From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: vincent.belaiche@gmail.com (Vincent =?UTF-8?Q?Bela=C3=AFche?=) Newsgroups: gmane.emacs.bugs Subject: bug#27391: 25.2.50; utf-8 coding cookie is not applied on some specific markdown file Date: Sat, 17 Jun 2017 07:45:35 +0200 Message-ID: <84wp8bxj40.fsf@gmail.com> References: <841sqkdzh5.fsf@gmail.com> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: blaine.gmane.org 1497678379 598 195.159.176.226 (17 Jun 2017 05:46:19 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sat, 17 Jun 2017 05:46:19 +0000 (UTC) Cc: Vincent =?UTF-8?Q?Bela=C3=AFche?= To: Eli Zaretskii , 27391@debbugs.gnu.org, p.stephani2@gmail.com Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat Jun 17 07:46:09 2017 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dM6Yj-00086h-8a for geb-bug-gnu-emacs@m.gmane.org; Sat, 17 Jun 2017 07:46:09 +0200 Original-Received: from localhost ([::1]:33491 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dM6Yo-0006XV-I7 for geb-bug-gnu-emacs@m.gmane.org; Sat, 17 Jun 2017 01:46:14 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:47738) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dM6Yf-0006XM-G2 for bug-gnu-emacs@gnu.org; Sat, 17 Jun 2017 01:46:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dM6Yc-0003GK-Af for bug-gnu-emacs@gnu.org; Sat, 17 Jun 2017 01:46:05 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:49272) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1dM6Yc-0003GA-6u for bug-gnu-emacs@gnu.org; Sat, 17 Jun 2017 01:46:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1dM6Yb-0006r8-RS for bug-gnu-emacs@gnu.org; Sat, 17 Jun 2017 01:46:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: vincent.belaiche@gmail.com (Vincent =?UTF-8?Q?Bela=C3=AFche?=) Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 17 Jun 2017 05:46:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 27391 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 27391-submit@debbugs.gnu.org id=B27391.149767834626331 (code B ref 27391); Sat, 17 Jun 2017 05:46:01 +0000 Original-Received: (at 27391) by debbugs.gnu.org; 17 Jun 2017 05:45:46 +0000 Original-Received: from localhost ([127.0.0.1]:51949 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dM6YM-0006qc-7T for submit@debbugs.gnu.org; Sat, 17 Jun 2017 01:45:46 -0400 Original-Received: from smtp02.smtpout.orange.fr ([80.12.242.124]:19543 helo=smtp.smtpout.orange.fr) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dM6YK-0006qL-9n for 27391@debbugs.gnu.org; Sat, 17 Jun 2017 01:45:44 -0400 Original-Received: from AigleRoyal ([90.32.171.63]) by mwinf5d03 with ME id ZhlZ1v0091NRGYS03hlZTu; Sat, 17 Jun 2017 07:45:37 +0200 X-ME-Helo: AigleRoyal X-ME-Auth: dmluY2VudGJlbGFpY2hlQG9yYW5nZS5mcg== X-ME-Date: Sat, 17 Jun 2017 07:45:37 +0200 X-ME-IP: 90.32.171.63 In-Reply-To: X-Antivirus: Avast (VPS 170616-2, 16/06/2017), Outbound message X-Antivirus-Status: Clean X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:133679 Archived-At: Le 17/06/2017 =E0 00:23, Vincent Bela=EFche a =E9crit : > > > Le 17/06/2017 =E0 00:09, Vincent Bela=EFche a =E9crit : >> >> Le 16/06/2017 =E0 21:37, Vincent Bela=EFche a =E9crit : >>> >>> Le 16/06/2017 =E0 21:15, Vincent Bela=EFche a =E9crit : >> [...] >> >>>> >>> After some more investigation, I think that the bug is in function >>> insert-file-contents of fileio.c which is the one that decide and sets >>> the coding system well before the other local variables are looked into= =2E >> I have located the bug. >> >> After some more investigation, in the end the find-auto-coding of >> mule.el is what is called to detect the coding. >> >> This function evaluates this expression to find the local variables: >> >> (re-search-forward >> "[\r\n]\\([^[\r\n]*\\)[ \t]*Local Variables:[ \t]*\\([^\r\n]*= \\)[\r\n]" >> tail-end t) >> >> This expression evaluates to nil over file CONTRIBUTING.md >> >> I can make a simple fix if you tell me on which branch to do it. >> >> However I think that the root of the problem is poor code factorization >> of local variable parsing between mule.el and file.el. A better, more >> futureproof fix would be some unique local variable parser with some >> input constrain telling what sort of setting are sought. The output of >> the parse could be used in file.el and mule.el. >> >> Vincent. >> >> > Ooops... my lengthy email of T23:34 was unwantedly sent. A shorter > version with only the conclusion and w/o all the details of my > investigation is above. > > Anyway, Philipp's patch is what I had in mind as a quick fix. Although I > don't think that this is a good solution not to factorize code when > possible. Factorizing makes it more maintainable. > > V. Just to mention the following points noted by me when comparing the code in find-auto-coding and in hack-local-variables: * In hack-local-variables the tailing local variables section is considered to be at max 3000 characters from eob, while in find-auto-coding it is considered to be 3072. The =AB correct =BB figure should be 3072, not 3000, for consistency with =AB 1024 * 3 =BB code in function Finsert_file_contents of fileio.c : if (nread =3D=3D 1024) { int ntail; if (lseek (fd, - (1024 * 3), SEEK_END) < 0) report_file_error ("Setting file position", orig_filename); ntail =3D emacs_read_quit (fd, read_buf + nread, 1024 * 3); nread =3D ntail < 0 ? ntail : nread + ntail; } Maybe the exact value should be in some constant. * In find-auto-coding there is no such thing as regexp operator "^" (for bol) or "$" (for eol) used, instead there is "[\r\n]". I suspect that this is because at this stage the coding system is not yet set, and therefore there is no such thing as bol or eol, the whole buffer is a single line. If as such, I withdraw my previous statement that code factorization is desirable. * In both cases what is sought for is the *FIRST* occurrence searched *FORWARD* of case sensitive string "Local Variables:" in the buffer tailing 3000--3072 characters. I think that this is a problem and that either we should search it *BACKWARD* or after finding the 1st occurrence, possible subsequent occurrences should be searched for, and the last occurrence should be considered instead. I say this because with emacs-template package it is possible that the template file has some local variables in the template definition section that differ from that of template itself. See (info "(template) DefSect") For instance the end of the template file would be as follow: --8<----8<----8<----8<----8<-- begin -->8---->8---->8---->8---->8---- =2E.. blah blah blah template content ... // Local Variables: // toto: "tata" // End: >>>TEMPLATE-DEFINITION-SECTION<<< =2E.. blah blah blah Lisp Template rules ... ;; Local Variables: ;; foo: "bar" ;; End: --8<----8<----8<----8<----8<-- end -->8---->8---->8---->8---->8---- Maybe preventing the [ character in the prefix string is not a typo but was some intentional design to allow preventing false detection of the local variable section. I strongly recommend that before doing any fix, somebody dig in file history to find when and *WHY* this [ preventing has been introduced --- sorry, but I do not volunteer for this tedious/time consuming kind of work... Vincent. --- L'absence de virus dans ce courrier =E9lectronique a =E9t=E9 v=E9rifi=E9e p= ar le logiciel antivirus Avast. https://www.avast.com/antivirus