From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Philipp Stephani Newsgroups: gmane.emacs.bugs Subject: bug#27391: 25.2.50; utf-8 coding cookie is not applied on some specific markdown file Date: Fri, 16 Jun 2017 21:39:59 +0000 Message-ID: References: <841sqkdzh5.fsf@gmail.com> <84zid7y668.fsf@gmail.com> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="001a113cffd2262c1f05521aa1ba" X-Trace: blaine.gmane.org 1497649285 5264 195.159.176.226 (16 Jun 2017 21:41:25 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 16 Jun 2017 21:41:25 +0000 (UTC) To: Vincent =?UTF-8?Q?Bela=C3=AFche?= , Eli Zaretskii , 27391@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Fri Jun 16 23:41:08 2017 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dLyzL-0000S2-LU for geb-bug-gnu-emacs@m.gmane.org; Fri, 16 Jun 2017 23:41:07 +0200 Original-Received: from localhost ([::1]:60801 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dLyzP-0004Vp-EC for geb-bug-gnu-emacs@m.gmane.org; Fri, 16 Jun 2017 17:41:11 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:56642) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dLyzJ-0004Vg-GV for bug-gnu-emacs@gnu.org; Fri, 16 Jun 2017 17:41:06 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dLyzG-0004Kn-AA for bug-gnu-emacs@gnu.org; Fri, 16 Jun 2017 17:41:05 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:49089) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1dLyzG-0004KW-5r for bug-gnu-emacs@gnu.org; Fri, 16 Jun 2017 17:41:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1dLyzF-0006RY-Lp for bug-gnu-emacs@gnu.org; Fri, 16 Jun 2017 17:41:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Philipp Stephani Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 16 Jun 2017 21:41:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 27391 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 27391-submit@debbugs.gnu.org id=B27391.149764921724695 (code B ref 27391); Fri, 16 Jun 2017 21:41:01 +0000 Original-Received: (at 27391) by debbugs.gnu.org; 16 Jun 2017 21:40:17 +0000 Original-Received: from localhost ([127.0.0.1]:51766 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dLyyW-0006QF-TZ for submit@debbugs.gnu.org; Fri, 16 Jun 2017 17:40:17 -0400 Original-Received: from mail-ot0-f178.google.com ([74.125.82.178]:35780) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dLyyV-0006Q1-JZ for 27391@debbugs.gnu.org; Fri, 16 Jun 2017 17:40:16 -0400 Original-Received: by mail-ot0-f178.google.com with SMTP id u13so24168086otd.2 for <27391@debbugs.gnu.org>; Fri, 16 Jun 2017 14:40:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=DXE0b1QZ/um0C+dX5ngsSxowBYo+uutI0jmbb5XQ4UU=; b=lx7+QoGV1Vr/86FhNNWX3rj+B6OVXIRsHj2uWEndUOa8UbkvuwrHSlmByO7V6AeANp 9VKalEN9V/c9bNfyYz5ANNjKrDIqX2KPHZG/64bYIbTLUTnLnRWIhgQaXdroc2j+/XEl 1GrAOCrHHizOwX1VPaRTpq+981q8Yw9vLHXegBwwBu4roaUY6E1sdKvGlB5JRByoCNEz txNnkZ+/CuGckNVUO7em2EeoJpa4M41Q02yTcfDWkO/sfwFu/2ti/qcy/nsjd6NMaiYC XPj1tZhBGwwfbwv9fpGlXtnduWokZvD3yG8Nk2ytQo+gyFsGhl+lyBGktYlM26+bt1va gCfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=DXE0b1QZ/um0C+dX5ngsSxowBYo+uutI0jmbb5XQ4UU=; b=uFeryB5Ji7HKOc8lHvmFhtwUwKI1+bhWXD+BI/iY9SydM5Y0AT5ZEaZLaUy7dANyBz iaCiOI2v/ygHoPhZhmoDR23gcqw4HeQJnAMjx/67VzoE01mi7KSkIMRsowlnIUx8pSK/ YZ0btdQdBz0WZ0EgHyjrxZFoHClL0ok3npRFWB4fakhVmGrGGeCBJk1h8hw+BMoWn77r 1T1sYSMxqiQIEEaqd16PJOvg9sme8/Wne/dSaUFE3PVzo3jBYjyJZFuZ3DCEGRvbelM2 xpS/4D1yWae2CSFkA/C4ZV//NsjCDDE50MyGHfm9b23XXQBsuDIcMh/H2ewUgdTKMhBq AFEg== X-Gm-Message-State: AKS2vOxSe83EgqSyTRXFmyoU3xye+aufmvj6vj0Ivfm7o3w8faljqnnM FqTSm14H9IsMeKg/8nUkY+DbWdBjBA== X-Received: by 10.157.43.199 with SMTP id u65mr7941947ota.182.1497649209811; Fri, 16 Jun 2017 14:40:09 -0700 (PDT) In-Reply-To: X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:133671 Archived-At: --001a113cffd2262c1f05521aa1ba Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Philipp Stephani schrieb am Fr., 16. Juni 2017 um 23:34 Uhr: > Vincent Bela=C3=AFche schrieb am Fr., 16. Ju= ni > 2017 um 23:28 Uhr: > >> >> >> Le 16/06/2017 =C3=A0 21:37, Vincent Bela=C3=AFche a =C3=A9crit : >> > >> > >> > Le 16/06/2017 =C3=A0 21:15, Vincent Bela=C3=AFche a =C3=A9crit : >> >> >> >> [...] >> >> >> >> >> >> > After some more investigation, I think that the bug is in function >> > insert-file-contents of fileio.c which is the one that decide and sets >> > the coding system well before the other local variables are looked int= o. >> >> After some more investigation, in the end the find-auto-coding of >> mule.el is what is called to detect the coding. This function calls some >> re-coding regexp. >> >> Here is a test function defining the same regexp. >> >> >> (defun doit () >> (interactive) >> (let* ((prefix (regexp-quote "[comment]: # (")) >> (suffix (regexp-quote ")")) >> (re-coding >> (concat >> "[\r\n]" prefix >> ;; N.B. without the \n below, the regexp can >> ;; eat newlines. >> "[ \t]*coding[ \t]*:[ \t]*\\([^ \t\r\n]+\\)[ \t]*" >> suffix "[\r\n]"))) >> (message (if (looking-at re-coding) "ok" "nak")))) >> >> I tried it with point at end of line >> >> [comment]: # ( Local Variables: ) >> >> and it answered "ok". Now I defined this with re-search-forward instead >> of looking-at: >> >> (defun doit () >> (interactive) >> (let* ((prefix (regexp-quote "[comment]: # (")) >> (suffix (regexp-quote ")")) >> (re-coding >> (concat >> "[\r\n]" prefix >> ;; N.B. without the \n below, the regexp can >> ;; eat newlines. >> "[ \t]*coding[ \t]*:[ \t]*\\([^ \t\r\n]+\\)[ \t]*" >> suffix "[\r\n]"))) >> (message (if (re-search-forward re-coding nil t) "ok" "nak")))) >> >> I placed the point before the coding: line, and I also got answer "ok" >> >> So I don't think that the regexp as such is to blame. Something else >> seems to happen. It is too late now, I need to go to bed... >> >> Vincent. >> >> > I think it's actually the regexp that searches for "Local Variables". The > following minimal example fails for me: > > (with-temp-buffer > (insert " > > [comment]: # ( Local Variables: ) > [comment]: # ( coding: utf-8 ) > [comment]: # ( End: ) > > ") > (goto-char (point-min)) > (re-search-forward > "[\r\n]\\([^[\r\n]*\\)[ \t]*Local Variables:[ \t]*\\([^\r\n]*\\)[\r\n]")= ) > > Does anybody know why the second character range says [^[\r\n] instead of [^\r\n]? This seems to explicitly exclude a leading [. --001a113cffd2262c1f05521aa1ba Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


Philip= p Stephani <p.stephani2@gmail.c= om> schrieb am Fr., 16. Juni 2017 um 23:34=C2=A0Uhr:
Vincent Bela=C3=AFche <vincent.belaiche@gmail.com> schrieb am Fr.= , 16. Juni 2017 um 23:28=C2=A0Uhr:
=

Le 16/06/2017 =C3=A0 21:37, Vincent Bela=C3=AFche a =C3=A9crit :
>
>
> Le 16/06/2017 =C3=A0 21:15, Vincent Bela=C3=AFche a =C3=A9crit :
>>

[...]

>>
>>
> After some more investigation, I think that the bug is in function
> insert-file-contents of fileio.c which is the one that decide and sets=
> the coding system well before the other local variables are looked int= o.

After some more investigation, in the end the find-auto-coding of
mule.el is what is called to detect the coding. This function calls some re-coding regexp.

Here is a test function defining the same regexp.


(defun doit ()
=C2=A0 (interactive)
=C2=A0 (let* ((prefix (regexp-quote "[comment]: # ("))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(suffix (regexp-quote ")"))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(re-coding
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (concat
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"[\r\n]" prefix
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0;; N.B. without the \n below, the = regexp can
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0;; eat newlines.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"[ \t]*coding[ \t]*:[ \t]*\\(= [^ \t\r\n]+\\)[ \t]*"
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0suffix "[\r\n]")))
=C2=A0 =C2=A0 (message (if (looking-at re-coding) "ok" "nak&= quot;))))

I tried it with point at end of line

[comment]: # ( Local Variables: )

and it answered "ok". Now I defined this with re-search-forward i= nstead
of looking-at:

(defun doit ()
=C2=A0 (interactive)
=C2=A0 (let* ((prefix (regexp-quote "[comment]: # ("))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(suffix (regexp-quote ")"))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(re-coding
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (concat
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"[\r\n]" prefix
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0;; N.B. without the \n below, the = regexp can
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0;; eat newlines.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"[ \t]*coding[ \t]*:[ \t]*\\(= [^ \t\r\n]+\\)[ \t]*"
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0suffix "[\r\n]")))
=C2=A0 =C2=A0 (message (if (re-search-forward re-coding nil t) "ok&quo= t; "nak"))))

I placed the point before the coding: line, and I also got answer "ok&= quot;

So I don't think that the regexp as such is to blame. Something else seems to happen. It is too late now, I need to go to bed...

=C2=A0 Vincent.


I think it's actually the regexp that searches for "= Local Variables". The following minimal example fails for me:

(with-temp-buffer
=C2=A0 (insert "
<= br>
[comment]: # ( Local Variables: )
[comment]: # ( coding: ut= f-8 )
[comment]: # ( End: )
<= div>
")
(goto-char (point-min))
(re-= search-forward
=C2=A0"[\r\n]\\([^[\r\n]*\\)[ \t]*Local Varia= bles:[ \t]*\\([^\r\n]*\\)[\r\n]"))


Does anybody know why the second charac= ter range says [^[\r\n] instead of =C2=A0[^\r\n]? This seems to explicitly = exclude a leading [.
--001a113cffd2262c1f05521aa1ba--