From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Newsgroups: gmane.emacs.bugs Subject: bug#53260: char-syntax differs in interpreter and bytecode [PATCH] Date: Sun, 16 Jan 2022 12:04:51 +0100 Message-ID: References: <0A87E4BA-4741-4688-A005-912ABFC86B83@acm.org> <8735lpmm4f.fsf@gnus.org> <978B1D34-1F52-4B3E-B2E2-7ADA95155068@acm.org> Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.13\)) Content-Type: multipart/mixed; boundary="Apple-Mail=_484B250A-F9B6-4D6B-8FC7-3F701E74A15C" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="32804"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Lars Ingebrigtsen , 53260@debbugs.gnu.org To: Stefan Monnier Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun Jan 16 12:06:12 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1n93MR-0008KM-Tb for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 16 Jan 2022 12:06:12 +0100 Original-Received: from localhost ([::1]:53098 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n93MQ-0002S3-A3 for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 16 Jan 2022 06:06:10 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:58888) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n93MI-0002O5-KS for bug-gnu-emacs@gnu.org; Sun, 16 Jan 2022 06:06:02 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:49694) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1n93MI-0002pi-9E for bug-gnu-emacs@gnu.org; Sun, 16 Jan 2022 06:06:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1n93MI-0005hM-4e for bug-gnu-emacs@gnu.org; Sun, 16 Jan 2022 06:06:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 16 Jan 2022 11:06:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 53260 X-GNU-PR-Package: emacs Original-Received: via spool by 53260-submit@debbugs.gnu.org id=B53260.164233110921838 (code B ref 53260); Sun, 16 Jan 2022 11:06:02 +0000 Original-Received: (at 53260) by debbugs.gnu.org; 16 Jan 2022 11:05:09 +0000 Original-Received: from localhost ([127.0.0.1]:42597 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1n93LQ-0005g9-Sn for submit@debbugs.gnu.org; Sun, 16 Jan 2022 06:05:09 -0500 Original-Received: from mail1435c50.megamailservers.eu ([91.136.14.35]:54748 helo=mail263c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1n93LK-0005fM-EG for 53260@debbugs.gnu.org; Sun, 16 Jan 2022 06:05:08 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1642331095; bh=eP0OdjGT/8u8GTWWcEqNnNX381ufA48pot1aglia6Gs=; h=From:Subject:Date:In-Reply-To:Cc:To:References:From; b=UKM5Wnk3endsAXJ9P430Z8Y+I+j4uFZqWAbOkqKesl4oX6gPpamoRXbNbJkIBYMnN QHkkfVeLfJxBQfh9XehxuYWvpL7MvGpiBh3YR9VIYaUHrxF8LyDZJdUbRq7AYeAa5J 8D7P3HFJWgK0RXv+6Pp/e9aJZm1CjX0y3FFJoJ5E= Feedback-ID: mattiase@acm.or Original-Received: from smtpclient.apple (c188-150-171-71.bredband.tele2.se [188.150.171.71]) (authenticated bits=0) by mail263c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 20GB4qta009775; Sun, 16 Jan 2022 11:04:54 +0000 In-Reply-To: X-Mailer: Apple Mail (2.3654.120.0.1.13) X-CTCH-RefID: str=0001.0A742F1A.61E3FBD7.0010, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-Origin-Country: SE X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:224380 Archived-At: --Apple-Mail=_484B250A-F9B6-4D6B-8FC7-3F701E74A15C Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii 15 jan. 2022 kl. 23.51 skrev Stefan Monnier : > Doesn't sound right: char tables are indexed by chars (i.e. Unicode = code > points) not by bytes, so we need to convert the byte into a char > before indexing. Sure, I'm happy to do it either way. Chars retrieved from unibyte = buffers or strings really should be converted to multibyte before used = with char-syntax; unibyte buffers are not very common but strings = slightly more so. --Apple-Mail=_484B250A-F9B6-4D6B-8FC7-3F701E74A15C Content-Disposition: attachment; filename=0001-Fix-Fchar_syntax-for-non-ASCII-in-unibyte-buffers.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="0001-Fix-Fchar_syntax-for-non-ASCII-in-unibyte-buffers.patch" Content-Transfer-Encoding: quoted-printable =46rom=202adb5c862232abf126f73d0aa514f5ae8b6babba=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20=3D?UTF-8?q?Mattias=3D20Engdeg=3DC3=3DA5rd?=3D=20= =0ADate:=20Sun,=2016=20Jan=202022=2011:58:00=20+0100=0A= Subject:=20[PATCH]=20Fix=20Fchar_syntax=20for=20non-ASCII=20in=20unibyte=20= buffers=0A=0AFchar_syntax=20did=20not=20convert=20unibyte=20characters=20= to=20multibyte=20when=20the=0Acurrent=20buffer=20was=20unibyte,=20in=20= contrast=20to=20`char-syntax`=20in=0Abyte-compiled=20code=20(bug#53260).=0A= =0A*=20src/bytecode.c=20(exec_byte_code):=20Call=20out=20to=20= Fchar_syntax;=0Athe=20dynamic=20frequency=20is=20too=20low=20to=20= justify=20inlining=20here,=20and=20it=0Adid=20lead=20to=20= implementations=20diverging.=0A*=20src/syntax.c=20(Fchar_syntax):=20= Convert=20non-ASCII=20unibyte=20values=20to=0Amultibyte.=20=20Remove=20= useless=20SETUP_BUFFER_SYNTAX_TABLE=20which=20has=20no=0Aeffect=20here.=0A= *=20test/src/syntax-tests.el=20(syntax-char-syntax):=20New=20test.=0A---=0A= =20src/bytecode.c=20=20=20=20=20=20=20=20=20=20=20|=20=208=20+-------=0A=20= src/syntax.c=20=20=20=20=20=20=20=20=20=20=20=20=20|=20=206=20+++---=0A=20= test/src/syntax-tests.el=20|=2015=20+++++++++++++++=0A=203=20files=20= changed,=2019=20insertions(+),=2010=20deletions(-)=0A=0Adiff=20--git=20= a/src/bytecode.c=20b/src/bytecode.c=0Aindex=20472992be18..b7e65d05ae=20= 100644=0A---=20a/src/bytecode.c=0A+++=20b/src/bytecode.c=0A@@=20-1167,13=20= +1167,7=20@@=20#define=20DEFINE(name,=20value)=20LABEL=20(name)=20,=0A=20= =09=20=20NEXT;=0A=20=0A=20=09CASE=20(Bchar_syntax):=0A-=09=20=20{=0A-=09=20= =20=20=20CHECK_CHARACTER=20(TOP);=0A-=09=20=20=20=20int=20c=20=3D=20= XFIXNAT=20(TOP);=0A-=09=20=20=20=20if=20(NILP=20(BVAR=20(current_buffer,=20= enable_multibyte_characters)))=0A-=09=20=20=20=20=20=20c=20=3D=20= make_char_multibyte=20(c);=0A-=09=20=20=20=20XSETFASTINT=20(TOP,=20= syntax_code_spec[SYNTAX=20(c)]);=0A-=09=20=20}=0A+=09=20=20TOP=20=3D=20= Fchar_syntax=20(TOP);=0A=20=09=20=20NEXT;=0A=20=0A=20=09CASE=20= (Bbuffer_substring):=0Adiff=20--git=20a/src/syntax.c=20b/src/syntax.c=0A= index=209df878b8ed..c1e81dfa47=20100644=0A---=20a/src/syntax.c=0A+++=20= b/src/syntax.c=0A@@=20-1101,10=20+1101,10=20@@=20DEFUN=20("char-syntax",=20= Fchar_syntax,=20Schar_syntax,=201,=201,=200,=0A=20`syntax-after'=20= instead.=20=20*/)=0A=20=20=20(Lisp_Object=20character)=0A=20{=0A-=20=20= int=20char_int;=0A=20=20=20CHECK_CHARACTER=20(character);=0A-=20=20= char_int=20=3D=20XFIXNUM=20(character);=0A-=20=20= SETUP_BUFFER_SYNTAX_TABLE=20();=0A+=20=20int=20char_int=20=3D=20XFIXNAT=20= (character);=0A+=20=20if=20(NILP=20(BVAR=20(current_buffer,=20= enable_multibyte_characters)))=0A+=20=20=20=20char_int=20=3D=20= make_char_multibyte=20(char_int);=0A=20=20=20return=20make_fixnum=20= (syntax_code_spec[SYNTAX=20(char_int)]);=0A=20}=0A=20=0Adiff=20--git=20= a/test/src/syntax-tests.el=20b/test/src/syntax-tests.el=0Aindex=20= 3b9f21cde3..501b5e067f=20100644=0A---=20a/test/src/syntax-tests.el=0A+++=20= b/test/src/syntax-tests.el=0A@@=20-506,4=20+506,19=20@@=20= test-from-to-parse-partial-sexp=0A=20=20=20=20=20(should=20= (parse-partial-sexp=201=201))=0A=20=20=20=20=20(should-error=20= (parse-partial-sexp=202=201))))=0A=20=0A+(ert-deftest=20= syntax-char-syntax=20()=0A+=20=20;;=20Verify=20that=20char-syntax=20= behaves=20identically=20in=20interpreted=20and=0A+=20=20;;=20= byte-compiled=20code=20(bug#53260).=0A+=20=20(let=20((cs=20(byte-compile=20= (lambda=20(x)=20(char-syntax=20x)))))=0A+=20=20=20=20;;=20Use=20a=20= unibyte=20buffer=20with=20a=20syntax=20table=20using=20symbol=20syntax=0A= +=20=20=20=20;;=20for=20raw=20byte=20128.=0A+=20=20=20=20= (with-temp-buffer=0A+=20=20=20=20=20=20(set-buffer-multibyte=20nil)=0A+=20= =20=20=20=20=20(let=20((st=20(make-syntax-table)))=0A+=20=20=20=20=20=20=20= =20(modify-syntax-entry=20(unibyte-char-to-multibyte=20128)=20"_"=20st)=0A= +=20=20=20=20=20=20=20=20(set-syntax-table=20st)=0A+=20=20=20=20=20=20=20= =20(should=20(equal=20(char-syntax=20128)=20?_))=0A+=20=20=20=20=20=20=20= =20(should=20(equal=20(funcall=20cs=20128)=20?_))))=0A+=20=20=20=20(list=20= (char-syntax=20128)=20(funcall=20cs=20128))))=0A+=0A=20;;;=20= syntax-tests.el=20ends=20here=0A--=20=0A2.32.0=20(Apple=20Git-132)=0A=0A= --Apple-Mail=_484B250A-F9B6-4D6B-8FC7-3F701E74A15C--