From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Paul Eggert Newsgroups: gmane.emacs.bugs Subject: bug#24206: 25.1; Curly quotes generate invalid strings, leading to a segfault Date: Wed, 17 Aug 2016 13:52:42 -0700 Organization: UCLA Computer Science Department Message-ID: References: <8337m7h1dp.fsf@gnu.org> <83zioffew5.fsf@gnu.org> <83popaf1yz.fsf@gnu.org> <87bn0u3rqc.fsf@linux-m68k.org> <83mvkdg91i.fsf@gnu.org> <8b78f23f-4a4f-e568-b760-3350ca7bb8d3@cs.ucla.edu> <83d1l8g3zs.fsf@gnu.org> <4822bfeb-c507-a9ff-93bc-1d27ba93b9d7@cs.ucla.edu> <8360qzfmz3.fsf@gnu.org> <11031c1e-c784-0ba2-4b6c-4fab0cb92354@cs.ucla.edu> <83tweje0ct.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable X-Trace: blaine.gmane.org 1471467204 7064 195.159.176.226 (17 Aug 2016 20:53:24 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 17 Aug 2016 20:53:24 +0000 (UTC) User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 Cc: johnw@gnu.org, 24206@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Wed Aug 17 22:53:20 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ba7pw-0001Vn-5O for geb-bug-gnu-emacs@m.gmane.org; Wed, 17 Aug 2016 22:53:20 +0200 Original-Received: from localhost ([::1]:49487 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ba7pt-00047Z-70 for geb-bug-gnu-emacs@m.gmane.org; Wed, 17 Aug 2016 16:53:17 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:38098) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ba7pj-00043q-RZ for bug-gnu-emacs@gnu.org; Wed, 17 Aug 2016 16:53:09 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ba7pe-0005sO-PN for bug-gnu-emacs@gnu.org; Wed, 17 Aug 2016 16:53:07 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:34617) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ba7pe-0005sE-LV for bug-gnu-emacs@gnu.org; Wed, 17 Aug 2016 16:53:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1ba7pe-0003BQ-Ar for bug-gnu-emacs@gnu.org; Wed, 17 Aug 2016 16:53:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 17 Aug 2016 20:53:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 24206 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 24206-submit@debbugs.gnu.org id=B24206.147146717212221 (code B ref 24206); Wed, 17 Aug 2016 20:53:02 +0000 Original-Received: (at 24206) by debbugs.gnu.org; 17 Aug 2016 20:52:52 +0000 Original-Received: from localhost ([127.0.0.1]:60562 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ba7pT-0003B3-PW for submit@debbugs.gnu.org; Wed, 17 Aug 2016 16:52:51 -0400 Original-Received: from zimbra.cs.ucla.edu ([131.179.128.68]:43681) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ba7pS-0003An-4B for 24206@debbugs.gnu.org; Wed, 17 Aug 2016 16:52:50 -0400 Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 2E073161325; Wed, 17 Aug 2016 13:52:44 -0700 (PDT) Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id wkM5-QOwxQlN; Wed, 17 Aug 2016 13:52:43 -0700 (PDT) Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 52CCF161326; Wed, 17 Aug 2016 13:52:43 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 7jyeOHRqlBos; Wed, 17 Aug 2016 13:52:43 -0700 (PDT) Original-Received: from [192.168.1.9] (unknown [100.32.155.148]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 344C9161325; Wed, 17 Aug 2016 13:52:43 -0700 (PDT) In-Reply-To: <83tweje0ct.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:122336 Archived-At: Eli Zaretskii wrote: >> Some change in this area was needed because the 'multibyte' flag went = away. > > Only because you removed it. You could have left it alone, it would > have worked Sure, but it was no longer necessary, as the code no longer needs to reco= rd=20 whether the original string was multibyte. Keeping an unnecessary variabl= e=20 around would make the code harder to read. > even after the call to Fstring_make_multibyte, for the > reasons I explained earlier: the result is not necessarily a multibyte > string. That doesn't affect the fact that the 'multibyte' variable is no longer=20 necessary. In emacs-25, 'multibyte' does not mean that the result is a mu= ltibyte=20 string; it means that the input is a multibyte string. There is no need t= o keep=20 track of that in master now, and it simplifies the code to not worry abou= t it. >> While doing that, I noticed that discarding all the code made this >> somewhat-tricky area easier to follow. It's not merely that the old mu= ltibyte >> code is unnecessarily long and hard to follow; it's that the old code = does >> something fairly-typical (copy a multibyte character) in an unusual wa= y, which >> is too likely to lead the reader into incorrectly thinking that there = is >> something actually unusual about the action. > I don't see why it is tricky, we do that in Emacs in other places. Really? A call to STRING_CHAR_AND_LENGTH followed by a length test follow= ed by a=20 call to memcpy for length > 1 and a special case inline copy for length =3D= =3D 1?=20 When copying multibyte data? Where else does Emacs do that? > it's more clear for you Replacing 14 unusually and unnecessarily tricky lines with zero lines sho= uld=20 help clarify things for most readers. > I could simply revert your commit, it would have saved us both quite > some time. Would you prefer that? It'd be even simpler to leave things alone, as the master code works bett= er than=20 emacs-25 does. (Merely reverting the commit wouldn't suffice, of course.) >> This one is not merely a style change. The old code matched \[ even if= not >> followed by ], the new code does not. This is an intended improvement.= I plead >> guilty to the charge that the new code is also shorter and clearer. > > Then why is there nothing about this in the log entry? I didn't think such detail was necessary, since it was a change to undocu= mented=20 behavior. If you think it worth mentioning, I can add a NEWS item. >> Alan wanted something that he could put into his .emacs that would cau= se >> (message PERCENTLESS) to output the string PERCENTLESS as-is, assuming >> PERCENTLESS lacks %. This was the point of his original bug report; hi= s original >> example involved ` and ' but he wanted the same behavior for =E2=80=98= and =E2=80=99, a point >> that became clear during the discussion of Bug#23425. > > Then why not for '..' as well? How is that different from =E2=80=98..=E2= =80=99? It's not different. Alan wanted the same behavior for '..', and he got th= at too.