From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#24206: 25.1; Curly quotes generate invalid strings, leading to a segfault Date: Wed, 17 Aug 2016 21:06:42 +0300 Message-ID: <83tweje0ct.fsf@gnu.org> References: <8337m7h1dp.fsf@gnu.org> <83zioffew5.fsf@gnu.org> <83popaf1yz.fsf@gnu.org> <87bn0u3rqc.fsf@linux-m68k.org> <83mvkdg91i.fsf@gnu.org> <8b78f23f-4a4f-e568-b760-3350ca7bb8d3@cs.ucla.edu> <83d1l8g3zs.fsf@gnu.org> <4822bfeb-c507-a9ff-93bc-1d27ba93b9d7@cs.ucla.edu> <8360qzfmz3.fsf@gnu.org> <11031c1e-c784-0ba2-4b6c-4fab0cb92354@cs.ucla.edu> Reply-To: Eli Zaretskii NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1471457491 26551 195.159.176.226 (17 Aug 2016 18:11:31 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 17 Aug 2016 18:11:31 +0000 (UTC) Cc: johnw@gnu.org, 24206@debbugs.gnu.org To: Paul Eggert Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Wed Aug 17 20:11:27 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ba5JE-0006b4-0k for geb-bug-gnu-emacs@m.gmane.org; Wed, 17 Aug 2016 20:11:24 +0200 Original-Received: from localhost ([::1]:49063 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ba5JB-0005Wt-4u for geb-bug-gnu-emacs@m.gmane.org; Wed, 17 Aug 2016 14:11:21 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:34555) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ba5G4-0002yK-4J for bug-gnu-emacs@gnu.org; Wed, 17 Aug 2016 14:08:09 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ba5Fx-0004nk-Ta for bug-gnu-emacs@gnu.org; Wed, 17 Aug 2016 14:08:06 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:34567) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ba5Fx-0004ng-Q7 for bug-gnu-emacs@gnu.org; Wed, 17 Aug 2016 14:08:01 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1ba5Fx-0007X2-K3 for bug-gnu-emacs@gnu.org; Wed, 17 Aug 2016 14:08:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 17 Aug 2016 18:08:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 24206 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 24206-submit@debbugs.gnu.org id=B24206.147145723828901 (code B ref 24206); Wed, 17 Aug 2016 18:08:01 +0000 Original-Received: (at 24206) by debbugs.gnu.org; 17 Aug 2016 18:07:18 +0000 Original-Received: from localhost ([127.0.0.1]:60512 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ba5FG-0007W5-1E for submit@debbugs.gnu.org; Wed, 17 Aug 2016 14:07:18 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:43163) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ba5FF-0007Vt-CB for 24206@debbugs.gnu.org; Wed, 17 Aug 2016 14:07:17 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ba5F8-0004bO-RL for 24206@debbugs.gnu.org; Wed, 17 Aug 2016 14:07:12 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:43615) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ba5F4-0004aW-SR; Wed, 17 Aug 2016 14:07:06 -0400 Original-Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:1716 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1ba5Et-0007fF-Ad; Wed, 17 Aug 2016 14:07:05 -0400 In-reply-to: <11031c1e-c784-0ba2-4b6c-4fab0cb92354@cs.ucla.edu> (message from Paul Eggert on Wed, 17 Aug 2016 10:41:52 -0700) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:122330 Archived-At: > Cc: johnw@gnu.org, 24206@debbugs.gnu.org > From: Paul Eggert > Date: Wed, 17 Aug 2016 10:41:52 -0700 > > > - if (multibyte) > > - { > > - int len; > > - > > - STRING_CHAR_AND_LENGTH (strp, len); > > - if (len == 1) > > - *bufp = *strp; > > - else > > - memcpy (bufp, strp, len); > > - strp += len; > > - bufp += len; > > - nchars++; > > - } > > - else > > - *bufp++ = *strp++, nchars++; > > + /* Fall through to copy one char. */ > > Some change in this area was needed because the 'multibyte' flag went away. Only because you removed it. You could have left it alone, it would have worked even after the call to Fstring_make_multibyte, for the reasons I explained earlier: the result is not necessarily a multibyte string. > While doing that, I noticed that discarding all the code made this > somewhat-tricky area easier to follow. It's not merely that the old multibyte > code is unnecessarily long and hard to follow; it's that the old code does > something fairly-typical (copy a multibyte character) in an unusual way, which > is too likely to lead the reader into incorrectly thinking that there is > something actually unusual about the action. Misleading code like this really > cries out to be rewritten, particularly if the rewriting simply ionvolves > deleting it. I don't see why it is tricky, we do that in Emacs in other places. It's pretty boilerplate. > In short, the main motivation here was clarity, not merely style. That's exactly my point: it's more clear for you, but that alone is not reason good enough to make such changes in code that worked for many years. > (I hope I don't have to go into such details to defend every code change I > install! I'm finding it difficult-enough now to find time to improve Emacs.) I could simply revert your commit, it would have saved us both quite some time. Would you prefer that? > This one is not merely a style change. The old code matched \[ even if not > followed by ], the new code does not. This is an intended improvement. I plead > guilty to the charge that the new code is also shorter and clearer. Then why is there nothing about this in the log entry? > > - /* Note the Fwhere_is_internal can GC, so we have to take > > - relocation of string contents into account. */ > > - strp = SDATA (string) + idx; > > - start = SDATA (string) + start_idx; > > + /* Take relocation of string contents into account. */ > > + strp = SDATA (str) + idx; > > + start = strp - length_byte - 1; > > The new comment came because I copied it from somewhere else in the interest of > consistency. You're right, I omitted some commentary in the process. I thought > the omitted info obvious, but evidently you think otherwise. It's obviously no > big deal, so I brought it back by applying the attached patch to master. Thanks. > > Unlike at that time, I now think > > this was a bad move, because Emacs 25.1 will have the disabled > > conversion in it, so by the time we release the code in master, it > > would be an incompatible change. > > If that's the main objection, then let's change Emacs 25 to behave similarly. > This would be a simple and conservative change to Emacs 25. But even if you > don't want to change Emacs 25 (and thus you want to Emacs 25 to continue to be > less-compatible with Emacs 24), it's OK to change this minor detail back to the > way Emacs 24 does things. Alas, it's too late to change Emacs 25.1. > > (I also don't see how it is related to the > > original bug report, which AFAIU was about (message "`foo'") that > > still behaves as in the bug report.) > > Alan wanted something that he could put into his .emacs that would cause > (message PERCENTLESS) to output the string PERCENTLESS as-is, assuming > PERCENTLESS lacks %. This was the point of his original bug report; his original > example involved ` and ' but he wanted the same behavior for ‘ and ’, a point > that became clear during the discussion of Bug#23425. Then why not for '..' as well? How is that different from ‘..’? What we have now on master is inconsistent and cannot be defended, IMO. > > (Mumbles something about Emacs maintenance being a lonely business...) > > But we have all these nice conversations! :-) Oh yes, what a relief!