From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Paul Eggert Newsgroups: gmane.emacs.bugs Subject: bug#24206: 25.1; Curly quotes generate invalid strings, leading to a segfault Date: Sun, 14 Aug 2016 19:04:42 -0700 Organization: UCLA Computer Science Department Message-ID: References: <8337m7h1dp.fsf@gnu.org> <83zioffew5.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable X-Trace: blaine.gmane.org 1471226722 7033 195.159.176.226 (15 Aug 2016 02:05:22 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 15 Aug 2016 02:05:22 +0000 (UTC) User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 Cc: p.stephani2@gmail.com, johnw@gnu.org, nicolas@petton.fr, 24206@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon Aug 15 04:05:17 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bZ7HA-0001Yp-Kb for geb-bug-gnu-emacs@m.gmane.org; Mon, 15 Aug 2016 04:05:16 +0200 Original-Received: from localhost ([::1]:34636 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bZ7H7-0005xC-JT for geb-bug-gnu-emacs@m.gmane.org; Sun, 14 Aug 2016 22:05:13 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:36439) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bZ7H0-0005tN-Kg for bug-gnu-emacs@gnu.org; Sun, 14 Aug 2016 22:05:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bZ7Gw-0006HR-Cg for bug-gnu-emacs@gnu.org; Sun, 14 Aug 2016 22:05:05 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:60029) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bZ7Gw-0006HN-8v for bug-gnu-emacs@gnu.org; Sun, 14 Aug 2016 22:05:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1bZ7Gv-00037K-TL for bug-gnu-emacs@gnu.org; Sun, 14 Aug 2016 22:05:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 15 Aug 2016 02:05:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 24206 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 24206-submit@debbugs.gnu.org id=B24206.147122669111962 (code B ref 24206); Mon, 15 Aug 2016 02:05:01 +0000 Original-Received: (at 24206) by debbugs.gnu.org; 15 Aug 2016 02:04:51 +0000 Original-Received: from localhost ([127.0.0.1]:57741 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bZ7Gk-00036s-Vh for submit@debbugs.gnu.org; Sun, 14 Aug 2016 22:04:51 -0400 Original-Received: from zimbra.cs.ucla.edu ([131.179.128.68]:43378) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bZ7Gi-00036d-MD for 24206@debbugs.gnu.org; Sun, 14 Aug 2016 22:04:48 -0400 Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 3E0B11611D9; Sun, 14 Aug 2016 19:04:43 -0700 (PDT) Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id B54og0y647Hd; Sun, 14 Aug 2016 19:04:42 -0700 (PDT) Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 7A92C161218; Sun, 14 Aug 2016 19:04:42 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 2MQpU996tW-a; Sun, 14 Aug 2016 19:04:42 -0700 (PDT) Original-Received: from [192.168.1.9] (unknown [100.32.155.148]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 549E61611D9; Sun, 14 Aug 2016 19:04:42 -0700 (PDT) In-Reply-To: <83zioffew5.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:122226 Archived-At: Eli Zaretskii wrote: > Its multibyteness is entirely in Emacs's imagination. Sure, but Emacs should not substitute "\342\200\230" for "`". The point o= f=20 text-quoting-style is to substitute quotes, not byte string encodings of = quotes. >> > More generally, Fsubstitute_command_keys is quite confused about uni= byte >> > versus multibyte issues. It merges together a number of strings, and >> > assumes that they are all multibyte iff the original string is >> > multibyte, which is obviously not true in general. > Could you please point out the specific places where this is done? OK, here's a contrived example. Run this code in emacs-25: (progn (setq km (make-keymap)) (define-key km "=E2=89=A0" 'global-set-key) (substitute-command-keys "\200\\\\[global-set-key]")) This should return a 2-character string equal to "\200=E2=89=A0". But in = Emacs 25 it=20 dumps core, at least on my platform (Fedora 23 x86-64). And in Emacs 24 o= n my=20 platform it returns a malformed string that prints as "\242\1340" but has= length=20 2. I suppose we could make Emacs 24 dump core too, though I haven't tried= hard=20 to do that. The problem is that the older Emacs code incorrectly assumes that the out= put of=20 substitution must be properly-encoded if the substitution changes somethi= ng.=20 This assumption can fail if the input is unibyte and contains bytes that = are not=20 properly-encoded for UTF-8. (There are other ways the assumption can fail= .)