From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#15984: 24.3; Problem with combining characters in attachment filename Date: Sat, 30 Nov 2013 15:20:13 +0200 Message-ID: <83siue58mq.fsf@gnu.org> References: <83iovc8eaq.fsf@gnu.org> <83a9gn8yoz.fsf@gnu.org> <831u1z8twg.fsf@gnu.org> <83r49z78jp.fsf@gnu.org> <83a9gn6v2f.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8BIT X-Trace: ger.gmane.org 1385817672 3182 80.91.229.3 (30 Nov 2013 13:21:12 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 30 Nov 2013 13:21:12 +0000 (UTC) Cc: 15984@debbugs.gnu.org, nisse@lysator.liu.se To: Kenichi Handa Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat Nov 30 14:21:16 2013 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1VmkTz-00075s-Sf for geb-bug-gnu-emacs@m.gmane.org; Sat, 30 Nov 2013 14:21:16 +0100 Original-Received: from localhost ([::1]:52026 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VmkTz-0001Ss-8E for geb-bug-gnu-emacs@m.gmane.org; Sat, 30 Nov 2013 08:21:15 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:57958) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VmkTr-0001Sk-Fq for bug-gnu-emacs@gnu.org; Sat, 30 Nov 2013 08:21:12 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VmkTm-0006Ws-Jz for bug-gnu-emacs@gnu.org; Sat, 30 Nov 2013 08:21:07 -0500 Original-Received: from debbugs.gnu.org ([140.186.70.43]:37250) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VmkTm-0006Wn-GI for bug-gnu-emacs@gnu.org; Sat, 30 Nov 2013 08:21:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1VmkTm-0003QL-1d for bug-gnu-emacs@gnu.org; Sat, 30 Nov 2013 08:21:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 30 Nov 2013 13:21:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 15984 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 15984-submit@debbugs.gnu.org id=B15984.138581763313115 (code B ref 15984); Sat, 30 Nov 2013 13:21:01 +0000 Original-Received: (at 15984) by debbugs.gnu.org; 30 Nov 2013 13:20:33 +0000 Original-Received: from localhost ([127.0.0.1]:51269 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VmkTI-0003PQ-O3 for submit@debbugs.gnu.org; Sat, 30 Nov 2013 08:20:33 -0500 Original-Received: from mtaout20.012.net.il ([80.179.55.166]:58103) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VmkTF-0003PC-73 for 15984@debbugs.gnu.org; Sat, 30 Nov 2013 08:20:31 -0500 Original-Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0MX200F00VAFOM00@a-mtaout20.012.net.il> for 15984@debbugs.gnu.org; Sat, 30 Nov 2013 15:20:21 +0200 (IST) Original-Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0MX200FJFVPVNW30@a-mtaout20.012.net.il>; Sat, 30 Nov 2013 15:20:20 +0200 (IST) In-reply-to: <83a9gn6v2f.fsf@gnu.org> X-012-Sender: halo1@inter.net.il X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:81142 Archived-At: > From: Eli Zaretskii > Cc: 15984@debbugs.gnu.org > > > From: nisse@lysator.liu.se (Niels Möller) > > Cc: 15984@debbugs.gnu.org > > Date: Fri, 29 Nov 2013 13:41:01 +0100 > > > > $ rm -rf ~/tmp/home/ && mkdir ~/tmp/home/ && HOME=$HOME/tmp/home emacs -nw -Q -l bug.el > > > > where bug.el contains > > > > (setq gnus-init-file nil) > > (setq gnus-nntp-server nil) > > (gnus-no-server) > > > > Then create the group with G d, pointing out the spool-like directory, > > enter the group (RET), view the message (RET), try to write out the > > attachment ("o" on the attachment button). Still crashes for me. > > It crashes in the current development trunk as well, but only if the > locale is set to Latin-1, like yours. > > I'm looking at this. There's something strange going on here; I'm CC'ing Handa-san, because the problem is related to processing character compositions on a TTY. The reason for the crash is simple: the following code from indent.c:scan_for_column /* Check composition sequence. */ if (cmp_it.id >= 0 || (scan == cmp_it.stop_pos && composition_reseat_it (&cmp_it, scan, scan_byte, end, w, NULL, Qnil))) composition_update_it (&cmp_it, scan, scan_byte, Qnil); if (cmp_it.id >= 0) { scan += cmp_it.nchars; scan_byte += cmp_it.nbytes; if (scan <= end) col += cmp_it.width; if (cmp_it.to == cmp_it.nglyphs) { cmp_it.id = -1; composition_compute_stop_pos (&cmp_it, scan, scan_byte, end, Qnil); } else cmp_it.from = cmp_it.to; continue; } incorrectly steps into the middle of a multibyte sequence #xCC #x88 for the character u+0308, the Combining Diaeresis, because cmp_it.nbytes is computed as 1 instead of 2. The question is why it does so. >From stepping through composition_reseat_it and composition_update_it, it looks like the code contradicts itself: it thinks that 'a' and the combining diaeresis should be composed, but then acts as if no composition should happen. As result, this code in composition_update_it: glyph = LGSTRING_GLYPH (gstring, cmp_it->from); cmp_it->nchars = LGLYPH_TO (glyph) + 1 - from; cmp_it->nbytes = 0; cmp_it->width = 0; for (i = cmp_it->nchars - 1; i >= 0; i--) { c = XINT (LGSTRING_CHAR (gstring, i)); cmp_it->nbytes += CHAR_BYTES (c); cmp_it->width += CHAR_WIDTH (c); } always considers only 'a', never the diaeresis, and so cmp_it->nbytes is always computed as 1. So scan_for_column advances only 1 byte, instead of 2, and finds itself in the middle of a multibyte sequence. >From there, it's a sure way to a crash. I hope Handa-san will be able to find the problem. The crash is 100% reproducible with the steps described above and a mail message that Niels can send you off-list. TIA