From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#14368: 24.3.50; Big screw: multibyte characters become unibyte Date: Sun, 12 May 2013 19:04:30 +0300 Message-ID: <83a9o09oc1.fsf@gnu.org> References: <83r4hda11h.fsf@gnu.org> <83fvxsap1k.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE X-Trace: ger.gmane.org 1368374755 20118 80.91.229.3 (12 May 2013 16:05:55 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 12 May 2013 16:05:55 +0000 (UTC) Cc: 14368@debbugs.gnu.org, rms@gnu.org To: Stefan Monnier , Kenichi Handa Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sun May 12 18:05:50 2013 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1UbYmT-0007lA-No for geb-bug-gnu-emacs@m.gmane.org; Sun, 12 May 2013 18:05:49 +0200 Original-Received: from localhost ([::1]:45403 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UbYmT-00027t-24 for geb-bug-gnu-emacs@m.gmane.org; Sun, 12 May 2013 12:05:49 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:59863) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UbYmO-00026X-2j for bug-gnu-emacs@gnu.org; Sun, 12 May 2013 12:05:45 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UbYmL-0002LO-8F for bug-gnu-emacs@gnu.org; Sun, 12 May 2013 12:05:44 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:33471) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UbYmL-0002LK-5A for bug-gnu-emacs@gnu.org; Sun, 12 May 2013 12:05:41 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.72) (envelope-from ) id 1UbYmg-0007gG-Mr for bug-gnu-emacs@gnu.org; Sun, 12 May 2013 12:06:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 12 May 2013 16:06:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 14368 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 14368-submit@debbugs.gnu.org id=B14368.136837472629451 (code B ref 14368); Sun, 12 May 2013 16:06:02 +0000 Original-Received: (at 14368) by debbugs.gnu.org; 12 May 2013 16:05:26 +0000 Original-Received: from localhost ([127.0.0.1]:37579 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UbYm2-0007eu-W2 for submit@debbugs.gnu.org; Sun, 12 May 2013 12:05:24 -0400 Original-Received: from mtaout22.012.net.il ([80.179.55.172]:51077) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UbYlz-0007em-QU for 14368@debbugs.gnu.org; Sun, 12 May 2013 12:05:20 -0400 Original-Received: from conversion-daemon.a-mtaout22.012.net.il by a-mtaout22.012.net.il (HyperSendmail v2007.08) id <0MMP00J000KVOA00@a-mtaout22.012.net.il> for 14368@debbugs.gnu.org; Sun, 12 May 2013 19:04:42 +0300 (IDT) Original-Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout22.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0MMP00JTX0NTOJ00@a-mtaout22.012.net.il>; Sun, 12 May 2013 19:04:42 +0300 (IDT) In-reply-to: <83fvxsap1k.fsf@gnu.org> X-012-Sender: halo1@inter.net.il X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:74187 Archived-At: > Date: Sun, 12 May 2013 05:51:35 +0300 > From: Eli Zaretskii > Cc: 14368@debbugs.gnu.org >=20 > > Date: Sat, 11 May 2013 17:44:27 -0400 > > From: Richard Stallman > > CC: 14368@debbugs.gnu.org > >=20 > > Can you reproduce it starting with "emacs -Q"? > >=20 > > Yes. I type > >=20 > > emacs -Q > > C-\ latin-1-postfix RET > > a ' C-a > >=20 > > and it fails >=20 > It doesn't fail for me, with yesterday's trunk. C-a just moves to = the > beginning of the line, as expected. >=20 > Wait, I can reproduce this in a TTY session (the above was a GUI > session). I will try to look into it. I found the reason, but I don't know enough about quail or input decoding to suggest a solution. The reason seems to be this changeset: 112000: Stefan Monnier 2013-03-11 * src/keyboard.c: Move keyboard d= ecoding to read_key_sequence. The problem is that we now decode all input that comes from quail (read_char calls input-method-function, and then read_decoded_char decodes the result). However, quail seems to work by deleting some characters from the buffer, and then reinserting them, possibly after translation, as instructed by the additional characters you type. In this case, typing "a '" inserts =E1, and quail then waits for another character. Typing C-a at this point removes =E1 from the buffer, and then sends = as input 2 events: a self-inserting character whose code is 225 decimal (that's =E1), followed by the code 1, which is C-a. (I don't know if this is how quail is supposed to work; what I described is what I saw in the debugger. Perhaps Handa-san could comment on that.) What happens next is that read_decoded_char attempts to decode 225, which will cause different results depending on the current keyboard encoding: on GNU/Linux, we get an 8-bit raw byte \341 (that's octal for 225), while on Windows with cp862 as the keyboard encoding, I get =DF. C-a is executed as expected, but the net result is that =E1 was replaced by something else. I'm not sure how to fix this cleanly. One way would be to get quail to encode the character events it sends, but then we have problems with un-encodable characters. Another way would be to somehow detect that the character comes from quail and refrain from decoding it, although I always thought that one of the goals of revision 112000 wa= s precisely to _allow_ decoding characters coming from quail. Stefan, can you take a look, please?