* bug#14368: 24.3.50; Big screw: multibyte characters become unibyte @ 2013-05-08 8:21 Richard Stallman 2013-05-11 16:49 ` Richard Stallman 2013-05-24 14:51 ` Handa Kenichi 0 siblings, 2 replies; 15+ messages in thread From: Richard Stallman @ 2013-05-08 8:21 UTC (permalink / raw) To: 14368 When I use latin-1-postfix to enter characters such as i-with-acute-accent and inverse-?, as soon as I type another character they turn into Latin-1 single byte codes. For instance, when I type i and ', the input method turns that into i-with-acute-accent; but then my next keystroke turns the i-with-acute-accent into \355. This is horrible! I can't find what is doing it. It seems to happen no matter what the next character is -- even M-x does it. But I can't see this on post-command-hook. In GNU Emacs 24.3.50.1 (mips64el-unknown-linux-gnu, GTK+ Version 2.20.1) of 2013-05-01 on chiefs-gnewsense Bzr revision: 112434 juri@jurta.org-20130501081012-n3c351r92cr17lu5 System Description: Debian GNU/Linux 6.0.6 (squeeze) Configured using: `configure CFLAGS=-g -O0' Important settings: value of $LANG: en_US.UTF-8 locale-coding-system: utf-8-unix default enable-multibyte-characters: t Major mode: Dired by date Minor modes in effect: shell-dirtrack-mode: t gpm-mouse-mode: t tooltip-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t buffer-read-only: t line-number-mode: t transient-mark-mode: t abbrev-mode: t Recent input: n . C-c C-c d d d x C-x b o u t g TAB RET g C-u C-p C-p C-p C-p C-p C-o C-n C-o C-x o C-x o d x y e s RET C-x b R TAB RET C-x 1 d u d d d d u d x 3 0 p d d d x n x C-a C-u C-u C-n C-u C-n C-n C-n C-n C-@ C-n C-n C-n r ESC , RET P u e d e SPC s e r SPC e l SPC 4 ? C-a ? / DEL DEL C-\ ? / C-e C-c C-c C-n C-@ C-n C-n ESC w C-x b o u t g TAB RET g C-p C-p C-p C-o C-x o ESC > RET RET ESC , RET C-o S e r i ' DEL DEL C-\ i ' a SPC p o s i b l e SPC a d n DEL m i t i r SPC a SPC o t r o s SPC s i n SPC ESC DEL g r a t u i t a m e n t e SPC y SPC n o SPC e n t r e g a r l e s SPC n a d a ? C-a ? / C-c C-c C-g C-x C-s C-g ESC x t o g g l e SPC e n a TAB RET ESC x ESC p RET C-x C-s RET C-x k RET C-x o e C-x 1 C-u C-u C-n C-n C-n C-d C-x C-s C-x k RET ESC x b u g SPC g n u SPC e m a DEL DEL DEL ESC DEL ESC DEL r e p o r t SPC e m a c s SPC b u g RET Recent messages: Wrote /home/rms/outgoing/out-56 Sending...done Mark set [3 times] Quit Saving file /home/rms/outgoing/out-56... Quit Saving file /home/rms/outgoing/out-56... Wrote /home/rms/outgoing/out-56 Saving file /home/rms/outgoing/out-56... Wrote /home/rms/outgoing/out-56 Load-path shadows: None found. Features: (shadow emacsbug quail cal-move cal-menu calendar cal-loaddefs dired-aux rmailsum grep compile parse-time vc-cvs sgml-mode shell pcomplete comint ansi-color ring mule-util qp help-mode rmailout misearch multi-isearch dabbrev mailalias rmailmm message sendmail format-spec rfc822 mml easymenu mml-sec mm-decode mm-bodies mm-encode mailabbrev gmm-utils mailheader mail-parse rfc2231 dired t-mouse time-date rmailedit rmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils paren cus-start cus-load nadvice advice help-fns tooltip ediff-hook vc-hooks lisp-float-type mwheel x-win x-dnd tool-bar dnd fontset image regexp-opt fringe tabulated-list newcomment lisp-mode register page menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock font-lock syntax facemenu font-core frame cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer loaddefs button faces cus-face macroexp files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote make-network-process dbusbind inotify dynamic-setting system-font-setting font-render-setting move-toolbar gtk x-toolkit x multi-tty emacs) -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#14368: 24.3.50; Big screw: multibyte characters become unibyte 2013-05-08 8:21 bug#14368: 24.3.50; Big screw: multibyte characters become unibyte Richard Stallman @ 2013-05-11 16:49 ` Richard Stallman 2013-05-11 17:17 ` Eli Zaretskii 2013-05-24 14:51 ` Handa Kenichi 1 sibling, 1 reply; 15+ messages in thread From: Richard Stallman @ 2013-05-11 16:49 UTC (permalink / raw) To: 14368 I did some debugging and found that the mistaken replacement of multibyte characters with unibyte occurs in quail-start-translation. However, the bug is probably not in quail.el, because quail.el has not changed since the start of the year, and the last change log entry was long before that. Can someone who understands quail please investigate this bug? -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#14368: 24.3.50; Big screw: multibyte characters become unibyte 2013-05-11 16:49 ` Richard Stallman @ 2013-05-11 17:17 ` Eli Zaretskii 2013-05-11 21:44 ` Richard Stallman 0 siblings, 1 reply; 15+ messages in thread From: Eli Zaretskii @ 2013-05-11 17:17 UTC (permalink / raw) To: rms; +Cc: 14368 > Date: Sat, 11 May 2013 12:49:07 -0400 > From: Richard Stallman <rms@gnu.org> > > I did some debugging and found that the mistaken > replacement of multibyte characters with unibyte > occurs in quail-start-translation. > > However, the bug is probably not in quail.el, because quail.el has not > changed since the start of the year, and the last change log entry was > long before that. > > Can someone who understands quail please investigate this bug? Can you reproduce it starting with "emacs -Q"? I tried, but couldn't. ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#14368: 24.3.50; Big screw: multibyte characters become unibyte 2013-05-11 17:17 ` Eli Zaretskii @ 2013-05-11 21:44 ` Richard Stallman 2013-05-12 2:51 ` Eli Zaretskii 0 siblings, 1 reply; 15+ messages in thread From: Richard Stallman @ 2013-05-11 21:44 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 14368 Can you reproduce it starting with "emacs -Q"? Yes. I type emacs -Q C-\ latin-1-postfix RET a ' C-a and it fails -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#14368: 24.3.50; Big screw: multibyte characters become unibyte 2013-05-11 21:44 ` Richard Stallman @ 2013-05-12 2:51 ` Eli Zaretskii 2013-05-12 16:04 ` Eli Zaretskii 0 siblings, 1 reply; 15+ messages in thread From: Eli Zaretskii @ 2013-05-12 2:51 UTC (permalink / raw) To: rms; +Cc: 14368 > Date: Sat, 11 May 2013 17:44:27 -0400 > From: Richard Stallman <rms@gnu.org> > CC: 14368@debbugs.gnu.org > > Can you reproduce it starting with "emacs -Q"? > > Yes. I type > > emacs -Q > C-\ latin-1-postfix RET > a ' C-a > > and it fails It doesn't fail for me, with yesterday's trunk. C-a just moves to the beginning of the line, as expected. Wait, I can reproduce this in a TTY session (the above was a GUI session). I will try to look into it. ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#14368: 24.3.50; Big screw: multibyte characters become unibyte 2013-05-12 2:51 ` Eli Zaretskii @ 2013-05-12 16:04 ` Eli Zaretskii 2013-05-13 15:50 ` Stefan Monnier 2013-05-23 17:28 ` Stefan Monnier 0 siblings, 2 replies; 15+ messages in thread From: Eli Zaretskii @ 2013-05-12 16:04 UTC (permalink / raw) To: Stefan Monnier, Kenichi Handa; +Cc: 14368, rms > Date: Sun, 12 May 2013 05:51:35 +0300 > From: Eli Zaretskii <eliz@gnu.org> > Cc: 14368@debbugs.gnu.org > > > Date: Sat, 11 May 2013 17:44:27 -0400 > > From: Richard Stallman <rms@gnu.org> > > CC: 14368@debbugs.gnu.org > > > > Can you reproduce it starting with "emacs -Q"? > > > > Yes. I type > > > > emacs -Q > > C-\ latin-1-postfix RET > > a ' C-a > > > > and it fails > > It doesn't fail for me, with yesterday's trunk. C-a just moves to the > beginning of the line, as expected. > > Wait, I can reproduce this in a TTY session (the above was a GUI > session). I will try to look into it. I found the reason, but I don't know enough about quail or input decoding to suggest a solution. The reason seems to be this changeset: 112000: Stefan Monnier 2013-03-11 * src/keyboard.c: Move keyboard decoding to read_key_sequence. The problem is that we now decode all input that comes from quail (read_char calls input-method-function, and then read_decoded_char decodes the result). However, quail seems to work by deleting some characters from the buffer, and then reinserting them, possibly after translation, as instructed by the additional characters you type. In this case, typing "a '" inserts á, and quail then waits for another character. Typing C-a at this point removes á from the buffer, and then sends as input 2 events: a self-inserting character whose code is 225 decimal (that's á), followed by the code 1, which is C-a. (I don't know if this is how quail is supposed to work; what I described is what I saw in the debugger. Perhaps Handa-san could comment on that.) What happens next is that read_decoded_char attempts to decode 225, which will cause different results depending on the current keyboard encoding: on GNU/Linux, we get an 8-bit raw byte \341 (that's octal for 225), while on Windows with cp862 as the keyboard encoding, I get ß. C-a is executed as expected, but the net result is that á was replaced by something else. I'm not sure how to fix this cleanly. One way would be to get quail to encode the character events it sends, but then we have problems with un-encodable characters. Another way would be to somehow detect that the character comes from quail and refrain from decoding it, although I always thought that one of the goals of revision 112000 was precisely to _allow_ decoding characters coming from quail. Stefan, can you take a look, please? ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#14368: 24.3.50; Big screw: multibyte characters become unibyte 2013-05-12 16:04 ` Eli Zaretskii @ 2013-05-13 15:50 ` Stefan Monnier 2013-05-23 17:28 ` Stefan Monnier 1 sibling, 0 replies; 15+ messages in thread From: Stefan Monnier @ 2013-05-13 15:50 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 14368, rms > The reason seems to be this changeset: > 112000: Stefan Monnier 2013-03-11 * src/keyboard.c: Move keyboard > decoding to read_key_sequence. > The problem is that we now decode all input that comes from quail > (read_char calls input-method-function, and then read_decoded_char > decodes the result). [...] > Stefan, can you take a look, please? Your analysis makes a lot of sense. I'll take a look as soon as I can, but this week is pretty busy. Stefan ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#14368: 24.3.50; Big screw: multibyte characters become unibyte 2013-05-12 16:04 ` Eli Zaretskii 2013-05-13 15:50 ` Stefan Monnier @ 2013-05-23 17:28 ` Stefan Monnier 2013-05-23 18:55 ` Eli Zaretskii 1 sibling, 1 reply; 15+ messages in thread From: Stefan Monnier @ 2013-05-23 17:28 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 14368, rms > I found the reason, but I don't know enough about quail or input > decoding to suggest a solution. I just installed a patch which should fix it. During development, I got some weird behavior and a few crashes, but I was neither able to track them down, nor to reproduce them now, so I installed the code as is. Let's hope for the best. Stefan ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#14368: 24.3.50; Big screw: multibyte characters become unibyte 2013-05-23 17:28 ` Stefan Monnier @ 2013-05-23 18:55 ` Eli Zaretskii 0 siblings, 0 replies; 15+ messages in thread From: Eli Zaretskii @ 2013-05-23 18:55 UTC (permalink / raw) To: Stefan Monnier; +Cc: 14368, rms > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Kenichi Handa <handa@gnu.org>, rms@gnu.org, 14368@debbugs.gnu.org > Date: Thu, 23 May 2013 13:28:01 -0400 > > > I found the reason, but I don't know enough about quail or input > > decoding to suggest a solution. > > I just installed a patch which should fix it. Thanks, it seems to work well for me, both on Windows and on GNU/Linux. > During development, I got some weird behavior and a few crashes, but > I was neither able to track them down, nor to reproduce them now, so > I installed the code as is. Let's hope for the best. Didn't crash for me. ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#14368: 24.3.50; Big screw: multibyte characters become unibyte 2013-05-08 8:21 bug#14368: 24.3.50; Big screw: multibyte characters become unibyte Richard Stallman 2013-05-11 16:49 ` Richard Stallman @ 2013-05-24 14:51 ` Handa Kenichi 2013-05-24 15:31 ` Eli Zaretskii 2013-05-24 15:34 ` bug#14368: 24.3.50; Big screw: multibyte characters become unibyte Stefan Monnier 1 sibling, 2 replies; 15+ messages in thread From: Handa Kenichi @ 2013-05-24 14:51 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 14368, rms I'm very sorry for the late response on this matter. In article <83a9o09oc1.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > However, quail seems to work by deleting some characters from the > buffer, and then reinserting them, possibly after translation, as > instructed by the additional characters you type. In this case, > typing "a '" inserts á, and quail then waits for another character. > Typing C-a at this point removes á from the buffer, and then sends as > input 2 events: a self-inserting character whose code is 225 decimal > (that's á), followed by the code 1, which is C-a. (I don't know if > this is how quail is supposed to work; what I described is what I saw > in the debugger. Perhaps Handa-san could comment on that.) Your analysis is correct. Quail is an event translator. It is designed not to insert a character directly but to generate proper character events. > I'm not sure how to fix this cleanly. One way would be to get quail > to encode the character events it sends, but then we have problems > with un-encodable characters. It is a possible way, but I don't think that is the right thing. Making quail encode characters and making the caller to re-decode them looks like very silly. > Another way would be to somehow detect > that the character comes from quail and refrain from decoding it, It's not only the quail problem. Currently the handling of unread-command-events is broken; this does not work correctly on terminal (setq unread-command-events '(?À)) --- Kenichi Handa handa@gnu.org ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#14368: 24.3.50; Big screw: multibyte characters become unibyte 2013-05-24 14:51 ` Handa Kenichi @ 2013-05-24 15:31 ` Eli Zaretskii 2013-05-25 1:05 ` bug#14368: 24.3.50; Big screw: multibyte characters become Handa Kenichi 2013-05-24 15:34 ` bug#14368: 24.3.50; Big screw: multibyte characters become unibyte Stefan Monnier 1 sibling, 1 reply; 15+ messages in thread From: Eli Zaretskii @ 2013-05-24 15:31 UTC (permalink / raw) To: Handa Kenichi; +Cc: 14368, rms > From: Handa Kenichi <handa@gnu.org> > Cc: monnier@iro.umontreal.ca, rms@gnu.org, 14368@debbugs.gnu.org > Date: Fri, 24 May 2013 10:51:20 -0400 > > Currently the handling of unread-command-events is broken; this does > not work correctly on terminal > > (setq unread-command-events '(?À)) Did you try with the latest trunk? If so, please explain what doesn't work with this, because it seems to work for me after Stefan's changes yesterday (I get À inserted). ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#14368: 24.3.50; Big screw: multibyte characters become 2013-05-24 15:31 ` Eli Zaretskii @ 2013-05-25 1:05 ` Handa Kenichi 2013-05-25 19:22 ` Richard Stallman 0 siblings, 1 reply; 15+ messages in thread From: Handa Kenichi @ 2013-05-25 1:05 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 14368, rms In article <838v34s8cj.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > > Currently the handling of unread-command-events is broken; this does > > not work correctly on terminal > > > > (setq unread-command-events '(?À)) > Did you try with the latest trunk? No. I wrote that before readingg Stefan's mail. I've just tried the latest code and confirmed that it was fixed. Thank you. --- Kenichi Handa handa@gnu.org ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#14368: 24.3.50; Big screw: multibyte characters become 2013-05-25 1:05 ` bug#14368: 24.3.50; Big screw: multibyte characters become Handa Kenichi @ 2013-05-25 19:22 ` Richard Stallman 2013-05-26 0:58 ` Stefan Monnier 0 siblings, 1 reply; 15+ messages in thread From: Richard Stallman @ 2013-05-25 19:22 UTC (permalink / raw) To: Handa Kenichi; +Cc: 14368 I too observe the bug to be fixed now. Thanks for fixing it. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#14368: 24.3.50; Big screw: multibyte characters become 2013-05-25 19:22 ` Richard Stallman @ 2013-05-26 0:58 ` Stefan Monnier 0 siblings, 0 replies; 15+ messages in thread From: Stefan Monnier @ 2013-05-26 0:58 UTC (permalink / raw) To: Richard Stallman; +Cc: 14368-done > I too observe the bug to be fixed now. Thanks for fixing it. Thanks for confirming, Stefan ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#14368: 24.3.50; Big screw: multibyte characters become unibyte 2013-05-24 14:51 ` Handa Kenichi 2013-05-24 15:31 ` Eli Zaretskii @ 2013-05-24 15:34 ` Stefan Monnier 1 sibling, 0 replies; 15+ messages in thread From: Stefan Monnier @ 2013-05-24 15:34 UTC (permalink / raw) To: Handa Kenichi; +Cc: 14368, rms > It's not only the quail problem. Currently the handling of > unread-command-events is broken; this does not work correctly on > terminal > (setq unread-command-events '(?À)) It should work now, with the patch I installed yesterday, Stefan ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2013-05-26 0:58 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-05-08 8:21 bug#14368: 24.3.50; Big screw: multibyte characters become unibyte Richard Stallman 2013-05-11 16:49 ` Richard Stallman 2013-05-11 17:17 ` Eli Zaretskii 2013-05-11 21:44 ` Richard Stallman 2013-05-12 2:51 ` Eli Zaretskii 2013-05-12 16:04 ` Eli Zaretskii 2013-05-13 15:50 ` Stefan Monnier 2013-05-23 17:28 ` Stefan Monnier 2013-05-23 18:55 ` Eli Zaretskii 2013-05-24 14:51 ` Handa Kenichi 2013-05-24 15:31 ` Eli Zaretskii 2013-05-25 1:05 ` bug#14368: 24.3.50; Big screw: multibyte characters become Handa Kenichi 2013-05-25 19:22 ` Richard Stallman 2013-05-26 0:58 ` Stefan Monnier 2013-05-24 15:34 ` bug#14368: 24.3.50; Big screw: multibyte characters become unibyte Stefan Monnier
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.