* bug#4037: Characters garbled in self-insert-command @ 2009-08-04 19:27 ` Juri Linkov 2009-08-28 8:55 ` bug#4037: marked as done (Characters garbled in self-insert-command) Emacs bug Tracking System 0 siblings, 1 reply; 9+ messages in thread From: Juri Linkov @ 2009-08-04 19:27 UTC (permalink / raw) To: emacs-pretest-bug I just noticed a regression against Emacs 22. In GNU Emacs 23.1.50 (x86_64-pc-linux-gnu) typing C-u 5 C-x 8 ' a inserts into the current buffer á\341\341\341á whereas in GNU Emacs 22.1.1 typing the same correctly inserts ááááá The command `self-insert-command' in Emacs 23 inserts the first and the last characters without any modifications, but applies the following conversion for the remaining characters: /* Add the offset to the character, for Finsert_char. We pass internal_self_insert the unmodified character because it itself does this offsetting. */ if (! NILP (current_buffer->enable_multibyte_characters)) modified_char = unibyte_char_to_multibyte (modified_char); Commenting out the above 2 lines produces the correct result. However, I'm not sure what is the right fix. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#4037: marked as done (Characters garbled in self-insert-command) 2009-08-04 19:27 ` bug#4037: Characters garbled in self-insert-command Juri Linkov @ 2009-08-28 8:55 ` Emacs bug Tracking System 0 siblings, 0 replies; 9+ messages in thread From: Emacs bug Tracking System @ 2009-08-28 8:55 UTC (permalink / raw) To: Eli Zaretskii [-- Attachment #1: Type: text/plain, Size: 880 bytes --] Your message dated Fri, 28 Aug 2009 11:52:21 +0300 with message-id <83fxbccoca.fsf@gnu.org> and subject line Re: bug#4240: 23.1.50; C-u doesn't work with Swedish characters has caused the Emacs bug report #4037, regarding Characters garbled in self-insert-command to be marked as done. This means that you claim that the problem has been dealt with. If this is not the case it is now your responsibility to reopen the bug report if necessary, and/or fix the problem forthwith. (NB: If you are a system administrator and have no idea what this message is talking about, this may indicate a serious mail system misconfiguration somewhere. Please contact owner@emacsbugs.donarmstrong.com immediately.) -- 4037: http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=4037 Emacs Bug Tracking System Contact owner@emacsbugs.donarmstrong.com with problems [-- Attachment #2: Type: message/rfc822, Size: 2934 bytes --] From: Juri Linkov <juri@jurta.org> To: emacs-pretest-bug@gnu.org Subject: Characters garbled in self-insert-command Date: Tue, 04 Aug 2009 22:27:42 +0300 Message-ID: <87ws5jh0ql.fsf@mail.jurta.org> I just noticed a regression against Emacs 22. In GNU Emacs 23.1.50 (x86_64-pc-linux-gnu) typing C-u 5 C-x 8 ' a inserts into the current buffer á\341\341\341á whereas in GNU Emacs 22.1.1 typing the same correctly inserts ááááá The command `self-insert-command' in Emacs 23 inserts the first and the last characters without any modifications, but applies the following conversion for the remaining characters: /* Add the offset to the character, for Finsert_char. We pass internal_self_insert the unmodified character because it itself does this offsetting. */ if (! NILP (current_buffer->enable_multibyte_characters)) modified_char = unibyte_char_to_multibyte (modified_char); Commenting out the above 2 lines produces the correct result. However, I'm not sure what is the right fix. -- Juri Linkov http://www.jurta.org/emacs/ [-- Attachment #3: Type: message/rfc822, Size: 2732 bytes --] From: Eli Zaretskii <eliz@gnu.org> To: Kenichi Handa <handa@m17n.org> Cc: monnier@iro.umontreal.ca, 4240-done@emacsbugs.donarmstrong.com, deniz.a.m.dogan@gmail.com, 4037-done@emacsbugs.donarmstrong.com Subject: Re: bug#4240: 23.1.50; C-u doesn't work with Swedish characters Date: Fri, 28 Aug 2009 11:52:21 +0300 Message-ID: <83fxbccoca.fsf@gnu.org> > From: Kenichi Handa <handa@m17n.org> > Cc: eliz@gnu.org, 4240@emacsbugs.donarmstrong.com, deniz.a.m.dogan@gmail.com > Date: Thu, 27 Aug 2009 15:23:25 +0900 > > In article <jwvocq14zlk.fsf-monnier+emacsbugreports@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes: > > >>> > Please see bug#4037: > >>> > http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=4037 > >>> > I received no confirmation that my proposed fix is correct. > >>> I think those two lines are not necessary anymore and should be > >>> removed (together with the comments which explain their need). I > >>> think they belong to the old pre-unicode days when raw eight-bit > >>> characters needed such special treatment. > > > I believe you're right. Nowadays, the keyboard-decoding should always > > take place before we get to that point. > > Sorry for the late responce on this matter. Yes, that > unibyte->multibyte conversion is not necessary. I've just > installed a fix. Thanks. I'm closing the two related bug reports. ^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#4240: 23.1.50; C-u doesn't work with Swedish characters @ 2009-08-23 13:28 ` Deniz Dogan 2009-08-23 18:54 ` Juri Linkov 2009-08-28 8:55 ` bug#4240: marked as done (23.1.50; C-u doesn't work with Swedish characters) Emacs bug Tracking System 0 siblings, 2 replies; 9+ messages in thread From: Deniz Dogan @ 2009-08-23 13:28 UTC (permalink / raw) To: emacs-pretest-bug Please write in English if possible, because the Emacs maintainers usually do not have translators to read other languages for them. Your bug report will be posted to the emacs-pretest-bug@gnu.org mailing list. Please describe exactly what actions triggered the bug and the precise symptoms of the bug: I hit "C-u ä" expecting it to come out as "ääää". Instead it comes out as "ä\344\344ä". I try "C-u C-u ä" and it comes out as "ä" followed by fourteen "\344" and then a trailing "ä". This happens no matter which kind of repetition I'm doing, be it using C-u or using e.g. M-3. It's always the leading and the trailing character that come out right, all of the other ones are "broken". If Emacs crashed, and you have the Emacs process in the gdb debugger, please include the output from the following gdb commands: `bt full' and `xbacktrace'. If you would like to further debug the crash, please read the file /home/deniz/usr/share/emacs/23.1.50/etc/DEBUG for instructions. In GNU Emacs 23.1.50.2 (i686-pc-linux-gnu, GTK+ Version 2.16.5) of 2009-08-13 on stalin Windowing system distributor `The X.Org Foundation', version 11.0.10603000 configured using `configure '--without-rsvg' '--without-tiff' '--without-xpm' '--prefix=/home/deniz/usr'' Important settings: value of $LC_ALL: nil value of $LC_COLLATE: C value of $LC_CTYPE: nil value of $LC_MESSAGES: nil value of $LC_MONETARY: nil value of $LC_NUMERIC: nil value of $LC_TIME: nil value of $LANG: en_US.utf8 value of $XMODIFIERS: nil locale-coding-system: utf-8-unix default-enable-multibyte-characters: t Major mode: Lisp Interaction Minor modes in effect: tooltip-mode: t tool-bar-mode: t mouse-wheel-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t global-auto-composition-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t Recent input: C-u ä <return> C-u C-u ä <return> M-5 M-0 ä <return> C-1 C-0 u <return> C-1 C-0 ä M-x r e p o r t - e m a c s - b u g f <return> <backspace> <return> Recent messages: For information about GNU Emacs and the GNU system, type C-h C-a. Load-path shadows: None found. ^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#4240: 23.1.50; C-u doesn't work with Swedish characters 2009-08-23 13:28 ` bug#4240: 23.1.50; C-u doesn't work with Swedish characters Deniz Dogan @ 2009-08-23 18:54 ` Juri Linkov 2009-08-23 20:40 ` Eli Zaretskii 2009-08-28 8:55 ` bug#4240: marked as done (23.1.50; C-u doesn't work with Swedish characters) Emacs bug Tracking System 1 sibling, 1 reply; 9+ messages in thread From: Juri Linkov @ 2009-08-23 18:54 UTC (permalink / raw) To: Deniz Dogan; +Cc: 4240 > I hit "C-u ä" expecting it to come out as "ääää". Instead it comes out > as "ä\344\344ä". I try "C-u C-u ä" and it comes out as "ä" followed by > fourteen "\344" and then a trailing "ä". This happens no matter which > kind of repetition I'm doing, be it using C-u or using e.g. M-3. It's > always the leading and the trailing character that come out right, all > of the other ones are "broken". Please see bug#4037: http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=4037 I received no confirmation that my proposed fix is correct. Maybe the right fix is to reverse negation? It seems logical to check if a buffer is unibyte before converting from unibyte to multibyte, but I don't understand what this code was supposed to do. Index: src/cmds.c =================================================================== RCS file: /sources/emacs/emacs/src/cmds.c,v retrieving revision 1.107 diff -u -r1.107 cmds.c --- src/cmds.c 13 Jul 2009 01:02:51 -0000 1.107 +++ src/cmds.c 10 Aug 2009 22:54:02 -0000 @@ -337,7 +337,7 @@ /* Add the offset to the character, for Finsert_char. We pass internal_self_insert the unmodified character because it itself does this offsetting. */ - if (! NILP (current_buffer->enable_multibyte_characters)) + if (NILP (current_buffer->enable_multibyte_characters)) modified_char = unibyte_char_to_multibyte (modified_char); XSETFASTINT (n, XFASTINT (n) - 2); -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#4240: 23.1.50; C-u doesn't work with Swedish characters 2009-08-23 18:54 ` Juri Linkov @ 2009-08-23 20:40 ` Eli Zaretskii 2009-08-26 17:08 ` Eli Zaretskii 0 siblings, 1 reply; 9+ messages in thread From: Eli Zaretskii @ 2009-08-23 20:40 UTC (permalink / raw) To: Juri Linkov, 4240; +Cc: deniz.a.m.dogan > From: Juri Linkov <juri@jurta.org> > Date: Sun, 23 Aug 2009 21:54:04 +0300 > Cc: 4240@emacsbugs.donarmstrong.com > > > I hit "C-u ä" expecting it to come out as "ääää". Instead it comes out > > as "ä\344\344ä". I try "C-u C-u ä" and it comes out as "ä" followed by > > fourteen "\344" and then a trailing "ä". This happens no matter which > > kind of repetition I'm doing, be it using C-u or using e.g. M-3. It's > > always the leading and the trailing character that come out right, all > > of the other ones are "broken". > > Please see bug#4037: > http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=4037 > > I received no confirmation that my proposed fix is correct. I think those two lines are not necessary anymore and should be removed (together with the comments which explain their need). I think they belong to the old pre-unicode days when raw eight-bit characters needed such special treatment. Handa-san, can you please comment on that? > Maybe the right fix is to reverse negation? Why, do you see that the code without these two lines don't DTRT when the characters are inserted into a unibyte buffer? If it works in both cases, it's the evidence that I'm right and this code is not needed anymore. > It seems logical to check if a buffer is unibyte before converting > from unibyte to multibyte, but I don't understand what this code was > supposed to do. It was supposed to produce a multibyte character from a unibyte one, by using a special locale-dependent table that mapped, e.g., 8859-1 encoded Latin-1 characters in the range [128..255] to the corresponding multibyte codepoints of Latin-1 characters in the internal representation of characters Emacs 22 used. See the Emacs 22 definition of unibyte_char_to_multibyte in src/charset.c. Nowadays we don't need that, since we have a special range of multibyte codepoints for representing unibyte characters in multibyte buffers and strings, and insert-char and the primitives it calls already DTRT with them. So there should be no need to do anything special outside insert-char. ^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#4240: 23.1.50; C-u doesn't work with Swedish characters 2009-08-23 20:40 ` Eli Zaretskii @ 2009-08-26 17:08 ` Eli Zaretskii 2009-08-27 5:04 ` Stefan Monnier 0 siblings, 1 reply; 9+ messages in thread From: Eli Zaretskii @ 2009-08-26 17:08 UTC (permalink / raw) To: 4240, handa; +Cc: deniz.a.m.dogan Ping! > Date: Sun, 23 Aug 2009 23:40:00 +0300 > From: Eli Zaretskii <eliz@gnu.org> > Cc: deniz.a.m.dogan@gmail.com > > > From: Juri Linkov <juri@jurta.org> > > Date: Sun, 23 Aug 2009 21:54:04 +0300 > > Cc: 4240@emacsbugs.donarmstrong.com > > > > > I hit "C-u ä" expecting it to come out as "ääää". Instead it comes out > > > as "ä\344\344ä". I try "C-u C-u ä" and it comes out as "ä" followed by > > > fourteen "\344" and then a trailing "ä". This happens no matter which > > > kind of repetition I'm doing, be it using C-u or using e.g. M-3. It's > > > always the leading and the trailing character that come out right, all > > > of the other ones are "broken". > > > > Please see bug#4037: > > http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=4037 > > > > I received no confirmation that my proposed fix is correct. > > I think those two lines are not necessary anymore and should be > removed (together with the comments which explain their need). I > think they belong to the old pre-unicode days when raw eight-bit > characters needed such special treatment. > > Handa-san, can you please comment on that? > > > Maybe the right fix is to reverse negation? > > Why, do you see that the code without these two lines don't DTRT when > the characters are inserted into a unibyte buffer? If it works in > both cases, it's the evidence that I'm right and this code is not > needed anymore. > > > It seems logical to check if a buffer is unibyte before converting > > from unibyte to multibyte, but I don't understand what this code was > > supposed to do. > > It was supposed to produce a multibyte character from a unibyte one, > by using a special locale-dependent table that mapped, e.g., 8859-1 > encoded Latin-1 characters in the range [128..255] to the > corresponding multibyte codepoints of Latin-1 characters in the > internal representation of characters Emacs 22 used. See the Emacs 22 > definition of unibyte_char_to_multibyte in src/charset.c. > > Nowadays we don't need that, since we have a special range of > multibyte codepoints for representing unibyte characters in multibyte > buffers and strings, and insert-char and the primitives it calls > already DTRT with them. So there should be no need to do anything > special outside insert-char. > ^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#4240: 23.1.50; C-u doesn't work with Swedish characters 2009-08-26 17:08 ` Eli Zaretskii @ 2009-08-27 5:04 ` Stefan Monnier 2009-08-27 6:23 ` Kenichi Handa 0 siblings, 1 reply; 9+ messages in thread From: Stefan Monnier @ 2009-08-27 5:04 UTC (permalink / raw) To: Eli Zaretskii; +Cc: deniz.a.m.dogan, 4240 >> > Please see bug#4037: >> > http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=4037 >> > I received no confirmation that my proposed fix is correct. >> I think those two lines are not necessary anymore and should be >> removed (together with the comments which explain their need). I >> think they belong to the old pre-unicode days when raw eight-bit >> characters needed such special treatment. I believe you're right. Nowadays, the keyboard-decoding should always take place before we get to that point. Stefan ^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#4240: 23.1.50; C-u doesn't work with Swedish characters 2009-08-27 5:04 ` Stefan Monnier @ 2009-08-27 6:23 ` Kenichi Handa 0 siblings, 0 replies; 9+ messages in thread From: Kenichi Handa @ 2009-08-27 6:23 UTC (permalink / raw) To: Stefan Monnier; +Cc: 4240, deniz.a.m.dogan In article <jwvocq14zlk.fsf-monnier+emacsbugreports@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes: >>> > Please see bug#4037: >>> > http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=4037 >>> > I received no confirmation that my proposed fix is correct. >>> I think those two lines are not necessary anymore and should be >>> removed (together with the comments which explain their need). I >>> think they belong to the old pre-unicode days when raw eight-bit >>> characters needed such special treatment. > I believe you're right. Nowadays, the keyboard-decoding should always > take place before we get to that point. Sorry for the late responce on this matter. Yes, that unibyte->multibyte conversion is not necessary. I've just installed a fix. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#4240: marked as done (23.1.50; C-u doesn't work with Swedish characters) 2009-08-23 13:28 ` bug#4240: 23.1.50; C-u doesn't work with Swedish characters Deniz Dogan 2009-08-23 18:54 ` Juri Linkov @ 2009-08-28 8:55 ` Emacs bug Tracking System 1 sibling, 0 replies; 9+ messages in thread From: Emacs bug Tracking System @ 2009-08-28 8:55 UTC (permalink / raw) To: Eli Zaretskii [-- Attachment #1: Type: text/plain, Size: 888 bytes --] Your message dated Fri, 28 Aug 2009 11:52:21 +0300 with message-id <83fxbccoca.fsf@gnu.org> and subject line Re: bug#4240: 23.1.50; C-u doesn't work with Swedish characters has caused the Emacs bug report #4037, regarding 23.1.50; C-u doesn't work with Swedish characters to be marked as done. This means that you claim that the problem has been dealt with. If this is not the case it is now your responsibility to reopen the bug report if necessary, and/or fix the problem forthwith. (NB: If you are a system administrator and have no idea what this message is talking about, this may indicate a serious mail system misconfiguration somewhere. Please contact owner@emacsbugs.donarmstrong.com immediately.) -- 4037: http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=4037 Emacs Bug Tracking System Contact owner@emacsbugs.donarmstrong.com with problems [-- Attachment #2: Type: message/rfc822, Size: 5152 bytes --] From: Deniz Dogan <deniz.a.m.dogan@gmail.com> To: emacs-pretest-bug@gnu.org Subject: 23.1.50; C-u doesn't work with Swedish characters Date: Sun, 23 Aug 2009 15:28:58 +0200 Message-ID: <7b501d5c0908230628r5bc2cad2he3fc7a2249fcac5@mail.gmail.com> Please write in English if possible, because the Emacs maintainers usually do not have translators to read other languages for them. Your bug report will be posted to the emacs-pretest-bug@gnu.org mailing list. Please describe exactly what actions triggered the bug and the precise symptoms of the bug: I hit "C-u ä" expecting it to come out as "ääää". Instead it comes out as "ä\344\344ä". I try "C-u C-u ä" and it comes out as "ä" followed by fourteen "\344" and then a trailing "ä". This happens no matter which kind of repetition I'm doing, be it using C-u or using e.g. M-3. It's always the leading and the trailing character that come out right, all of the other ones are "broken". If Emacs crashed, and you have the Emacs process in the gdb debugger, please include the output from the following gdb commands: `bt full' and `xbacktrace'. If you would like to further debug the crash, please read the file /home/deniz/usr/share/emacs/23.1.50/etc/DEBUG for instructions. In GNU Emacs 23.1.50.2 (i686-pc-linux-gnu, GTK+ Version 2.16.5) of 2009-08-13 on stalin Windowing system distributor `The X.Org Foundation', version 11.0.10603000 configured using `configure '--without-rsvg' '--without-tiff' '--without-xpm' '--prefix=/home/deniz/usr'' Important settings: value of $LC_ALL: nil value of $LC_COLLATE: C value of $LC_CTYPE: nil value of $LC_MESSAGES: nil value of $LC_MONETARY: nil value of $LC_NUMERIC: nil value of $LC_TIME: nil value of $LANG: en_US.utf8 value of $XMODIFIERS: nil locale-coding-system: utf-8-unix default-enable-multibyte-characters: t Major mode: Lisp Interaction Minor modes in effect: tooltip-mode: t tool-bar-mode: t mouse-wheel-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t global-auto-composition-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t Recent input: C-u ä <return> C-u C-u ä <return> M-5 M-0 ä <return> C-1 C-0 u <return> C-1 C-0 ä M-x r e p o r t - e m a c s - b u g f <return> <backspace> <return> Recent messages: For information about GNU Emacs and the GNU system, type C-h C-a. Load-path shadows: None found. [-- Attachment #3: Type: message/rfc822, Size: 2732 bytes --] From: Eli Zaretskii <eliz@gnu.org> To: Kenichi Handa <handa@m17n.org> Cc: monnier@iro.umontreal.ca, 4240-done@emacsbugs.donarmstrong.com, deniz.a.m.dogan@gmail.com, 4037-done@emacsbugs.donarmstrong.com Subject: Re: bug#4240: 23.1.50; C-u doesn't work with Swedish characters Date: Fri, 28 Aug 2009 11:52:21 +0300 Message-ID: <83fxbccoca.fsf@gnu.org> > From: Kenichi Handa <handa@m17n.org> > Cc: eliz@gnu.org, 4240@emacsbugs.donarmstrong.com, deniz.a.m.dogan@gmail.com > Date: Thu, 27 Aug 2009 15:23:25 +0900 > > In article <jwvocq14zlk.fsf-monnier+emacsbugreports@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes: > > >>> > Please see bug#4037: > >>> > http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=4037 > >>> > I received no confirmation that my proposed fix is correct. > >>> I think those two lines are not necessary anymore and should be > >>> removed (together with the comments which explain their need). I > >>> think they belong to the old pre-unicode days when raw eight-bit > >>> characters needed such special treatment. > > > I believe you're right. Nowadays, the keyboard-decoding should always > > take place before we get to that point. > > Sorry for the late responce on this matter. Yes, that > unibyte->multibyte conversion is not necessary. I've just > installed a fix. Thanks. I'm closing the two related bug reports. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-08-28 8:55 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <83fxbccoca.fsf@gnu.org> 2009-08-04 19:27 ` bug#4037: Characters garbled in self-insert-command Juri Linkov 2009-08-28 8:55 ` bug#4037: marked as done (Characters garbled in self-insert-command) Emacs bug Tracking System 2009-08-23 13:28 ` bug#4240: 23.1.50; C-u doesn't work with Swedish characters Deniz Dogan 2009-08-23 18:54 ` Juri Linkov 2009-08-23 20:40 ` Eli Zaretskii 2009-08-26 17:08 ` Eli Zaretskii 2009-08-27 5:04 ` Stefan Monnier 2009-08-27 6:23 ` Kenichi Handa 2009-08-28 8:55 ` bug#4240: marked as done (23.1.50; C-u doesn't work with Swedish characters) Emacs bug Tracking System
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).