unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#1006: garbled unicode characters in M-x term
@ 2008-09-24 20:30 Chong Yidong
  0 siblings, 0 replies; 2+ messages in thread
From: Chong Yidong @ 2008-09-24 20:30 UTC (permalink / raw)
  To: Andreas Politz; +Cc: 1006

> Ok, I think I found the problem. term uses `binary' as input coding.
> After it has examined the input, it inserts the relevant/visible parts
> of it into the buffer. Only at this point it decodes the bytes with
> the apropriate coding (variable:locale-coding-system).  At some point
> it splits the input string, to make it suitable for the with of the
> `terminal'. The problem is, that it measures bytes not characters. So
> the 3-byte character in question in aptitude, which is mostly on the
> last column, gets split in 2 strings a 1 and 2 byte. This 2 strings,
> when encoded and inserted independently, will result in what was
> described as the problem.

Thanks for the analysis.  Could you try to write a patch to fix this?






^ permalink raw reply	[flat|nested] 2+ messages in thread
* bug#1006: garbled unicode characters in M-x term
@ 2008-09-19 19:34 Andreas Politz
  0 siblings, 0 replies; 2+ messages in thread
From: Andreas Politz @ 2008-09-19 19:34 UTC (permalink / raw)
  To: bug-gnu-emacs


Please write in English if possible, because the Emacs maintainers
usually do not have translators to read other languages for them.

Your bug report will be posted to the bug-gnu-emacs@gnu.org mailing list,
and to the gnu.emacs.bug news group.

Please describe exactly what actions triggered the bug
and the precise symptoms of the bug:


Problem : Under certain circumstances multibyte characters in M-x term
become garbled and display as single byte escape sequences.

Example : debians aptitude (character U+2592)

 From a post I made to gnu.emacs.help:

Ok, I think I found the problem. term uses `binary' as input coding.
After it has examined the input, it inserts the relevant/visible parts
of it into the buffer. Only at this point it decodes the bytes with
the apropriate coding (variable:locale-coding-system).
At some point it splits the input string, to make it suitable for the
with of the `terminal'. The problem is, that it measures bytes not
characters. So the 3-byte character in question in aptitude, which is mostly
on the last column, gets split in 2 strings a 1 and 2 byte. This 2
strings, when encoded and inserted independently, will result in
what was described as the problem.

Solution would be to encode the string before checking the length of
it.

-ap

If Emacs crashed, and you have the Emacs process in the gdb debugger,
please include the output from the following gdb commands:
     `bt full' and `xbacktrace'.
If you would like to further debug the crash, please read the file
/usr/share/emacs/22.2/etc/DEBUG for instructions.


In GNU Emacs 22.2.1 (i486-pc-linux-gnu, GTK+ Version 2.12.11)
  of 2008-07-25 on raven, modified by Debian
Windowing system distributor `The X.Org Foundation', version 11.0.10402000
configured using `configure  '--build=i486-linux-gnu' '--host=i486-linux-gnu' '--prefix=/usr' '--sharedstatedir=/var/lib' '--libexecdir=/usr/lib' '--localstatedir=/var/lib' '--infodir=/usr/share/info' '--mandir=/usr/share/man' '--with-pop=yes' '--enable-locallisppath=/etc/emacs22:/etc/emacs:/usr/local/share/emacs/22.2/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/22.2/site-lisp:/usr/share/emacs/site-lisp:/usr/share/emacs/22.2/leim' '--with-x=yes' '--with-x-toolkit=gtk' '--with-toolkit-scroll-bars' 'build_alias=i486-linux-gnu' 'host_alias=i486-linux-gnu' 'CFLAGS=-DDEBIAN -g -O2' 'LDFLAGS=-g' 'CPPFLAGS=''

Important settings:
   value of $LC_ALL: nil
   value of $LC_COLLATE: nil
   value of $LC_CTYPE: nil
   value of $LC_MESSAGES: nil
   value of $LC_MONETARY: nil
   value of $LC_NUMERIC: nil
   value of $LC_TIME: nil
   value of $LANG: en_US.UTF-8
   locale-coding-system: utf-8
   default-enable-multibyte-characters: t

Major mode: Fundamental

Minor modes in effect:
   shell-dirtrack-mode: t
   auto-fill-function: do-auto-fill
   show-paren-mode: t
   savehist-mode: t
   icomplete-mode: t
   global-hi-lock-mode: t
   hi-lock-mode: t
   display-time-mode: t
   tooltip-mode: t
   mouse-wheel-mode: t
   menu-bar-mode: t
   file-name-shadow-mode: t
   global-font-lock-mode: t
   font-lock-mode: t
   unify-8859-on-encoding-mode: t
   utf-translate-cjk-mode: t
   auto-compression-mode: t
   column-number-mode: t
   line-number-mode: t

Recent input:
C-x C-s M-x d i f f SPC u DEL C-g C-x o M-? m C-M-v
C-x k RET C-x C-g M-x d i f f RET RET t e r m . RET
C-x o C-v C-v C-v C-v C-v M-< M-x w o m a n RET d i
f f RET C-v C-v C-v M-v C-r i g n o r e C-r C-g C-x
b t e r C-s C-s C-g C-x o M-x C-g C-u M-x d i f f RET
RET t e r C-s RET w <return> C-x o C-n C-n C-n C-n
C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n
C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n
C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n
C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n
C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-x o C-x
o M-< C-x k RET C-x o C-u M-x d i f f RET RET t e r
C-s RET DEL w <return> C-x C-g C-u C-g M-x d i f f
RET RET t e r m . RET C-x o C-v C-v C-v C-v C-v M-v
M-v M-v M-v M-v C-x o C-x C-w ~ / . e m / t e r m .
e l <return> C-x b f o RET C-n C-n C-n C-n C-n C-n
C-n C-n C-n C-n M-x r e p o SPC r t RET g r a <backspace>
<backspace> a r b e l e d DEL DEL DEL DEL l e d C-g

Recent messages:
Repeating command 1 other-window
Quit
Repeating command 1 other-window [2 times]
Saving file /home/andy/.emacs.d/term.el...
Wrote /home/andy/.emacs.d/term.el
Making completion list...
Loading emacsbug...done
Quit







^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2008-09-24 20:30 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-24 20:30 bug#1006: garbled unicode characters in M-x term Chong Yidong
  -- strict thread matches above, loose matches on Subject: below --
2008-09-19 19:34 Andreas Politz

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).