* bug#970: 23.0.60; Non-ASCII display problems on a tty
@ 2008-09-18 18:32 Chong Yidong
2008-09-19 8:44 ` Eli Zaretskii
2008-09-27 14:48 ` Eli Zaretskii
0 siblings, 2 replies; 4+ messages in thread
From: Chong Yidong @ 2008-09-18 18:32 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 970
> emacs -Q
> C-h H
>
> Type C-n several times, and you will see some very strange behavior:
> for example, some lines are skipped and point never enters them.
I think Kenichi Handa's latest composition changes should have fixed
this. Can you verify?
^ permalink raw reply [flat|nested] 4+ messages in thread
* bug#970: 23.0.60; Non-ASCII display problems on a tty 2008-09-18 18:32 bug#970: 23.0.60; Non-ASCII display problems on a tty Chong Yidong @ 2008-09-19 8:44 ` Eli Zaretskii 2008-09-27 14:48 ` Eli Zaretskii 1 sibling, 0 replies; 4+ messages in thread From: Eli Zaretskii @ 2008-09-19 8:44 UTC (permalink / raw) To: Chong Yidong, 970; +Cc: bug-gnu-emacs > From: Chong Yidong <cyd@stupidchicken.com> > Date: Thu, 18 Sep 2008 14:32:00 -0400 > Cc: 970@emacsbugs.donarmstrong.com > > > emacs -Q > > C-h H > > > > Type C-n several times, and you will see some very strange behavior: > > for example, some lines are skipped and point never enters them. > > I think Kenichi Handa's latest composition changes should have fixed > this. Can you verify? The ``some lines are skipped'' part is indeed solved. But the other problems mentioned in my bug report are still there. For example, compare the "South Asia" and "Bengali" lines with a graphics display: the number and screen position of the `?' question marks displayed on a tty instead of non-ASCII characters do not match those displayed on a graphics terminal. ^ permalink raw reply [flat|nested] 4+ messages in thread
* bug#970: 23.0.60; Non-ASCII display problems on a tty 2008-09-18 18:32 bug#970: 23.0.60; Non-ASCII display problems on a tty Chong Yidong 2008-09-19 8:44 ` Eli Zaretskii @ 2008-09-27 14:48 ` Eli Zaretskii 1 sibling, 0 replies; 4+ messages in thread From: Eli Zaretskii @ 2008-09-27 14:48 UTC (permalink / raw) To: Kenichi Handa, 970; +Cc: bug-gnu-emacs I have some more info about this bug. The below is based on displaying a file that is encoded in iso-2022-7bit-unix, and has a single line that is a copy of line 20 from etc/HELLO, which is the entry for the Bengali language. To produce this file, copy line 20 of HELLO, paste it into a new file, type "C-x RET f iso-2022-7bit-unix RET" and save the file. The display problems for this line are directly caused by the fact that tty_write_glyphs is called with its last argument len=22, which means the display engine expects 22 characters to be displayed. And tty_write_glyphs therefore moves cursor by 22 positions to account for that. However, encode_terminal_code returns a string whose length is only 13 characters, and the difference between 13 and 22 is the immediate cause for display problems: the displayed string looks as if it were padded by whitespace, but typing "C-x =" on these ``whitespace'' characters reveals that they are not spaces at all. Looking inside encode_terminal_code, I see that the problem is somehow related to composite characters. The first group of non-ASCII characters (in parentheses) are composite characters whose u.cmp.automatic flag is set. The Lisp object returned by composition_gstring_from_id for this group of characters is a Lisp vector: [[nil 2476 2494 2434 2482 2494] 0 [0 0 2476 2476 1 0 1 1 0 nil] [1 1 2494 2494 1 0 1 1 0 nil] [2 2 2434 2434 1 0 1 1 0 nil] [3 3 2482 2482 1 0 1 1 0 nil] [4 4 2494 2494 1 0 1 1 0 nil]] When this code: if (src->u.cmp.automatic) for (i = src->u.cmp.from; i < src->u.cmp.to; i++) { Lisp_Object g = LGSTRING_GLYPH (gstring, i); int c = LGLYPH_CHAR (g); if (! char_charset (c, charset_list, NULL)) break; buf += CHAR_STRING (c, buf); nchars++; } walks this Lisp vector, it immediately finds that the 1st character cannot be encoded by the current terminal's encoding, and breaks out of the loop. Then the `?' character gets stored in the buffer that is being prepared for encoding: if (i == 0) { /* The first character of the composition is not encodable. */ *buf++ = '?'; nchars++; } This is all as expected, but because of the "if (i == 0)" clause above, the `?' character gets stored only for the first character in this composition, whose codepoint is 2476. For other characters, the u.cmp.from value is greater than 0, so `?' is not stored for them. By contrast, on a graphics terminal, the 5 characters inside the parentheses are displayed as 2 visible glyphs, one (codepoint 2476) for buffer position 10, the other (codepoint 2482) for buffer position 13. Thus, I would expect to see two `?' question marks inside parentheses, not one. Similar problem happens with the second group of non-ASCII characters on this line, the characters that follow the TAB character. Here's the Lisp object returned by composition_gstring_from_id: [[nil 2472 2478 2488 2509 2453 2494 2480] 1 [0 0 2472 2472 1 0 1 1 0 nil] [1 1 2478 2478 1 0 1 1 0 nil] [2 3 2488 2488 1 0 1 1 0 nil] [2 3 2509 2509 0 0 0 1 0 nil] [4 4 2453 2453 1 0 1 1 0 nil] [5 5 2494 2494 1 0 1 1 0 nil] [6 6 2480 2480 1 0 1 1 0 nil]] (Note that in this case, there are elements in this vector whose FROM-IDX and TO-IDX values are not identical, and also the WIDTH value is zero for one of them.) This group of characters is displayed as 4 visible glyphs on a graphics terminal: respectively, for buffer positions 17 (code 2472), 18 (code 2478), 19 (code 2488), and 23 (2480). On a TTY, only one `?' is shown, again for the same reason as described above: the "if (i == 0)" test. My first suspicion would be that the object returned by composition_gstring_from_id gives incorrect data for FROM-IDX and TO-IDX, but I'm not sure I understood the composition machinery enough to draw a definitive conclusion. It is not even clear to me how do we want to display these characters: do we want the number of `?'s to be identical to the number of glyphs displayed by a graphics terminal, or do we want something else? Handa-san, can you please comment on these findings? ^ permalink raw reply [flat|nested] 4+ messages in thread
* bug#970: 23.0.60; Non-ASCII display problems on a tty @ 2008-09-12 10:18 Eli Zaretskii 0 siblings, 0 replies; 4+ messages in thread From: Eli Zaretskii @ 2008-09-12 10:18 UTC (permalink / raw) To: emacs-pretest-bug emacs -Q C-h H Type C-n several times, and you will see some very strange behavior: for example, some lines are skipped and point never enters them. Also, some non-ASCII characters are displayed incorrectly. For example, the "Bengali" line has only 1 "?" character in the parentheses following the language name, whereas 2 characters are displayed on a graphics display (I tried MS-Windows). On the same line, under "HELLO", there are 2 "?" characters instead of 4, and they are not aligned with the rest of greetings; moving point with C-f skips those "?"s and lands on what is displayed as space, but "C-x =" shows that there are non-ASCII characters in the buffer at those "blank" positions. Etc., etc., it looks like tty display of non-ASCII characters that cannot be displayed by the current terminal-coding-system is very much screwed up. Here's what "locale" reports, in case it's important: eliz@fencepost:~/emacs.cvs/emacs$ locale LANG= LC_CTYPE="POSIX" LC_NUMERIC="POSIX" LC_TIME="POSIX" LC_COLLATE="POSIX" LC_MONETARY="POSIX" LC_MESSAGES="POSIX" LC_PAPER="POSIX" LC_NAME="POSIX" LC_ADDRESS="POSIX" LC_TELEPHONE="POSIX" LC_MEASUREMENT="POSIX" LC_IDENTIFICATION="POSIX" LC_ALL= In GNU Emacs 23.0.60.63 (x86_64-unknown-linux-gnu, X toolkit) of 2008-09-12 on fencepost configured using `configure '--with-jpeg=no' '--with-png=no' '--with-gif=no' '--with-tiff=no'' Important settings: value of $LC_ALL: nil value of $LC_COLLATE: nil value of $LC_CTYPE: nil value of $LC_MESSAGES: nil value of $LC_MONETARY: nil value of $LC_NUMERIC: nil value of $LC_TIME: nil value of $LANG: nil value of $XMODIFIERS: nil locale-coding-system: nil default-enable-multibyte-characters: t Major mode: Fundamental Minor modes in effect: tooltip-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t global-auto-composition-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t view-mode: t Recent input: ESC [ > 0 ; 1 3 6 ; 0 c C-h H ESC O B ESC O B ESC O B ESC O B ESC O B ESC O B ESC O B ESC O B C-n C-n C-n C-n C-n C-n ESC x r e p o r t - e m a TAB TAB RET Recent messages: ("./src/emacs" "-Q") For information about GNU Emacs and the GNU system, type C-h C-a. Loading vc-cvs...done View mode: type C-h for help, h for commands, q to quit. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-09-27 14:48 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-09-18 18:32 bug#970: 23.0.60; Non-ASCII display problems on a tty Chong Yidong 2008-09-19 8:44 ` Eli Zaretskii 2008-09-27 14:48 ` Eli Zaretskii -- strict thread matches above, loose matches on Subject: below -- 2008-09-12 10:18 Eli Zaretskii
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.