From: Kenichi Handa <handa@m17n.org>
Cc: dalias@aerifal.cx, emacs-devel@gnu.org
Subject: Re: [dalias@aerifal.cx: BUG: Emacs ignores charcell width when running on terminal (w/rtfs & ideas for fix)]
Date: Tue, 24 Oct 2006 09:30:38 +0900 [thread overview]
Message-ID: <E1GcABi-0003oX-00@etlken> (raw)
In-Reply-To: <E1GZSrj-0005dK-W6@fencepost.gnu.org> (message from Richard Stallman on Mon, 16 Oct 2006 09:50:51 -0400)
In article <E1GZSrj-0005dK-W6@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:
> Would you please look at this issue and comment?
> I am not sure if this is something we should try to fix, now or ever.
> But I would like you to think about it.
Sorry for the late response. Actually there's not that
much we can do on this matter.
> ------- Start of forwarded message -------
> Date: Wed, 11 Oct 2006 15:16:50 -0400
> To: bug-gnu-emacs@gnu.org
> From: Rich Felker <dalias@aerifal.cx>
> Subject: BUG: Emacs ignores charcell width when running on terminal (w/rtfs
> & ideas for fix)
[...]
> When GNU Emacs is run on a terminal (-nw mode) and editing UTF-8 text
> files, it treats all characters as if they occupy one character cell
> column on the terminal. This causes it to become confused about the
> cursor position whenever there is CJK fullwidth text or scripts that
> use nonspacing combining characters present, to the point that editing
> is impossible.
Unfortunately, the current Emacs assumes that all characters
in a charset has the same width. As far as we are dealing
with legacy charsets (e.g. ISO8859, JISX, KSC, GB), that
assumption worked well.
> Attached to this email is a UTF-8 file you can open in Emacs which
> exhibits the problem: Japanese Hiragana (for CJK wide) and Tibetan and
> Thai (for nonspacing).
> The root of the problem: In term.c, produce_glyphs() function, the
> code assumes all multibyte characters for a given 'charset' have the
> same width:
The root of the problem is that there's no way for Emacs to
know how many column a terminal use to display a specific
character. For Hiragana, it's possible for Emacs to guess
it will be displayed with two-column, but for Tibetan and
Thai, it heavily depends on terminal's capapbility of
handling CTL (Complex Text Layout). If a terminal doesn't
know how to do CTL for Tibetan, it will just produce glyphs
for each syllable component without stacking (and thus
occupy several columns). If a terminal does, it will dislay
them in one (or two) column. But, there's no way for Emacs
to know which is the case.
> Correctly fixing the issue:
> 1. Needs some sort of width lookup for unicode characters without
> having to convert from Emacs' native encoding to UCS thru UTF-8.
> This should be straightforward for someone who understands the
> code.
That only works for such simple characters as Hiranaga. In
emacs-unicode-2 branch, I introduced char-width-table that
maps each character to column-width occupied by that
character on screen.
> 2. The apppend_glyph() function needs to handle width==0 case, perhaps
> converting the previous glyph into a COMPOSITE_GLYPH instead of
> adding a CHAR_GLYPH. However I don't understand the COMPOSITE_GLYPH
> system in Emacs so I don't know if this is feasible.
COMPOSITE_GLYPH is a glyph containing multiple characters
that must be displayed as a single grapheme cluster. On X,
Emacs displays characters in a COMPOSITE_GLYPH correctly
(sometimes by stacking, sometimes by overstriking, sometimes
by using alternate glyph, etc). But, as there's no way on
terminal to perform such a operation, current Emacs just
displays the first character of a COMPOSITE_GLYPH.
> At present this issue is making it very difficult for me to use
> Tibetan text in composing email and material for the web, so I'm
> looking for some way to fix it, either upstream or with hacks I can
> make locally for the time being until it's fixed properly.
If you want to handle Tibetan text, using X is the only way
for the moment.
---
Kenichi Handa
handa@m17n.org
prev parent reply other threads:[~2006-10-24 0:30 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-10-16 13:50 [dalias@aerifal.cx: BUG: Emacs ignores charcell width when running on terminal (w/rtfs & ideas for fix)] Richard Stallman
2006-10-24 0:30 ` Kenichi Handa [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=E1GcABi-0003oX-00@etlken \
--to=handa@m17n.org \
--cc=dalias@aerifal.cx \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).