unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Re: utf8 char display in buffer
       [not found] ` <878wjzugdo.fsf@iki.fi>
@ 2009-06-11 12:55   ` Lennart Borgman
  2009-06-11 13:04     ` Andreas Schwab
  0 siblings, 1 reply; 25+ messages in thread
From: Lennart Borgman @ 2009-06-11 12:55 UTC (permalink / raw)
  To: Teemu Likonen, Emacs-Devel

Teemu mentioned this on gnu-emacs. It seems nice, but the help text
that C-h l rfc1345 brings up is not that much helpful for someone who
does not know this well. Could it perhaps be enhanced with some links
to relevant information?


On Thu, Jun 11, 2009 at 2:03 PM, Teemu Likonen<tlikonen@iki.fi> wrote:
> On 2009-06-08 14:33 (-0400), ken wrote:
>
>> I already use a few utf8 characters in emacs (and in web pages), but
>> recently needed to use a couple more. One is an 'a' with a horizontal
>> line above it, the other an 'i' with a [horizontal] line above it. How
>> do I input these into a buffer?
>
> Let’s add one more nice way to insert Unicode chars: “rfc1345” input
> method. It’s an input method for Unicode characters using mnemonics.
> Examples:
>
>    &a- = ā
>    &i- = ī
>    &W* = Ω
>    &"6 = “
>    &"9 = ”
>
> For more info: C-h I rfc1345 RET
>




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-11 12:55   ` utf8 char display in buffer Lennart Borgman
@ 2009-06-11 13:04     ` Andreas Schwab
  2009-06-11 13:07       ` Lennart Borgman
  0 siblings, 1 reply; 25+ messages in thread
From: Andreas Schwab @ 2009-06-11 13:04 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: Teemu Likonen, Emacs-Devel

Lennart Borgman <lennart.borgman@gmail.com> writes:

> Teemu mentioned this on gnu-emacs. It seems nice, but the help text
> that C-h l rfc1345 brings up is not that much helpful for someone who
> does not know this well. Could it perhaps be enhanced with some links
> to relevant information?

This has been fixed in Emacs 23, where the complete translation table is
included.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-11 13:04     ` Andreas Schwab
@ 2009-06-11 13:07       ` Lennart Borgman
  2009-06-11 13:08         ` Lennart Borgman
  0 siblings, 1 reply; 25+ messages in thread
From: Lennart Borgman @ 2009-06-11 13:07 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Teemu Likonen, Emacs-Devel

On Thu, Jun 11, 2009 at 3:04 PM, Andreas Schwab<schwab@linux-m68k.org> wrote:
> Lennart Borgman <lennart.borgman@gmail.com> writes:
>
>> Teemu mentioned this on gnu-emacs. It seems nice, but the help text
>> that C-h l rfc1345 brings up is not that much helpful for someone who
>> does not know this well. Could it perhaps be enhanced with some links
>> to relevant information?
>
> This has been fixed in Emacs 23, where the complete translation table is
> included.

Really? This is what I get with
GNU Emacs 23.0.94.1 (i386-mingw-nt5.1.2600) of 2009-06-10 on
LENNART-69DE564 (patched)

------------------------------
Input method: rfc1345 (`m' in mode line) for UTF-8
  Unicode characters input method using RFC1345 mnemonics (non-ASCII only).
E.g. &a' -> á

[back]
------------------------------




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-11 13:07       ` Lennart Borgman
@ 2009-06-11 13:08         ` Lennart Borgman
  2009-06-11 13:24           ` Tassilo Horn
  0 siblings, 1 reply; 25+ messages in thread
From: Lennart Borgman @ 2009-06-11 13:08 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Teemu Likonen, Emacs-Devel

Hm, the check out date is 2009-05-29, not the date below.


On Thu, Jun 11, 2009 at 3:07 PM, Lennart
Borgman<lennart.borgman@gmail.com> wrote:
> On Thu, Jun 11, 2009 at 3:04 PM, Andreas Schwab<schwab@linux-m68k.org> wrote:
>> Lennart Borgman <lennart.borgman@gmail.com> writes:
>>
>>> Teemu mentioned this on gnu-emacs. It seems nice, but the help text
>>> that C-h l rfc1345 brings up is not that much helpful for someone who
>>> does not know this well. Could it perhaps be enhanced with some links
>>> to relevant information?
>>
>> This has been fixed in Emacs 23, where the complete translation table is
>> included.
>
> Really? This is what I get with
> GNU Emacs 23.0.94.1 (i386-mingw-nt5.1.2600) of 2009-06-10 on
> LENNART-69DE564 (patched)
>
> ------------------------------
> Input method: rfc1345 (`m' in mode line) for UTF-8
>  Unicode characters input method using RFC1345 mnemonics (non-ASCII only).
> E.g. &a' -> á
>
> [back]
> ------------------------------
>




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-11 13:08         ` Lennart Borgman
@ 2009-06-11 13:24           ` Tassilo Horn
  0 siblings, 0 replies; 25+ messages in thread
From: Tassilo Horn @ 2009-06-11 13:24 UTC (permalink / raw)
  To: Emacs-Devel

Lennart Borgman <lennart.borgman@gmail.com> writes:

Hi Lennart,

>>> This has been fixed in Emacs 23, where the complete translation
>>> table is included.
>>
>> Really?

Mine is about a week old, and it displays the complete table.  Nice, I
which I'd write more unicode chars. :-)

Bye,
Tassilo




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
       [not found]                 ` <4A32D54D.1040405@mousecar.com>
@ 2009-06-12 22:27                   ` Lennart Borgman
  2009-06-12 23:38                     ` ken
  2009-06-13  1:36                     ` Miles Bader
  0 siblings, 2 replies; 25+ messages in thread
From: Lennart Borgman @ 2009-06-12 22:27 UTC (permalink / raw)
  To: gebser, Emacs-Devel devel

Ken, I think this is a good idea so I have sent this along to Emacs devel.

On Sat, Jun 13, 2009 at 12:23 AM, ken<gebser@mousecar.com> wrote:
> Yet emacs puts a little box in the place of a character it cannot find
> (or, per your explanation) possibly confused about.  The fact remains
> that the little box is not a correct rendering of the code.  It is an
> error... at least it is for me, because that's not what I typed in.  So
> it is an error.  As an error, there should be a corresponding error
> message, hopefully one (or more) which would help diagnose the problem.
>  It seems obvious that, given the long thread on this issue with no
> resolution, we could use some help-- like an error message-- which would
> help in diagnosis.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-12 22:27                   ` Lennart Borgman
@ 2009-06-12 23:38                     ` ken
  2009-06-13  4:11                       ` Eli Zaretskii
  2009-06-14 20:59                       ` Stefan Monnier
  2009-06-13  1:36                     ` Miles Bader
  1 sibling, 2 replies; 25+ messages in thread
From: ken @ 2009-06-12 23:38 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: Emacs-Devel devel



On 06/12/2009 06:27 PM Lennart Borgman wrote:
> Ken, I think this is a good idea so I have sent this along to Emacs devel.
> 
> On Sat, Jun 13, 2009 at 12:23 AM, ken<gebser@mousecar.com> wrote:
>> Yet emacs puts a little box in the place of a character it cannot find
>> (or, per your explanation) possibly confused about.  The fact remains
>> that the little box is not a correct rendering of the code.  It is an
>> error... at least it is for me, because that's not what I typed in.  So
>> it is an error.  As an error, there should be a corresponding error
>> message, hopefully one (or more) which would help diagnose the problem.
>>  It seems obvious that, given the long thread on this issue with no
>> resolution, we could use some help-- like an error message-- which would
>> help in diagnosis.

Thank you, Lennart!  To give the people at emacs-devel some context to
the issue, the salient portion of the previous post is pasted below:

0) Some others and myself want to include some non-English characters in
a file being edited in emacs. Problems arise, however:

1) In a buffer which is already utf-8 encoded, I set the appropriate
input method, type in the desired characters. They display just peachy
and there is happiness in EmacsLand.

2) I save the buffer to a file, then close the buffer.

3) I visit the same file (i.e., load it again into emacs). Because it
has <!-- -*- coding: utf-8; -*- --> as the first line, it opens
utf-8 encoded. This is confirmed by the presence of a 'u' as the second
character in the status bar.

4) The text in the buffer displays fine, except that in place of each of
those non-English characters is a little empty box. With the cursor on
one of those boxes, an 'a' with a horizontal bar above it, doing "C-x
=", emacs returns "Char: ā (01210041, 331809, 0x51021, file ...)".
(While, in emacs the character after "Char:" is a little box, if I load
this same file into Firefox, that same character appears as it should,
as an 'a' with a horizontal bar above it. How it appears in your email
client will depend upon your email client.)

A) The fact that, as described in (4), the characters display correctly
in Firefox, but not in emacs indicates that emacs is not drawing on the
needed character set. Yet, the fact that in (1) the characters initially
display correctly (when first input) indicates that the needed character
set is present on the system and emacs can find it and has permission
access it. Further, we would think that emacs would throw out an error
message if either of these conditions were not met... and it doesn't. We
can only assume that, when visiting and then decoding a file and pulling
into a buffer for display, emacs is not even asking for the proper
character set when encountering a non-English character. This is where I
would start to look for the error.

B) It would be helpful if the code which does the decoding of a file and
renders it into the buffer display, if that part of it would throw an
error message when it encounters a character it doesn't know how to
display, i.e., when a little box character is displayed. After all,
isn't it an error when a little box is displayed in lieu of the correct
character? Possible error messages would be something like: "decoding
process can't find /path/to/charset.file" or "decoding process doesn't
have requisite permission to read /path/to/charset.file" or "invalid
character: [hex/decimal value]" or other.

###

Thanks much,
ken




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-12 22:27                   ` Lennart Borgman
  2009-06-12 23:38                     ` ken
@ 2009-06-13  1:36                     ` Miles Bader
  2009-06-13  1:43                       ` Lennart Borgman
  2009-06-13  5:50                       ` Richard Stallman
  1 sibling, 2 replies; 25+ messages in thread
From: Miles Bader @ 2009-06-13  1:36 UTC (permalink / raw)
  To: emacs-devel

Lennart Borgman <lennart.borgman@gmail.com> writes:
> Ken, I think this is a good idea so I have sent this along to Emacs devel.
>
>> Yet emacs puts a little box in the place of a character it cannot find
>> (or, per your explanation) possibly confused about.  The fact remains
>> that the little box is not a correct rendering of the code.  It is an
>> error... at least it is for me, because that's not what I typed in.  So
>> it is an error.  As an error, there should be a corresponding error
>> message, hopefully one (or more) which would help diagnose the problem.

An "error message" _when_?  Whether a character is displayable or not
isn't known until display time, and error messages for display issues
are generally a very bad idea.  For some display errors, the display
code will put messages in the *Messages* buffer (though they aren't
displayed to the user), but even that must be done with careful
consideration; currently they typically indicate that something
seriously screwy is going on (and a non-displayable character, however
annoying, isn't "seriously screwy").

If an "error message" were displayed, what would it say?  The only thing
I can think of is "no font could be found to display character FOO", but
that fact is already obvious from the little box.  Given no extra
detail, is a message even useful?

Maybe some sort of _once-only_ pop-up buffer note to the user saying
"little boxes indicate characters for which a font could not found; see
info manual section X.Y for details on blah blah"?  I suppose that could
help some people, but such a thing, even if once-only, would probably be
pretty annoying to non-complete-noob users, so ...

-Miles

-- 
[|nurgle|]  ddt- demonic? so quake will have an evil kinda setting? one that
            will  make every christian in the world foamm at the mouth?
[iddt]      nurg, that's the goal





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-13  1:36                     ` Miles Bader
@ 2009-06-13  1:43                       ` Lennart Borgman
  2009-06-13  5:50                       ` Richard Stallman
  1 sibling, 0 replies; 25+ messages in thread
From: Lennart Borgman @ 2009-06-13  1:43 UTC (permalink / raw)
  To: Miles Bader; +Cc: emacs-devel

On Sat, Jun 13, 2009 at 3:36 AM, Miles Bader<miles@gnu.org> wrote:
> Lennart Borgman <lennart.borgman@gmail.com> writes:
>> Ken, I think this is a good idea so I have sent this along to Emacs devel.
>>
>>> Yet emacs puts a little box in the place of a character it cannot find
>>> (or, per your explanation) possibly confused about.  The fact remains
>>> that the little box is not a correct rendering of the code.  It is an
>>> error... at least it is for me, because that's not what I typed in.  So
>>> it is an error.  As an error, there should be a corresponding error
>>> message, hopefully one (or more) which would help diagnose the problem.
>
> An "error message" _when_?  Whether a character is displayable or not
> isn't known until display time, and error messages for display issues
> are generally a very bad idea.  For some display errors, the display
> code will put messages in the *Messages* buffer (though they aren't
> displayed to the user), but even that must be done with careful
> consideration; currently they typically indicate that something
> seriously screwy is going on (and a non-displayable character, however
> annoying, isn't "seriously screwy").
>
> If an "error message" were displayed, what would it say?  The only thing
> I can think of is "no font could be found to display character FOO", but
> that fact is already obvious from the little box.  Given no extra
> detail, is a message even useful?
>
> Maybe some sort of _once-only_ pop-up buffer note to the user saying
> "little boxes indicate characters for which a font could not found; see
> info manual section X.Y for details on blah blah"?  I suppose that could
> help some people, but such a thing, even if once-only, would probably be
> pretty annoying to non-complete-noob users, so ...

Yes, I know you have to be careful with error messages in a case like
this, but giving some type of information (like the one you suggested)
the first time in an emacs session when such a little box is shown was
what I had in mind.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-12 23:38                     ` ken
@ 2009-06-13  4:11                       ` Eli Zaretskii
  2009-06-13 12:30                         ` ken
  2009-06-14 20:59                       ` Stefan Monnier
  1 sibling, 1 reply; 25+ messages in thread
From: Eli Zaretskii @ 2009-06-13  4:11 UTC (permalink / raw)
  To: gebser; +Cc: lennart.borgman, emacs-devel

> Date: Fri, 12 Jun 2009 19:38:30 -0400
> From: ken <gebser@mousecar.com>
> Cc: Emacs-Devel devel <emacs-devel@gnu.org>
> Reply-To: gebser@mousecar.com
> 
> Thank you, Lennart!  To give the people at emacs-devel some context to
> the issue, the salient portion of the previous post is pasted below:

Please provide the output of "C-u C-x =" on these characters, both
when they are displayed correctly and when they are displayed as empty
boxes.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-13  1:36                     ` Miles Bader
  2009-06-13  1:43                       ` Lennart Borgman
@ 2009-06-13  5:50                       ` Richard Stallman
  2009-06-15  4:34                         ` Miles Bader
  2009-06-15 20:06                         ` Chong Yidong
  1 sibling, 2 replies; 25+ messages in thread
From: Richard Stallman @ 2009-06-13  5:50 UTC (permalink / raw)
  To: Miles Bader; +Cc: emacs-devel

Would it be possible to display the codepoint numerically in the box?




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-13  4:11                       ` Eli Zaretskii
@ 2009-06-13 12:30                         ` ken
  2009-06-13 13:23                           ` Eli Zaretskii
  0 siblings, 1 reply; 25+ messages in thread
From: ken @ 2009-06-13 12:30 UTC (permalink / raw)
  To: Eli Zaretskii, GNU Emacs List, emacs-devel

On 06/13/2009 12:11 AM Eli Zaretskii wrote:
>> ....
> 
> Please provide the output of "C-u C-x =" on these characters, both
> when they are displayed correctly and when they are displayed as empty
> boxes.


In a similar post on the same thread Eli Zaretskii wrote:
> Please post here the full output of "C-u C-x =" (from a buffer popped
> up by Emacs) for these characters, both when you type them using the
> appropriate input method and they are displayed correctly (as in 1)
> above), and when you see them as empty boxes after revisiting the
> file.  The differences between these two cases should give you a hint
> what is wrong; if not, someone else here might have ideas.

Eli, thanks for your response.  Here it is:

^[$-1 ¡ is 'a' with a horizontal bar over it.  On first inputting it
(after doing "set-input-method latin-4-postfix" and before changing the
input method to anything else), it appears correctly and "C-u C-x =" yields:

=============================================

  character: ^[$-1 ¡ (05140, 2656, 0xa60)
    charset: latin-iso8859-4
	     (Right-Hand Part of Latin Alphabet 4 (ISO/IEC 8859-4): ISO-IR-110)
 code point: 96
     syntax: word
   category: l:Latin
buffer code: 0x84 0xE0
  file code: 0xC4 0x81 (encoded by coding system mule-utf-8-unix)
       font: -ETL-Fixed-Medium-R-Normal--16-160-72-72-C-80-ISO8859-4

=============================================

When I reload the file (revisit the file), the same character is
replaced with a little box.  Doing "C-u C-x =" here yields:

=============================================

  character: ^[$-1 ¡ (01210041, 331809, 0x51021)
    charset: mule-unicode-0100-24ff
	     (Unicode characters of the range U+0100..U+24FF.)
 code point: 32 33
     syntax: word
   category: l:Latin
buffer code: 0x9C 0xF4 0xA0 0xA1
  file code: 0xC4 0x81 (encoded by coding system mule-utf-8-unix)
       font: -- none --

=============================================

Note: For some reason, possibly related, had difficulty copying the
above text from emacs into clipboard (i.e., "M-w" didn't do anything),
so had to use a workaround.  It seems that this workaround altered the
character in question, the one above following each of the two instances
of "character:".

As for the meaning of the two outputs above, all that I can confidently
glean is that, if I want to use non-English characters in emacs, I have
to be an expert emacs developer.  :)





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-13 12:30                         ` ken
@ 2009-06-13 13:23                           ` Eli Zaretskii
  0 siblings, 0 replies; 25+ messages in thread
From: Eli Zaretskii @ 2009-06-13 13:23 UTC (permalink / raw)
  To: gebser; +Cc: emacs-devel

> Date: Sat, 13 Jun 2009 08:30:37 -0400
> From: ken <gebser@mousecar.com>
> Reply-To:  gebser@mousecar.com
> 
> ^[$-1 ¡ is 'a' with a horizontal bar over it.  On first inputting it
> (after doing "set-input-method latin-4-postfix" and before changing the
> input method to anything else), it appears correctly and "C-u C-x =" yields:
> 
> =============================================
> 
>   character: ^[$-1 ¡ (05140, 2656, 0xa60)
>     charset: latin-iso8859-4
> 	     (Right-Hand Part of Latin Alphabet 4 (ISO/IEC 8859-4): ISO-IR-110)
>  code point: 96
>      syntax: word
>    category: l:Latin
> buffer code: 0x84 0xE0
>   file code: 0xC4 0x81 (encoded by coding system mule-utf-8-unix)
>        font: -ETL-Fixed-Medium-R-Normal--16-160-72-72-C-80-ISO8859-4
> 
> =============================================
> 
> When I reload the file (revisit the file), the same character is
> replaced with a little box.  Doing "C-u C-x =" here yields:
> 
> =============================================
> 
>   character: ^[$-1 ¡ (01210041, 331809, 0x51021)
>     charset: mule-unicode-0100-24ff
> 	     (Unicode characters of the range U+0100..U+24FF.)
>  code point: 32 33
>      syntax: word
>    category: l:Latin
> buffer code: 0x9C 0xF4 0xA0 0xA1
>   file code: 0xC4 0x81 (encoded by coding system mule-utf-8-unix)
>        font: -- none --
> 
> =============================================

So I think everything is clear now: you have a font that covers this
characters when they are from the 8859-4 character set, but you do not
have a font that covers them in Unicode.  You should install the
Unicode font that supports these characters.

> As for the meaning of the two outputs above, all that I can confidently
> glean is that, if I want to use non-English characters in emacs, I have
> to be an expert emacs developer.  :)

That's exaggeration, I think.  You can use the "C-u C-x =" command,
just as you did above, to find out what Emacs thinks about each
character that is displayed as an empty box.  You can then look for
fonts that cover these characters.  "C-u C-x =" is a user-level
command, and one of its uses is precisely this: to find out what fonts
are missing on your machine.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-12 23:38                     ` ken
  2009-06-13  4:11                       ` Eli Zaretskii
@ 2009-06-14 20:59                       ` Stefan Monnier
  1 sibling, 0 replies; 25+ messages in thread
From: Stefan Monnier @ 2009-06-14 20:59 UTC (permalink / raw)
  To: gebser; +Cc: Lennart Borgman, Emacs-Devel devel

> Thank you, Lennart!  To give the people at emacs-devel some context to
> the issue, the salient portion of the previous post is pasted below:

IIUC an important missing detail is that you're using Emacs-22 and that
this same problem won't happen in Emacs-23, right?
I could imagine adding a help-text that would pop-up when the mouse is
over one of those dreaded square boxes.  But this problem has been
around for a while now, and should be much more rare in Emacs-23, so I'm
not sure it's worth "fixing".  Or rather I think that the approach taken
in Emacs-23 is such a fix.


        Stefan




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-13  5:50                       ` Richard Stallman
@ 2009-06-15  4:34                         ` Miles Bader
  2009-06-15 19:30                           ` Richard Stallman
  2009-06-15 20:06                         ` Chong Yidong
  1 sibling, 1 reply; 25+ messages in thread
From: Miles Bader @ 2009-06-15  4:34 UTC (permalink / raw)
  To: rms; +Cc: emacs-devel

Richard Stallman <rms@gnu.org> writes:
> Would it be possible to display the codepoint numerically in the box?

Would that help much?  I'm not sure that the codepoint is very useful to
most people (and the information is easily available via C-x =)...

[Well I suppose it might be a little useful, but maybe not enough to
justify implementation costs...]

-Miles

-- 
(\(\
(^.^)
(")")
*This is the cute bunny virus, please copy this into your sig so it can spread.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-15  4:34                         ` Miles Bader
@ 2009-06-15 19:30                           ` Richard Stallman
  2009-06-16  0:30                             ` James Cloos
  2009-06-16 20:48                             ` Stefan Monnier
  0 siblings, 2 replies; 25+ messages in thread
From: Richard Stallman @ 2009-06-15 19:30 UTC (permalink / raw)
  To: Miles Bader; +Cc: emacs-devel

    > Would it be possible to display the codepoint numerically in the box?

    Would that help much?  I'm not sure that the codepoint is very useful to
    most people (and the information is easily available via C-x =)...

I think it would be quite useful.  First, you would immediately see
which of the undisplayable characters are the same.  Second, you might
come to recognize a few common codepoints, and that would be useful.

Whether it's worth the trouble depends on how much trouble that is,
which I don't know.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-13  5:50                       ` Richard Stallman
  2009-06-15  4:34                         ` Miles Bader
@ 2009-06-15 20:06                         ` Chong Yidong
  2009-06-15 21:57                           ` Drew Adams
  2009-06-16  5:30                           ` Richard Stallman
  1 sibling, 2 replies; 25+ messages in thread
From: Chong Yidong @ 2009-06-15 20:06 UTC (permalink / raw)
  To: rms; +Cc: emacs-devel, Miles Bader

Richard Stallman <rms@gnu.org> writes:

> Would it be possible to display the codepoint numerically in the box?

I don't think there's enough space.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: utf8 char display in buffer
  2009-06-15 20:06                         ` Chong Yidong
@ 2009-06-15 21:57                           ` Drew Adams
  2009-06-16  5:30                           ` Richard Stallman
  1 sibling, 0 replies; 25+ messages in thread
From: Drew Adams @ 2009-06-15 21:57 UTC (permalink / raw)
  To: 'Chong Yidong', rms; +Cc: 'Miles Bader', emacs-devel

> > Would it be possible to display the codepoint numerically 
> > in the box?
> 
> I don't think there's enough space.

I don't know whether Richard really meant to put the number inside the little
character-size box.

But a tooltip (mouseover) would work. And it would have room for more than just
the codepoint. It would not show you more than one at a time, however. (He
mentioned seeing immediately which little boxes represented the same codepoint.)





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-15 19:30                           ` Richard Stallman
@ 2009-06-16  0:30                             ` James Cloos
  2009-06-16  1:10                               ` Miles Bader
  2009-06-16 13:53                               ` Chong Yidong
  2009-06-16 20:48                             ` Stefan Monnier
  1 sibling, 2 replies; 25+ messages in thread
From: James Cloos @ 2009-06-16  0:30 UTC (permalink / raw)
  To: emacs-devel; +Cc: rms, Miles Bader

>> Would it be possible to display the codepoint numerically in the box?

Displaying the UCS Code Point for characters which lack font support is
the norm in GTK.  This is done by drawing a box around four or six digit
glyphs which are rendered in a smaller point size.  I'd expect that this
is easier to read when using a proportional face.

Apple went in a slightly different direction and commissioned a fallback
font from Michael Everson which has one glyph per Unicode script and uses
that for each character associated with said script.

Emacs could easily do either (w/o the need for a font in the latter case).

-JimC
-- 
James Cloos <cloos@jhcloos.com>         OpenPGP: 1024D/ED7DAEA6




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-16  0:30                             ` James Cloos
@ 2009-06-16  1:10                               ` Miles Bader
  2009-06-16  1:12                                 ` Miles Bader
  2009-06-16 13:53                               ` Chong Yidong
  1 sibling, 1 reply; 25+ messages in thread
From: Miles Bader @ 2009-06-16  1:10 UTC (permalink / raw)
  To: James Cloos; +Cc: rms, emacs-devel

James Cloos <cloos@jhcloos.com> writes:
> Displaying the UCS Code Point for characters which lack font support is
> the norm in GTK.  This is done by drawing a box around four or six digit
> glyphs which are rendered in a smaller point size.  I'd expect that this
> is easier to read when using a proportional face.
>
> Apple went in a slightly different direction and commissioned a fallback
> font from Michael Everson which has one glyph per Unicode script and uses
> that for each character associated with said script.
>
> Emacs could easily do either (w/o the need for a font in the latter case).

The GTK method does screw up one good thing about emacs' method -- the
boxes it displays are generally the correct width (single- or double-
width [CJK etc]), so text alignment is preserved.

The apple method might be able to preserve the width, and seems better
for the user anyway -- I think the most useful info is "what kind of
font should I install to fix this" and/or "do I really care enough to
fix this", so identifying the script is probably more important than
identifying the precise codepoint.

Drew's suggestion of a tooltip seems like it might be easier to
implement, and more functional than either in practice though --
it could display a lot more information without screwing up alignment,
basically a slightly more convenient/obvious version of C-x =

-Miles

-- 
Egotist, n. A person of low taste, more interested in himself than in me.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-16  1:10                               ` Miles Bader
@ 2009-06-16  1:12                                 ` Miles Bader
  2009-06-17  5:07                                   ` Richard Stallman
  0 siblings, 1 reply; 25+ messages in thread
From: Miles Bader @ 2009-06-16  1:12 UTC (permalink / raw)
  To: James Cloos; +Cc: rms, emacs-devel

Miles Bader <miles@gnu.org> writes:
> The GTK method does screw up one good thing about emacs' method -- the
> boxes it displays are generally the correct width (single- or double-
> width [CJK etc]), so text alignment is preserved.

Er, to be more clear, that should be "the boxes _emacs_ displays are
generally the correct width...".

-Miles

-- 
My books focus on timeless truths.  -- Donald Knuth




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-15 20:06                         ` Chong Yidong
  2009-06-15 21:57                           ` Drew Adams
@ 2009-06-16  5:30                           ` Richard Stallman
  1 sibling, 0 replies; 25+ messages in thread
From: Richard Stallman @ 2009-06-16  5:30 UTC (permalink / raw)
  To: Chong Yidong; +Cc: emacs-devel, miles

    > Would it be possible to display the codepoint numerically in the box?

    I don't think there's enough space.

What determines how much space there is?  Could we make it wider?
Perhaps an earlier stage of redisplay could check whether there is a font
for the character.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-16  0:30                             ` James Cloos
  2009-06-16  1:10                               ` Miles Bader
@ 2009-06-16 13:53                               ` Chong Yidong
  1 sibling, 0 replies; 25+ messages in thread
From: Chong Yidong @ 2009-06-16 13:53 UTC (permalink / raw)
  To: James Cloos; +Cc: Miles Bader, rms, emacs-devel

James Cloos <cloos@jhcloos.com> writes:

> Apple went in a slightly different direction and commissioned a fallback
> font from Michael Everson which has one glyph per Unicode script and uses
> that for each character associated with said script.

If your system has a font for display a character, Emacs will
automatically use that font.  But I don't think we should package a
fallback font into the Emacs distribution.  The benefit would be
marginal anyway; if a user lacks a system font for displaying a script,
he or she probably cannot read that script.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-15 19:30                           ` Richard Stallman
  2009-06-16  0:30                             ` James Cloos
@ 2009-06-16 20:48                             ` Stefan Monnier
  1 sibling, 0 replies; 25+ messages in thread
From: Stefan Monnier @ 2009-06-16 20:48 UTC (permalink / raw)
  To: rms; +Cc: emacs-devel, Miles Bader

Can we stop wasting time on this: this was a problem in Emacs-22.
There's no evidence that this problem is significant in Emacs-23.


        Stefan "I said it already elsewhere, but I'm afraid it got lost"




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: utf8 char display in buffer
  2009-06-16  1:12                                 ` Miles Bader
@ 2009-06-17  5:07                                   ` Richard Stallman
  0 siblings, 0 replies; 25+ messages in thread
From: Richard Stallman @ 2009-06-17  5:07 UTC (permalink / raw)
  To: Miles Bader; +Cc: cloos, emacs-devel

    > The GTK method does screw up one good thing about emacs' method -- the
    > boxes it displays are generally the correct width (single- or double-
    > width [CJK etc]), so text alignment is preserved.

    Er, to be more clear, that should be "the boxes _emacs_ displays are
    generally the correct width...".

Another idea is to display the bottom byte of the character code as
hex in the box.  Maybe two hex digits can fit if they are small, and
the user would sometimes be able to identify the character from those.

Another idea is to display in the box a slightly smaller version of
the character which the bottom byte represents.




^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2009-06-17  5:07 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <mailman.227.1244485995.2239.help-gnu-emacs@gnu.org>
     [not found] ` <878wjzugdo.fsf@iki.fi>
2009-06-11 12:55   ` utf8 char display in buffer Lennart Borgman
2009-06-11 13:04     ` Andreas Schwab
2009-06-11 13:07       ` Lennart Borgman
2009-06-11 13:08         ` Lennart Borgman
2009-06-11 13:24           ` Tassilo Horn
     [not found] ` <mailman.297.1244559110.2239.help-gnu-emacs@gnu.org>
     [not found]   ` <7I2dndeTy7sqkLLXnZ2dnUVZ_gmdnZ2d@sysmatrix.net>
     [not found]     ` <pc7hbyofaoq.fsf@panix2.panix.com>
     [not found]       ` <aJKdnVvH4ebY5a3XnZ2dnUVZ_vOdnZ2d@sysmatrix.net>
     [not found]         ` <mailman.522.1244818530.2239.help-gnu-emacs@gnu.org>
     [not found]           ` <pc7fxe5mpfh.fsf@panix2.panix.com>
     [not found]             ` <JOqdncHRqaVlG6_XnZ2dnUVZ_u6dnZ2d@sysmatrix.net>
     [not found]               ` <db6d8fe0-4cd5-4499-a8a8-466203889a83@y34g2000prb.googlegroups.com>
     [not found]                 ` <4A32D54D.1040405@mousecar.com>
2009-06-12 22:27                   ` Lennart Borgman
2009-06-12 23:38                     ` ken
2009-06-13  4:11                       ` Eli Zaretskii
2009-06-13 12:30                         ` ken
2009-06-13 13:23                           ` Eli Zaretskii
2009-06-14 20:59                       ` Stefan Monnier
2009-06-13  1:36                     ` Miles Bader
2009-06-13  1:43                       ` Lennart Borgman
2009-06-13  5:50                       ` Richard Stallman
2009-06-15  4:34                         ` Miles Bader
2009-06-15 19:30                           ` Richard Stallman
2009-06-16  0:30                             ` James Cloos
2009-06-16  1:10                               ` Miles Bader
2009-06-16  1:12                                 ` Miles Bader
2009-06-17  5:07                                   ` Richard Stallman
2009-06-16 13:53                               ` Chong Yidong
2009-06-16 20:48                             ` Stefan Monnier
2009-06-15 20:06                         ` Chong Yidong
2009-06-15 21:57                           ` Drew Adams
2009-06-16  5:30                           ` Richard Stallman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).