unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* 22.2.50; Display of "zero width no-break space" (U+FEFF)
@ 2008-04-06 20:53 Reiner Steib
  2008-04-20 10:35 ` Reiner Steib
  0 siblings, 1 reply; 7+ messages in thread
From: Reiner Steib @ 2008-04-06 20:53 UTC (permalink / raw)
  To: emacs-pretest-bug

Hi,

in Emacs 22.2.50, "zero width no-break space" (U+FEFF) is displayed as
a hollow box. [1]

I think for normal buffers it should be displayed like ` ' (NO-BREAK
SPACE, U+00A0) i.e. using the face face `nobreak-space'.  But when
`nobreak-char-display' is nil, it should not be displayed at all or
like a normal SPC.  (Dunno what Unicode says about it.)

In gedit, xedit (with the same font as in Emacs) and Firefox[2], the
char is displayed like a space char (i.e. not "zero width").

Bye, Reiner.

[1]
,----[ M-x describe-char RET ]
|       character:  (325983, #o1174537, #x4f95f, U+FEFF)
|         charset: mule-unicode-e000-ffff
| 		 (Unicode characters of the range U+E000..U+FFFF.)
|      code point: #x72 #x5F
|          syntax: w 	which means: word
|     buffer code: #x9C #xF3 #xF2 #xDF
|       file code: #xEF #xBB #xBF (encoded by coding system utf-8)
|         display: by this font (glyph code)
|      -Misc-Fixed-Medium-R-SemiCondensed--13-120-75-75-C-60-ISO10646-1 (#xFEFF)
|    Unicode data:  
|            Name: ZERO WIDTH NO-BREAK SPACE
|        Category: other format
| Combining class: Spacing
|   Bidi category: Boundary Neutral
|        Old name: BYTE ORDER MARK
`----

[2] In both lines, the first char after the first space is U+FFFF:
,----[ http://article.gmane.org/gmane.discuss/11614/ ]
|regarding gmane.linux.debian.devel.www :  [...]
|like http://news.gmane.org/gmane.linux.debian.devel.www 
`----
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 22.2.50; Display of "zero width no-break space" (U+FEFF)
  2008-04-06 20:53 22.2.50; Display of "zero width no-break space" (U+FEFF) Reiner Steib
@ 2008-04-20 10:35 ` Reiner Steib
  2008-04-20 23:57   ` Juri Linkov
  0 siblings, 1 reply; 7+ messages in thread
From: Reiner Steib @ 2008-04-20 10:35 UTC (permalink / raw)
  To: emacs-pretest-bug

On Sun, Apr 06 2008, Reiner Steib wrote:

> in Emacs 22.2.50, "zero width no-break space" (U+FEFF) is displayed as
> a hollow box. [1]
>
> I think for normal buffers it should be displayed like ` ' (NO-BREAK
> SPACE, U+00A0) i.e. using the face face `nobreak-space'.  But when
> `nobreak-char-display' is nil, it should not be displayed at all or
> like a normal SPC.  (Dunno what Unicode says about it.)

Any opinions on this?

> In gedit, xedit (with the same font as in Emacs) and Firefox[2], the
> char is displayed like a space char (i.e. not "zero width").

Similar: U+2403

,----
|       character: ␃ (343747, #o1237303, #x53ec3, U+2403)
| [...]
|         display: by this font (glyph code)
|      -Misc-Fixed-Medium-R-SemiCondensed--12-110-75-75-C-60-ISO10646-1 (#x2403)
|    Unicode data:  
|            Name: SYMBOL FOR END OF TEXT
|        Category: other symbol
| Combining class: Spacing
|   Bidi category: Other Neutrals
|        Old name: GRAPHIC FOR END OF TEXT
`----

> [1]
> ,----[ M-x describe-char RET ]
> |       character:  (325983, #o1174537, #x4f95f, U+FEFF)
> |         charset: mule-unicode-e000-ffff
> | 		 (Unicode characters of the range U+E000..U+FFFF.)
> |      code point: #x72 #x5F
> |          syntax: w 	which means: word
> |     buffer code: #x9C #xF3 #xF2 #xDF
> |       file code: #xEF #xBB #xBF (encoded by coding system utf-8)
> |         display: by this font (glyph code)
> |      -Misc-Fixed-Medium-R-SemiCondensed--13-120-75-75-C-60-ISO10646-1 (#xFEFF)
> |    Unicode data:  
> |            Name: ZERO WIDTH NO-BREAK SPACE
> |        Category: other format
> | Combining class: Spacing
> |   Bidi category: Boundary Neutral
> |        Old name: BYTE ORDER MARK
> `----
>
> [2] In both lines, the first char after the first space is U+FFFF:
> ,----[ http://article.gmane.org/gmane.discuss/11614/ ]
> |regarding gmane.linux.debian.devel.www :  [...]
> |like http://news.gmane.org/gmane.linux.debian.devel.www 
> `----

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 22.2.50; Display of "zero width no-break space" (U+FEFF)
  2008-04-20 10:35 ` Reiner Steib
@ 2008-04-20 23:57   ` Juri Linkov
  2008-04-21  3:09     ` Eli Zaretskii
  0 siblings, 1 reply; 7+ messages in thread
From: Juri Linkov @ 2008-04-20 23:57 UTC (permalink / raw)
  To: emacs-pretest-bug

>> in Emacs 22.2.50, "zero width no-break space" (U+FEFF) is displayed as
>> a hollow box. [1]
>>
>> I think for normal buffers it should be displayed like ` ' (NO-BREAK
>> SPACE, U+00A0) i.e. using the face face `nobreak-space'.  But when
>> `nobreak-char-display' is nil, it should not be displayed at all or
>> like a normal SPC.  (Dunno what Unicode says about it.)
>
> Any opinions on this?

This is the same character as BOM.  But since it is *ZERO-WIDTH*
NO-BREAK SPACE, it seems it shouldn't be displayed at all in view mode.
But it edit mode, it would be preferable to have some indication
about the presence of this character in the buffer.

-- 
Juri Linkov
http://www.jurta.org/emacs/




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 22.2.50; Display of "zero width no-break space" (U+FEFF)
  2008-04-20 23:57   ` Juri Linkov
@ 2008-04-21  3:09     ` Eli Zaretskii
  2008-04-21 11:51       ` Reiner Steib
  0 siblings, 1 reply; 7+ messages in thread
From: Eli Zaretskii @ 2008-04-21  3:09 UTC (permalink / raw)
  To: emacs-devel

> From: Juri Linkov <juri@jurta.org>
> Date: Mon, 21 Apr 2008 02:57:29 +0300
> Cc: 
> 
> >> in Emacs 22.2.50, "zero width no-break space" (U+FEFF) is displayed as
> >> a hollow box. [1]
> >>
> >> I think for normal buffers it should be displayed like ` ' (NO-BREAK
> >> SPACE, U+00A0) i.e. using the face face `nobreak-space'.  But when
> >> `nobreak-char-display' is nil, it should not be displayed at all or
> >> like a normal SPC.  (Dunno what Unicode says about it.)
> >
> > Any opinions on this?
> 
> This is the same character as BOM.  But since it is *ZERO-WIDTH*
> NO-BREAK SPACE, it seems it shouldn't be displayed at all in view mode.
> But it edit mode, it would be preferable to have some indication
> about the presence of this character in the buffer.

I think there will be many characters such as U+FEFF defined by
Unicode that we will need to decide how to display in various modes.
It's not right, IMO, to decide about their display one by one, when
someone happens to pop a question.  We should probably have some kind
of general policy about them.  I suggest that Someone(TM) looks at
this issue and suggests how we should deal with these characters,
based on Unicode recommendations and on what other Unicode-capable
editors do.





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 22.2.50; Display of "zero width no-break space" (U+FEFF)
  2008-04-21  3:09     ` Eli Zaretskii
@ 2008-04-21 11:51       ` Reiner Steib
  2008-04-21 22:33         ` Juri Linkov
  0 siblings, 1 reply; 7+ messages in thread
From: Reiner Steib @ 2008-04-21 11:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

[ The following message is a courtesy copy of an article that has
  been posted to news:gmane.emacs.devel as well. ]

On Mon, Apr 21 2008, Eli Zaretskii wrote:

>> From: Juri Linkov <juri@jurta.org>
>> This is the same character as BOM.  But since it is *ZERO-WIDTH*
>> NO-BREAK SPACE, it seems it shouldn't be displayed at all in view mode.
>> But it edit mode, it would be preferable to have some indication
>> about the presence of this character in the buffer.

If "view mode" is meant to be more general than the minor mode
`view-mode', I agree.  E.g. in Gnus article buffer (a read-only buffer
where articles are displayed), `nobreak-char-display' is set to nil
whereas when composing or replying, Emacs' default is used.

> I think there will be many characters such as U+FEFF defined by
> Unicode that we will need to decide how to display in various modes.
> It's not right, IMO, to decide about their display one by one, when
> someone happens to pop a question.  We should probably have some kind
> of general policy about them.  I suggest that Someone(TM) looks at
> this issue and suggests how we should deal with these characters,
> based on Unicode recommendations and on what other Unicode-capable
> editors do.

I agree.

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 22.2.50; Display of "zero width no-break space" (U+FEFF)
  2008-04-21 11:51       ` Reiner Steib
@ 2008-04-21 22:33         ` Juri Linkov
  2008-04-23  5:14           ` Chong Yidong
  0 siblings, 1 reply; 7+ messages in thread
From: Juri Linkov @ 2008-04-21 22:33 UTC (permalink / raw)
  To: emacs-devel

>>> This is the same character as BOM.  But since it is *ZERO-WIDTH*
>>> NO-BREAK SPACE, it seems it shouldn't be displayed at all in view mode.
>>> But it edit mode, it would be preferable to have some indication
>>> about the presence of this character in the buffer.
>
> If "view mode" is meant to be more general than the minor mode
> `view-mode', I agree.  E.g. in Gnus article buffer (a read-only buffer
> where articles are displayed), `nobreak-char-display' is set to nil
> whereas when composing or replying, Emacs' default is used.

We definitely should highlight ZERO-WIDTH NO-BREAK SPACE in
whitespace-mode, but for editing modes I'm not so sure.

-- 
Juri Linkov
http://www.jurta.org/emacs/




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 22.2.50; Display of "zero width no-break space" (U+FEFF)
  2008-04-21 22:33         ` Juri Linkov
@ 2008-04-23  5:14           ` Chong Yidong
  0 siblings, 0 replies; 7+ messages in thread
From: Chong Yidong @ 2008-04-23  5:14 UTC (permalink / raw)
  To: Juri Linkov; +Cc: emacs-devel

Juri Linkov <juri@jurta.org> writes:

>>>> This is the same character as BOM.  But since it is *ZERO-WIDTH*
>>>> NO-BREAK SPACE, it seems it shouldn't be displayed at all in view mode.
>>>> But it edit mode, it would be preferable to have some indication
>>>> about the presence of this character in the buffer.
>>
>> If "view mode" is meant to be more general than the minor mode
>> `view-mode', I agree.  E.g. in Gnus article buffer (a read-only buffer
>> where articles are displayed), `nobreak-char-display' is set to nil
>> whereas when composing or replying, Emacs' default is used.
>
> We definitely should highlight ZERO-WIDTH NO-BREAK SPACE in
> whitespace-mode, but for editing modes I'm not so sure.

By default, Emacs should highlight ZERO-WIDTH NO-BREAK SPACE.  If
individual major modes, like Gnus buffers, want to hide this character,
that's up to them; we don't have to worry about it.  It's just like how
Emacs ordinarily displays page breaks as ^L, whereas the Gnus article
buffer has special code to split articles into pages.




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-04-23  5:14 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-06 20:53 22.2.50; Display of "zero width no-break space" (U+FEFF) Reiner Steib
2008-04-20 10:35 ` Reiner Steib
2008-04-20 23:57   ` Juri Linkov
2008-04-21  3:09     ` Eli Zaretskii
2008-04-21 11:51       ` Reiner Steib
2008-04-21 22:33         ` Juri Linkov
2008-04-23  5:14           ` Chong Yidong

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).