* whitespace includes U+3000 @ 2006-06-25 2:11 Dan Jacobson 2006-06-26 1:00 ` Kenichi Handa 0 siblings, 1 reply; 6+ messages in thread From: Dan Jacobson @ 2006-06-25 2:11 UTC (permalink / raw) Cc: handa Are emacs whitespace detectors aware of Unicode characters like U+3000? show-trailing-whitespace and other (apropos (quote ("whitespace"))) stuff aren't. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: whitespace includes U+3000 2006-06-25 2:11 whitespace includes U+3000 Dan Jacobson @ 2006-06-26 1:00 ` Kenichi Handa 2006-06-27 10:34 ` Richard Stallman 0 siblings, 1 reply; 6+ messages in thread From: Kenichi Handa @ 2006-06-26 1:00 UTC (permalink / raw) Cc: bug-gnu-emacs In article <871wtegf13.fsf@jidanni.org>, Dan Jacobson <jidanni@jidanni.org> writes: > Are emacs whitespace detectors aware of Unicode characters like > U+3000? No. The current Emacs treat only TAB and SPACE as "whitespace" characters. Should be fixed in emacs-unicode-2 which contains Unicode character category data. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: whitespace includes U+3000 2006-06-26 1:00 ` Kenichi Handa @ 2006-06-27 10:34 ` Richard Stallman 2006-06-27 11:48 ` Kenichi Handa 0 siblings, 1 reply; 6+ messages in thread From: Richard Stallman @ 2006-06-27 10:34 UTC (permalink / raw) Cc: bug-gnu-emacs, jidanni > Are emacs whitespace detectors aware of Unicode characters like > U+3000? No. The current Emacs treat only TAB and SPACE as "whitespace" characters. It would be very easy to fix this by setting the syntax table entries for those characters--if there are not too many of them. So why not fix it? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: whitespace includes U+3000 2006-06-27 10:34 ` Richard Stallman @ 2006-06-27 11:48 ` Kenichi Handa 2006-06-28 17:25 ` Richard Stallman 0 siblings, 1 reply; 6+ messages in thread From: Kenichi Handa @ 2006-06-27 11:48 UTC (permalink / raw) Cc: bug-gnu-emacs, handa, jidanni In article <E1FvAu1-0000gN-I2@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes: >> Are emacs whitespace detectors aware of Unicode characters like >> U+3000? > No. The current Emacs treat only TAB and SPACE as > "whitespace" characters. > It would be very easy to fix this by setting the syntax table entries > for those characters--if there are not too many of them. So why not > fix it? Are you sure that "whitespace" of syntax has the same meaning as the "whitespace" of show-trailing-whitespace? For instance, currently ^L (formfeed) has syntax "whitespace". But, it is displayed with glyph "^L". Should it be the target of show-trailing-whitespace? For instance, currently NBSP (U+00A0) has syntax "." (punctuation), and it is displayed with special face to indicated the existing of that character. Should it be changed to "whitespace" syntax, or shoudn't be changed? Have you considered these things? Please try M-x apropos RET whitespace RET. The word "whitespace" is used in slightly different meanings. I thinks we can't blindly use "whitespace" syntax in some cases. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: whitespace includes U+3000 2006-06-27 11:48 ` Kenichi Handa @ 2006-06-28 17:25 ` Richard Stallman 2006-06-29 2:01 ` Kenichi Handa 0 siblings, 1 reply; 6+ messages in thread From: Richard Stallman @ 2006-06-28 17:25 UTC (permalink / raw) Cc: bug-gnu-emacs, jidanni, handa > No. The current Emacs treat only TAB and SPACE as > "whitespace" characters. > It would be very easy to fix this by setting the syntax table entries > for those characters--if there are not too many of them. So why not > fix it? Are you sure that "whitespace" of syntax has the same meaning as the "whitespace" of show-trailing-whitespace? I am not sure which one we're talking about here. Is it show-trailing-whitespace? If so, that would also be easy to change, if it ought to be changed. For instance, currently ^L (formfeed) has syntax "whitespace". But, it is displayed with glyph "^L". Should it be the target of show-trailing-whitespace? No. For instance, currently NBSP (U+00A0) has syntax "." (punctuation), and it is displayed with special face to indicated the existing of that character. Should it be changed to "whitespace" syntax, or shoudn't be changed? The special face for that character should not be overridden, but the other whitespace after it _and before it_ should probably be displayed specially by show-trailing-whitespace. You can probably get this result by putting NBSP into the pattern for show-trailing-whitespace to recognize. Redisplay will override the face, for the NBSP. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: whitespace includes U+3000 2006-06-28 17:25 ` Richard Stallman @ 2006-06-29 2:01 ` Kenichi Handa 0 siblings, 0 replies; 6+ messages in thread From: Kenichi Handa @ 2006-06-29 2:01 UTC (permalink / raw) Cc: bug-gnu-emacs, handa, jidanni In article <E1Fvdme-00060m-A2@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes: >> No. The current Emacs treat only TAB and SPACE as >> "whitespace" characters. >> It would be very easy to fix this by setting the syntax table entries >> for those characters--if there are not too many of them. So why not >> fix it? > Are you sure that "whitespace" of syntax has the same > meaning as the "whitespace" of show-trailing-whitespace? > I am not sure which one we're talking about here. > Is it show-trailing-whitespace? show-trailing-whitespace is just an example. I think his question is about all Emacs functionalities handling "whitespace" in some meaning (examples are listed by M-x apropos RET whitespace RET). > If so, that would also be easy to change, if it ought to be changed. Of course it's easy to change. The difficult thing is to determine if it ought to be changed. > For instance, currently ^L (formfeed) has syntax > "whitespace". But, it is displayed with glyph "^L". Should > it be the target of show-trailing-whitespace? > No. Then we have different meanings in "whitespace"; the set of characters that have "whitespace" syntax is different from the set of characters that are displayed by "whitespace" glyph. And, we can't use "whitespace" syntax at least for show-trailing-whitespace. > For instance, currently NBSP (U+00A0) has syntax "." > (punctuation), and it is displayed with special face to > indicated the existing of that character. Should it be > changed to "whitespace" syntax, or shoudn't be changed? > The special face for that character should not be overridden, but the > other whitespace after it _and before it_ should probably be displayed > specially by show-trailing-whitespace. > You can probably get this result by putting NBSP into the pattern > for show-trailing-whitespace to recognize. Redisplay will override > the face, for the NBSP. What do you mean by "pattern" here? Regular expression? Currently the function highlight_trailing_whitespace doesn't use regular expression but checks TAB and SPACE directly (i.e. hardcoded). By the way, I've just found that currently the special face for NBSP is overriden by show-trailing-whitespace. That is because highlight_trailing_whitespace is called at the near end of display_line. Anyway, Unicode has lots more space-like characters (e.g. U+2000..U+200B). Should them be treated by the same way as NBSP (i.e. displayed with nobreak-face)? Or as SPACE? How about the case of fixup-whitespace? It seems that this function should delete only TAB and SPACE. So, here we have the third meaning of "whitespace"; just TAB and SPACE. How about the case of delete-trailing-whitespace? How about the case of ... Do you think we should define the semantics of "whitespace" and "space character" in all cases clearly before the release, and should modify codes if necessary? --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2006-06-29 2:01 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-06-25 2:11 whitespace includes U+3000 Dan Jacobson 2006-06-26 1:00 ` Kenichi Handa 2006-06-27 10:34 ` Richard Stallman 2006-06-27 11:48 ` Kenichi Handa 2006-06-28 17:25 ` Richard Stallman 2006-06-29 2:01 ` Kenichi Handa
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).