* whitespace includes U+3000 @ 2006-06-25 2:11 Dan Jacobson 2006-06-26 1:00 ` Kenichi Handa 0 siblings, 1 reply; 9+ messages in thread From: Dan Jacobson @ 2006-06-25 2:11 UTC (permalink / raw) Cc: handa Are emacs whitespace detectors aware of Unicode characters like U+3000? show-trailing-whitespace and other (apropos (quote ("whitespace"))) stuff aren't. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: whitespace includes U+3000 2006-06-25 2:11 whitespace includes U+3000 Dan Jacobson @ 2006-06-26 1:00 ` Kenichi Handa 2006-06-27 10:34 ` Richard Stallman 0 siblings, 1 reply; 9+ messages in thread From: Kenichi Handa @ 2006-06-26 1:00 UTC (permalink / raw) Cc: bug-gnu-emacs In article <871wtegf13.fsf@jidanni.org>, Dan Jacobson <jidanni@jidanni.org> writes: > Are emacs whitespace detectors aware of Unicode characters like > U+3000? No. The current Emacs treat only TAB and SPACE as "whitespace" characters. Should be fixed in emacs-unicode-2 which contains Unicode character category data. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: whitespace includes U+3000 2006-06-26 1:00 ` Kenichi Handa @ 2006-06-27 10:34 ` Richard Stallman 2006-06-27 11:48 ` Kenichi Handa 0 siblings, 1 reply; 9+ messages in thread From: Richard Stallman @ 2006-06-27 10:34 UTC (permalink / raw) Cc: bug-gnu-emacs, jidanni > Are emacs whitespace detectors aware of Unicode characters like > U+3000? No. The current Emacs treat only TAB and SPACE as "whitespace" characters. It would be very easy to fix this by setting the syntax table entries for those characters--if there are not too many of them. So why not fix it? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: whitespace includes U+3000 2006-06-27 10:34 ` Richard Stallman @ 2006-06-27 11:48 ` Kenichi Handa 2006-06-28 17:25 ` Richard Stallman 0 siblings, 1 reply; 9+ messages in thread From: Kenichi Handa @ 2006-06-27 11:48 UTC (permalink / raw) Cc: bug-gnu-emacs, handa, jidanni In article <E1FvAu1-0000gN-I2@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes: >> Are emacs whitespace detectors aware of Unicode characters like >> U+3000? > No. The current Emacs treat only TAB and SPACE as > "whitespace" characters. > It would be very easy to fix this by setting the syntax table entries > for those characters--if there are not too many of them. So why not > fix it? Are you sure that "whitespace" of syntax has the same meaning as the "whitespace" of show-trailing-whitespace? For instance, currently ^L (formfeed) has syntax "whitespace". But, it is displayed with glyph "^L". Should it be the target of show-trailing-whitespace? For instance, currently NBSP (U+00A0) has syntax "." (punctuation), and it is displayed with special face to indicated the existing of that character. Should it be changed to "whitespace" syntax, or shoudn't be changed? Have you considered these things? Please try M-x apropos RET whitespace RET. The word "whitespace" is used in slightly different meanings. I thinks we can't blindly use "whitespace" syntax in some cases. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: whitespace includes U+3000 2006-06-27 11:48 ` Kenichi Handa @ 2006-06-28 17:25 ` Richard Stallman 2006-06-29 2:01 ` Kenichi Handa 0 siblings, 1 reply; 9+ messages in thread From: Richard Stallman @ 2006-06-28 17:25 UTC (permalink / raw) Cc: bug-gnu-emacs, jidanni, handa > No. The current Emacs treat only TAB and SPACE as > "whitespace" characters. > It would be very easy to fix this by setting the syntax table entries > for those characters--if there are not too many of them. So why not > fix it? Are you sure that "whitespace" of syntax has the same meaning as the "whitespace" of show-trailing-whitespace? I am not sure which one we're talking about here. Is it show-trailing-whitespace? If so, that would also be easy to change, if it ought to be changed. For instance, currently ^L (formfeed) has syntax "whitespace". But, it is displayed with glyph "^L". Should it be the target of show-trailing-whitespace? No. For instance, currently NBSP (U+00A0) has syntax "." (punctuation), and it is displayed with special face to indicated the existing of that character. Should it be changed to "whitespace" syntax, or shoudn't be changed? The special face for that character should not be overridden, but the other whitespace after it _and before it_ should probably be displayed specially by show-trailing-whitespace. You can probably get this result by putting NBSP into the pattern for show-trailing-whitespace to recognize. Redisplay will override the face, for the NBSP. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: whitespace includes U+3000 2006-06-28 17:25 ` Richard Stallman @ 2006-06-29 2:01 ` Kenichi Handa 2006-06-29 17:57 ` Richard Stallman 0 siblings, 1 reply; 9+ messages in thread From: Kenichi Handa @ 2006-06-29 2:01 UTC (permalink / raw) Cc: bug-gnu-emacs, handa, jidanni In article <E1Fvdme-00060m-A2@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes: >> No. The current Emacs treat only TAB and SPACE as >> "whitespace" characters. >> It would be very easy to fix this by setting the syntax table entries >> for those characters--if there are not too many of them. So why not >> fix it? > Are you sure that "whitespace" of syntax has the same > meaning as the "whitespace" of show-trailing-whitespace? > I am not sure which one we're talking about here. > Is it show-trailing-whitespace? show-trailing-whitespace is just an example. I think his question is about all Emacs functionalities handling "whitespace" in some meaning (examples are listed by M-x apropos RET whitespace RET). > If so, that would also be easy to change, if it ought to be changed. Of course it's easy to change. The difficult thing is to determine if it ought to be changed. > For instance, currently ^L (formfeed) has syntax > "whitespace". But, it is displayed with glyph "^L". Should > it be the target of show-trailing-whitespace? > No. Then we have different meanings in "whitespace"; the set of characters that have "whitespace" syntax is different from the set of characters that are displayed by "whitespace" glyph. And, we can't use "whitespace" syntax at least for show-trailing-whitespace. > For instance, currently NBSP (U+00A0) has syntax "." > (punctuation), and it is displayed with special face to > indicated the existing of that character. Should it be > changed to "whitespace" syntax, or shoudn't be changed? > The special face for that character should not be overridden, but the > other whitespace after it _and before it_ should probably be displayed > specially by show-trailing-whitespace. > You can probably get this result by putting NBSP into the pattern > for show-trailing-whitespace to recognize. Redisplay will override > the face, for the NBSP. What do you mean by "pattern" here? Regular expression? Currently the function highlight_trailing_whitespace doesn't use regular expression but checks TAB and SPACE directly (i.e. hardcoded). By the way, I've just found that currently the special face for NBSP is overriden by show-trailing-whitespace. That is because highlight_trailing_whitespace is called at the near end of display_line. Anyway, Unicode has lots more space-like characters (e.g. U+2000..U+200B). Should them be treated by the same way as NBSP (i.e. displayed with nobreak-face)? Or as SPACE? How about the case of fixup-whitespace? It seems that this function should delete only TAB and SPACE. So, here we have the third meaning of "whitespace"; just TAB and SPACE. How about the case of delete-trailing-whitespace? How about the case of ... Do you think we should define the semantics of "whitespace" and "space character" in all cases clearly before the release, and should modify codes if necessary? --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: whitespace includes U+3000 2006-06-29 2:01 ` Kenichi Handa @ 2006-06-29 17:57 ` Richard Stallman 2006-07-05 17:33 ` Kevin Rodgers 0 siblings, 1 reply; 9+ messages in thread From: Richard Stallman @ 2006-06-29 17:57 UTC (permalink / raw) Cc: jidanni, handa, emacs-devel Then we have different meanings in "whitespace"; the set of characters that have "whitespace" syntax is different from the set of characters that are displayed by "whitespace" glyph. That's right. There is "characters that would print as whitespace" and there is "characters that would display as whitespace in Emacs." These are different for good reason; it is not a mistake that they are different. Maybe we need to clarify the documentation so that people will understand that there are two different concepts of whitespace. In theory we might want to use two different words for these concepts. But that seems strained and difficult. They really are two applications of of the standard concept of "whitespace". We might want to speak of "screen whitespace" and "text whitespace". And, we can't use "whitespace" syntax at least for show-trailing-whitespace. Yes, that is true. > You can probably get this result by putting NBSP into the pattern > for show-trailing-whitespace to recognize. Redisplay will override > the face, for the NBSP. What do you mean by "pattern" here? Regular expression? Yes, I assumed it used one. However, on second thought, I've concluded that show-trailing-whitespace doesn't need to know about NBSP at all. Since NBSP is now indicated on the screen by a color, it is no longer likely to go unnoticed. So there is no problem with NBSP and show-trailing-whitespace. show-trailing-whitespace ought to know about all characters that will be indistinguishable on the screen from "end of the line". By the way, I've just found that currently the special face for NBSP is overriden by show-trailing-whitespace. Do you mean, show-trailing-whitespace would override the special face for NBSP _if_ you modify it to recognize NBSP along with SPC and TAB? That means my expectation was mistaken; I stand corrected. But since show-trailing-whitespace does not need to recognize NBSP, this isn't a _problem_. Anyway, Unicode has lots more space-like characters (e.g. U+2000..U+200B). Should them be treated by the same way as NBSP (i.e. displayed with nobreak-face)? Or as SPACE? It depends how they are used. How does Emacs display them? How about the case of fixup-whitespace? It seems that this function should delete only TAB and SPACE. So, here we have the third meaning of "whitespace"; just TAB and SPACE. It is an interesting question what fixup-whitespace should do with NBSP. I am not sure; it depends on how NBSP is used. When the existing space is just one NBSP, fixup-whitespace should not change it. Do people use multiple NBSP to force more space between two words? If so, maybe fixup-whitespace should leave that untouched. Or maybe fixup-whitespace should convert a run of NBSP to a single NBSP. When there is a series of whitespace including NBSP and SPC (or TAB), the runs of ordinary whitespace should be compacted to a single SPC, and the runs of NBSP should be treated as above. Similar reasoning needs to be applied to other kinds of whitespace, to figure out what behavior users will really find useful and helpful in fixup-whitespace. How about the case of delete-trailing-whitespace? That is meant to get rid of junk. It should probably delete NBSP just like SPC and TAB, since that is useless at the end of a line. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: whitespace includes U+3000 2006-06-29 17:57 ` Richard Stallman @ 2006-07-05 17:33 ` Kevin Rodgers 2006-07-07 4:13 ` Richard Stallman 0 siblings, 1 reply; 9+ messages in thread From: Kevin Rodgers @ 2006-07-05 17:33 UTC (permalink / raw) Richard Stallman wrote: > > You can probably get this result by putting NBSP into the pattern > > for show-trailing-whitespace to recognize. Redisplay will override > > the face, for the NBSP. > > What do you mean by "pattern" here? Regular expression? > > Yes, I assumed it used one. > > However, on second thought, I've concluded that > show-trailing-whitespace doesn't need to know about NBSP at all. > Since NBSP is now indicated on the screen by a color, it is no longer > likely to go unnoticed. So there is no problem with NBSP and > show-trailing-whitespace. That is true by default, but not if the user has set nobreak-char-display to nil. I think show-trailing-whitespace should DTRT even if the user has made such a customization and ensure that the trailing whitespace is indicated. > show-trailing-whitespace ought to know about all characters that will > be indistinguishable on the screen from "end of the line". Agreed! (for non-default values of display options like nobreak-char-display as well) > By the way, I've just found that currently the special face > for NBSP is overriden by show-trailing-whitespace. > > Do you mean, show-trailing-whitespace would override the special face > for NBSP _if_ you modify it to recognize NBSP along with SPC and TAB? > That means my expectation was mistaken; I stand corrected. I think that is good: it means that show-trailing-whitespace will indicate NBSP regardless of nobreak-char-display. > But since show-trailing-whitespace does not need to recognize NBSP, > this isn't a _problem_. I don't think so. (To reiterate: show-trailing-whitespace does need to recognize NBSP in case nobreak-char-display is nil). ... > How about the case of delete-trailing-whitespace? > > That is meant to get rid of junk. It should probably delete > NBSP just like SPC and TAB, since that is useless at the end of a line. It would be surprising if delete-trailing-whitespace deleted anything (e.g. NBSP) that was not displayed specially by show-trailing-whitespace. -- Kevin ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: whitespace includes U+3000 2006-07-05 17:33 ` Kevin Rodgers @ 2006-07-07 4:13 ` Richard Stallman 0 siblings, 0 replies; 9+ messages in thread From: Richard Stallman @ 2006-07-07 4:13 UTC (permalink / raw) Cc: emacs-devel > However, on second thought, I've concluded that > show-trailing-whitespace doesn't need to know about NBSP at all. > Since NBSP is now indicated on the screen by a color, it is no longer > likely to go unnoticed. So there is no problem with NBSP and > show-trailing-whitespace. That is true by default, but not if the user has set nobreak-char-display to nil. I think show-trailing-whitespace should DTRT even if the user has made such a customization and ensure that the trailing whitespace is indicated. I won't object, if someone wants to do it. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2006-07-07 4:13 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-06-25 2:11 whitespace includes U+3000 Dan Jacobson 2006-06-26 1:00 ` Kenichi Handa 2006-06-27 10:34 ` Richard Stallman 2006-06-27 11:48 ` Kenichi Handa 2006-06-28 17:25 ` Richard Stallman 2006-06-29 2:01 ` Kenichi Handa 2006-06-29 17:57 ` Richard Stallman 2006-07-05 17:33 ` Kevin Rodgers 2006-07-07 4:13 ` Richard Stallman
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.