unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Kenichi Handa <handa@m17n.org>
Cc: bug-gnu-emacs@gnu.org, handa@m17n.org, jidanni@jidanni.org
Subject: Re: whitespace includes U+3000
Date: Thu, 29 Jun 2006 11:01:15 +0900	[thread overview]
Message-ID: <E1FvlqF-0007yZ-00@etlken> (raw)
In-Reply-To: <E1Fvdme-00060m-A2@fencepost.gnu.org> (message from Richard Stallman on Wed, 28 Jun 2006 13:25:00 -0400)

In article <E1Fvdme-00060m-A2@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:

>> No.  The current Emacs treat only TAB and SPACE as
>> "whitespace" characters.

>> It would be very easy to fix this by setting the syntax table entries
>> for those characters--if there are not too many of them.  So why not
>> fix it?

>     Are you sure that "whitespace" of syntax has the same
>     meaning as the "whitespace" of show-trailing-whitespace?

> I am not sure which one we're talking about here.
> Is it show-trailing-whitespace?

show-trailing-whitespace is just an example.  I think his
question is about all Emacs functionalities handling
"whitespace" in some meaning (examples are listed by M-x
apropos RET whitespace RET).

> If so, that would also be easy to change, if it ought to be changed.

Of course it's easy to change.  The difficult thing is to
determine if it ought to be changed.

>     For instance, currently ^L (formfeed) has syntax
>     "whitespace".  But, it is displayed with glyph "^L".  Should
>     it be the target of show-trailing-whitespace?

> No.

Then we have different meanings in "whitespace"; the set of
characters that have "whitespace" syntax is different from
the set of characters that are displayed by "whitespace"
glyph.  And, we can't use "whitespace" syntax at least for
show-trailing-whitespace.

>     For instance, currently NBSP (U+00A0) has syntax "."
>     (punctuation), and it is displayed with special face to
>     indicated the existing of that character.  Should it be
>     changed to "whitespace" syntax, or shoudn't be changed?

> The special face for that character should not be overridden, but the
> other whitespace after it _and before it_ should probably be displayed
> specially by show-trailing-whitespace.

> You can probably get this result by putting NBSP into the pattern
> for show-trailing-whitespace to recognize.  Redisplay will override
> the face, for the NBSP.

What do you mean by "pattern" here?  Regular expression?
Currently the function highlight_trailing_whitespace doesn't
use regular expression but checks TAB and SPACE directly
(i.e. hardcoded).

By the way, I've just found that currently the special face
for NBSP is overriden by show-trailing-whitespace.  That is
because highlight_trailing_whitespace is called at the near
end of display_line.

Anyway, Unicode has lots more space-like characters
(e.g. U+2000..U+200B).  Should them be treated by the same
way as NBSP (i.e. displayed with nobreak-face)?  Or as
SPACE?

How about the case of fixup-whitespace?  It seems that this
function should delete only TAB and SPACE.  So, here we have
the third meaning of "whitespace"; just TAB and SPACE.

How about the case of delete-trailing-whitespace?

How about the case of ...

Do you think we should define the semantics of "whitespace"
and "space character" in all cases clearly before the
release, and should modify codes if necessary?

---
Kenichi Handa
handa@m17n.org

      reply	other threads:[~2006-06-29  2:01 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-25  2:11 whitespace includes U+3000 Dan Jacobson
2006-06-26  1:00 ` Kenichi Handa
2006-06-27 10:34   ` Richard Stallman
2006-06-27 11:48     ` Kenichi Handa
2006-06-28 17:25       ` Richard Stallman
2006-06-29  2:01         ` Kenichi Handa [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E1FvlqF-0007yZ-00@etlken \
    --to=handa@m17n.org \
    --cc=bug-gnu-emacs@gnu.org \
    --cc=jidanni@jidanni.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).