unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* Strange whitespaces.
@ 2021-09-30  9:37 Hongyi Zhao
  2021-09-30  9:56 ` Gregory Heytings
  2021-09-30 10:08 ` Emanuel Berg via Users list for the GNU Emacs text editor
  0 siblings, 2 replies; 31+ messages in thread
From: Hongyi Zhao @ 2021-09-30  9:37 UTC (permalink / raw)
  To: help-gnu-emacs

[-- Attachment #1: Type: text/plain, Size: 2667 bytes --]

I've seen two strange whitespaces which shown as underscores in
scratch buffer, and `M-x describer-char RET' give the following
results:

The first one:

===============
           position: 146 of 148 (98%), column: 0
            character:   (displayed as  ) (codepoint 160, #o240, #xa0)
              charset: unicode (Unicode (ISO10646))
code point in charset: 0xA0
               script: latin
               syntax:       which means: whitespace
             category: .:Base, b:Arabic, j:Japanese, l:Latin
             to input: type "C-x 8 RET a0" or "C-x 8 RET NO-BREAK SPACE"
          buffer code: #xC2 #xA0
            file code: #xC2 #xA0 (encoded by coding system utf-8-unix)
              display: by this font (glyph code)
    ftcrhb:-PfEd-DejaVuSansMono Nerd Font
Mono-normal-normal-normal-*-20-*-*-*-m-0-iso10646-1 (#x62)
       hardcoded face: nobreak-space

Character code properties: customize what to show
  name: NO-BREAK SPACE
  old-name: NON-BREAKING SPACE
  general-category: Zs (Separator, Space)
  decomposition: (noBreak 32) (noBreak ' ')

There are text properties here:
  fontified            t
  wrap-prefix          " "
  ws-butler-chg        delete


The second:

 ===============
           position: 148 of 148 (99%), column: 2
            character:   (displayed as  ) (codepoint 8194, #o20002, #x2002)
              charset: unicode (Unicode (ISO10646))
code point in charset: 0x2002
               script: symbol
               syntax:       which means: whitespace
             category: .:Base
             to input: type "C-x 8 RET 2002" or "C-x 8 RET EN SPACE"
          buffer code: #xE2 #x80 #x82
            file code: #xE2 #x80 #x82 (encoded by coding system utf-8-unix)
              display: by this font (glyph code)
    ftcrhb:-PfEd-DejaVuSansMono Nerd Font
Mono-normal-normal-normal-*-20-*-*-*-m-0-iso10646-1 (#x712)
       hardcoded face: nobreak-space

Character code properties: customize what to show
  name: EN SPACE
  general-category: Zs (Separator, Space)
  decomposition: (compat 32) (compat ' ')

There are text properties here:
  fontified            t
  rear-nonsticky       t
  wrap-prefix          " "
  ws-butler-chg        chg


If I copy and paste these two characters into other editors, say,
Gmail web client or gedit, I will see nothing of them. OTOH, if I copy
them back to Emacs again, for the Gmail web client case, the first
character will be lost.

I am puzzled by this phenomenon: Why do people design so many
whitespace representations  and how to safely manipulate them between
different editors

Regards, HZ

[-- Attachment #2: whitespaces.png --]
[-- Type: image/png, Size: 158467 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Strange whitespaces.
  2021-09-30  9:37 Strange whitespaces Hongyi Zhao
@ 2021-09-30  9:56 ` Gregory Heytings
  2021-09-30 10:11   ` Emanuel Berg via Users list for the GNU Emacs text editor
  2021-09-30 10:08 ` Emanuel Berg via Users list for the GNU Emacs text editor
  1 sibling, 1 reply; 31+ messages in thread
From: Gregory Heytings @ 2021-09-30  9:56 UTC (permalink / raw)
  To: Hongyi Zhao; +Cc: help-gnu-emacs


>
> name: NO-BREAK SPACE
>
> name: EN SPACE
>
> I am puzzled by this phenomenon: Why do people design so many whitespace 
> representations and how to safely manipulate them between different 
> editors
>

There's nothing strange happening here.  There are indeed many kinds of 
spaces with different properties, just like there are many kinds of dashes 
with different properties.  An en space is a half em space, just like an 
en dash is a half em dash.  A no-break space is what its name suggests, it 
means that the editor should not break the line there.  Say you want to 
indicate a list in a paragraph with (a), (b), (c), for example in: "We 
need to buy (a) apples, (b) brownies, (c) chocolate."  You typically do 
not want to see:

We need to buy (a) apples, (b)
brownies, (c) chocolate.

and you do this by putting a no-break space between "(a)" and "apples", 
"(b)" and "brownies", "(c)" and "chocolate".



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Strange whitespaces.
  2021-09-30  9:37 Strange whitespaces Hongyi Zhao
  2021-09-30  9:56 ` Gregory Heytings
@ 2021-09-30 10:08 ` Emanuel Berg via Users list for the GNU Emacs text editor
  2021-09-30 13:53   ` Hongyi Zhao
  1 sibling, 1 reply; 31+ messages in thread
From: Emanuel Berg via Users list for the GNU Emacs text editor @ 2021-09-30 10:08 UTC (permalink / raw)
  To: help-gnu-emacs

Hongyi Zhao wrote:

>   name: NO-BREAK SPACE
>   old-name: NON-BREAKING SPACE

This is to prevent an auto line break.
<https://en.wikipedia.org/wiki/Non-breaking_space>

>   name: EN SPACE

"A space which has a nominal width of 1 en".
<https://en.wiktionary.org/wiki/en_space>

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Strange whitespaces.
  2021-09-30  9:56 ` Gregory Heytings
@ 2021-09-30 10:11   ` Emanuel Berg via Users list for the GNU Emacs text editor
  2021-09-30 10:19     ` Emanuel Berg via Users list for the GNU Emacs text editor
  0 siblings, 1 reply; 31+ messages in thread
From: Emanuel Berg via Users list for the GNU Emacs text editor @ 2021-09-30 10:11 UTC (permalink / raw)
  To: help-gnu-emacs

Gregory Heytings wrote:

> There's nothing strange happening here. There are indeed
> many kinds of spaces with different properties, just like
> there are many kinds of dashes

For example this one, common in Lisp: -

I know what you are thinking! A dash LOL

But actually it is a HYPHEN-MINUS in the general-category
Pd (Punctuation, Dash) ...

(defalias 'what-char #'describe-char)

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Strange whitespaces.
  2021-09-30 10:11   ` Emanuel Berg via Users list for the GNU Emacs text editor
@ 2021-09-30 10:19     ` Emanuel Berg via Users list for the GNU Emacs text editor
  2021-09-30 13:44       ` Hongyi Zhao
  0 siblings, 1 reply; 31+ messages in thread
From: Emanuel Berg via Users list for the GNU Emacs text editor @ 2021-09-30 10:19 UTC (permalink / raw)
  To: help-gnu-emacs

> For example this one, common in Lisp: -
>
> I know what you are thinking! A dash LOL
>
> But actually it is a HYPHEN-MINUS in the general-category
> Pd (Punctuation, Dash) ...
>
> (defalias 'what-char #'describe-char)

And what about /bin/bash in Debian, is that a dash as well?

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Strange whitespaces.
  2021-09-30 10:19     ` Emanuel Berg via Users list for the GNU Emacs text editor
@ 2021-09-30 13:44       ` Hongyi Zhao
  2021-09-30 15:39         ` Emanuel Berg via Users list for the GNU Emacs text editor
  0 siblings, 1 reply; 31+ messages in thread
From: Hongyi Zhao @ 2021-09-30 13:44 UTC (permalink / raw)
  To: Emanuel Berg, help-gnu-emacs

On Thu, Sep 30, 2021 at 6:20 PM Emanuel Berg via Users list for the
GNU Emacs text editor <help-gnu-emacs@gnu.org> wrote:
>
> > For example this one, common in Lisp: -
> >
> > I know what you are thinking! A dash LOL
> >
> > But actually it is a HYPHEN-MINUS in the general-category
> > Pd (Punctuation, Dash) ...
> >
> > (defalias 'what-char #'describe-char)
>
> And what about /bin/bash in Debian, is that a dash as well?

As noted here [1]:

Debian uses Bash as the default interactive shell.
Debian uses Dash as the default non-interactive shell.

[1] https://wiki.debian.org/Shell

HZ



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Strange whitespaces.
  2021-09-30 10:08 ` Emanuel Berg via Users list for the GNU Emacs text editor
@ 2021-09-30 13:53   ` Hongyi Zhao
  2021-09-30 15:20     ` [External] : " Drew Adams
  2021-09-30 15:41     ` Emanuel Berg via Users list for the GNU Emacs text editor
  0 siblings, 2 replies; 31+ messages in thread
From: Hongyi Zhao @ 2021-09-30 13:53 UTC (permalink / raw)
  To: Emanuel Berg, help-gnu-emacs

On Thu, Sep 30, 2021 at 6:08 PM Emanuel Berg via Users list for the
GNU Emacs text editor <help-gnu-emacs@gnu.org> wrote:
>
> Hongyi Zhao wrote:
>
> >   name: NO-BREAK SPACE
> >   old-name: NON-BREAKING SPACE
>
> This is to prevent an auto line break.
> <https://en.wikipedia.org/wiki/Non-breaking_space>
>
> >   name: EN SPACE
>
> "A space which has a nominal width of 1 en".
> <https://en.wiktionary.org/wiki/en_space>

Why are both displayed as red underscores in Emacs?

HZ



^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [External] : Re: Strange whitespaces.
  2021-09-30 13:53   ` Hongyi Zhao
@ 2021-09-30 15:20     ` Drew Adams
  2021-09-30 15:46       ` Hongyi Zhao
                         ` (2 more replies)
  2021-09-30 15:41     ` Emanuel Berg via Users list for the GNU Emacs text editor
  1 sibling, 3 replies; 31+ messages in thread
From: Drew Adams @ 2021-09-30 15:20 UTC (permalink / raw)
  To: Hongyi Zhao, Emanuel Berg, help-gnu-emacs

> Why are both displayed as red underscores in Emacs?

I don't think the EN SPACE is displayed that way.
Not with `emacs -Q' (no init file), at least.  But
I don't have an Emacs 28 prerelease build - maybe
they changed Emacs 28 to highlight that as well.

As for a NO-BREAK SPACE, it's highlighted (with face
`nobreak-space', which inherits from `escape-glyph')
if variable `nobreak-char-display' is non-nil, which
it is by default.
___

Note that `nobreak-char-display' highlights all
no-break (aka "hard") characters, not just the
no-break space.  IOW, it highlights hard hyphens
(code point 8209) as well as hard spaces.

Vanilla Emacs doesn't let you highlight these
chars differently (e.g. highlight only hard spaces
or only hard hyphens.  (This highlighting
is low-level - it doesn't use Font Lock mode.)

But you can highlight such chars differently, or
highlight only one of them, using library
`highlight-chars.el'. 

https://www.emacswiki.org/emacs/ShowWhiteSpace#HighlightChars

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Strange whitespaces.
  2021-09-30 13:44       ` Hongyi Zhao
@ 2021-09-30 15:39         ` Emanuel Berg via Users list for the GNU Emacs text editor
  0 siblings, 0 replies; 31+ messages in thread
From: Emanuel Berg via Users list for the GNU Emacs text editor @ 2021-09-30 15:39 UTC (permalink / raw)
  To: help-gnu-emacs

Hongyi Zhao wrote:

> Debian uses Bash as the default interactive shell.
> Debian uses Dash as the default non-interactive shell.

+1 :)

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Strange whitespaces.
  2021-09-30 13:53   ` Hongyi Zhao
  2021-09-30 15:20     ` [External] : " Drew Adams
@ 2021-09-30 15:41     ` Emanuel Berg via Users list for the GNU Emacs text editor
  2021-10-01  1:45       ` Hongyi Zhao
  2021-10-01 10:15       ` Eric S Fraga
  1 sibling, 2 replies; 31+ messages in thread
From: Emanuel Berg via Users list for the GNU Emacs text editor @ 2021-09-30 15:41 UTC (permalink / raw)
  To: help-gnu-emacs

Hongyi Zhao wrote:

>>> name: NO-BREAK SPACE
>>> old-name: NON-BREAKING SPACE
>>
>> This is to prevent an auto line break.
>> <https://en.wikipedia.org/wiki/Non-breaking_space>
>>
>>> name: EN SPACE
>>
>> "A space which has a nominal width of 1 en".
>> <https://en.wiktionary.org/wiki/en_space>
>
> Why are both displayed as red underscores in Emacs?

Maybe there is no better way to display them in your setup?

As for the red color, use this

(defun what-face (pos)
  (interactive "d")
  (let ((face (or (get-char-property pos 'face)
                  (get-char-property pos 'read-cf-name) )))
    (message (format "%s" (or face "no face!"))) ))

What does it say?

PS. Suggestion: add that to vanilla Emacs if it isn't there!
  
-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] : Re: Strange whitespaces.
  2021-09-30 15:20     ` [External] : " Drew Adams
@ 2021-09-30 15:46       ` Hongyi Zhao
  2021-09-30 16:26         ` Drew Adams
  2021-09-30 16:06       ` tomas
  2021-09-30 16:12       ` Eli Zaretskii
  2 siblings, 1 reply; 31+ messages in thread
From: Hongyi Zhao @ 2021-09-30 15:46 UTC (permalink / raw)
  To: Drew Adams; +Cc: help-gnu-emacs, Emanuel Berg

On Thu, Sep 30, 2021 at 11:20 PM Drew Adams <drew.adams@oracle.com> wrote:
>
> > Why are both displayed as red underscores in Emacs?
>
> I don't think the EN SPACE is displayed that way.
> Not with `emacs -Q' (no init file), at least.  But
> I don't have an Emacs 28 prerelease build - maybe
> they changed Emacs 28 to highlight that as well.
>
> As for a NO-BREAK SPACE, it's highlighted (with face
> `nobreak-space', which inherits from `escape-glyph')
> if variable `nobreak-char-display' is non-nil, which
> it is by default.
> ___
>
> Note that `nobreak-char-display' highlights all
> no-break (aka "hard") characters, not just the
> no-break space.  IOW, it highlights hard hyphens
> (code point 8209) as well as hard spaces.

What's the meaning of "hard" here?

> Vanilla Emacs doesn't let you highlight these
> chars differently (e.g. highlight only hard spaces
> or only hard hyphens.  (This highlighting
> is low-level - it doesn't use Font Lock mode.)
>
> But you can highlight such chars differently, or
> highlight only one of them, using library
> `highlight-chars.el'.
>
> https://www.emacswiki.org/emacs/ShowWhiteSpace#HighlightChars

Thank you for telling me about it.

HZ



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] : Re: Strange whitespaces.
  2021-09-30 15:20     ` [External] : " Drew Adams
  2021-09-30 15:46       ` Hongyi Zhao
@ 2021-09-30 16:06       ` tomas
  2021-09-30 16:12       ` Eli Zaretskii
  2 siblings, 0 replies; 31+ messages in thread
From: tomas @ 2021-09-30 16:06 UTC (permalink / raw)
  To: help-gnu-emacs

[-- Attachment #1: Type: text/plain, Size: 507 bytes --]

On Thu, Sep 30, 2021 at 03:20:30PM +0000, Drew Adams wrote:
> > Why are both displayed as red underscores in Emacs?
> 
> I don't think the EN SPACE is displayed that way.
> Not with `emacs -Q' (no init file), at least.  But
> I don't have an Emacs 28 prerelease build - maybe
> they changed Emacs 28 to highlight that as well.

Emacs 28.0.50-somethingsomething (compiled off git,
about yesterday). It is shown highlighted.

It is controlled by the option `nobreak-char-display'.

Cheers
 - t

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] : Re: Strange whitespaces.
  2021-09-30 15:20     ` [External] : " Drew Adams
  2021-09-30 15:46       ` Hongyi Zhao
  2021-09-30 16:06       ` tomas
@ 2021-09-30 16:12       ` Eli Zaretskii
  2021-09-30 16:32         ` Drew Adams
  2021-10-01  1:51         ` Hongyi Zhao
  2 siblings, 2 replies; 31+ messages in thread
From: Eli Zaretskii @ 2021-09-30 16:12 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Drew Adams <drew.adams@oracle.com>
> Date: Thu, 30 Sep 2021 15:20:30 +0000
> 
> > Why are both displayed as red underscores in Emacs?
> 
> I don't think the EN SPACE is displayed that way.
> Not with `emacs -Q' (no init file), at least.  But
> I don't have an Emacs 28 prerelease build - maybe
> they changed Emacs 28 to highlight that as well.

They did.  We now highlight any non-ASCII character whose Unicode
general category is "Space Separator" (or Zs for short).  That
includes EN SPACE.

> Note that `nobreak-char-display' highlights all
> no-break (aka "hard") characters, not just the
> no-break space.  IOW, it highlights hard hyphens
> (code point 8209) as well as hard spaces.

It highlights much more, see above.



^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [External] : Re: Strange whitespaces.
  2021-09-30 15:46       ` Hongyi Zhao
@ 2021-09-30 16:26         ` Drew Adams
  0 siblings, 0 replies; 31+ messages in thread
From: Drew Adams @ 2021-09-30 16:26 UTC (permalink / raw)
  To: Hongyi Zhao; +Cc: help-gnu-emacs, Emanuel Berg

> > no-break (aka "hard") characters, not just the
> > no-break space.
> 
> What's the meaning of "hard" here?

No-break.  ("aka" is an abbreviation for "Also known as".)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [External] : Re: Strange whitespaces.
  2021-09-30 16:12       ` Eli Zaretskii
@ 2021-09-30 16:32         ` Drew Adams
  2021-09-30 16:45           ` Eli Zaretskii
  2021-10-01  1:51         ` Hongyi Zhao
  1 sibling, 1 reply; 31+ messages in thread
From: Drew Adams @ 2021-09-30 16:32 UTC (permalink / raw)
  To: Eli Zaretskii, help-gnu-emacs@gnu.org

> > I don't think the EN SPACE is displayed that way.
> > Not with `emacs -Q' (no init file), at least.  But
> > I don't have an Emacs 28 prerelease build - maybe
> > they changed Emacs 28 to highlight that as well.
> 
> They did.  We now highlight any non-ASCII character whose Unicode
> general category is "Space Separator" (or Zs for short).  That
> includes EN SPACE.
> 
> > Note that `nobreak-char-display' highlights all
> > no-break (aka "hard") characters, not just the
> > no-break space.  IOW, it highlights hard hyphens
> > (code point 8209) as well as hard spaces.
> 
> It highlights much more [in Emacs 28], see above.

All the more need for an ability to highlight or
not highlight individually.

Treating all such chars the same might be handy,
but it's also a limiting, all-or-nothing choice.
If you want different treatment for different chars
then it doesn't fit the bill.




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] : Re: Strange whitespaces.
  2021-09-30 16:32         ` Drew Adams
@ 2021-09-30 16:45           ` Eli Zaretskii
  2021-09-30 17:03             ` Drew Adams
  0 siblings, 1 reply; 31+ messages in thread
From: Eli Zaretskii @ 2021-09-30 16:45 UTC (permalink / raw)
  To: Drew Adams; +Cc: help-gnu-emacs

> From: Drew Adams <drew.adams@oracle.com>
> Date: Thu, 30 Sep 2021 16:32:24 +0000
> 
> > They did.  We now highlight any non-ASCII character whose Unicode
> > general category is "Space Separator" (or Zs for short).  That
> > includes EN SPACE.
> > 
> > > Note that `nobreak-char-display' highlights all
> > > no-break (aka "hard") characters, not just the
> > > no-break space.  IOW, it highlights hard hyphens
> > > (code point 8209) as well as hard spaces.
> > 
> > It highlights much more [in Emacs 28], see above.
> 
> All the more need for an ability to highlight or
> not highlight individually.

That'd be a different feature.  This feature is for making sure the
user is aware of unusual characters that might display as something
innocent.  If someone wants a different feature, they should turn this
special display off, and program whatever they want instead.



^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [External] : Re: Strange whitespaces.
  2021-09-30 16:45           ` Eli Zaretskii
@ 2021-09-30 17:03             ` Drew Adams
  0 siblings, 0 replies; 31+ messages in thread
From: Drew Adams @ 2021-09-30 17:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: help-gnu-emacs@gnu.org

> > All the more need for an ability to highlight or
> > not highlight individually.
> 
> That'd be a different feature.  This feature is for making sure the
> user is aware of unusual characters that might display as something
> innocent.  If someone wants a different feature, they should turn this
> special display off, and program whatever they want instead.

Yes, exactly.  Both features are useful.
And yes, to use the one I mentioned (which
vanilla Emacs lacks) you have to turn off the
hard-coded highlighting of no-break chars.

From the commentary of `highlight-chars.el':

 Using `highlight-chars.el' to highlight hard space
 and hyphen chars requires turning off their default
 highlighting provided by vanilla Emacs, that is,
 setting `nobreak-char-display' to nil.  This is
 done automatically by the functions defined here.
 When you turn off this font-lock highlighting, the
 vanilla Emacs highlighting is automatically restored.



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Strange whitespaces.
  2021-09-30 15:41     ` Emanuel Berg via Users list for the GNU Emacs text editor
@ 2021-10-01  1:45       ` Hongyi Zhao
  2021-10-01  1:56         ` Emanuel Berg via Users list for the GNU Emacs text editor
  2021-10-01 10:15       ` Eric S Fraga
  1 sibling, 1 reply; 31+ messages in thread
From: Hongyi Zhao @ 2021-10-01  1:45 UTC (permalink / raw)
  To: Emanuel Berg, help-gnu-emacs

On Fri, Oct 1, 2021 at 12:04 AM Emanuel Berg via Users list for the
GNU Emacs text editor <help-gnu-emacs@gnu.org> wrote:
>
> Hongyi Zhao wrote:
>
> >>> name: NO-BREAK SPACE
> >>> old-name: NON-BREAKING SPACE
> >>
> >> This is to prevent an auto line break.
> >> <https://en.wikipedia.org/wiki/Non-breaking_space>
> >>
> >>> name: EN SPACE
> >>
> >> "A space which has a nominal width of 1 en".
> >> <https://en.wiktionary.org/wiki/en_space>
> >
> > Why are both displayed as red underscores in Emacs?
>
> Maybe there is no better way to display them in your setup?
>
> As for the red color, use this
>
> (defun what-face (pos)
>   (interactive "d")
>   (let ((face (or (get-char-property pos 'face)
>                   (get-char-property pos 'read-cf-name) )))
>     (message (format "%s" (or face "no face!"))) ))
>
> What does it say?
>
> PS. Suggestion: add that to vanilla Emacs if it isn't there!

I added the above code snippet into my `~/.emacs.d/init.el', and
checking the discussed here by `M-x what-face' with point on them, but
just see the following message minibuffer:

no face!

HZ



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] : Re: Strange whitespaces.
  2021-09-30 16:12       ` Eli Zaretskii
  2021-09-30 16:32         ` Drew Adams
@ 2021-10-01  1:51         ` Hongyi Zhao
  2021-10-01  2:03           ` Emanuel Berg via Users list for the GNU Emacs text editor
  2021-10-01  6:34           ` Eli Zaretskii
  1 sibling, 2 replies; 31+ messages in thread
From: Hongyi Zhao @ 2021-10-01  1:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: help-gnu-emacs

On Fri, Oct 1, 2021 at 12:27 AM Eli Zaretskii <eliz@gnu.org> wrote:
>
> > From: Drew Adams <drew.adams@oracle.com>
> > Date: Thu, 30 Sep 2021 15:20:30 +0000
> >
> > > Why are both displayed as red underscores in Emacs?
> >
> > I don't think the EN SPACE is displayed that way.
> > Not with `emacs -Q' (no init file), at least.  But
> > I don't have an Emacs 28 prerelease build - maybe
> > they changed Emacs 28 to highlight that as well.
>
> They did.  We now highlight any non-ASCII character whose Unicode
> general category is "Space Separator" (or Zs for short).

I fail to see the connection between the abbreviation and the original
representation it stands for.

> That includes EN SPACE.
>
> > Note that `nobreak-char-display' highlights all
> > no-break (aka "hard") characters, not just the
> > no-break space.  IOW, it highlights hard hyphens
> > (code point 8209) as well as hard spaces.
>
> It highlights much more, see above.



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Strange whitespaces.
  2021-10-01  1:45       ` Hongyi Zhao
@ 2021-10-01  1:56         ` Emanuel Berg via Users list for the GNU Emacs text editor
  0 siblings, 0 replies; 31+ messages in thread
From: Emanuel Berg via Users list for the GNU Emacs text editor @ 2021-10-01  1:56 UTC (permalink / raw)
  To: help-gnu-emacs

Hongyi Zhao wrote:

> I added the above code snippet into my `~/.emacs.d/init.el', and
> checking the discussed here by `M-x what-face' with point on them, but
> just see the following message minibuffer:
>
> no face!

That code is 5 lines. Pause here and read it - what do you
think that means? I think you can figure it out ...

Remember the method from school when you couldn't solve a problem:

  1) think
  
  2) ask a friend
  
  3) ask the teacher

The "teacher" here is gmane.emacs.help, collectively.

But oh, well ... "no face!" means it has no face. It must have
lost it somehow, you know how many sensitive characters there
are in a school.

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] : Re: Strange whitespaces.
  2021-10-01  1:51         ` Hongyi Zhao
@ 2021-10-01  2:03           ` Emanuel Berg via Users list for the GNU Emacs text editor
  2021-10-01  6:34           ` Eli Zaretskii
  1 sibling, 0 replies; 31+ messages in thread
From: Emanuel Berg via Users list for the GNU Emacs text editor @ 2021-10-01  2:03 UTC (permalink / raw)
  To: help-gnu-emacs

Hongyi Zhao wrote:

>> They did. We now highlight any non-ASCII character whose
>> Unicode general category is "Space Separator" (or Zs for
>> short).
>
> I fail to see the connection between the abbreviation and
> the original representation it stands for.

Maybe S was already taken by "Symbol" as in the category
"Math Symbol", key Sm.

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] : Re: Strange whitespaces.
  2021-10-01  1:51         ` Hongyi Zhao
  2021-10-01  2:03           ` Emanuel Berg via Users list for the GNU Emacs text editor
@ 2021-10-01  6:34           ` Eli Zaretskii
  2021-10-01  7:26             ` Hongyi Zhao
  1 sibling, 1 reply; 31+ messages in thread
From: Eli Zaretskii @ 2021-10-01  6:34 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Hongyi Zhao <hongyi.zhao@gmail.com>
> Date: Fri, 1 Oct 2021 09:51:22 +0800
> Cc: help-gnu-emacs <help-gnu-emacs@gnu.org>
> 
> > We now highlight any non-ASCII character whose Unicode
> > general category is "Space Separator" (or Zs for short).
> 
> I fail to see the connection between the abbreviation and the original
> representation it stands for.

You mean, Zs vs "Space Separator"?  Please complain to the Unicode
Consortium about any of that, Emacs just uses the names and
nomenclature they invented.  See

  https://www.unicode.org/reports/tr44/#General_Category_Values



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] : Re: Strange whitespaces.
  2021-10-01  6:34           ` Eli Zaretskii
@ 2021-10-01  7:26             ` Hongyi Zhao
  2021-10-01  7:56               ` Emanuel Berg via Users list for the GNU Emacs text editor
  0 siblings, 1 reply; 31+ messages in thread
From: Hongyi Zhao @ 2021-10-01  7:26 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: help-gnu-emacs

On Fri, Oct 1, 2021 at 2:36 PM Eli Zaretskii <eliz@gnu.org> wrote:
>
> > From: Hongyi Zhao <hongyi.zhao@gmail.com>
> > Date: Fri, 1 Oct 2021 09:51:22 +0800
> > Cc: help-gnu-emacs <help-gnu-emacs@gnu.org>
> >
> > > We now highlight any non-ASCII character whose Unicode
> > > general category is "Space Separator" (or Zs for short).
> >
> > I fail to see the connection between the abbreviation and the original
> > representation it stands for.
>
> You mean, Zs vs "Space Separator"?

Yes.

> Please complain to the Unicode Consortium about any of that, Emacs just uses the names and
> nomenclature they invented.  See
>
>   https://www.unicode.org/reports/tr44/#General_Category_Values

I presumably basically figured out the logic behinds the nomenclature:
Based on the Description given on the above URL:

a space character (of various non-zero widths)

So, the Z <---> non-zero, and s <---> space.

This is like the naming rules used in regular expression
metacharacters, say, in python [1]:

\s
For Unicode (str) patterns:

Matches Unicode whitespace characters (which includes [ \t\n\r\f\v],
and also many other characters, for example the non-breaking spaces
mandated by typography rules in many languages). If the ASCII flag is
used, only [ \t\n\r\f\v] is matched.

For 8-bit (bytes) patterns:

Matches characters considered whitespace in the ASCII character set;
this is equivalent to [ \t\n\r\f\v].

\S

Matches any character which is not a whitespace character. This is the
opposite of \s. If the ASCII flag is used this becomes the equivalent
of [^ \t\n\r\f\v].


[1] https://docs.python.org/3/library/re.html

HZ



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] : Re: Strange whitespaces.
  2021-10-01  7:26             ` Hongyi Zhao
@ 2021-10-01  7:56               ` Emanuel Berg via Users list for the GNU Emacs text editor
  2021-10-01 10:10                 ` Hongyi Zhao
  0 siblings, 1 reply; 31+ messages in thread
From: Emanuel Berg via Users list for the GNU Emacs text editor @ 2021-10-01  7:56 UTC (permalink / raw)
  To: help-gnu-emacs

Hongyi Zhao wrote:

> I presumably basically figured out the logic behinds the
> nomenclature: Based on the Description given on the above
> URL:
>
> a space character (of various non-zero widths)
>
> So, the Z <---> non-zero, and s <---> space.

The logic is it must be something and it can't be S since

S    Symbol                Sm | Sc | Sk | So

Zo why not Z inztead? Zimple az that! Mozt likely anyway.

Zs   Space_Separator       a space character (of various non-zero widths)     
Zl   Line_Separator        U+2028 LINE SEPARATOR only                         
Zp   Paragraph_Separator   U+2029 PARAGRAPH SEPARATOR only                    
Z    Separator             Zs | Zl | Zp

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] : Re: Strange whitespaces.
  2021-10-01  7:56               ` Emanuel Berg via Users list for the GNU Emacs text editor
@ 2021-10-01 10:10                 ` Hongyi Zhao
  2021-10-01 10:24                   ` Emanuel Berg via Users list for the GNU Emacs text editor
  0 siblings, 1 reply; 31+ messages in thread
From: Hongyi Zhao @ 2021-10-01 10:10 UTC (permalink / raw)
  To: Emanuel Berg, help-gnu-emacs

On Fri, Oct 1, 2021 at 3:56 PM Emanuel Berg via Users list for the GNU
Emacs text editor <help-gnu-emacs@gnu.org> wrote:
>
> Hongyi Zhao wrote:
>
> > I presumably basically figured out the logic behinds the
> > nomenclature: Based on the Description given on the above
> > URL:
> >
> > a space character (of various non-zero widths)
> >
> > So, the Z <---> non-zero, and s <---> space.
>
> The logic is it must be something and it can't be S since
>
> S    Symbol                Sm | Sc | Sk | So
>
> Zo why not Z inztead? Zimple az that! Mozt likely anyway.
>
> Zs   Space_Separator       a space character (of various non-zero widths)
> Zl   Line_Separator        U+2028 LINE SEPARATOR only
> Zp   Paragraph_Separator   U+2029 PARAGRAPH SEPARATOR only
> Z    Separator             Zs | Zl | Zp

OMG. I didn't notice the last line above.



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Strange whitespaces.
  2021-09-30 15:41     ` Emanuel Berg via Users list for the GNU Emacs text editor
  2021-10-01  1:45       ` Hongyi Zhao
@ 2021-10-01 10:15       ` Eric S Fraga
  2021-10-01 10:28         ` Emanuel Berg via Users list for the GNU Emacs text editor
  1 sibling, 1 reply; 31+ messages in thread
From: Eric S Fraga @ 2021-10-01 10:15 UTC (permalink / raw)
  To: help-gnu-emacs

On Thursday, 30 Sep 2021 at 17:41, Emanuel Berg via Users list for the GNU Emacs text editor wrote:
> As for the red color, use this

Or just "C-u C-x =" (what-cursor-position) or "4ga" in evil-mode?

-- 
Eric S Fraga via Emacs 28.0.50 & org 9.5 on Debian 11.0




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [External] : Re: Strange whitespaces.
  2021-10-01 10:10                 ` Hongyi Zhao
@ 2021-10-01 10:24                   ` Emanuel Berg via Users list for the GNU Emacs text editor
  0 siblings, 0 replies; 31+ messages in thread
From: Emanuel Berg via Users list for the GNU Emacs text editor @ 2021-10-01 10:24 UTC (permalink / raw)
  To: help-gnu-emacs

Hongyi Zhao wrote:

>>> I presumably basically figured out the logic behinds the
>>> nomenclature: Based on the Description given on the above
>>> URL:
>>>
>>> a space character (of various non-zero widths)
>>>
>>> So, the Z <---> non-zero, and s <---> space.
>>
>> The logic is it must be something and it can't be S since
>>
>> S    Symbol                Sm | Sc | Sk | So
>>
>> Zo why not Z inztead? Zimple az that! Mozt likely anyway.
>>
>> Zs   Space_Separator       a space character (of various non-zero widths)
>> Zl   Line_Separator        U+2028 LINE SEPARATOR only
>> Zp   Paragraph_Separator   U+2029 PARAGRAPH SEPARATOR only
>> Z    Separator             Zs | Zl | Zp
>
> OMG. I didn't notice the last line above.

'z' is actually the separator between the English alphabet and
what comes after it ...

Maybe we can have a cyclic alphabet in Lisp - no need for
Z then! Z in English aren't even toned (or are they? Mask of
Zorro - no!) so it can be dropped from the English ABC, even.
Mask of Sorro - hm, doesn't have the same elan to it ...

;;; -*- lexical-binding: t -*-
;;;
;;; this file:
;;;   http://user.it.uu.se/~embe8573/emacs-init/abc.el
;;;   https://dataswamp.org/~incal/emacs-init/abc.el

(require 'cl-lib)

(defun alphabet (&optional as-list)
  (let ((abc "a b c d e f g h i j k l m n o p q r s t u v w x y z"))
    (if as-list
        (cl-remove ?\s (string-to-list abc))
      abc) ))
;; (alphabet)   ; a b c d e f g h i j k l m n o p q r s t u v w x y z
;; (alphabet t) ; (97 98 99 100 101 102 103 104 105 106 107 108 ...)

(defun echo-alphabet (&optional number)
  (interactive "p")
  (let*((num        (or number (length (alphabet t))))
        (part       (cl-subseq (alphabet t) 0 num))
        (str-list   (cl-mapcar (lambda (c) (char-to-string c)) part))
        (str-almost (format "%s" str-list))
        (str        (substring str-almost 1 (1- (length str-almost)))) )
    (message str) ))
;; (echo-alphabet)     ; a b c d e f g h i j k l m n o p q r s t u v w x y z
;; (echo-alphabet  10) ; a b c d e f g h i j
;; (echo-alphabet -10) ; a b c d e f g h i j k l m n o p
(defalias 'abc #'echo-alphabet)

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Strange whitespaces.
  2021-10-01 10:15       ` Eric S Fraga
@ 2021-10-01 10:28         ` Emanuel Berg via Users list for the GNU Emacs text editor
  2021-10-01 10:57           ` Eric S Fraga
  0 siblings, 1 reply; 31+ messages in thread
From: Emanuel Berg via Users list for the GNU Emacs text editor @ 2021-10-01 10:28 UTC (permalink / raw)
  To: help-gnu-emacs

Eric S Fraga wrote:

>> As for the red color, use this
>
> Or just "C-u C-x =" (what-cursor-position) or "4ga" in
> evil-mode?

Maybe but it brings up a fullscreen buffer with a bunch of
data that (here) isn't asked for ... twice disruptive. This is
what the echo area is for.

Maybe but it brings up a fullscreen buffer with a bunch of
data that (here) isn't asked for ... twice disruptive. This is
what the echo area is for.

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Strange whitespaces.
  2021-10-01 10:28         ` Emanuel Berg via Users list for the GNU Emacs text editor
@ 2021-10-01 10:57           ` Eric S Fraga
  2021-10-01 11:18             ` Emanuel Berg via Users list for the GNU Emacs text editor
  2021-10-01 12:20             ` Yuri Khan
  0 siblings, 2 replies; 31+ messages in thread
From: Eric S Fraga @ 2021-10-01 10:57 UTC (permalink / raw)
  To: help-gnu-emacs

On Friday,  1 Oct 2021 at 12:28, Emanuel Berg wrote:
> Maybe but it brings up a fullscreen buffer with a bunch of
> data that (here) isn't asked for 

Sure but it is available "out of the box" and will work for everybody so
worth noting, I thought.

-- 
Eric S Fraga via Emacs 28.0.50 & org 9.5 on Debian 11.0




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Strange whitespaces.
  2021-10-01 10:57           ` Eric S Fraga
@ 2021-10-01 11:18             ` Emanuel Berg via Users list for the GNU Emacs text editor
  2021-10-01 12:20             ` Yuri Khan
  1 sibling, 0 replies; 31+ messages in thread
From: Emanuel Berg via Users list for the GNU Emacs text editor @ 2021-10-01 11:18 UTC (permalink / raw)
  To: help-gnu-emacs

Eric S Fraga wrote:

>> Maybe but it brings up a fullscreen buffer with a bunch of
>> data that (here) isn't asked for
>
> Sure but it is available "out of the box" and will work for
> everybody so worth noting, I thought.

Not nice noting nothing ...

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Strange whitespaces.
  2021-10-01 10:57           ` Eric S Fraga
  2021-10-01 11:18             ` Emanuel Berg via Users list for the GNU Emacs text editor
@ 2021-10-01 12:20             ` Yuri Khan
  1 sibling, 0 replies; 31+ messages in thread
From: Yuri Khan @ 2021-10-01 12:20 UTC (permalink / raw)
  To: Eric S Fraga; +Cc: help-gnu-emacs

On Fri, 1 Oct 2021 at 17:58, Eric S Fraga <e.fraga@ucl.ac.uk> wrote:

> Sure but it is available "out of the box" and will work for everybody so
> worth noting, I thought.

If you don’t use hl-line-mode (or its global equivalent), you can also
do M-x describe-face RET or M-x customize-face RET. They then suggest
one of the faces at point as the default value for their argument.
With some luck, the one suggested is the one you need, and you just
C-g the command away.

(With hl-line-mode, hl-line is one of the faces at point every time,
and has a non-negligible chance of being chosen as the suggested
default.)



^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2021-10-01 12:20 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-09-30  9:37 Strange whitespaces Hongyi Zhao
2021-09-30  9:56 ` Gregory Heytings
2021-09-30 10:11   ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-09-30 10:19     ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-09-30 13:44       ` Hongyi Zhao
2021-09-30 15:39         ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-09-30 10:08 ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-09-30 13:53   ` Hongyi Zhao
2021-09-30 15:20     ` [External] : " Drew Adams
2021-09-30 15:46       ` Hongyi Zhao
2021-09-30 16:26         ` Drew Adams
2021-09-30 16:06       ` tomas
2021-09-30 16:12       ` Eli Zaretskii
2021-09-30 16:32         ` Drew Adams
2021-09-30 16:45           ` Eli Zaretskii
2021-09-30 17:03             ` Drew Adams
2021-10-01  1:51         ` Hongyi Zhao
2021-10-01  2:03           ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-10-01  6:34           ` Eli Zaretskii
2021-10-01  7:26             ` Hongyi Zhao
2021-10-01  7:56               ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-10-01 10:10                 ` Hongyi Zhao
2021-10-01 10:24                   ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-09-30 15:41     ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-10-01  1:45       ` Hongyi Zhao
2021-10-01  1:56         ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-10-01 10:15       ` Eric S Fraga
2021-10-01 10:28         ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-10-01 10:57           ` Eric S Fraga
2021-10-01 11:18             ` Emanuel Berg via Users list for the GNU Emacs text editor
2021-10-01 12:20             ` Yuri Khan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).