unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
       [not found] <20201026111348.773761-1-bpeeluk.ref@yahoo.co.uk>
@ 2020-10-26 11:13 ` Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2020-10-26 16:29   ` Eli Zaretskii
  0 siblings, 1 reply; 26+ messages in thread
From: Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-10-26 11:13 UTC (permalink / raw)
  To: 44236

U+202F is like the normal non-breaking space character except that it
is slightly narrower. In the French language, this character is
supposed to be used before most punctation marks such as question
marks and quote characters. For people using the BÉPO keyboard layout,
this character is typed with just shift+space, so it’s quite easy to
accidentally type it. For that reason it would be nice if it was
displayed differently like the regular non-breaking space. This patch
makes that change.

* src/charcter.h: Add an enum for the U+202F character.
* src/xdisp.c (get_next_display_element): Use nobreak_space face also
for U+202F.
---
 src/character.h | 1 +
 src/xdisp.c     | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git src/character.h src/character.h
index af5023f77c..90708c8d38 100644
--- src/character.h
+++ src/character.h
@@ -69,6 +69,7 @@ #define EMACS_CHARACTER_H
 enum
 {
   NO_BREAK_SPACE = 0x00A0,
+  NARROW_NO_BREAK_SPACE = 0x202F,
   SOFT_HYPHEN = 0x00AD,
   ZERO_WIDTH_NON_JOINER = 0x200C,
   ZERO_WIDTH_JOINER = 0x200D,
diff --git src/xdisp.c src/xdisp.c
index 5a62cd6eb5..0772066f8a 100644
--- src/xdisp.c
+++ src/xdisp.c
@@ -7555,7 +7555,7 @@ get_next_display_element (struct it *it)
 	     non-ASCII spaces and hyphens specially.  */
 	  if (! ASCII_CHAR_P (c) && ! NILP (Vnobreak_char_display))
 	    {
-	      if (c == NO_BREAK_SPACE)
+	      if (c == NO_BREAK_SPACE || c == NARROW_NO_BREAK_SPACE)
 		nonascii_space_p = true;
 	      else if (c == SOFT_HYPHEN || c == HYPHEN
 		       || c == NON_BREAKING_HYPHEN)
-- 
2.25.4






^ permalink raw reply related	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-10-26 11:13 ` bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2020-10-26 16:29   ` Eli Zaretskii
  2020-10-26 16:55     ` Drew Adams
                       ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Eli Zaretskii @ 2020-10-26 16:29 UTC (permalink / raw)
  To: Neil Roberts; +Cc: 44236

> Date: Mon, 26 Oct 2020 12:13:48 +0100
> From: Neil Roberts via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
> 
> U+202F is like the normal non-breaking space character except that it
> is slightly narrower. In the French language, this character is
> supposed to be used before most punctation marks such as question
> marks and quote characters. For people using the BÉPO keyboard layout,
> this character is typed with just shift+space, so it’s quite easy to
> accidentally type it. For that reason it would be nice if it was
> displayed differently like the regular non-breaking space. This patch
> makes that change.

Thanks.

But what is the purpose of showing this character like we do with
NBSP?  We do that with NBSP because otherwise it will be easy to
interpret NBSP as a SPC: they have the same width and appearance on
display.  By contrast, U+202F NARROW NO-BREAK SPACE is much thinner,
and cannot be mistaken to be SPC.

OTOH, if we make U+202F stand out, then why not others, for example
U+2007? or U+2060? or U+2002? or U+2003? or U+2009 etc.

IOW, we need to decide on the rationale for displaying these
specially, and then we can decide which ones should have this applied.





^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-10-26 16:29   ` Eli Zaretskii
@ 2020-10-26 16:55     ` Drew Adams
  2020-10-27  9:17     ` Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2020-11-01  8:20     ` bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F Juri Linkov
  2 siblings, 0 replies; 26+ messages in thread
From: Drew Adams @ 2020-10-26 16:55 UTC (permalink / raw)
  To: Eli Zaretskii, Neil Roberts; +Cc: 44236

> But what is the purpose of showing this character like we do with
> NBSP?  We do that with NBSP because otherwise it will be easy to
> interpret NBSP as a SPC: they have the same width and appearance on
> display.  By contrast, U+202F NARROW NO-BREAK SPACE is much thinner,
> and cannot be mistaken to be SPC.
> 
> OTOH, if we make U+202F stand out, then why not others, for example
> U+2007? or U+2060? or U+2002? or U+2003? or U+2009 etc.
> 
> IOW, we need to decide on the rationale for displaying these
> specially, and then we can decide which ones should have this applied.

I agree, both (1) that the main purpose of the current highlighting is to make no-break (aka hard) space stand out from ordinary space, and (2) that any additional highlighting needs a rationale.
___

FWIW -

My library highlight-chars.el lets you highlight particular chars in different ways, au choix.  And in particular, you can highlight just hard spaces or just hard hyphens (they need not be treated the same way).

https://www.emacswiki.org/emacs/download/highlight-chars.el





^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-10-26 16:29   ` Eli Zaretskii
  2020-10-26 16:55     ` Drew Adams
@ 2020-10-27  9:17     ` Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2020-10-27 15:24       ` Eli Zaretskii
  2020-11-01  8:20     ` bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F Juri Linkov
  2 siblings, 1 reply; 26+ messages in thread
From: Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-10-27  9:17 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 44236

Eli Zaretskii <eliz@gnu.org> writes:

> But what is the purpose of showing this character like we do with
> NBSP?  We do that with NBSP because otherwise it will be easy to
> interpret NBSP as a SPC: they have the same width and appearance on
> display.  By contrast, U+202F NARROW NO-BREAK SPACE is much thinner,
> and cannot be mistaken to be SPC.

Most people use Emacs with a monospace font, as is the default if you
don’t change it, so in practice U+202F looks identical to NBSP and the
regular space. I would assume that most people using these characters
would be editing the source code for a document that would be displayed
in something else, such as editing an HTML document. In that case you
want to make sure that you got the right spaces in the source code and
without the visual indication it is really hard to do.

I guess ideally in my case it would be even better if U+202F had a
different face than NBSP so that I could also make sure I picked the
right non-breaking space when typing a document in French.

The other use case, which is probably more common for me, is that I am
editing some source code and I don’t want any non-breaking spaces at
all. With the bépo keyboard layout it’s kind of easy to accidentally
type them, so I just want to be able to recognise either of them. In
that case having the same face for both characters is still helpful.

> OTOH, if we make U+202F stand out, then why not others, for example
> U+2007? or U+2060? or U+2002? or U+2003? or U+2009 etc.

I think it would make sense to highlight all of the spaces that look
exactly the same as a regular space. That would exclude U+2060 because
that is zero-width. Maybe we could use all of the characters from the
“space separator” Unicode class except U+0020.

https://www.compart.com/en/unicode/category/Zs

- Neil





^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-10-27  9:17     ` Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2020-10-27 15:24       ` Eli Zaretskii
  2020-10-28 11:37         ` bug#44236: [PATCH] xdisp: Apply nobreak-char-display to all characters of blankp Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 26+ messages in thread
From: Eli Zaretskii @ 2020-10-27 15:24 UTC (permalink / raw)
  To: Neil Roberts; +Cc: 44236

> From: Neil Roberts <bpeeluk@yahoo.co.uk>
> Cc: 44236@debbugs.gnu.org
> Date: Tue, 27 Oct 2020 10:17:35 +0100
> 
> > OTOH, if we make U+202F stand out, then why not others, for example
> > U+2007? or U+2060? or U+2002? or U+2003? or U+2009 etc.
> 
> I think it would make sense to highlight all of the spaces that look
> exactly the same as a regular space. That would exclude U+2060 because
> that is zero-width. Maybe we could use all of the characters from the
> “space separator” Unicode class except U+0020.

I'm okay with displaying all "space" characters that way.  We already
have a function which tests this category: blankp.  Would you like to
submit a patch which implements the above?  Please also include a NEWS
entry which calls out this new behavior.

Thanks.





^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display to all characters of blankp
  2020-10-27 15:24       ` Eli Zaretskii
@ 2020-10-28 11:37         ` Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2020-10-30 12:14           ` bug#44236: (no subject) Lars Ingebrigtsen
  0 siblings, 1 reply; 26+ messages in thread
From: Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-10-28 11:37 UTC (permalink / raw)
  To: 44236

nobreak-char-display is documented as making Emacs display all
non-ASCII chars that have the same appearance as an ASCII space using
a special face. In practice however, this was limited to nbsp and the
hyphen characters. When using a monospace font, there are many other
characters that resemble an ASCII space, such as U+202F NARROW
NO-BREAK SPACE. That is like the normal non-breaking space character
except that it is slightly narrower. In the French language, this
character is supposed to be used before most punctuation marks such as
question marks and quote characters, so it is quite prevalent. For
that reason it would be nice if it was displayed differently like the
regular non-breaking space.

This patch makes it show all non-ASCII characters from the Unicode
horizontal space class using the special face.

* src/xdisp.c (get_next_display_element): Use blankp to test whether
to use the nobreak_space face.
---
 doc/emacs/display.texi | 3 ++-
 etc/NEWS               | 8 ++++++++
 src/xdisp.c            | 5 +++--
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git doc/emacs/display.texi doc/emacs/display.texi
index 6f1bc802b8..ccc945c3af 100644
--- doc/emacs/display.texi
+++ doc/emacs/display.texi
@@ -1605,7 +1605,8 @@ Text Display
 realization, e.g., by yanking; for instance, source code compilers
 typically do not treat non-@acronym{ASCII} spaces as whitespace
 characters.  To deal with this problem, Emacs displays such characters
-specially: it displays @code{U+00A0} (no-break space) with the
+specially: it displays @code{U+00A0} (no-break space) and other
+characters from the Unicode horizontal space class with the
 @code{nobreak-space} face, and it displays @code{U+00AD} (soft
 hyphen), @code{U+2010} (hyphen), and @code{U+2011} (non-breaking
 hyphen) with the @code{nobreak-hyphen} face.  To disable this, change
diff --git etc/NEWS etc/NEWS
index 7dbd3d51fa..dcf9a75723 100644
--- etc/NEWS
+++ etc/NEWS
@@ -163,6 +163,14 @@ your init file:
     (setq frame-title-format '(multiple-frames "%b"
                               ("" invocation-name "@" system-name)))
 
++++
+** 'nobreak-char-display' now also affects all non-ASCII Unicode horizontal space characters.
+The documented intention of this variable is to cause Emacs to display
+characters that could be confused with a space character using a
+different face. Previously this was limited only to NBSP and hyphen
+characters. Now it covers all of the Unicode space characters,
+including narrow NBSP, which has the same appearance.
+
 \f
 * Editing Changes in Emacs 28.1
 
diff --git src/xdisp.c src/xdisp.c
index 5a62cd6eb5..cf30ba9479 100644
--- src/xdisp.c
+++ src/xdisp.c
@@ -7555,7 +7555,7 @@ get_next_display_element (struct it *it)
 	     non-ASCII spaces and hyphens specially.  */
 	  if (! ASCII_CHAR_P (c) && ! NILP (Vnobreak_char_display))
 	    {
-	      if (c == NO_BREAK_SPACE)
+	      if (blankp (c))
 		nonascii_space_p = true;
 	      else if (c == SOFT_HYPHEN || c == HYPHEN
 		       || c == NON_BREAKING_HYPHEN)
@@ -34740,7 +34740,8 @@ syms_of_xdisp (void)
 same appearance as an ASCII space or hyphen, using the `nobreak-space'
 or `nobreak-hyphen' face respectively.
 
-U+00A0 (no-break space), U+00AD (soft hyphen), U+2010 (hyphen), and
+All of the non-ASCII characters in the Unicode horizontal whitespace
+character class, as well as U+00AD (soft hyphen), U+2010 (hyphen), and
 U+2011 (non-breaking hyphen) are affected.
 
 Any other non-nil value means to display these characters as an escape
-- 
2.25.4






^ permalink raw reply related	[flat|nested] 26+ messages in thread

* bug#44236: (no subject)
  2020-10-28 11:37         ` bug#44236: [PATCH] xdisp: Apply nobreak-char-display to all characters of blankp Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2020-10-30 12:14           ` Lars Ingebrigtsen
  0 siblings, 0 replies; 26+ messages in thread
From: Lars Ingebrigtsen @ 2020-10-30 12:14 UTC (permalink / raw)
  To: Neil Roberts; +Cc: 44236

Neil Roberts <bpeeluk@yahoo.co.uk> writes:

> This patch makes it show all non-ASCII characters from the Unicode
> horizontal space class using the special face.

Thanks; applied to Emacs 28.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-10-26 16:29   ` Eli Zaretskii
  2020-10-26 16:55     ` Drew Adams
  2020-10-27  9:17     ` Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2020-11-01  8:20     ` Juri Linkov
  2020-11-01  8:30       ` Juri Linkov
  2020-11-01 13:12       ` Lars Ingebrigtsen
  2 siblings, 2 replies; 26+ messages in thread
From: Juri Linkov @ 2020-11-01  8:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Neil Roberts, 44236

> IOW, we need to decide on the rationale for displaying these
> specially, and then we can decide which ones should have this applied.

For a long time my customization contained

  (setq dired-listing-switches "-Alv --block-size='1")

that in Dired buffers displays file sizes using nice space
as the thousands separator between groups of 3 digit.

But now this clean space between numbers is polluted by visual garbage
of unrequested highlighted underlines.

Using 'C-u C-x =' on the character shows that it's NARROW NO-BREAK SPACE
with the nobreak-space face on it.

The intention of nobreak-space is to warn the user about confusable
characters in writable buffers.  But why highlight such characters
in read-only Dired buffers?





^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-11-01  8:20     ` bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F Juri Linkov
@ 2020-11-01  8:30       ` Juri Linkov
  2020-11-01 13:12       ` Lars Ingebrigtsen
  1 sibling, 0 replies; 26+ messages in thread
From: Juri Linkov @ 2020-11-01  8:30 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Neil Roberts, 44236

> For a long time my customization contained
>
>   (setq dired-listing-switches "-Alv --block-size='1")
>
> that in Dired buffers displays file sizes using nice space
> as the thousands separator between groups of 3 digit.
>
> But now this clean space between numbers is polluted by visual garbage
> of unrequested highlighted underlines.

For example, gnus-article-mode disables this highlighting
in read-only buffers with:

  ;; Prevent Emacs from displaying non-break space with
  ;; `nobreak-space' face.
  (set (make-local-variable 'nobreak-char-display) nil)

But still in Dired buffers this highlighting is useful to see
bad characters in file names.  Whereas such highlighting makes no sense
in file sizes.

> Using 'C-u C-x =' on the character shows that it's NARROW NO-BREAK SPACE
> with the nobreak-space face on it.

It displays this information with this patch:

diff --git a/lisp/descr-text.el b/lisp/descr-text.el
index ec9a968013..075cb21c21 100644
--- a/lisp/descr-text.el
+++ b/lisp/descr-text.el
@@ -687,7 +687,8 @@ describe-char
                                   (save-excursion (goto-char pos)
                                                   (looking-at-p "[ \t]+$")))
                              'trailing-whitespace)
-                            ((and nobreak-char-display char (eq char '#xa0))
+                            ((and nobreak-char-display char
+                                  (eq (get-char-code-property char 'general-category) 'Zs))
                              'nobreak-space)
                             ((and nobreak-char-display char
 				  (memq char '(#xad #x2010 #x2011)))





^ permalink raw reply related	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-11-01  8:20     ` bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F Juri Linkov
  2020-11-01  8:30       ` Juri Linkov
@ 2020-11-01 13:12       ` Lars Ingebrigtsen
  2020-11-01 15:16         ` Eli Zaretskii
  2020-11-01 18:53         ` Juri Linkov
  1 sibling, 2 replies; 26+ messages in thread
From: Lars Ingebrigtsen @ 2020-11-01 13:12 UTC (permalink / raw)
  To: Juri Linkov; +Cc: Neil Roberts, 44236

Juri Linkov <juri@linkov.net> writes:

> The intention of nobreak-space is to warn the user about confusable
> characters in writable buffers.  But why highlight such characters
> in read-only Dired buffers?

Perhaps `special-mode' should switch this highlighting off?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-11-01 13:12       ` Lars Ingebrigtsen
@ 2020-11-01 15:16         ` Eli Zaretskii
  2020-11-01 18:51           ` Juri Linkov
  2020-11-01 18:53         ` Juri Linkov
  1 sibling, 1 reply; 26+ messages in thread
From: Eli Zaretskii @ 2020-11-01 15:16 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: bpeeluk, 44236, juri

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: Eli Zaretskii <eliz@gnu.org>,  Neil Roberts <bpeeluk@yahoo.co.uk>,
>   44236@debbugs.gnu.org
> Date: Sun, 01 Nov 2020 14:12:49 +0100
> 
> Juri Linkov <juri@linkov.net> writes:
> 
> > The intention of nobreak-space is to warn the user about confusable
> > characters in writable buffers.  But why highlight such characters
> > in read-only Dired buffers?
> 
> Perhaps `special-mode' should switch this highlighting off?

That sounds too drastic to me.  But perhaps we should only highlight
this character and other "thin" spaces only on TTY frames, where they
really look like a SPC?  Because on GUI frames it is quite easy to
understand that they are not a SPC character.





^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-11-01 15:16         ` Eli Zaretskii
@ 2020-11-01 18:51           ` Juri Linkov
  2020-11-01 19:29             ` Eli Zaretskii
  0 siblings, 1 reply; 26+ messages in thread
From: Juri Linkov @ 2020-11-01 18:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: bpeeluk, Lars Ingebrigtsen, 44236

>> > The intention of nobreak-space is to warn the user about confusable
>> > characters in writable buffers.  But why highlight such characters
>> > in read-only Dired buffers?
>>
>> Perhaps `special-mode' should switch this highlighting off?
>
> That sounds too drastic to me.

I agree.

> But perhaps we should only highlight this character and other "thin"
> spaces only on TTY frames, where they really look like a SPC?
> Because on GUI frames it is quite easy to understand that they are not
> a SPC character.

Even on GUI frames with monospaced fonts I see that NARROW NO-BREAK SPACE
still has the same width as all other space characters.  So there is
no visual difference between them on GUI frames.





^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-11-01 13:12       ` Lars Ingebrigtsen
  2020-11-01 15:16         ` Eli Zaretskii
@ 2020-11-01 18:53         ` Juri Linkov
  2020-11-01 19:30           ` Eli Zaretskii
  2020-11-01 19:41           ` Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 2 replies; 26+ messages in thread
From: Juri Linkov @ 2020-11-01 18:53 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Neil Roberts, 44236

>> The intention of nobreak-space is to warn the user about confusable
>> characters in writable buffers.  But why highlight such characters
>> in read-only Dired buffers?
>
> Perhaps `special-mode' should switch this highlighting off?

In Dired it's still useful to be able to spot unusual characters
in file names, but such highlighting is useless in file sizes.

Maybe highlighting should check for some text properties,
and not to highlight nobreak-chars in text with these properties?





^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-11-01 18:51           ` Juri Linkov
@ 2020-11-01 19:29             ` Eli Zaretskii
  2020-11-01 19:40               ` Juri Linkov
  0 siblings, 1 reply; 26+ messages in thread
From: Eli Zaretskii @ 2020-11-01 19:29 UTC (permalink / raw)
  To: Juri Linkov; +Cc: bpeeluk, larsi, 44236

> From: Juri Linkov <juri@linkov.net>
> Cc: Lars Ingebrigtsen <larsi@gnus.org>,  bpeeluk@yahoo.co.uk,
>   44236@debbugs.gnu.org
> Date: Sun, 01 Nov 2020 20:51:33 +0200
> 
> > But perhaps we should only highlight this character and other "thin"
> > spaces only on TTY frames, where they really look like a SPC?
> > Because on GUI frames it is quite easy to understand that they are not
> > a SPC character.
> 
> Even on GUI frames with monospaced fonts I see that NARROW NO-BREAK SPACE
> still has the same width as all other space characters.  So there is
> no visual difference between them on GUI frames.

In which case the special face is entirely appropriate, and I don't
think I understand your complaint.





^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-11-01 18:53         ` Juri Linkov
@ 2020-11-01 19:30           ` Eli Zaretskii
  2020-11-01 19:41             ` Juri Linkov
  2020-11-01 19:41           ` Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 26+ messages in thread
From: Eli Zaretskii @ 2020-11-01 19:30 UTC (permalink / raw)
  To: Juri Linkov; +Cc: bpeeluk, larsi, 44236

> From: Juri Linkov <juri@linkov.net>
> Cc: Eli Zaretskii <eliz@gnu.org>,  Neil Roberts <bpeeluk@yahoo.co.uk>,
>   44236@debbugs.gnu.org
> Date: Sun, 01 Nov 2020 20:53:59 +0200
> 
> Maybe highlighting should check for some text properties,
> and not to highlight nobreak-chars in text with these properties?

That would mean an entirely different implementation from what we have
now.





^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-11-01 19:29             ` Eli Zaretskii
@ 2020-11-01 19:40               ` Juri Linkov
  2020-11-01 19:52                 ` Eli Zaretskii
  0 siblings, 1 reply; 26+ messages in thread
From: Juri Linkov @ 2020-11-01 19:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: bpeeluk, larsi, 44236

>> Even on GUI frames with monospaced fonts I see that NARROW NO-BREAK SPACE
>> still has the same width as all other space characters.  So there is
>> no visual difference between them on GUI frames.
>
> In which case the special face is entirely appropriate, and I don't
> think I understand your complaint.

Seeing hundreds of red underlines in Dired buffers is a horrible experience.





^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-11-01 19:30           ` Eli Zaretskii
@ 2020-11-01 19:41             ` Juri Linkov
  2020-11-01 19:59               ` Eli Zaretskii
  0 siblings, 1 reply; 26+ messages in thread
From: Juri Linkov @ 2020-11-01 19:41 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: bpeeluk, larsi, 44236

>> Maybe highlighting should check for some text properties,
>> and not to highlight nobreak-chars in text with these properties?
>
> That would mean an entirely different implementation from what we have
> now.

get_next_display_element has no access to text properties?





^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-11-01 18:53         ` Juri Linkov
  2020-11-01 19:30           ` Eli Zaretskii
@ 2020-11-01 19:41           ` Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2020-11-01 20:00             ` Juri Linkov
  1 sibling, 1 reply; 26+ messages in thread
From: Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-11-01 19:41 UTC (permalink / raw)
  To: Juri Linkov, Lars Ingebrigtsen; +Cc: Eli Zaretskii, 44236

Juri Linkov <juri@linkov.net> writes:

> In Dired it's still useful to be able to spot unusual characters
> in file names, but such highlighting is useless in file sizes.

If the idea is to spot malicious filenames that have confusable
characters then I think the problem is much larger than just confusing
space characters. For example “.аl‌ias”, with a letter from the Cyrillic
alphabet and a zero-width space. I think that particular problem is out
of scope for the nobreak-char-display feature.

Regards,
- Neil





^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-11-01 19:40               ` Juri Linkov
@ 2020-11-01 19:52                 ` Eli Zaretskii
  2020-11-01 20:12                   ` Juri Linkov
  0 siblings, 1 reply; 26+ messages in thread
From: Eli Zaretskii @ 2020-11-01 19:52 UTC (permalink / raw)
  To: Juri Linkov; +Cc: bpeeluk, larsi, 44236

> From: Juri Linkov <juri@linkov.net>
> Cc: larsi@gnus.org,  bpeeluk@yahoo.co.uk,  44236@debbugs.gnu.org
> Date: Sun, 01 Nov 2020 21:40:20 +0200
> 
> >> Even on GUI frames with monospaced fonts I see that NARROW NO-BREAK SPACE
> >> still has the same width as all other space characters.  So there is
> >> no visual difference between them on GUI frames.
> >
> > In which case the special face is entirely appropriate, and I don't
> > think I understand your complaint.
> 
> Seeing hundreds of red underlines in Dired buffers is a horrible experience.

You can turn the feature off locally in your Dired buffers, no?

I mean, you've created this situation by customizing the Dired
display, so customizing it a bit more should not be a grave problem,
IMO.





^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-11-01 19:41             ` Juri Linkov
@ 2020-11-01 19:59               ` Eli Zaretskii
  0 siblings, 0 replies; 26+ messages in thread
From: Eli Zaretskii @ 2020-11-01 19:59 UTC (permalink / raw)
  To: Juri Linkov; +Cc: bpeeluk, larsi, 44236

> From: Juri Linkov <juri@linkov.net>
> Cc: larsi@gnus.org,  bpeeluk@yahoo.co.uk,  44236@debbugs.gnu.org
> Date: Sun, 01 Nov 2020 21:41:10 +0200
> 
> >> Maybe highlighting should check for some text properties,
> >> and not to highlight nobreak-chars in text with these properties?
> >
> > That would mean an entirely different implementation from what we have
> > now.
> 
> get_next_display_element has no access to text properties?

Text properties are handled by the display code on a level above
get_next_display_element.

But that's not what I meant.  I meant that if we want to base this on
text properties, we should do this via hi-lock or similar, not in the
display engine which treats all characters the same.

Alternatively, if this new feature is so annoying, and people are
unwilling to customize their Emacs to get the old behavior back, maybe
we should make nobreak-char-display more than just a simple boolean,
so that people could control which characters are and aren't
emphasized?





^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-11-01 19:41           ` Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2020-11-01 20:00             ` Juri Linkov
  0 siblings, 0 replies; 26+ messages in thread
From: Juri Linkov @ 2020-11-01 20:00 UTC (permalink / raw)
  To: Neil Roberts; +Cc: Lars Ingebrigtsen, 44236

[-- Attachment #1: Type: text/plain, Size: 563 bytes --]

>> In Dired it's still useful to be able to spot unusual characters
>> in file names, but such highlighting is useless in file sizes.
>
> If the idea is to spot malicious filenames that have confusable
> characters then I think the problem is much larger than just confusing
> space characters. For example “.аl‌ias”, with a letter from the Cyrillic
> alphabet and a zero-width space. I think that particular problem is out
> of scope for the nobreak-char-display feature.

You tried to sneak in confusable characters, but I see them clearly,
heh heh :-)


[-- Attachment #2: confusables.png --]
[-- Type: image/png, Size: 24625 bytes --]

[-- Attachment #3: Type: text/plain, Size: 137 bytes --]


These characters are revealed thanks to the 'markchars' package from GNU ELPA,
and glyphless-char face customized to :background "red".

^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-11-01 19:52                 ` Eli Zaretskii
@ 2020-11-01 20:12                   ` Juri Linkov
  2020-11-03 18:44                     ` Juri Linkov
  0 siblings, 1 reply; 26+ messages in thread
From: Juri Linkov @ 2020-11-01 20:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: bpeeluk, larsi, 44236

>> Seeing hundreds of red underlines in Dired buffers is a horrible experience.
>
> You can turn the feature off locally in your Dired buffers, no?
>
> I mean, you've created this situation by customizing the Dired
> display, so customizing it a bit more should not be a grave problem,
> IMO.

Indeed, this should not be hard to do if a more general solution
can't be found.

> Text properties are handled by the display code on a level above
> get_next_display_element.
>
> But that's not what I meant.  I meant that if we want to base this on
> text properties, we should do this via hi-lock or similar, not in the
> display engine which treats all characters the same.

Or markchars.el, or uni-confusables.el.  Like these packages maybe better
to create another package e.g. nobreak.el, based on font-lock-mode?

> Alternatively, if this new feature is so annoying, and people are
> unwilling to customize their Emacs to get the old behavior back, maybe
> we should make nobreak-char-display more than just a simple boolean,
> so that people could control which characters are and aren't
> emphasized?

This would complicate the core functions.





^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
       [not found]               ` <<83lffketlt.fsf@gnu.org>
@ 2020-11-01 22:43                 ` Drew Adams
  0 siblings, 0 replies; 26+ messages in thread
From: Drew Adams @ 2020-11-01 22:43 UTC (permalink / raw)
  To: Eli Zaretskii, Juri Linkov; +Cc: bpeeluk, larsi, 44236

> Alternatively, if this new feature is so annoying, and people are
> unwilling to customize their Emacs to get the old behavior back, maybe
> we should make nobreak-char-display more than just a simple boolean,
> so that people could control which characters are and aren't
> emphasized?

(Not speaking to how annoying anything might be, here.)

`nobreak-char-display' is what it is.  It can't be
expected to do more than it does, IMO.  Its aim is
to highlight non-ASCII chars that look similar to
ASCII space and hyphen.

That's already too much, IMO.  I've said before that
it's a weakness that users can't separate those two
(highlighting look-alikes for SPC and hyphen).  They
shouldn't be hard-coupled together (IMO).

And I mentioned my library `highlight-chars.el',
which lets you highlight different sets of chars.
And code can control that.  It sounds like that's
maybe what's being looked for here: highlight certain
chars in certain contexts (not just everywhere).
___

Description:

https://www.emacswiki.org/emacs/ShowWhiteSpace#HighlightChars

Code:

https://www.emacswiki.org/emacs/download/highlight-chars.el





^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-11-01 20:12                   ` Juri Linkov
@ 2020-11-03 18:44                     ` Juri Linkov
  2020-11-03 21:07                       ` Basil L. Contovounesios
  0 siblings, 1 reply; 26+ messages in thread
From: Juri Linkov @ 2020-11-03 18:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: bpeeluk, larsi, 44236

[-- Attachment #1: Type: text/plain, Size: 838 bytes --]

>> But that's not what I meant.  I meant that if we want to base this on
>> text properties, we should do this via hi-lock or similar, not in the
>> display engine which treats all characters the same.
>
> Or markchars.el, or uni-confusables.el.  Like these packages maybe better
> to create another package e.g. nobreak.el, based on font-lock-mode?

Now I extended markchars.el to highlight exactly the same characters
as highlighted by nobreak-char-display, and additionally highlight them
only in files names in Dired.  This is configurable with such hook:

  (add-hook 'dired-mode-hook
            (lambda ()
              (setq-local nobreak-char-display nil)
              (setq-local markchars-what '(markchars-nobreak-space
                                           markchars-nobreak-hyphen))
              (markchars-mode 1)))


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: markchars-nobreak.patch --]
[-- Type: text/x-diff, Size: 4898 bytes --]

diff --git a/packages/markchars/markchars.el b/packages/markchars/markchars.el
index 7d7fe2982..bd902f7c7 100644
--- a/packages/markchars/markchars.el
+++ b/packages/markchars/markchars.el
@@ -31,6 +31,12 @@
 ;; `markchars-face-confusable' or `markchars-face-pattern'
 ;; respectively.
 ;;
+;; You can set `nobreak-char-display' to nil, and use
+;; `markchars-nobreak-space' and `markchars-nobreak-hyphen'
+;; in Dired buffers to highlight `nobreak-space' and `nobreak-hyphen'
+;; only in file names, not `nobreak-space' used by thousands separators
+;; in file sizes (bug#44236).
+;;
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ;;
 ;;; Change log:
@@ -79,6 +85,16 @@ markchars-white
   "White face for `markchars-mode' char marking."
   :group 'markchars)
 
+(defface markchars-nobreak-space
+  '((t (:inherit nobreak-space)))
+  "Face for displaying nobreak space."
+  :group 'markchars)
+
+(defface markchars-nobreak-hyphen
+  '((t (:inherit nobreak-hyphen)))
+  "Face for displaying nobreak hyphens."
+  :group 'markchars)
+
 (defcustom markchars-face-pattern 'markchars-heavy
   "Pointer to face used for marking matched patterns."
   :type 'face
@@ -101,12 +117,40 @@ markchars-simple-pattern
   :type 'regexp
   :group 'markchars)
 
+(defvar markchars-nobreak-space-pattern
+  (rx (any ;; ?\N{SPACE}
+           ?\N{NO-BREAK SPACE}
+           ?\N{OGHAM SPACE MARK}
+           ?\N{EN QUAD}
+           ?\N{EM QUAD}
+           ?\N{EN SPACE}
+           ?\N{EM SPACE}
+           ?\N{THREE-PER-EM SPACE}
+           ?\N{FOUR-PER-EM SPACE}
+           ?\N{SIX-PER-EM SPACE}
+           ?\N{FIGURE SPACE}
+           ?\N{PUNCTUATION SPACE}
+           ?\N{THIN SPACE}
+           ?\N{HAIR SPACE}
+           ?\N{NARROW NO-BREAK SPACE}
+           ?\N{MEDIUM MATHEMATICAL SPACE}
+           ?\N{IDEOGRAPHIC SPACE}))
+  "A list of characters with general-category `Zs' (Separator, Space).")
+
+(defvar markchars-nobreak-hyphen-pattern
+  (rx (any ?\N{SOFT HYPHEN} ?\N{HYPHEN} ?\N{NON-BREAKING HYPHEN}))
+  "A list of hyphen characters.")
+
 (defcustom markchars-what
   `(markchars-simple-pattern
     markchars-confusables
     ,@(when (fboundp 'idn-is-recommended) '(markchars-nonidn-fun)))
   "Things to mark, a list of regular expressions or symbols."
   :type `(repeat (choice :tag "Marking choices"
+                         (const :tag "Non-ASCII space chars"
+                                markchars-nobreak-space)
+                         (const :tag "Non-ASCII hyphen chars"
+                                markchars-nobreak-hyphen)
                          (const
                           :tag "Non IDN chars (Unicode.org tr39 suggestions)"
                           markchars-nonidn-fun)
@@ -129,6 +173,18 @@ markchars-set-keywords
                            (when (eq what 'markchars-simple-pattern)
                              (setq what markchars-simple-pattern))
                            (cond
+                            ((eq what 'markchars-nobreak-space)
+                             (list
+                              markchars-nobreak-space-pattern
+                              (list 0 '(markchars--render-nobreak-space
+                                        (match-beginning 0)
+                                        (match-end 0)))))
+                            ((eq what 'markchars-nobreak-hyphen)
+                             (list
+                              markchars-nobreak-hyphen-pattern
+                              (list 0 '(markchars--render-nobreak-hyphen
+                                        (match-beginning 0)
+                                        (match-end 0)))))
                             ((eq what 'markchars-nonidn-fun)
                              (list
                               "\\<\\w+\\>"
@@ -184,6 +240,22 @@ markchars--render-nonidn
           (put-text-property (point) (1+ (point)) 'face markchars-face-nonidn)))
       (forward-char))))
 
+(defun markchars--render-nobreak-space (beg end)
+  "Assign markchars pattern properties between BEG and END.
+In Dired/WDired buffers, highlight nobreak-space characters
+only in file names, not anywhere else, so it doesn't highlight
+nobreak-space characters used by thousands separators in file sizes."
+  (when (or (not (derived-mode-p 'dired-mode 'wdired-mode))
+            (or (get-text-property beg 'dired-filename)
+                (get-text-property end 'dired-filename)))
+    (put-text-property beg end 'face 'markchars-nobreak-space)
+    (put-text-property beg end 'markchars 'nobreak-space)))
+
+(defun markchars--render-nobreak-hyphen (beg end)
+  "Assign markchars pattern properties between BEG and END."
+  (put-text-property beg end 'face 'markchars-nobreak-hyphen)
+  (put-text-property beg end 'markchars 'nobreak-hyphen))
+
 ;;;###autoload
 (define-minor-mode markchars-mode
   "Mark special characters.

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-11-03 18:44                     ` Juri Linkov
@ 2020-11-03 21:07                       ` Basil L. Contovounesios
  2020-11-04 19:54                         ` Juri Linkov
  0 siblings, 1 reply; 26+ messages in thread
From: Basil L. Contovounesios @ 2020-11-03 21:07 UTC (permalink / raw)
  To: Juri Linkov; +Cc: bpeeluk, larsi, 44236

Juri Linkov <juri@linkov.net> writes:

> +(defface markchars-nobreak-space
> +  '((t (:inherit nobreak-space)))

> +(defface markchars-nobreak-hyphen
> +  '((t (:inherit nobreak-hyphen)))

AKA '((t :inherit ...)).

-- 
Basil





^ permalink raw reply	[flat|nested] 26+ messages in thread

* bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
  2020-11-03 21:07                       ` Basil L. Contovounesios
@ 2020-11-04 19:54                         ` Juri Linkov
  0 siblings, 0 replies; 26+ messages in thread
From: Juri Linkov @ 2020-11-04 19:54 UTC (permalink / raw)
  To: Basil L. Contovounesios; +Cc: bpeeluk, larsi, 44236

>> +(defface markchars-nobreak-space
>> +  '((t (:inherit nobreak-space)))
>
>> +(defface markchars-nobreak-hyphen
>> +  '((t (:inherit nobreak-hyphen)))
>
> AKA '((t :inherit ...)).

Thanks, I also fixed the existing deffaces where I copied this from,
and pushed to GNU ELPA.





^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2020-11-04 19:54 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20201026111348.773761-1-bpeeluk.ref@yahoo.co.uk>
2020-10-26 11:13 ` bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors
2020-10-26 16:29   ` Eli Zaretskii
2020-10-26 16:55     ` Drew Adams
2020-10-27  9:17     ` Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors
2020-10-27 15:24       ` Eli Zaretskii
2020-10-28 11:37         ` bug#44236: [PATCH] xdisp: Apply nobreak-char-display to all characters of blankp Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors
2020-10-30 12:14           ` bug#44236: (no subject) Lars Ingebrigtsen
2020-11-01  8:20     ` bug#44236: [PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F Juri Linkov
2020-11-01  8:30       ` Juri Linkov
2020-11-01 13:12       ` Lars Ingebrigtsen
2020-11-01 15:16         ` Eli Zaretskii
2020-11-01 18:51           ` Juri Linkov
2020-11-01 19:29             ` Eli Zaretskii
2020-11-01 19:40               ` Juri Linkov
2020-11-01 19:52                 ` Eli Zaretskii
2020-11-01 20:12                   ` Juri Linkov
2020-11-03 18:44                     ` Juri Linkov
2020-11-03 21:07                       ` Basil L. Contovounesios
2020-11-04 19:54                         ` Juri Linkov
2020-11-01 18:53         ` Juri Linkov
2020-11-01 19:30           ` Eli Zaretskii
2020-11-01 19:41             ` Juri Linkov
2020-11-01 19:59               ` Eli Zaretskii
2020-11-01 19:41           ` Neil Roberts via Bug reports for GNU Emacs, the Swiss army knife of text editors
2020-11-01 20:00             ` Juri Linkov
     [not found] <<20201026111348.773761-1-bpeeluk.ref@yahoo.co.uk>
     [not found] ` <<20201026111348.773761-1-bpeeluk@yahoo.co.uk>
     [not found]   ` <<837drdeyss.fsf@gnu.org>
     [not found]     ` <<87h7q98p4q.fsf@mail.linkov.net>
     [not found]       ` <<87imapcjam.fsf@gnus.org>
     [not found]         ` <<87wnz4zz5k.fsf@mail.linkov.net>
     [not found]           ` <<83wnz4euxk.fsf@gnu.org>
     [not found]             ` <<87r1pcyib0.fsf@mail.linkov.net>
     [not found]               ` <<83lffketlt.fsf@gnu.org>
2020-11-01 22:43                 ` Drew Adams

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).