unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* minor mode for highlighting character classes such as non-ascii (bug 47455)
@ 2021-06-01 16:16 Roland Winkler
  2021-06-01 16:31 ` Eli Zaretskii
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Roland Winkler @ 2021-06-01 16:16 UTC (permalink / raw)
  To: emacs-devel

In the context of bug 47455 the idea emerged that it can be useful
to have a minor mode that highlights non-ascii characters.  I want
to ask hear for feedback: does such a mode already exist?  Can it be
useful to have this minor mode more general such that the highlighted
character classes become configurable?  Are there possible use cases
for this beyond non-ascii characters?



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: minor mode for highlighting character classes such as non-ascii (bug 47455)
  2021-06-01 16:16 minor mode for highlighting character classes such as non-ascii (bug 47455) Roland Winkler
@ 2021-06-01 16:31 ` Eli Zaretskii
  2021-06-01 16:32 ` Óscar Fuentes
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Eli Zaretskii @ 2021-06-01 16:31 UTC (permalink / raw)
  To: Roland Winkler; +Cc: emacs-devel

> Date: Tue, 1 Jun 2021 11:16:53 -0500
> From: "Roland Winkler" <winkler@gnu.org>
> 
> In the context of bug 47455 the idea emerged that it can be useful
> to have a minor mode that highlights non-ascii characters.  I want
> to ask hear for feedback: does such a mode already exist?  Can it be
> useful to have this minor mode more general such that the highlighted
> character classes become configurable?  Are there possible use cases
> for this beyond non-ascii characters?

We already highlight some potentially confusing characters, see
no-break-char-display and faces nobreak-space and nobreak-hyphen.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: minor mode for highlighting character classes such as non-ascii (bug 47455)
  2021-06-01 16:16 minor mode for highlighting character classes such as non-ascii (bug 47455) Roland Winkler
  2021-06-01 16:31 ` Eli Zaretskii
@ 2021-06-01 16:32 ` Óscar Fuentes
  2021-06-01 16:48   ` Eli Zaretskii
  2021-06-01 16:49   ` [External] : " Drew Adams
  2021-06-01 16:53 ` Daniel Martín
  2021-06-01 20:51 ` Juri Linkov
  3 siblings, 2 replies; 14+ messages in thread
From: Óscar Fuentes @ 2021-06-01 16:32 UTC (permalink / raw)
  To: emacs-devel

"Roland Winkler" <winkler@gnu.org> writes:

> In the context of bug 47455 the idea emerged that it can be useful
> to have a minor mode that highlights non-ascii characters.  I want
> to ask hear for feedback: does such a mode already exist?  Can it be
> useful to have this minor mode more general such that the highlighted
> character classes become configurable?  Are there possible use cases
> for this beyond non-ascii characters?

Highlighting characters that does not belong to certain coding system
would be useful to me.

Sometimes I work on source code encoded as UTF-8 but they must not
contain characters not encodable by windows-1252. More precisely, string
literals must be composed of characters encodable by windows-1252.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: minor mode for highlighting character classes such as non-ascii (bug 47455)
  2021-06-01 16:32 ` Óscar Fuentes
@ 2021-06-01 16:48   ` Eli Zaretskii
  2021-06-01 16:49   ` [External] : " Drew Adams
  1 sibling, 0 replies; 14+ messages in thread
From: Eli Zaretskii @ 2021-06-01 16:48 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: emacs-devel

> From: Óscar Fuentes <ofv@wanadoo.es>
> Date: Tue, 01 Jun 2021 18:32:59 +0200
> 
> "Roland Winkler" <winkler@gnu.org> writes:
> 
> > In the context of bug 47455 the idea emerged that it can be useful
> > to have a minor mode that highlights non-ascii characters.  I want
> > to ask hear for feedback: does such a mode already exist?  Can it be
> > useful to have this minor mode more general such that the highlighted
> > character classes become configurable?  Are there possible use cases
> > for this beyond non-ascii characters?
> 
> Highlighting characters that does not belong to certain coding system
> would be useful to me.

Character cannot belong to a coding system.  I think you mean
"characters that cannot be safely encoded by a coding system".

> Sometimes I work on source code encoded as UTF-8 but they must not
> contain characters not encodable by windows-1252. More precisely, string
> literals must be composed of characters encodable by windows-1252.

We already make such tests when you save the buffer, so the code to do
that exists, and can be reused.  But I'm not sure this highlighting
couldn't turn out to be somewhat expensive.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [External] : Re: minor mode for highlighting character classes such as non-ascii (bug 47455)
  2021-06-01 16:32 ` Óscar Fuentes
  2021-06-01 16:48   ` Eli Zaretskii
@ 2021-06-01 16:49   ` Drew Adams
  2021-06-01 16:53     ` Eli Zaretskii
  1 sibling, 1 reply; 14+ messages in thread
From: Drew Adams @ 2021-06-01 16:49 UTC (permalink / raw)
  To: Óscar Fuentes, emacs-devel@gnu.org

> > In the context of bug 47455 the idea emerged that it can be useful
> > to have a minor mode that highlights non-ascii characters.  I want
> > to ask hear for feedback: does such a mode already exist?  Can it be
> > useful to have this minor mode more general such that the highlighted
> > character classes become configurable?  Are there possible use cases
> > for this beyond non-ascii characters?

Eli> We already highlight some potentially confusing characters,
Eli> see no-break-char-display and faces nobreak-space and
Eli> nobreak-hyphen.

OF> Highlighting characters that does not belong to certain
OF> coding system would be useful to me.
OF>
OF> Sometimes I work on source code encoded as UTF-8 but they
OF> must not contain characters not encodable by windows-1252.
OF> More precisely, string literals must be composed of
OF> characters encodable by windows-1252.

That highlighting of no-break chars is limited.

1. You can't specify which no-break chars you want to highlight,
   and you can't highlight different such chars differently.

2. It is low-level highlighting, which doesn't use font-lock.

`whitespace.el' is similarly limited, e.g. coupling space and
hard-space chars, for highlighting.

More useful is something like what library `highlight-chars.el'
offers.  You can highlight any chars you like, in any way you
like.  You can highlight arbitrary chars, chars in char ranges,
char classes, or charsets.

https://www.emacswiki.org/emacs/ShowWhiteSpace#HighlightChars

https://www.emacswiki.org/emacs/download/highlight-chars.el




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [External] : Re: minor mode for highlighting character classes such as non-ascii (bug 47455)
  2021-06-01 16:49   ` [External] : " Drew Adams
@ 2021-06-01 16:53     ` Eli Zaretskii
  2021-06-01 18:07       ` Drew Adams
  0 siblings, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2021-06-01 16:53 UTC (permalink / raw)
  To: Drew Adams; +Cc: ofv, emacs-devel

> From: Drew Adams <drew.adams@oracle.com>
> Date: Tue, 1 Jun 2021 16:49:21 +0000
> 
> More useful is something like what library `highlight-chars.el'
> offers.  You can highlight any chars you like, in any way you
> like.  You can highlight arbitrary chars, chars in char ranges,
> char classes, or charsets.

You mean, like highlight-regexp?



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: minor mode for highlighting character classes such as non-ascii (bug 47455)
  2021-06-01 16:16 minor mode for highlighting character classes such as non-ascii (bug 47455) Roland Winkler
  2021-06-01 16:31 ` Eli Zaretskii
  2021-06-01 16:32 ` Óscar Fuentes
@ 2021-06-01 16:53 ` Daniel Martín
  2021-06-01 17:34   ` Roland Winkler
  2021-06-01 20:51 ` Juri Linkov
  3 siblings, 1 reply; 14+ messages in thread
From: Daniel Martín @ 2021-06-01 16:53 UTC (permalink / raw)
  To: Roland Winkler; +Cc: emacs-devel

"Roland Winkler" <winkler@gnu.org> writes:

> In the context of bug 47455 the idea emerged that it can be useful
> to have a minor mode that highlights non-ascii characters.  I want
> to ask hear for feedback: does such a mode already exist?  Can it be
> useful to have this minor mode more general such that the highlighted
> character classes become configurable?  Are there possible use cases
> for this beyond non-ascii characters?

There's already a feature in the display engine that shows certain
non-ASCII characters with a special face.  For example, U+00A0 (no-break
space) is shown with the ‘nobreak-space’ face, U+00AD (soft hyphen),
‘U+2010’ (hyphen), and ‘U+2011’ (non-breaking hyphen) are shown with the
‘nobreak-hyphen’ face.  Perhaps your suggestion could be a
backwards-compatible extension to this hardcoded display mechanism.

Another option is to simply instruct users to use Hi Lock mode with
"[^[:ascii:]]".



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: minor mode for highlighting character classes such as non-ascii (bug 47455)
  2021-06-01 16:53 ` Daniel Martín
@ 2021-06-01 17:34   ` Roland Winkler
  2021-06-01 17:39     ` Eli Zaretskii
  0 siblings, 1 reply; 14+ messages in thread
From: Roland Winkler @ 2021-06-01 17:34 UTC (permalink / raw)
  To: Daniel =?utf-8?Q?Mart=C3=ADn ?=; +Cc: Eli Zaretskii, emacs-devel

On Tue Jun 1 2021 Daniel Martín wrote:
> There's already a feature in the display engine that shows certain
> non-ASCII characters with a special face.  For example, U+00A0 (no-break
> space) is shown with the ‘nobreak-space’ face, U+00AD (soft hyphen),
> ‘U+2010’ (hyphen), and ‘U+2011’ (non-breaking hyphen) are shown with the
> ‘nobreak-hyphen’ face.  Perhaps your suggestion could be a
> backwards-compatible extension to this hardcoded display mechanism.
> 
> Another option is to simply instruct users to use Hi Lock mode with
> "[^[:ascii:]]".

Thanks, Hi Lock mode seems to be already a close match.  I'll look
at this more closely.

The bug report is related to the fact that \(Bib\|La\)?TeX chokes on
any non-ascii characters.  Nonetheless, non-ascii characters can end
up in such files for all kinds of reasons.  So the idea is to have
something like a minor mode that highlights any non-ascii characters
so that this effectively warns the user about the presence of such
characters.

Personally, I have been fooled in particular by unicode 'ZERO WIDTH
SPACE' which probably requires special treatment.  Could it also
make sense to give this particular character a (possibly configurable)
treatment by the display engine?  (I know nothing about the display
engine.)



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: minor mode for highlighting character classes such as non-ascii (bug 47455)
  2021-06-01 17:34   ` Roland Winkler
@ 2021-06-01 17:39     ` Eli Zaretskii
  2021-06-01 17:56       ` Roland Winkler
  0 siblings, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2021-06-01 17:39 UTC (permalink / raw)
  To: Roland Winkler; +Cc: emacs-devel, mardani29

> Date: Tue, 1 Jun 2021 12:34:56 -0500
> From: "Roland Winkler" <winkler@gnu.org>
> Cc: emacs-devel@gnu.org, Eli Zaretskii <eliz@gnu.org>
> 
> Personally, I have been fooled in particular by unicode 'ZERO WIDTH
> SPACE' which probably requires special treatment.  Could it also
> make sense to give this particular character a (possibly configurable)
> treatment by the display engine?

It already does, see glyphless-char-display.  You can customize it to
your needs.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: minor mode for highlighting character classes such as non-ascii (bug 47455)
  2021-06-01 17:39     ` Eli Zaretskii
@ 2021-06-01 17:56       ` Roland Winkler
  0 siblings, 0 replies; 14+ messages in thread
From: Roland Winkler @ 2021-06-01 17:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, mardani29

On Tue Jun 1 2021 Eli Zaretskii wrote:
> > Personally, I have been fooled in particular by unicode 'ZERO
> > WIDTH SPACE' which probably requires special treatment.  Could
> > it also make sense to give this particular character a (possibly
> > configurable) treatment by the display engine?
> 
> It already does, see glyphless-char-display.  You can customize it
> to your needs.

Excellent, I knew I may be reinventing the wheel with these ideas.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [External] : Re: minor mode for highlighting character classes such as non-ascii (bug 47455)
  2021-06-01 16:53     ` Eli Zaretskii
@ 2021-06-01 18:07       ` Drew Adams
  2021-06-01 18:20         ` Eli Zaretskii
  0 siblings, 1 reply; 14+ messages in thread
From: Drew Adams @ 2021-06-01 18:07 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: ofv@wanadoo.es, emacs-devel@gnu.org

> > More useful is something like what library `highlight-chars.el'
> > offers.  You can highlight any chars you like, in any way you
> > like.  You can highlight arbitrary chars, chars in char ranges,
> > char classes, or charsets.
> 
> You mean, like highlight-regexp?

No.  I mean like what I linked to.  There you'll
find a succinct description.  I think you'll be
able to see a real difference from
`highlight-regexp' (`hi-lock-face-buffer').

https://www.emacswiki.org/emacs/ShowWhiteSpace*HighlightChars

And the Commentary in highlight-chars.el offers
quite a bit more detail.

https://www.emacswiki.org/emacs/download/highlight-chars.el

But yes, hi-lock.el offers some improvement over
the hard-coded behavior of `nobreak-char-display'
and `whitespace-mode'.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [External] : Re: minor mode for highlighting character classes such as non-ascii (bug 47455)
  2021-06-01 18:07       ` Drew Adams
@ 2021-06-01 18:20         ` Eli Zaretskii
  0 siblings, 0 replies; 14+ messages in thread
From: Eli Zaretskii @ 2021-06-01 18:20 UTC (permalink / raw)
  To: Drew Adams; +Cc: ofv, emacs-devel

> From: Drew Adams <drew.adams@oracle.com>
> CC: "ofv@wanadoo.es" <ofv@wanadoo.es>,
>         "emacs-devel@gnu.org"
> 	<emacs-devel@gnu.org>
> Date: Tue, 1 Jun 2021 18:07:33 +0000
> 
> But yes, hi-lock.el offers some improvement over
> the hard-coded behavior of `nobreak-char-display'
> and `whitespace-mode'.

How do you write a regexp or a character class that matches characters
which cannot be encoded by a given coding-system?



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: minor mode for highlighting character classes such as non-ascii (bug 47455)
  2021-06-01 16:16 minor mode for highlighting character classes such as non-ascii (bug 47455) Roland Winkler
                   ` (2 preceding siblings ...)
  2021-06-01 16:53 ` Daniel Martín
@ 2021-06-01 20:51 ` Juri Linkov
  2021-06-01 21:45   ` Roland Winkler
  3 siblings, 1 reply; 14+ messages in thread
From: Juri Linkov @ 2021-06-01 20:51 UTC (permalink / raw)
  To: Roland Winkler; +Cc: emacs-devel

> In the context of bug 47455 the idea emerged that it can be useful
> to have a minor mode that highlights non-ascii characters.  I want
> to ask hear for feedback: does such a mode already exist?  Can it be
> useful to have this minor mode more general such that the highlighted
> character classes become configurable?  Are there possible use cases
> for this beyond non-ascii characters?

The GNU ELPA package 'markchars' has options to highlight
non-ASCII space chars, non-ASCII hyphen chars,
and other customizable patterns, e.g. "[[:nonascii:]]+".



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: minor mode for highlighting character classes such as non-ascii (bug 47455)
  2021-06-01 20:51 ` Juri Linkov
@ 2021-06-01 21:45   ` Roland Winkler
  0 siblings, 0 replies; 14+ messages in thread
From: Roland Winkler @ 2021-06-01 21:45 UTC (permalink / raw)
  To: Juri Linkov; +Cc: emacs-devel

On Tue Jun 1 2021 Juri Linkov wrote:
> The GNU ELPA package 'markchars' has options to highlight
> non-ASCII space chars, non-ASCII hyphen chars,
> and other customizable patterns, e.g. "[[:nonascii:]]+".

Wonderful, thanks for the feedback!  It seems that the only thing
that's missing is the proper advertisement of these existing
features and packages for the new use case.



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-06-01 21:45 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-01 16:16 minor mode for highlighting character classes such as non-ascii (bug 47455) Roland Winkler
2021-06-01 16:31 ` Eli Zaretskii
2021-06-01 16:32 ` Óscar Fuentes
2021-06-01 16:48   ` Eli Zaretskii
2021-06-01 16:49   ` [External] : " Drew Adams
2021-06-01 16:53     ` Eli Zaretskii
2021-06-01 18:07       ` Drew Adams
2021-06-01 18:20         ` Eli Zaretskii
2021-06-01 16:53 ` Daniel Martín
2021-06-01 17:34   ` Roland Winkler
2021-06-01 17:39     ` Eli Zaretskii
2021-06-01 17:56       ` Roland Winkler
2021-06-01 20:51 ` Juri Linkov
2021-06-01 21:45   ` Roland Winkler

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).