all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Ted Zlatanov <tzz@lifelogs.com>
To: emacs-devel@gnu.org
Subject: Re: highlighting non-ASCII characters
Date: Mon, 29 Mar 2010 16:05:29 -0500	[thread overview]
Message-ID: <87aatqj0ti.fsf@lifelogs.com> (raw)
In-Reply-To: jwv39ziubob.fsf-monnier+emacs@gnu.org

On Mon, 29 Mar 2010 16:19:07 -0400 Stefan Monnier <monnier@iro.umontreal.ca> wrote: 

>> Yes, probably.  But that's accidental.  I still think the character
>> classes [:idn:] (revised name from before) and [:confusable:] (or
>> [:homoglyph:]) would make sense as a first step, then we can decide how
>> to highlight them.

SM> The homoglyph data would be a useful starting point for the feature
SM> I imagine, indeed.  But from the message that started this thread, "K"
SM> is a homoglyph, yet highlighting it everywhere doesn't sound like a good
SM> idea, so basically we need to associate with each homoglyph char
SM> a context where it is expected and only highlight it when it appears in
SM> a different context (or maybe rather when it appears in the context of
SM> its peer).

(I had a "lightbulb moment" I should have had long ago: "confusable" is
a character property, while "homoglyph" is a glyph property; thus the
character class should be [:confusable:] and "homoglyph" should be used
in the face name as long as it's not, er, confusing.)

I know the goal is to match in context and I may take whitespace.el as a
guide in this regard, but I have to start with a [:confusable:]
character class.  I'll also add a [:idn:] class as discussed.  Is that
OK or are you concerned about code bloat in regexp.c?

Afterwards we can set up the map between each confusable character and
the set of characters it can match; this is also in the data file.  That
lets us look in context and apply the rules I proposed.  So for example
if Cyrillic K is confusable with Roman K and we see Roman characters
around, that's suspicious.  But Cyrillic "zhe" is not confusable with
any Roman characters so it wouldn't be as suspicious.

On Mon, 29 Mar 2010 11:48:28 -0700 "Drew Adams" <drew.adams@oracle.com> wrote: 

DA> But it occurred to me that besides different categories of such critters there
DA> might be different levels of fontification details that users might want to see.

DA> For example, for some users or for some purposes, it might be useful to see
DA> different kinds of quote marks distinguished (e.g. different kinds of curly
DA> quotes that might be homoglyphs or curly vs straight quotes, which are not
DA> homoglyphs). For other users or for other purposes such highlighting would be a
DA> distraction.

I'll set up a flexible mechanism, probably patterned after
whitespace.el, to do this kind of highlighting.  So the users will be
able to extend it if needed.

I don't know about curly vs. straight quotes.  I don't think that's a
significant problem, whereas a Cyrillic K in Roman text can actually
cause problems and security compromises.  I'm not against the idea, I
have just never seen it become an issue, and there's a million ways to
combine quotation marks depending on the context.  What's the specific
case that you're thinking of?

Ted





  parent reply	other threads:[~2010-03-29 21:05 UTC|newest]

Thread overview: 182+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-18 19:11 Translation of http status code to text Lennart Borgman
2010-03-22  1:19 ` Juri Linkov
2010-03-22 13:17   ` Ted Zlatanov
2010-03-22 14:01     ` Stefan Monnier
2010-03-22 14:25       ` Ted Zlatanov
2010-03-22 17:06         ` Ted Zlatanov
2010-03-22 17:55           ` Sven Joachim
2010-03-22 19:23             ` Ted Zlatanov
2010-03-22 20:32               ` Sven Joachim
2010-03-22 21:31                 ` Ted Zlatanov
2010-03-23  9:55                   ` Juri Linkov
2010-03-23 13:08                     ` Lennart Borgman
2010-03-23 14:26                       ` face for non-ASCII characters (was: Translation of http status code to text) Ted Zlatanov
2010-03-23 16:28                         ` Lennart Borgman
2010-03-23 18:18                           ` face for non-ASCII characters Ted Zlatanov
2011-04-15 22:41                             ` Ted Zlatanov
2011-04-15 23:07                               ` Lennart Borgman
2011-04-16  0:51                                 ` Ted Zlatanov
2011-04-16  9:10                                   ` Lennart Borgman
2011-04-16 15:05                                     ` Ted Zlatanov
2011-04-16 15:28                                       ` Lennart Borgman
2011-04-16 15:42                                         ` Ted Zlatanov
2011-04-16 15:50                                           ` Lennart Borgman
2011-04-16 15:57                                             ` Ted Zlatanov
2011-04-16 16:01                                               ` Lennart Borgman
2011-04-16 16:13                                                 ` Ted Zlatanov
2011-04-16 16:22                                                   ` Lennart Borgman
2011-04-16 16:27                                                   ` Drew Adams
2011-04-16 16:45                                                     ` Ted Zlatanov
2011-04-16 16:48                                                       ` Lennart Borgman
2011-04-16 16:55                                                         ` Ted Zlatanov
2011-04-16 17:11                                                           ` Lennart Borgman
2011-04-18 15:48                                                             ` Ted Zlatanov
2011-04-18 15:53                                                               ` Lennart Borgman
2011-04-18 16:20                                                                 ` Ted Zlatanov
2011-04-18 17:03                                                                   ` Lennart Borgman
2011-04-19 13:07                                                                     ` Ted Zlatanov
2011-04-19 18:56                                                                       ` Lennart Borgman
2011-04-20 14:49                                                                         ` Ted Zlatanov
2011-04-20 21:38                                                                           ` Lennart Borgman
2011-04-21 17:35                                                                             ` Ted Zlatanov
2011-04-21 18:42                                                                               ` Lennart Borgman
2011-04-21 19:14                                                                                 ` Ted Zlatanov
2011-04-21 20:00                                                                                   ` Lennart Borgman
2011-04-21 20:35                                                                                     ` Ted Zlatanov
2011-04-21 20:53                                                                                       ` Lennart Borgman
2011-04-21 21:18                                                                                         ` Ted Zlatanov
2011-04-22 12:20                                                                                           ` Lennart Borgman
2011-04-22 12:49                                                                                             ` Stephen J. Turnbull
2011-04-22 13:23                                                                                               ` Lennart Borgman
2011-04-23  0:50                                                                                                 ` Richard Stallman
2011-04-23  7:13                                                                                                   ` Lennart Borgman
2011-04-25 17:54                                                                                                     ` Richard Stallman
2011-04-26 18:26                                                                                                       ` Chong Yidong
2011-04-26 19:05                                                                                                         ` Ted Zlatanov
2011-04-26 20:29                                                                                                           ` Chong Yidong
2011-04-27  3:45                                                                                                             ` Ted Zlatanov
2011-04-27  4:42                                                                                                               ` Stephen J. Turnbull
2011-05-02 18:18                                                                                                                 ` Ted Zlatanov
2011-05-03  1:50                                                                                                                   ` Stephen J. Turnbull
2011-05-03 14:45                                                                                                                     ` Ted Zlatanov
2011-05-03 21:21                                                                                                                       ` Lennart Borgman
2011-05-04 14:41                                                                                                                         ` Stephen J. Turnbull
2011-04-27 12:41                                                                                                         ` Lennart Borgman
2011-04-22 14:20                                                                                             ` Ted Zlatanov
2011-04-22 17:12                                                                                               ` Lennart Borgman
2011-04-26  3:14                                                                                                 ` package management proposals for Emacs (was: face for non-ASCII characters) Ted Zlatanov
2011-04-26  8:10                                                                                                   ` Lennart Borgman
2011-04-26 21:46                                                                                                     ` Richard Stallman
2011-04-27  1:19                                                                                                       ` package management proposals for Emacs Stefan Monnier
2011-04-27  3:36                                                                                                         ` Ted Zlatanov
2011-04-27 21:14                                                                                                         ` Richard Stallman
2011-04-26  3:09                                                                                               ` markchars.el 0.2.0 and idn.el (was: face for non-ASCII characters) Ted Zlatanov
2011-04-26  8:13                                                                                                 ` Lennart Borgman
2011-04-26 15:28                                                                                                   ` idn.el and confusables.txt (was: markchars.el 0.2.0 and idn.el) Ted Zlatanov
2011-05-13 19:42                                                                                                     ` idn.el and confusables.txt Stefan Monnier
2011-05-13 20:19                                                                                                       ` Ted Zlatanov
2011-05-14  8:13                                                                                                         ` Eli Zaretskii
2011-05-14  8:06                                                                                                       ` Eli Zaretskii
2011-05-14  8:56                                                                                                         ` Lennart Borgman
2011-05-14  9:36                                                                                                           ` Eli Zaretskii
2011-05-14 13:40                                                                                                         ` Ted Zlatanov
2011-05-14 14:38                                                                                                           ` Eli Zaretskii
2011-05-14 15:30                                                                                                             ` Ted Zlatanov
2011-05-14 16:42                                                                                                               ` Eli Zaretskii
2011-05-14 17:06                                                                                                                 ` Ted Zlatanov
2011-05-14 20:59                                                                                                                   ` Eli Zaretskii
2011-05-15  1:22                                                                                                                     ` Ted Zlatanov
2011-05-15  5:56                                                                                                                       ` Eli Zaretskii
2011-05-15 12:14                                                                                                                         ` Ted Zlatanov
2011-05-16 12:38                                                                                                                           ` Eli Zaretskii
2011-05-16 18:31                                                                                                                             ` Ted Zlatanov
2011-05-17 17:59                                                                                                                               ` Eli Zaretskii
2011-05-17 15:32                                                                                                                           ` Ted Zlatanov
2011-05-18 18:15                                                                                                                             ` Ted Zlatanov
2011-05-14 17:25                                                                                                             ` Stefan Monnier
2011-05-15 13:06                                                                                                         ` Kenichi Handa
2011-05-15 17:34                                                                                                           ` Eli Zaretskii
2011-05-18  5:23                                                                                                             ` handa
2011-05-18  7:38                                                                                                               ` Eli Zaretskii
2011-05-18  7:59                                                                                                                 ` handa
2011-05-18  8:13                                                                                                                   ` Eli Zaretskii
2011-06-17  8:15                                                                                                                   ` Kenichi Handa
2011-06-17 15:12                                                                                                                     ` Eli Zaretskii
2011-06-21  2:07                                                                                                                       ` Kenichi Handa
2011-06-21  2:53                                                                                                                         ` Eli Zaretskii
2011-06-21  3:29                                                                                                                           ` Kenichi Handa
2011-06-21  6:11                                                                                                                             ` Eli Zaretskii
2011-06-21  7:22                                                                                                                               ` Kenichi Handa
2011-06-21  7:34                                                                                                                                 ` Eli Zaretskii
2011-06-21  8:02                                                                                                                                   ` Kenichi Handa
2011-06-21 10:30                                                                                                                                     ` bidi at startup (was: idn.el and confusables.txt) Eli Zaretskii
2011-06-21 15:12                                                                                                                                       ` bidi at startup Stefan Monnier
2011-06-21 17:13                                                                                                                                         ` Eli Zaretskii
2011-06-22 15:32                                                                                                                                           ` Stefan Monnier
2011-07-07  6:10                                                                                                                         ` C interface to Unicode character property char-tables Kenichi Handa
2011-08-06 16:52                                                                                                                         ` Using uniprop_table_lookup (was: idn.el and confusables.txt) Eli Zaretskii
2011-08-09  0:55                                                                                                                           ` Kenichi Handa
2011-08-09  1:32                                                                                                                             ` Using uniprop_table_lookup Stefan Monnier
2011-08-09  4:31                                                                                                                               ` Kenichi Handa
2011-08-15  8:57                                                                                                                                 ` Eli Zaretskii
2011-05-31 10:42                                                                                                     ` uni-confusables 0.1 is on the Emacs ELPA branch (was: idn.el and confusables.txt) Ted Zlatanov
2011-06-08 10:42                                                                                                       ` uni-confusables 0.1 is on the Emacs ELPA branch Ted Zlatanov
2011-06-08 15:22                                                                                                         ` Stefan Monnier
2011-04-16 16:00                                             ` face for non-ASCII characters Drew Adams
2010-03-23 19:40                         ` Florian Beck
2010-03-23 14:35                       ` Translation of http status code to text Miles Bader
2010-03-23 14:22                     ` highlighting non-ASCII characters (was: Translation of http status code to text) Ted Zlatanov
2010-03-23 16:50                       ` highlighting non-ASCII characters (was: Translation of http statuscode " Drew Adams
2010-03-23 21:49                         ` highlighting non-ASCII characters Stefan Monnier
2010-03-23 21:53                           ` Drew Adams
2010-03-24  0:45                             ` Stefan Monnier
2010-03-24  1:03                               ` Ted Zlatanov
2010-03-24  2:47                                 ` Stefan Monnier
2010-03-24  4:20                                   ` Eli Zaretskii
2010-03-24  5:14                                     ` Jason Rumney
2010-03-24 13:25                                       ` Stefan Monnier
2010-03-24 15:06                                         ` Jason Rumney
2010-03-24 19:47                                           ` Ted Zlatanov
2010-03-24 10:05                                   ` Ted Zlatanov
2010-03-24 16:21                                     ` Lennart Borgman
2010-03-24 19:34                                       ` Lennart Borgman
2010-03-26 17:35                                         ` Ted Zlatanov
2010-03-26 20:43                                           ` Ted Zlatanov
2010-03-26 22:50                                             ` Lennart Borgman
2010-03-29 18:38                                               ` Ted Zlatanov
2010-03-29 18:48                                                 ` Drew Adams
2010-03-29 20:20                                                   ` Stefan Monnier
2010-03-29 20:19                                                 ` Stefan Monnier
2010-03-29 20:51                                                   ` Lennart Borgman
2010-03-30 13:22                                                     ` Ted Zlatanov
2010-03-29 21:05                                                   ` Ted Zlatanov [this message]
2010-03-29 21:31                                                     ` Lennart Borgman
2010-03-29 21:32                                                     ` Drew Adams
2010-03-30 13:15                                                       ` Ted Zlatanov
2010-03-30 14:04                                                         ` Drew Adams
2010-03-30 14:17                                                           ` Lennart Borgman
2010-03-30 14:42                                                           ` Ted Zlatanov
2010-03-30 16:18                                                         ` Juri Linkov
2010-03-30  1:45                                                     ` Stefan Monnier
2010-03-25  7:11                                       ` Juri Linkov
2010-03-25 14:07                                         ` Lennart Borgman
2010-03-25 17:32                                           ` Juri Linkov
2010-03-26  0:32                                             ` Lennart Borgman
2010-03-26 13:38                                               ` Stephen Berman
2010-03-26 22:44                                                 ` Lennart Borgman
2010-03-25  7:12                                     ` Juri Linkov
2010-03-24  2:09                               ` Drew Adams
2010-03-24  5:00                               ` Stephen J. Turnbull
2010-03-24  9:28                               ` Juri Linkov
2010-03-24 13:15                                 ` Ted Zlatanov
2010-03-24  9:27                       ` Juri Linkov
2010-03-22 18:41           ` Translation of http status code to text Stefan Monnier
2010-03-22 19:15             ` Ted Zlatanov
2010-03-23  9:54               ` Juri Linkov
2010-03-23 10:54                 ` joakim
2010-03-23 15:02                 ` Ted Zlatanov
2010-03-24  3:22                   ` Stefan Monnier
2010-03-24 17:35                   ` Glenn Morris
2010-03-24 19:37                     ` Ted Zlatanov
2010-03-25  1:16                       ` Ted Zlatanov
2010-03-23 12:57               ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87aatqj0ti.fsf@lifelogs.com \
    --to=tzz@lifelogs.com \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.