unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Change of Lisp syntax for "fancy" quotes in Emacs 27?
@ 2018-02-02 22:24 Noam Postavsky
  2018-02-02 22:52 ` Paul Eggert
                   ` (3 more replies)
  0 siblings, 4 replies; 98+ messages in thread
From: Noam Postavsky @ 2018-02-02 22:24 UTC (permalink / raw)
  To: Emacs developers; +Cc: Drew Adams

In Emacs 26 and earlier the following is valid lisp code:

(setq ’bar 42)
(setq foo ’bar)

In the current master branch, this will signal (invalid-read-syntax
"strange quote" "’"). To write the equivalent the ’ must be backslash
escaped:

(setq \’bar 42)
(setq foo \’bar)

(the backslash escaping also works in earlier Emacs versions).

The point of this change is to give a more straightforward error in
cases where a plain straight quote is accidentally written instead of
a curved one.

In Bug#30217, Drew Adams strongly objects to this change. I don't want
to "sneak" this in, so I'm asking here for people's thoughts on this.

References:
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=30217
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=2967

PS In case anyone has trouble reading the example code (e.g., due to
some email encoding failure), evaluating

   (insert "(setq \u2019bar 42)\n(setq foo \u2019bar)")

will write it into your current buffer.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-02 22:24 Change of Lisp syntax for "fancy" quotes in Emacs 27? Noam Postavsky
@ 2018-02-02 22:52 ` Paul Eggert
  2018-02-03  0:00   ` Drew Adams
  2018-02-03  8:33 ` Eli Zaretskii
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 98+ messages in thread
From: Paul Eggert @ 2018-02-02 22:52 UTC (permalink / raw)
  To: Noam Postavsky, Emacs developers; +Cc: Drew Adams

On 02/02/2018 02:24 PM, Noam Postavsky wrote:
> In Bug#30217, Drew Adams strongly objects to this change. I don't want
> to "sneak" this in, so I'm asking here for people's thoughts on this.

I see two main categories of users here, with different needs. 
Less-expert users are likely to run into problems with quotes and other 
characters (that's why we got bug reports), and appreciate diagnostics 
pinpointing the problems; also, programmers concerned about security are 
likely to want these confusing characters to be diagnosed, to prevent an 
attacker from sending code that is easily read one way but actually 
operates in a different way. On the other hand, programs that generate 
Elisp code might prefer not having to special-case these characters. So 
perhaps there should be a buffer-local variable that controls which 
behavior is selected. The default behavior should be the one that caters 
better to general users and is safer.

While we're on the topic, I suggest using the Unicode confusables list 
<http://www.unicode.org/Public/security/10.0.0/confusables.txt> to come 
up with a list of confusing alternatives for each character that has a 
special meaning in Emacs Lisp. This should be better than our trying to 
come up with our own, ad-hoc list. For example, U+A78C LATIN SMALL 
LETTER SALTILLO (ꞌ) looks almost exactly like an apostrophe on my screen 
and is in the confusables list, but is not a character that Emacs 
currently checks for.




^ permalink raw reply	[flat|nested] 98+ messages in thread

* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-02 22:52 ` Paul Eggert
@ 2018-02-03  0:00   ` Drew Adams
  2018-02-03  0:09     ` Paul Eggert
  0 siblings, 1 reply; 98+ messages in thread
From: Drew Adams @ 2018-02-03  0:00 UTC (permalink / raw)
  To: Paul Eggert, Noam Postavsky, Emacs developers

> I see two main categories of users here, with different needs.
> Less-expert users are likely to run into problems with quotes
> and other characters (that's why we got bug reports), and
> appreciate diagnostics pinpointing the problems; also,
> programmers concerned about security are likely to want these
> confusing characters to be diagnosed, to prevent an attacker
> from sending code that is easily read one way but actually
> operates in a different way.
>
> On the other hand, programs that generate Elisp code might
> prefer not having to special-case these characters. So
> perhaps there should be a buffer-local variable that controls
> which behavior is selected. The default behavior should be
> the one that caters better to general users and is safer.

The distinction I think needs to be made is between:

1. Trying to _warn users_ (all users, less-expert or not)
   about possible misuse of particularly confusable chars.
   This just warns about possible pilot error.

2. _Changing Lisp_ reading and evaluating, to treat some
   (all?) confusable characters specially, changing their
   syntax and requiring them to be escaped in order to be
   treated normally (i.e., as they have been treated so far).

I object to #2, NOT to #1.

#1: By all means, we should try to help users.  We can
    issue byte-compilation warnings and some interactive
    warnings - provided we can helpfully and unambiguously
    distinguish the right situations.

#2 changes Lisp in non-neglible, non-helpful ways.
   See bug #30217 for more.

----

There are lots more characters to which the same
non-bug "fix" of changing Lisp might be applied (which
means that users will wonder why this confusable char
is treated specially, and not that one).

Such chars include pretty much anything that could be
confused with anything that is ever used as a delimiter
in Emacs Lisp: brackets (in the British sense) of all
sorts: parens, square, angle, curly.  There are really
quite a few such bracket-confusables.

Such chars also include pretty much anything that could
be confused with any other chars that are used specially
in Lisp: period, comma, quote, backquote, colon.  Again:
there are quite a few such confusables.

They even include chars that could be confused with the
directory separators used in Emacs Lisp.

Finally (?), they include chars that could be confused
with the ASCII-digit numerals 0123456789.  There are
lots of these confusables too.

(Even with just ASCII there are confusables.  Think of
what some use in passwords or leet: zero vs uppercase
letter O, digit 1 vs lowercase letter l, etc.  We've
just gotten used to carefully distinguishing such chars.
Now there are many more, and slighter, differences to
get used to.)

----

Beyond the question of which chars to treat specially,
there's the question of where - in which contexts -
to try to distinguish them.

Contexts include such places as sexps being evaluated,
doc strings, and comments.

They can also include fonts: a given character might
be confusable, or more confusable, in one font than
in another.  Even font size can make a difference
(with some fonts I find myself zooming in to see
whether a quote-thingy might really be a curly quote).

The questions of which chars and where (context) are
both relevant even if we only warn users (#1) and do
not change Lisp syntax (#2).

----

At the very least, I would hope that if we do anything
at all about this we would start by only warning.
I really hope we will not change Lisp syntax for this,
i.e., I hope we revert the change that has been made so
far for Emacs 27.

> While we're on the topic, I suggest using the Unicode
> confusables list ... to come up with a list of confusing
> alternatives for each character that has a special meaning
> in Emacs Lisp. This should be better than our trying to
> come up with our own, ad-hoc list.
>
> For example, U+A78C LATIN SMALL LETTER SALTILLO (ꞌ) looks
> almost exactly like an apostrophe on my screen and is in
> the confusables list, but is not a character that Emacs
> currently checks for.

Yup, and that's just one tiny tip of this terribly
tippy iceberg.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-03  0:00   ` Drew Adams
@ 2018-02-03  0:09     ` Paul Eggert
  2018-02-03  0:39       ` Drew Adams
  0 siblings, 1 reply; 98+ messages in thread
From: Paul Eggert @ 2018-02-03  0:09 UTC (permalink / raw)
  To: Drew Adams, Noam Postavsky, Emacs developers

On 02/02/2018 04:00 PM, Drew Adams wrote:
> The distinction I think needs to be made is between:
>
> 1. Trying to_warn users_  (all users, less-expert or not)
>     about possible misuse of particularly confusable chars.
>     This just warns about possible pilot error.
>
> 2._Changing Lisp_  reading and evaluating, to treat some
>     (all?) confusable characters specially, changing their
>     syntax and requiring them to be escaped in order to be
>     treated normally (i.e., as they have been treated so far).
>
> I object to #2, NOT to #1.

I don't see a clear distinction between #1 and #2. For example, in an 
adversarial environment, users who get warned about suspicious 
characters in their incoming source files will most likely type "no" 
when asked to run such code. In that case, if you want your audience to 
include users who care even a smidgen about security, you'll need to 
escape confusable characters in the business parts of your Emacs Lisp 
code. Effectively that will be a change to Emacs Lisp, even if its 
formal syntax does not change.




^ permalink raw reply	[flat|nested] 98+ messages in thread

* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-03  0:09     ` Paul Eggert
@ 2018-02-03  0:39       ` Drew Adams
  0 siblings, 0 replies; 98+ messages in thread
From: Drew Adams @ 2018-02-03  0:39 UTC (permalink / raw)
  To: Paul Eggert, Noam Postavsky, Emacs developers

> > The distinction I think needs to be made is between:
> >
> > 1. Trying to_warn users_  (all users, less-expert or not)
> >     about possible misuse of particularly confusable chars.
> >     This just warns about possible pilot error.
> >
> > 2._Changing Lisp_  reading and evaluating, to treat some
> >     (all?) confusable characters specially, changing their
> >     syntax and requiring them to be escaped in order to be
> >     treated normally (i.e., as they have been treated so far).
> >
> > I object to #2, NOT to #1.
> 
> I don't see a clear distinction between #1 and #2.

That's too bad.  They are really quite different.

In the first case, you get a warning.  In the second case
your code breaks.

> For example, in an adversarial environment...

I don't think that's the reason for this change at all.
It was not mentioned in the bug thread, AFAIK.

The motivation was to prevent confusion on the part of
users, not to prevent or avoid malevolent behavior.
Please see the bug thread (#30217).

The idea was to improve convenience and reduce confusion
by someone who copy+pastes code from a web page (for
example), when (for example) that page renders a normal
quote as a curly quote.

You want to introduce a security aspect here.  I can't
speak much to that.  I'll simply ask whether other Lisps
(e.g. Common Lisp) worry about such a risk?  What does
Clojure do about confusables in Lisp symbols?  Does any
other Lisp change the Lisp syntax and behavior to require
special escaping of such chars in symbols (or elsewhere)?

Sure, even if no other Lisp worries about this or takes
the same approach as that proposed, that's not a proof
that Emacs Lisp shouldn't.  Still...

Given enough motivation, you can already, today, create
Lisp code (confusing, confusable, or otherwise) that is
evil, even without using any consusable Unicode chars.

When I was a kid we would play tricks on each other,
changing a character somewhere in a friend's large deck
of punched Hollerith cards - e.g., insert or remove a
decimal point.  You had to wait a full day to get back
the result of your program run, and the result would
only be a pretty cryptic error msg.  Argggh!

It was just good-natured fun - a game among friends.
And that was only with assembler and Fortran, and we
were just newbie kids.  Imagine what you can do today,
without bothering to rely on close Unicode confusables.

Sorry, but your "security" argument just doesn't pass
muster, for me.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-02 22:24 Change of Lisp syntax for "fancy" quotes in Emacs 27? Noam Postavsky
  2018-02-02 22:52 ` Paul Eggert
@ 2018-02-03  8:33 ` Eli Zaretskii
  2018-02-03 16:16   ` Drew Adams
  2018-02-03 18:13 ` Aaron Ecay
  2018-10-05  0:03 ` Noam Postavsky
  3 siblings, 1 reply; 98+ messages in thread
From: Eli Zaretskii @ 2018-02-03  8:33 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: drew.adams, emacs-devel

> From: Noam Postavsky <npostavs@users.sourceforge.net>
> Date: Fri, 2 Feb 2018 17:24:43 -0500
> Cc: Drew Adams <drew.adams@oracle.com>
> 
> In Emacs 26 and earlier the following is valid lisp code:
> 
> (setq ’bar 42)
> (setq foo ’bar)
> 
> In the current master branch, this will signal (invalid-read-syntax
> "strange quote" "’"). To write the equivalent the ’ must be backslash
> escaped:
> 
> (setq \’bar 42)
> (setq foo \’bar)
> 
> (the backslash escaping also works in earlier Emacs versions).
> 
> The point of this change is to give a more straightforward error in
> cases where a plain straight quote is accidentally written instead of
> a curved one.

The bug reports which triggered the above changes are bug#2967 and
bug#23425.  So any proposal to remove those changes should also
suggest an alternative for handling those bug reports.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-03  8:33 ` Eli Zaretskii
@ 2018-02-03 16:16   ` Drew Adams
  2018-02-03 17:05     ` Eli Zaretskii
  0 siblings, 1 reply; 98+ messages in thread
From: Drew Adams @ 2018-02-03 16:16 UTC (permalink / raw)
  To: Eli Zaretskii, Noam Postavsky; +Cc: emacs-devel

> The bug reports which triggered the above changes are bug#2967 and
> bug#23425.  So any proposal to remove those changes should also
> suggest an alternative for handling those bug reports.

For "handling those bug reports"?  Are we to add
more cans of worms to this question, obscuring it?

AFAICT, no alternatives to handling those bugs
are needed because of reverting the Lisp syntax
change made for bug #30217.  Can you point to
how/why reverting that change would necessitate
alternative fixes for those bugs?

Bug #2967 just asked for a warning, e.g. during
byte-compilation or loading.  There's no
objection here to warning.

Bug #2967 did not ask for (or get) a change in
Lisp syntax.  I see no negative impact on #2967
from reverting the Lisp-syntax "fix" to #30217.

Even #30217 did not ask for such a syntax change.
Warning is sufficient for fixing #30217 too.

Bug #23425, on the other hand, is a gigantic
stream-of-consciousness about anything and
everything to do with Paul's changes to Emacs
over the last few years wrt curly quotes.
It's not a single bug report thread - it's
all over the map.

In any case, #23425, like #2967 (and even
#30217), is not about what was done to "fix"
#30217 - changing Lisp syntax for fancy quotes.

How is it helpful to throw all of #23425 into
this Lisp syntax-change question, as if the
present issue puts into question everything
ever discussed about curly quotes?

Or do you have something specific in mind here
wrt #23425 - some part of it?  Something that
would actually be impacted negatively by
reverting the Lisp syntax changes for #30217?
If so, please identify it.

But if you mean only the ability to get confused
by copy+pasting Lisp code that has a fancy quote
mark somewhere in place of ordinary ASCII
apostrophe ('), e.g., (setq foo ’bar), then
that's just the same pilot-error gotcha as for
bug #30217.

There are many gotchas in Lisp.  You can see
repeated postings of some at various places
(e.g., help-gnu-emacs, emacs.stackexchange).
E.g., the error that a given Lisp function is
not defined (because its library was not loaded).

The pilot error described in bug #30217 is not
even a commonly reported one.  The "fix" made
in #30217 is an overreaction.

So one solution to #30217 is to do nothing - just
revert the misguided Lisp syntax change.  Users
will learn that gotcha the same way they learn
others.  Not every report of a gotcha needs to
lead to changes to Emacs.

If we do nothing there will continue to be some
such pilot errors, of course.  But we already
raise an error if the code leads to a problem.

And the original error message from bug #23425
is _more_ meaningful and helpful, not less,
than the new one after the "fix".

The original error msg of #23425:
  (wrong-number-of-arguments setq 31)

tells you pretty much that setq is missing an
argument or it has too many, which makes you
look at its arguments.  Not so obscure.  And
accurate.

The new error msg:
  (invalid-read-syntax "strange quote" "’")

is obscure.  Invalid read syntax when reading
what?  What's invalid about it?

Confusion - not understanding an accurate error
msg, is not the same thing as Lisp itself having
a bug because such a character is included in a
symbol name.

Another solution is to try to warn users about
the use of confusables.

That's actually many solutions, because it
requires handling different chars and different
gotcha contexts differently, and carefully.
But unlike a syntax change it's not an
all-or-nothing thing: we could add warnings here
and there, as something might be better than
nothing.

Either doing nothing or trying to warn about such
gotchas is right.  Changing Lisp syntax here is
not right.  Lisp doesn't have a bug here.

This is all about pilot error - the same kind of
thing that happens when someone mistypes `,' for
`.' for dotted-pair syntax, or types `.' in `a.b'
intending dotted-pair syntax but getting a symbol
instead, or quotes a sexp expecting the sexp to
be evaluated.

Yes, a user might scratch her head when seeing
the error message from such a mistake, but the
error message is right, not wrong, and eventually
the light turns on.

And this enlightenment is aided by the fact that
Lisp syntax is so simple.  The "fix" for bug
#30217 goes in the opposite direction.  It makes
Lisp syntax more complex and makes understanding
syntax mistakes more difficult.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-03 16:16   ` Drew Adams
@ 2018-02-03 17:05     ` Eli Zaretskii
  2018-02-04  1:16       ` Michael Heerdegen
                         ` (3 more replies)
  0 siblings, 4 replies; 98+ messages in thread
From: Eli Zaretskii @ 2018-02-03 17:05 UTC (permalink / raw)
  To: Drew Adams; +Cc: emacs-devel, npostavs

> Date: Sat, 3 Feb 2018 08:16:15 -0800 (PST)
> From: Drew Adams <drew.adams@oracle.com>
> Cc: emacs-devel@gnu.org
> 
> > The bug reports which triggered the above changes are bug#2967 and
> > bug#23425.  So any proposal to remove those changes should also
> > suggest an alternative for handling those bug reports.
> 
> For "handling those bug reports"?  Are we to add
> more cans of worms to this question, obscuring it?
> 
> AFAICT, no alternatives to handling those bugs
> are needed because of reverting the Lisp syntax
> change made for bug #30217.  Can you point to
> how/why reverting that change would necessitate
> alternative fixes for those bugs?

Those bug reports complained about obscure error messages that are
unhelpful when a Lisp programmer tries to figure out the root cause.
I'm saying that we should find an alternative way of making clear,
helpful error messages in those special cases where characters which
display similarly might make the error message confusing if it just
cites the symbol's name.

For example, suppose you have a Lisp program that produces the
following error message when compiled/executed:

  Symbol's value as variable is void: 'аbbrevs-changed

You then type "C-h v abbrevs-changed RET" and get the expected result,
meaning that the variable is known to Emacs.  How quickly will you be
able to spot the cause of the error message?

The change that got reverted from the emacs-26 branch was about a
similar case, but for a character that's much more important for Lisp
than 'a': it's about the character used to quote symbol names.  But
the essence is the same: due to how characters are displayed, some
characters can be confused for others.

We want to find a way of identifying such situation and telling the
Lisp programmer about that in clear and easily understandable ways.
One way, perhaps too radical one, is to reject such "confusable"
characters outright.  We could decide that we don't want such a
radical solution, but that doesn't mean we should give up on the
attempt to find some other solution for the problem.  Neither does it
mean we should proclaim people who installed the change as enemies of
the society.

> Bug #23425, on the other hand, is a gigantic
> stream-of-consciousness about anything and
> everything [...]
> [...]
> How is it helpful to throw all of #23425 into
> this Lisp syntax-change question, as if the
> present issue puts into question everything
> ever discussed about curly quotes?

I could turn the table and ask you how is it helpful to dump on us all
your random thoughts about this, instead of simply saying you didn't
understand the relevance and asking for more explanations.  Which I
just provided.

I hope now the issue is clear enough.

> And the original error message from bug #23425
> is _more_ meaningful and helpful, not less,
> than the new one after the "fix".
> 
> The original error msg of #23425:
>   (wrong-number-of-arguments setq 31)
> 
> tells you pretty much that setq is missing an
> argument or it has too many, which makes you
> look at its arguments.  Not so obscure.  And
> accurate.
> 
> The new error msg:
>   (invalid-read-syntax "strange quote" "’")
> 
> is obscure.  Invalid read syntax when reading
> what?  What's invalid about it?

I think you are so eager to make your point that you are willing to
claim that black is white and vice versa.  Any objective person would
agree that the new error message is more directly pointing to the root
cause, which is the syntax of specifying a quoted symbol name using a
"strange quote".  If we are good in writing and indexing our ELisp
manual, then I'd expect to find there an index entry for "strange
quote", which will land me where this issue is explained.  Case
closed.

Once again, I can agree that this measure might be too harsh, but I
would still like to see clear diagnostics of such typos, and like
Paul, I thing we should take our inspiration from the Unicode
Standard's notion of "confusables".  Ideas and proposals for patches
along those lines are welcome.  Ignoring the problem, or trying to
convince us that it doesn't exist, is not.

> Either doing nothing or trying to warn about such
> gotchas is right.  Changing Lisp syntax here is
> not right.

Doing nothing would be ignoring the problem.  That changing Lisp
syntax is not right is your opinion: legitimate, but clearly not
shared by at least some.

> Lisp doesn't have a bug here.

That's a strawman, and you know it.  We are talking about diagnostics
for bugs in Lisp programs.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-02 22:24 Change of Lisp syntax for "fancy" quotes in Emacs 27? Noam Postavsky
  2018-02-02 22:52 ` Paul Eggert
  2018-02-03  8:33 ` Eli Zaretskii
@ 2018-02-03 18:13 ` Aaron Ecay
  2018-02-04  2:05   ` Drew Adams
  2018-02-04  4:51   ` Paul Eggert
  2018-10-05  0:03 ` Noam Postavsky
  3 siblings, 2 replies; 98+ messages in thread
From: Aaron Ecay @ 2018-02-03 18:13 UTC (permalink / raw)
  To: Noam Postavsky, Emacs developers; +Cc: Drew Adams

Hi Noam,

2018ko otsailak 2an, Noam Postavsky-ek idatzi zuen:
> 
> In Emacs 26 and earlier the following is valid lisp code:
> 
> (setq ’bar 42)
> (setq foo ’bar)

I was surprised to learn that this is the case, in light of what is
said in the Elisp reference about symbol names: “A symbol name can
contain any characters whatever. Most symbol names are written with
letters, digits, and the punctuation characters ‘-+=*/’. Such names
require no special punctuation; the characters of the name suffice as
long as the name does not look like a number. (If it does, write a ‘\’
at the beginning of the name to force interpretation as a symbol.) The
characters ‘_~!@$%^&:<>{}?’  are less often used but also require no
special punctuation. Any other characters may be included in a symbol's
name by escaping them with a backslash.”  (info "(elisp) Symbol Type")

Would it be worth considering making the reader enforce this fully
specification, as an alternative to your patch?  That would solve
this problem with curly quotes in symbol names (which also bit me at
one point), as well as the potential problems with other confusable
characters raised by Paul.

(It might still be desirable to add a special user-friendly error message
when the illegal characters are confusable with an ASCII single quote, as
an additional user-friendliness measure.)

Aaron

PS if this approach is not taken, the manual should at least be changed
to match the actual behavior of the reader.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-03 17:05     ` Eli Zaretskii
@ 2018-02-04  1:16       ` Michael Heerdegen
  2018-02-04  1:25         ` Clément Pit-Claudel
                           ` (2 more replies)
  2018-02-04  1:55       ` Drew Adams
                         ` (2 subsequent siblings)
  3 siblings, 3 replies; 98+ messages in thread
From: Michael Heerdegen @ 2018-02-04  1:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: npostavs, Drew Adams, emacs-devel

Hello,

Helpfulness of error messages surely depends on the beholder, and on
expectations.  In my eyes,

> Symbol's value as variable is void: 'аbbrevs-changed

is quite clear: you think this        ^^^^^^^^^^^^^^^^ is a quoted
thing, but the error message calls it a symbol.  So there must be a
problem with that quote, it has obviously gotten read as part of the
symbol.  Sure, you have still to find out why.  OTOH

> >   (invalid-read-syntax "strange quote" "’")

also doesn't say what's wrong with that quote.  It even calls something
a quote where there is none.  The error message is confusing.  Repeating
the pseudo quote character in the error message doesn't make it look
less like a quote.

> I think you are so eager to make your point that you are willing to
> claim that black is white and vice versa.  Any objective person would
> agree that the new error message is more directly pointing to the root
> cause

Are you really sure that every Emacs user would expect that we modify
the Lisp reader to catch typos?

FWIW, we already modified the Lisp reader to catch another style issue
(to get rid of old-style backquotes) and made it error.  It broke my
stuff (el-search) horribly - though I don't use old-style backquotes,
and for code that also doesn't use them.  Now I need to work around
`read' and define my own `read' function.  I also need to remember for a
long time that using `read' is forbidden in my library.  I even
implemented a minor mode to warn me just about that: it warns me that I
use `read' and it's forbidden.  Otherwise, I would get strange errors
when using my stuff, from time to time, whenever I added a `read' by
accident.  All other users of my package, too.  And believe me, _these_
error messages are then less understandable than

> Symbol's value as variable is void: 'аbbrevs-changed.

Misusing something fundamental as the Lisp reader to catch such stuff
should be the very last resort.  The result can get much more confusing
in situations we now don't think about.

> > Lisp doesn't have a bug here.
> That's a strawman, and you know it.  We are talking about diagnostics
> for bugs in Lisp programs.

I think it's a eligible argument.  Drew just thinks it's the wrong fix.
He may also think that no fix would maybe suffice.  That's ok, and I
think he made some good points.

We should discuss about alternative approaches to move forward.  People
often paste stuff into scratch or the M-: prompt that they copied from
elsewhere.  Maybe we could make M-: and C-x C-e check for this problem.
These could also check for other, similar frequent problems.  Any better
suggestions?


Michael.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-04  1:16       ` Michael Heerdegen
@ 2018-02-04  1:25         ` Clément Pit-Claudel
  2018-02-04  2:05           ` Drew Adams
                             ` (2 more replies)
  2018-02-04 11:15         ` Alan Mackenzie
  2018-02-04 14:47         ` Noam Postavsky
  2 siblings, 3 replies; 98+ messages in thread
From: Clément Pit-Claudel @ 2018-02-04  1:25 UTC (permalink / raw)
  To: Michael Heerdegen, Eli Zaretskii; +Cc: emacs-devel, Drew Adams, npostavs

On 2018-02-03 20:16, Michael Heerdegen wrote:
> Helpfulness of error messages surely depends on the beholder, and on
> expectations.  In my eyes,
> 
>> Symbol's value as variable is void: 'аbbrevs-changed
> is quite clear: you think this        ^^^^^^^^^^^^^^^^ is a quoted
> thing, but the error message calls it a symbol.  So there must be a
> problem with that quote, it has obviously gotten read as part of the
> symbol.  Sure, you have still to find out why.

I think you're making Eli's point, actually :)

The problem isn't the quote: it's the CYRILLIC SMALL LETTER A instead of LATIN SMALL LETTER A.  IOW, (string= "аbbrevs-changed" "abbrevs-changed") is nil.

I think Eli was illustrating the confusion that can stem from Unicode confusables (and I must agree that the error message could be much better ^^)

Clément.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-03 17:05     ` Eli Zaretskii
  2018-02-04  1:16       ` Michael Heerdegen
@ 2018-02-04  1:55       ` Drew Adams
  2018-02-04  2:10         ` Noam Postavsky
  2018-02-05  1:06       ` Why "symbol's value" error about a list? Richard Stallman
  2018-02-05  1:06       ` Change of Lisp syntax for "fancy" quotes in Emacs 27? Richard Stallman
  3 siblings, 1 reply; 98+ messages in thread
From: Drew Adams @ 2018-02-04  1:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, npostavs

> Those bug reports complained about obscure error messages that are
> unhelpful when a Lisp programmer tries to figure out the root cause.
> I'm saying that we should find an alternative way of making clear,
> helpful error messages in those special cases where characters which
> display similarly might make the error message confusing if it just
> cites the symbol's name.

OK.  Except I would say warnings, not error messages, at
least in most cases.

But even if we have an error message, that's not a call
to change the syntax of Lisp.  User errors happen.  We
should just want to help users avoid making such errors.

> For example, suppose you have a Lisp program that produces
> the following error message when compiled/executed:
> 
>   Symbol's value as variable is void: 'аbbrevs-changed
> 
> You then type "C-h v abbrevs-changed RET" and get the expected result,
> meaning that the variable is known to Emacs.  How quickly will you be
> able to spot the cause of the error message?

Some people will wonder for a while.  Others, perhaps
already bitten by this gotcha, will notice the quote
mark there right away.

One thing that would help, I think, and which should
be done in general, would be to put the offending
thingie between `...':

 Symbol's value as variable is void: `'аbbrevs-changed'

That makes it more obvious that the symbol name
includes that fancy quote char.

Still, all of this is pilot error, where "pilot" can
include the user who wrote the code but more likely
means a user who copy+pasted it.

> The change that got reverted from the emacs-26 branch was about a
> similar case, but for a character that's much more important for Lisp
> than 'a': it's about the character used to quote symbol names.  But
> the essence is the same: due to how characters are displayed, some
> characters can be confused for others.
> 
> We want to find a way of identifying such situation and telling the
> Lisp programmer about that in clear and easily understandable ways.
> One way, perhaps too radical one, is to reject such "confusable"
> characters outright.  We could decide that we don't want such a
> radical solution, but that doesn't mean we should give up on the
> attempt to find some other solution for the problem.  Neither does it
> mean we should proclaim people who installed the change as enemies of
> the society.

Agreed.  As I've said, I'm in favor of providing
friendly warnings/reminders that point out that
such a character is present.

I think that should be enough.

There are lots of potential confusables, and lots
of different use contexts.  But if we start with
just one or two such chars and one or two common
and clear contexts where a warning might help, that
would be good.  We can always add more such warnings
as cases come up (get reported or otherwise become
obvious).

It would be an overreaction, IMO, to jump to
changing the existing Lisp syntax to raise errors
when someone uses such a character in, say, a symbol
name.  We should not require such chars to be
escaped in a symbol name.  Such chars have no special
meaning for Lisp (unlike `.', `,' `'', ``', `(', `)',
`[', `]', `"', `<', `>', `#' `;', and perhaps some more).

> > Bug #23425, on the other hand, is a gigantic
> > stream-of-consciousness about anything and
> > everything [...]
> > [...]
> > How is it helpful to throw all of #23425 into
> > this Lisp syntax-change question, as if the
> > present issue puts into question everything
> > ever discussed about curly quotes?
> 
> I could turn the table and ask you how is it helpful
> to dump on us all your random thoughts about this,
> instead of simply saying you didn't understand the
> relevance and asking for more explanations.  Which I
> just provided.

Whoa!  I don't see a connection between the current
issue and the many things discussed in #23425.  And
I don't think I dumped any random thoughts on anyone.

> I hope now the issue is clear enough.

No idea what your point is there.

If there is some part of bug #23425 that you think
is relevant here, and you think it will be UNfixed by
reverting the Lisp-syntax change made for bug #30217,
please tell us what that part is.

I don't see anything in #23425 that needs the change
in Lisp syntax made for #30217.  And I don't see that
Lisp change being necessary to fix #30217 either.
It wasn't requested by the bug filer, AFAIK.  Same
for the other bugs you mentioned.  The filers just
asked for warnings, AFAICT.

> > And the original error message from bug #23425
> > is _more_ meaningful and helpful, not less,
> > than the new one after the "fix".
>
> I think you are so eager to make your point that you are willing to
> claim that black is white and vice versa.  Any objective person would
> agree that the new error message is more directly pointing to the root
> cause, which is the syntax of specifying a quoted symbol name using a
> "strange quote".  If we are good in writing and indexing our ELisp
> manual, then I'd expect to find there an index entry for "strange
> quote", which will land me where this issue is explained.  Case
> closed.

We can perhaps agree to disagree about that.
But of course if you say the case is closed then
it's closed.

> Once again, I can agree that this measure might be too harsh, but I
> would still like to see clear diagnostics of such typos, and like
> Paul, I thing we should take our inspiration from the Unicode
> Standard's notion of "confusables".

I've agreed about that from the beginning.  It can
be helpful to warn users about possible confusion
when they use confusables.  And I agree that clear
diagnostics are needed - that was one of my points.

That's different from changing the syntax of Lisp.

> Ideas and proposals for patches along those lines
> are welcome.

Ditto.

> Ignoring the problem, or trying to convince us
> that it doesn't exist, is not.

I recognize the problems of confusable characters.
Not all such possible confusions are equally likely,
in practice.

Recognizing contexts where something might well be
a typo, and warrants a helpful reminder/warning, is
what's needed - case by case.

What's not needed, IMO (and probably the only place
where I differ from you on this, even if you don't
want to recognize it) is a change in Lisp syntax,
making it a read error not to escape such a character.

> > Either doing nothing or trying to warn about such
> > gotchas is right.  Changing Lisp syntax here is
> > not right.
> 
> Doing nothing would be ignoring the problem.

Yes.  It's maybe not the best help for users, but
it would be one way to handle those few reports of
confusion.  We get a lot more questions due to
other confusions wrt Lisp than we do such questions
due to confusing one char for another.

I didn't, and don't, say that doing nothing is the
best approach.  I said it's one way to deal with
such reports.  Unlike changing Lisp syntax, it at
least doesn't introduce new problems.

> That changing Lisp syntax is not right is your
> opinion: legitimate, but clearly not shared by at
> least some.

That's why we're having this discussion.

I have yet to hear a reason why it is right to
change Lisp syntax for this - why a simple warning
is not sufficient and we need to also make Lisp
raise an error.

> > Lisp doesn't have a bug here.
> 
> That's a strawman, and you know it.  We are talking
> about diagnostics for bugs in Lisp programs.

I have no objection to diagnostics.  Add warnings
for byte-compilation, loading, whatever.

Make sure the warnings are clear.  Say, for instance
that a curly quote was used in sexp `...'.  Don't
just say that invalid syntax was read (somewhere).
Clearly pointing out the confusable char in the
possibly confused sexp should go a long way to
making things clear.

My objection is to making such chars be escaped to
prevent Lisp from raising an error.  I don't put
`a’b' in the same class as, say, `a,b'.

`,' is special in Lisp, and (setq a,b 42) should
(and does) raise an error.  `’' is not special in
Lisp, and (setq a’b 42) should not raise an error (IMO).
Likewise, (setq ,b 42) (yes) and (setq ’b 42) (no).

If you want to argue for this syntax change, why
not address some of my arguments against it?  Where
will you draw the line, for instance?  There are
_lots_ of possible confusables.

I'd say start with only the few that have actually
been reported (is there only one reported?), trying
to come up with reasonable warnings in particular
contexts of use.  That would be a good start.

We might even have a user option that lists the
confusables to check/warn for, with whatever
default value people here think is best (it might
be only `’', to start with - or both left and
right curly quotes).

Are you thinking instead (since both you and Paul
mentioned the Unicode list of confusables) of
starting with _all_ characters in that list?

http://www.unicode.org/Public/security/8.0.0/confusables.txt

I won't argue about which chars should be warned
about, though I might be interested to see what
contexts we warn for and what the messages will be.

My objection is not about detecting this or that
use of this or that character and warning/reminding
users about it.

My objection is to making Lisp require escaping of
such characters.  That's all.  I think I've made
that as clear as I possibly can.

But you seem to want to paint my objection as
being against helping users know about accidental
use of confusables, e.g., `’' instead of `''.  Why?



^ permalink raw reply	[flat|nested] 98+ messages in thread

* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-03 18:13 ` Aaron Ecay
@ 2018-02-04  2:05   ` Drew Adams
  2018-02-04  4:51   ` Paul Eggert
  1 sibling, 0 replies; 98+ messages in thread
From: Drew Adams @ 2018-02-04  2:05 UTC (permalink / raw)
  To: Aaron Ecay, Noam Postavsky, Emacs developers

> I was surprised to learn that this is the case, in light of what is
> said in the Elisp reference about symbol names:
>
> “A symbol name can
> contain any characters whatever. Most symbol names are written with
> letters, digits, and the punctuation characters ‘-+=*/’. Such names
> require no special punctuation; the characters of the name suffice as
> long as the name does not look like a number. (If it does, write a ‘\’
> at the beginning of the name to force interpretation as a symbol.) The
> characters ‘_~!@$%^&:<>{}?’  are less often used but also require no
> special punctuation. Any other characters may be included in a symbol's
> name by escaping them with a backslash.”  (info "(elisp) Symbol Type")

Thank you very much for that.  I guess I wasn't aware of
that text.  I thought that there were only a very few
chars that needed to be escaped in symbol names - `,',
`(', etc.: only chars that have special syntactic
meaning in Lisp.

I suppose that invalidates my objection, though I wonder
_why_ we would require escaping so many ordinary chars.

And like you I wonder whether that text is accurate.
I wonder whether that is the intended design (why?) or it
is just an inaccurate description of the real behavior.

Trying various chars from confusables.txt, it does not
seem like they require escaping (at least not yet).
That text appears to be wrong.

I'd prefer it if escaping was _not_ required for chars
other than those mentioned in that text, including
chars in confusables.txt.  I think it makes more sense
to require escaping only for characters that have
special Lisp significance, syntactically.

IOW, I prefer the actual behavior to the behavior
described in that text.  I don't think someone using
Hebrew or Arabic or Chinese or Korean letters in a
symbol name should need to escape each one (or any
of them).

But if the design described there has already been
decided on then as best for Emacs then I guess my
argument is moot.  In that case, the implementation
is currently waaaaaay out of whack wrt the design.

And if that's the design to be implemented then I
agree with you that implementing it as described
in that text would at least have an advantage of
consistency.

> Would it be worth considering making the reader enforce this fully
> specification, as an alternative to your patch?  That would solve
> this problem with curly quotes in symbol names (which also bit me at
> one point), as well as the potential problems with other confusable
> characters raised by Paul.
> 
> (It might still be desirable to add a special user-friendly error
> message when the illegal characters are confusable with an ASCII
> single quote, as an additional user-friendliness measure.)
> 
> if this approach is not taken, the manual should at least
> be changed to match the actual behavior of the reader.

That's the approach I'd prefer.  Let chars be used in
symbol names without escaping, except for those with
special Lisp syntax.

But add warnings in contexts where we think someone
might have inadvertently used a confusable in place
of a common character.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-04  1:25         ` Clément Pit-Claudel
@ 2018-02-04  2:05           ` Drew Adams
  2018-02-04  2:06           ` Michael Heerdegen
  2018-02-04 10:34           ` Alan Third
  2 siblings, 0 replies; 98+ messages in thread
From: Drew Adams @ 2018-02-04  2:05 UTC (permalink / raw)
  To: Clément Pit-Claudel, Michael Heerdegen, Eli Zaretskii
  Cc: emacs-devel, npostavs

> > Helpfulness of error messages surely depends on the beholder, and on
> > expectations.  In my eyes,
> >
> >> Symbol's value as variable is void: 'аbbrevs-changed
> > is quite clear: you think this        ^^^^^^^^^^^^^^^^ is a quoted
> > thing, but the error message calls it a symbol.  So there must be a
> > problem with that quote, it has obviously gotten read as part of the
> > symbol.  Sure, you have still to find out why.
> 
> I think you're making Eli's point, actually :)
> 
> The problem isn't the quote: it's the CYRILLIC SMALL LETTER A instead of
> LATIN SMALL LETTER A.  IOW, (string= "аbbrevs-changed" "abbrevs-
> changed") is nil.
> 
> I think Eli was illustrating the confusion that can stem from Unicode
> confusables (and I must agree that the error message could be much
> better ^^)

I too misread Eli's example as being about using a
curly quote instead of an apostrophe.  You're right
that it's an ordinary apostrophe and the first `a'
is the letter you mention.

But then why would anyone ever see the quote mark
in such a message?  Was the message artificially
configured?

In any case, if that example, without the quote, say,
is trying to make Eli's point, then he must be arguing
for warning about using such confusables also - `а'
as a confusable for `a'.

That's a monumental undertaking (take a look at the
confusables.txt list).  And the messages (warning or
error) would need to be pretty darn clear about just
what char was used and where, in order not to sow
even more confusion.  It sure won't cut the mustard
to just say "Invalid read syntax"!



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-04  1:25         ` Clément Pit-Claudel
  2018-02-04  2:05           ` Drew Adams
@ 2018-02-04  2:06           ` Michael Heerdegen
  2018-02-04 10:34           ` Alan Third
  2 siblings, 0 replies; 98+ messages in thread
From: Michael Heerdegen @ 2018-02-04  2:06 UTC (permalink / raw)
  To: Clément Pit-Claudel; +Cc: Eli Zaretskii, emacs-devel, Drew Adams, npostavs

Clément Pit-Claudel <cpitclaudel@gmail.com> writes:

> On 2018-02-03 20:16, Michael Heerdegen wrote:
> > Helpfulness of error messages surely depends on the beholder, and on
> > expectations.  In my eyes,
> > 
> >> Symbol's value as variable is void: 'аbbrevs-changed
> > is quite clear: you think this        ^^^^^^^^^^^^^^^^ is a quoted
> > thing, but the error message calls it a symbol.  So there must be a
> > problem with that quote, it has obviously gotten read as part of the
> > symbol.  Sure, you have still to find out why.
>
> I think you're making Eli's point, actually :)
>
> The problem isn't the quote: it's the CYRILLIC SMALL LETTER A instead
> of LATIN SMALL LETTER A.  IOW, (string= "аbbrevs-changed"
> "abbrevs-changed") is nil.

Oh.  Why is then there a quote in this error message?

FWIW, I'm not against doing something that helps the user in such
situations.  But these are problems in the interaction between the user
and Emacs, so we should care about it on that (the interface) level.
And keep Lisp, the language, simple.


Michael.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-04  1:55       ` Drew Adams
@ 2018-02-04  2:10         ` Noam Postavsky
  0 siblings, 0 replies; 98+ messages in thread
From: Noam Postavsky @ 2018-02-04  2:10 UTC (permalink / raw)
  To: Drew Adams; +Cc: Eli Zaretskii, Emacs developers

On Sat, Feb 3, 2018 at 8:55 PM, Drew Adams <drew.adams@oracle.com> wrote:

> My objection is to making Lisp require escaping of
> such characters.  That's all.  I think I've made
> that as clear as I possibly can.

I think your position is indeed quite clear by now. In fact, I think
the length and frequency of your posts are going to make it harder for
other people to participate, so could you dial it back it a bit.
Please?



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-03 18:13 ` Aaron Ecay
  2018-02-04  2:05   ` Drew Adams
@ 2018-02-04  4:51   ` Paul Eggert
  2018-02-04  9:47     ` Andreas Schwab
  2018-02-04 15:04     ` Noam Postavsky
  1 sibling, 2 replies; 98+ messages in thread
From: Paul Eggert @ 2018-02-04  4:51 UTC (permalink / raw)
  To: Aaron Ecay, Noam Postavsky, Emacs developers; +Cc: Drew Adams

[-- Attachment #1: Type: text/plain, Size: 1230 bytes --]

Aaron Ecay wrote:
> I was surprised to learn that this is the case, in light of what is
> said in the Elisp reference about symbol names

Good point; thanks. In the spirit of "be strict about what you generate", the 
Emacs printer should escape any character that is not in the list of characters 
documented in the Elisp manual as being safe (i.e., as not requiring escaping). 
This is elementary future-proofing, and is independent of whether we want Emacs 
to warn about or disallow confusable chars in symbols.

Proposed patches against 'master' attached. The first merely simplifes the code 
without changing its effect. The second fixes a bug in the manual, which 
incorrectly states that '?' never needs escaping in symbol names. These two 
patches are routine. (I assume the second one should be applied to emacs26 
instead of to master.)

The third patch changes the Lisp printer to escape characters as suggested above.

The fourth patch changes the Lisp printer to escape '?' only at the start of a 
symbol. This is nicer for programs using Scheme-style naming conventions in 
Emacs Lisp, e.g., 'fooish?' rather than 'fooishp'. I discovered the need for 
this patch when I wrote the second patch.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Simplify-print_object-a-bit.patch --]
[-- Type: text/x-patch; name="0001-Simplify-print_object-a-bit.patch", Size: 2780 bytes --]

From c03b816016f8cc2f15d275e7ad23448366489277 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 3 Feb 2018 20:29:00 -0800
Subject: [PATCH 1/4] Simplify print_object a bit

* src/print.c (print_object): Simplify by using C99
constructs, and by taking advantage of the fact that Lisp
strings are are followed by null bytes.
---
 src/print.c | 40 ++++++++++++++++------------------------
 1 file changed, 16 insertions(+), 24 deletions(-)

diff --git a/src/print.c b/src/print.c
index b3c0f6f..d3eb49d 100644
--- a/src/print.c
+++ b/src/print.c
@@ -1916,38 +1916,29 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
     case Lisp_Symbol:
       {
 	bool confusing;
-	unsigned char *p = SDATA (SYMBOL_NAME (obj));
-	unsigned char *end = p + SBYTES (SYMBOL_NAME (obj));
-	int c;
-	ptrdiff_t i, i_byte;
-	ptrdiff_t size_byte;
-	Lisp_Object name;
-
-	name = SYMBOL_NAME (obj);
-
-	if (p != end && (*p == '-' || *p == '+')) p++;
-	if (p == end)
-	  confusing = 0;
+	Lisp_Object name = SYMBOL_NAME (obj);
+	ptrdiff_t size_byte = SBYTES (name);
+	unsigned char *p = SDATA (name);
+	unsigned char *end = p + size_byte;
+
 	/* If symbol name begins with a digit, and ends with a digit,
 	   and contains nothing but digits and `e', it could be treated
 	   as a number.  So set CONFUSING.
 
-	   Symbols that contain periods could also be taken as numbers,
-	   but periods are always escaped, so we don't have to worry
-	   about them here.  */
-	else if (*p >= '0' && *p <= '9'
-		 && end[-1] >= '0' && end[-1] <= '9')
+	   Symbols that contain '.' or '#' could also be taken as
+	   numbers, but these are always escaped so don't worry about
+	   them here.  */
+	if (c_isdigit (p[*p == '-' || *p == '+']) && c_isdigit (end[-1]))
 	  {
-	    while (p != end && ((*p >= '0' && *p <= '9')
-				/* Needed for \2e10.  */
-				|| *p == 'e' || *p == 'E'))
+	    /* Check for 'e' too; needed for \2e10.  */
+	    do
 	      p++;
+	    while (c_isdigit (*p) || *p == 'e' || *p == 'E');
+
 	    confusing = (end == p);
 	  }
 	else
-	  confusing = 0;
-
-	size_byte = SBYTES (name);
+	  confusing = false;
 
 	if (! NILP (Vprint_gensym)
             && !SYMBOL_INTERNED_IN_INITIAL_OBARRAY_P (obj))
@@ -1958,10 +1949,11 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
 	    break;
 	  }
 
-	for (i = 0, i_byte = 0; i_byte < size_byte;)
+	for (ptrdiff_t i = 0, i_byte = 0; i_byte < size_byte;)
 	  {
 	    /* Here, we must convert each multi-byte form to the
 	       corresponding character code before handing it to PRINTCHAR.  */
+	    int c;
 	    FETCH_STRING_CHAR_ADVANCE (c, name, i, i_byte);
 	    maybe_quit ();
 
-- 
2.7.4


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-Say-needs-escaping-at-start-of-symbol.patch --]
[-- Type: text/x-patch; name="0002-Say-needs-escaping-at-start-of-symbol.patch", Size: 1210 bytes --]

From 4b945a3fcbf6ff2bde4595dd8b8f472d1b3d17af Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 3 Feb 2018 20:30:21 -0800
Subject: [PATCH 2/4] Say ? needs escaping at start of symbol.

* doc/lispref/objects.texi: ? is also special.
---
 doc/lispref/objects.texi | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/doc/lispref/objects.texi b/doc/lispref/objects.texi
index af74062..f0420e6 100644
--- a/doc/lispref/objects.texi
+++ b/doc/lispref/objects.texi
@@ -557,7 +557,8 @@ Symbol Type
 of the name suffice as long as the name does not look like a number.
 (If it does, write a @samp{\} at the beginning of the name to force
 interpretation as a symbol.)  The characters @samp{_~!@@$%^&:<>@{@}?} are
-less often used but also require no special punctuation.  Any other
+less often used but also require no special punctuation, except that
+@samp{\} must precede @samp{?} at the start of a symbol.  Any other
 characters may be included in a symbol's name by escaping them with a
 backslash.  In contrast to its use in strings, however, a backslash in
 the name of a symbol simply quotes the single character that follows the
-- 
2.7.4


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: 0003-prin1-etc.-now-escape-more-chars-in-symbols.patch --]
[-- Type: text/x-patch; name="0003-prin1-etc.-now-escape-more-chars-in-symbols.patch", Size: 3147 bytes --]

From 2add3a1595f709bb071e2b775970038470b2fab2 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 3 Feb 2018 20:30:48 -0800
Subject: [PATCH 3/4] prin1 etc. now escape more chars in symbols

Inspired by email from Aaron Ecay in:
https://lists.gnu.org/r/emacs-devel/2018-02/msg00125.html
* etc/NEWS: Mention this.
* src/print.c (print_object): Escape any character that is not
documented to not require escaping.
---
 etc/NEWS    |  7 +++++++
 src/print.c | 37 +++++++++++++++++++++++++++++++------
 2 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/etc/NEWS b/etc/NEWS
index afd0fba..2a46002 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -87,6 +87,13 @@ regular expression was previously invalid, but is now accepted:
 
    x\{32768\}
 
+** 'print' and related functions now escape more chars in symbols.
+They now escape any symbol character that is outside the documented
+set of characters that do not need escaping.  For example, (print
+(intern "n\u0456l")) now outputs "n\іl" instead of "nіl", as a hint to
+the reader that the "і" is not the usual U+0069 LATIN SMALL LETTER I,
+but is instead U+0456 CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I.
+
 \f
 * Editing Changes in Emacs 27.1
 
diff --git a/src/print.c b/src/print.c
index d3eb49d..7eca36a 100644
--- a/src/print.c
+++ b/src/print.c
@@ -1959,12 +1959,37 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
 
 	    if (escapeflag)
 	      {
-		if (c == '\"' || c == '\\' || c == '\''
-		    || c == ';' || c == '#' || c == '(' || c == ')'
-		    || c == ',' || c == '.' || c == '`'
-		    || c == '[' || c == ']' || c == '?' || c <= 040
-                    || confusing
-		    || (i == 1 && confusable_symbol_character_p (c)))
+		switch (c)
+		  {
+		    /* The Emacs Lisp manual lists these characters as
+		       not requiring escaping in symbols.  Although some
+		       other characters might also work, play it safe
+		       and escape all but these characters.  */
+		  case '!': case '$': case '%': case '&':
+		  case '*': case '-': case '+': case '/':
+		  case '0': case '1': case '2': case '3': case '4':
+		  case '5': case '6': case '7': case '8': case '9':
+		  case ':': case '<': case '=': case '>': case '@':
+		  case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
+		  case 'G': case 'H': case 'I': case 'J': case 'K': case 'L':
+		  case 'M': case 'N': case 'O': case 'P': case 'Q': case 'R':
+		  case 'S': case 'T': case 'U': case 'V': case 'W': case 'X':
+		  case 'Y': case 'Z':
+		  case '^': case '_':
+		  case 'a': case 'b': case 'c': case 'd': case 'e': case 'f':
+		  case 'g': case 'h': case 'i': case 'j': case 'k': case 'l':
+		  case 'm': case 'n': case 'o': case 'p': case 'q': case 'r':
+		  case 's': case 't': case 'u': case 'v': case 'w': case 'x':
+		  case 'y': case 'z':
+		  case '{': case '}': case '~':
+		    break;
+
+		  default:
+		    confusing = true;
+		    break;
+		  }
+
+		if (confusing)
 		  {
 		    printchar ('\\', printcharfun);
 		    confusing = false;
-- 
2.7.4


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #5: 0004-Escape-only-at-start-of-symbol.patch --]
[-- Type: text/x-patch; name="0004-Escape-only-at-start-of-symbol.patch", Size: 1913 bytes --]

From 4289ea136de4876b5dfc20d83b5a2556d1b5d8e6 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 3 Feb 2018 20:39:48 -0800
Subject: [PATCH 4/4] Escape ? only at start of symbol

* src/print.c (print_object): Do it.
---
 etc/NEWS    | 4 ++++
 src/print.c | 4 ++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/etc/NEWS b/etc/NEWS
index 2a46002..c435136 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -94,6 +94,10 @@ set of characters that do not need escaping.  For example, (print
 the reader that the "і" is not the usual U+0069 LATIN SMALL LETTER I,
 but is instead U+0456 CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I.
 
+** 'print' etc. no longer escape non-initial '?' in symbols.
+For example, the symbol 'list?' is now printed as-is.  Initial '?'
+is still escaped, e.g., (print (intern "?x")) still outputs "\?x".
+
 \f
 * Editing Changes in Emacs 27.1
 
diff --git a/src/print.c b/src/print.c
index 7eca36a..dfd6c50 100644
--- a/src/print.c
+++ b/src/print.c
@@ -1938,7 +1938,7 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
 	    confusing = (end == p);
 	  }
 	else
-	  confusing = false;
+	  confusing = *p == '?';
 
 	if (! NILP (Vprint_gensym)
             && !SYMBOL_INTERNED_IN_INITIAL_OBARRAY_P (obj))
@@ -1969,7 +1969,7 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
 		  case '*': case '-': case '+': case '/':
 		  case '0': case '1': case '2': case '3': case '4':
 		  case '5': case '6': case '7': case '8': case '9':
-		  case ':': case '<': case '=': case '>': case '@':
+		  case ':': case '<': case '=': case '>': case '?': case '@':
 		  case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
 		  case 'G': case 'H': case 'I': case 'J': case 'K': case 'L':
 		  case 'M': case 'N': case 'O': case 'P': case 'Q': case 'R':
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-04  4:51   ` Paul Eggert
@ 2018-02-04  9:47     ` Andreas Schwab
  2018-02-04 15:04     ` Noam Postavsky
  1 sibling, 0 replies; 98+ messages in thread
From: Andreas Schwab @ 2018-02-04  9:47 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Aaron Ecay, Emacs developers, Drew Adams, Noam Postavsky

On Feb 03 2018, Paul Eggert <eggert@cs.ucla.edu> wrote:

> Good point; thanks. In the spirit of "be strict about what you generate",
> the Emacs printer should escape any character that is not in the list of
> characters documented in the Elisp manual as being safe (i.e., as not
> requiring escaping).

No!

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-04  1:25         ` Clément Pit-Claudel
  2018-02-04  2:05           ` Drew Adams
  2018-02-04  2:06           ` Michael Heerdegen
@ 2018-02-04 10:34           ` Alan Third
  2018-02-04 15:36             ` Clément Pit-Claudel
  2 siblings, 1 reply; 98+ messages in thread
From: Alan Third @ 2018-02-04 10:34 UTC (permalink / raw)
  To: Clément Pit-Claudel
  Cc: Michael Heerdegen, Eli Zaretskii, npostavs, Drew Adams,
	emacs-devel

On Sat, Feb 03, 2018 at 08:25:01PM -0500, Clément Pit-Claudel wrote:
> On 2018-02-03 20:16, Michael Heerdegen wrote:
> > Helpfulness of error messages surely depends on the beholder, and on
> > expectations.  In my eyes,
> > 
> >> Symbol's value as variable is void: 'аbbrevs-changed
> > is quite clear: you think this        ^^^^^^^^^^^^^^^^ is a quoted
> > thing, but the error message calls it a symbol.  So there must be a
> > problem with that quote, it has obviously gotten read as part of the
> > symbol.  Sure, you have still to find out why.
> 
> I think you're making Eli's point, actually :)
> 
> The problem isn't the quote: it's the CYRILLIC SMALL LETTER A
> instead of LATIN SMALL LETTER A. IOW, (string= "аbbrevs-changed"
> "abbrevs-changed") is nil.
> 
> I think Eli was illustrating the confusion that can stem from
> Unicode confusables (and I must agree that the error message could
> be much better ^^)

Something like:

Symbol's value as variable is void: 'аbbrevs-changed
Did you mean `abbrevs-changed'?
Symbol contains `а' (CYRILLIC SMALL LETTER A) at character 0, did you
mean `a' (LATIN SMALL LETTER A)?

The middle line would require Emacs to do a fuzzy search for similar
symbols, which may be too much. Something like that could be helpful
even in cases where the name has been mistyped (abbrev-changed instead
of abbrevs-changed, for example).

-- 
Alan Third



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-04  1:16       ` Michael Heerdegen
  2018-02-04  1:25         ` Clément Pit-Claudel
@ 2018-02-04 11:15         ` Alan Mackenzie
  2018-02-04 15:54           ` Drew Adams
  2018-02-04 14:47         ` Noam Postavsky
  2 siblings, 1 reply; 98+ messages in thread
From: Alan Mackenzie @ 2018-02-04 11:15 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: Eli Zaretskii, emacs-devel, Drew Adams, npostavs

Hello, Michael.

On Sun, Feb 04, 2018 at 02:16:52 +0100, Michael Heerdegen wrote:
> Hello,

> Helpfulness of error messages surely depends on the beholder, and on
> expectations.  In my eyes,

> > Symbol's value as variable is void: 'аbbrevs-changed

> is quite clear: you think this        ^^^^^^^^^^^^^^^^ is a quoted
> thing, but the error message calls it a symbol.  So there must be a
> problem with that quote, it has obviously gotten read as part of the
> symbol.  Sure, you have still to find out why.  OTOH

This has actually happened to me.  In the error message, I didn't see
the quote as part of the symbol, I subconsciously dismissed it as a
quoting convention in the error message.  So what my brain saw was

    Symbol's value as variable is void: abbrevs-changed

.  This puzzled me a long time.

> > >   (invalid-read-syntax "strange quote" "’")

> also doesn't say what's wrong with that quote.  It even calls something
> a quote where there is none.

Perhaps "strange quasi quote" would be more emphatic and clearer.

> The error message is confusing.  Repeating the pseudo quote character
> in the error message doesn't make it look less like a quote.

Agreed, on both points.

> > I think you are so eager to make your point that you are willing to
> > claim that black is white and vice versa.  Any objective person would
> > agree that the new error message is more directly pointing to the root
> > cause

> Are you really sure that every Emacs user would expect that we modify
> the Lisp reader to catch typos?

We're not talking about typos here.  The curly quotes aren't present on
typical keyboard layouts (though I'm informed they are present on
Finnish keyboards), so nobody who isn't Finnish will type one of these
characters by accident.  We're talking about Emacs itself corrupting
ASCII quotes into curly quotes in a `message' call because of the
default setting of `text-quoting-style', and so on.

Because of this, the error message should concentrate on that quote, not
the strange symbol, which Emacs itself created.

[ .... ]

> > Symbol's value as variable is void: 'аbbrevs-changed.

> Misusing something fundamental as the Lisp reader to catch such stuff
> should be the very last resort.  The result can get much more confusing
> in situations we now don't think about.

Maybe we're already at the last resort for this problem.  Or maybe not.
Maybe an error message for unknown symbols should check for them
beginning with a curly quote.

> > > Lisp doesn't have a bug here.
> > That's a strawman, and you know it.  We are talking about diagnostics
> > for bugs in Lisp programs.

> I think it's a eligible argument.  Drew just thinks it's the wrong fix.
> He may also think that no fix would maybe suffice.  That's ok, and I
> think he made some good points.

> We should discuss about alternative approaches to move forward.  People
> often paste stuff into scratch or the M-: prompt that they copied from
> elsewhere.  Maybe we could make M-: and C-x C-e check for this problem.
> These could also check for other, similar frequent problems.  Any better
> suggestions?

I think that's a good suggestion.

> Michael.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-04  1:16       ` Michael Heerdegen
  2018-02-04  1:25         ` Clément Pit-Claudel
  2018-02-04 11:15         ` Alan Mackenzie
@ 2018-02-04 14:47         ` Noam Postavsky
  2 siblings, 0 replies; 98+ messages in thread
From: Noam Postavsky @ 2018-02-04 14:47 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: Eli Zaretskii, Drew Adams, Emacs developers

On Sat, Feb 3, 2018 at 8:16 PM, Michael Heerdegen
<michael_heerdegen@web.de> wrote:

> FWIW, we already modified the Lisp reader to catch another style issue
> (to get rid of old-style backquotes) and made it error.  It broke my
> stuff (el-search) horribly - though I don't use old-style backquotes,
> and for code that also doesn't use them.

That backquote change made `read' signal errors when reading
subexpressions of otherwise valid code.

The change under discussion changes what is valid code, so you won't
have the problem of getting read errors for valid code.
(changing what is valid Lisp has other drawbacks, as Drew has
repeatedly pointed out)



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-04  4:51   ` Paul Eggert
  2018-02-04  9:47     ` Andreas Schwab
@ 2018-02-04 15:04     ` Noam Postavsky
  2018-02-04 17:33       ` Eli Zaretskii
  1 sibling, 1 reply; 98+ messages in thread
From: Noam Postavsky @ 2018-02-04 15:04 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Aaron Ecay, Drew Adams, Emacs developers

On Sat, Feb 3, 2018 at 11:51 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
> Aaron Ecay wrote:
>>
>> I was surprised to learn that this is the case, in light of what is
>> said in the Elisp reference about symbol names

      Most symbol names are written with letters, digits, and the
    punctuation characters `-+=*/'.  Such names require no special
    punctuation...

> Good point; thanks. In the spirit of "be strict about what you generate",
> the Emacs printer should escape any character that is not in the list of
> characters documented in the Elisp manual as being safe (i.e., as not
> requiring escaping). This is elementary future-proofing, and is independent
> of whether we want Emacs to warn about or disallow confusable chars in
> symbols.

My impression is that manual passage was written with only ASCII
characters in mind.  But since Emacs has allowed Unicode characters in
symbol names for a long time now, I don't think we should all of a
sudden declare "letters" to mean just [a-zA-Z].



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-04 10:34           ` Alan Third
@ 2018-02-04 15:36             ` Clément Pit-Claudel
  2018-02-04 17:37               ` Eli Zaretskii
  0 siblings, 1 reply; 98+ messages in thread
From: Clément Pit-Claudel @ 2018-02-04 15:36 UTC (permalink / raw)
  To: Alan Third
  Cc: Michael Heerdegen, Eli Zaretskii, npostavs, Drew Adams,
	emacs-devel

On 2018-02-04 05:34, Alan Third wrote:
> Symbol's value as variable is void: 'аbbrevs-changed
> Did you mean `abbrevs-changed'?
> Symbol contains `а' (CYRILLIC SMALL LETTER A) at character 0, did you
> mean `a' (LATIN SMALL LETTER A)?

That would be pretty cool.

> The middle line would require Emacs to do a fuzzy search for similar
> symbols, which may be too much.

OCaml does this (but at compile time).  Do we have a way to delay the fuzzy search to the point when the error message is displayed?  Otherwise we'll pay the price of the search even if the error is then swallowed by a condition-case.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-04 11:15         ` Alan Mackenzie
@ 2018-02-04 15:54           ` Drew Adams
  0 siblings, 0 replies; 98+ messages in thread
From: Drew Adams @ 2018-02-04 15:54 UTC (permalink / raw)
  To: Alan Mackenzie, Michael Heerdegen; +Cc: Eli Zaretskii, emacs-devel, npostavs

> We're not talking about typos here.  The curly quotes aren't present on
> typical keyboard layouts (though I'm informed they are present on
> Finnish keyboards), so nobody who isn't Finnish will type one of these
> characters by accident.  We're talking about Emacs itself corrupting
> ASCII quotes into curly quotes in a `message' call because of the
> default setting of `text-quoting-style', and so on.
> 
> Because of this, the error message should concentrate on that quote, not
> the strange symbol, which Emacs itself created.

Not necessarily.  Although I share your concern about
Emacs promulgating curly quotes, there is a real usage
problem akin to "typos": users copying text, including
Lisp code, from a web page or elsewhere, and pasting
it into Emacs as code to be evaluated at some point.

If the source of the copy has already changed simple
apostrophe to a curly quote (as one example) then that's
what gets passed to Emacs.  The person copy+pasting
may well have no clue about just which characters are
being copied.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-04 15:04     ` Noam Postavsky
@ 2018-02-04 17:33       ` Eli Zaretskii
  2018-02-04 19:36         ` Paul Eggert
  0 siblings, 1 reply; 98+ messages in thread
From: Eli Zaretskii @ 2018-02-04 17:33 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: aaronecay, eggert, drew.adams, emacs-devel

> From: Noam Postavsky <npostavs@users.sourceforge.net>
> Date: Sun, 4 Feb 2018 10:04:26 -0500
> Cc: Aaron Ecay <aaronecay@gmail.com>, Drew Adams <drew.adams@oracle.com>,
> 	Emacs developers <emacs-devel@gnu.org>
> 
> I don't think we should all of a sudden declare "letters" to mean
> just [a-zA-Z].

We don't: [:alpha:] nowadays matches much more than just [a-zA-Z].

So indeed, restricting Lisp symbols that way sounds too harsh, almost
arbitrary.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-04 15:36             ` Clément Pit-Claudel
@ 2018-02-04 17:37               ` Eli Zaretskii
  2018-02-04 21:31                 ` Noam Postavsky
  0 siblings, 1 reply; 98+ messages in thread
From: Eli Zaretskii @ 2018-02-04 17:37 UTC (permalink / raw)
  To: Clément Pit-Claudel
  Cc: michael_heerdegen, alan, npostavs, drew.adams, emacs-devel

> Cc: Michael Heerdegen <michael_heerdegen@web.de>, Eli Zaretskii
>  <eliz@gnu.org>, emacs-devel@gnu.org, Drew Adams <drew.adams@oracle.com>,
>  npostavs@users.sourceforge.net
> From: Clément Pit-Claudel <cpitclaudel@gmail.com>
> Date: Sun, 4 Feb 2018 10:36:49 -0500
> 
> > The middle line would require Emacs to do a fuzzy search for similar
> > symbols, which may be too much.
> 
> OCaml does this (but at compile time).  Do we have a way to delay the fuzzy search to the point when the error message is displayed?  Otherwise we'll pay the price of the search even if the error is then swallowed by a condition-case.

Isn't this premature optimization?  We aren't even sure yet that such
a fuzzy search will be too expensive.  We could, for example,
implement the confusables as a char-table, which would make it fast
enough, I think.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-04 17:33       ` Eli Zaretskii
@ 2018-02-04 19:36         ` Paul Eggert
  2018-02-04 19:55           ` Philipp Stephani
  2018-02-04 20:10           ` Eli Zaretskii
  0 siblings, 2 replies; 98+ messages in thread
From: Paul Eggert @ 2018-02-04 19:36 UTC (permalink / raw)
  To: Eli Zaretskii, Noam Postavsky; +Cc: aaronecay, drew.adams, emacs-devel

Eli Zaretskii wrote:
> restricting Lisp symbols that way sounds too harsh

OK, I'll omit that patch.

We still have a problem with Emacs Lisp code containing confusable characters 
that can make the code exceedingly hard to review. These characters are 
currently not caught or checked by anything. We really should do better.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-04 19:36         ` Paul Eggert
@ 2018-02-04 19:55           ` Philipp Stephani
  2018-02-04 20:10           ` Eli Zaretskii
  1 sibling, 0 replies; 98+ messages in thread
From: Philipp Stephani @ 2018-02-04 19:55 UTC (permalink / raw)
  To: Paul Eggert
  Cc: aaronecay, Eli Zaretskii, emacs-devel, drew.adams, Noam Postavsky

[-- Attachment #1: Type: text/plain, Size: 768 bytes --]

Paul Eggert <eggert@cs.ucla.edu> schrieb am So., 4. Feb. 2018 um 20:36 Uhr:

> Eli Zaretskii wrote:
> > restricting Lisp symbols that way sounds too harsh
>
> OK, I'll omit that patch.
>
> We still have a problem with Emacs Lisp code containing confusable
> characters
> that can make the code exceedingly hard to review. These characters are
> currently not caught or checked by anything. We really should do better.
>
>
The following should be unintrusive and not too hard: Let the reader push
all confusable symbols (i.e. symbols that contain at least one unescaped
character from the Unicode confusables list that maps to a sequence of
ASCII characters) onto an internal dynamic variable. The byte compiler can
then emit warnings if that variable becomes non-nil.

[-- Attachment #2: Type: text/html, Size: 1071 bytes --]

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-04 19:36         ` Paul Eggert
  2018-02-04 19:55           ` Philipp Stephani
@ 2018-02-04 20:10           ` Eli Zaretskii
  2018-02-04 20:36             ` Eli Zaretskii
  1 sibling, 1 reply; 98+ messages in thread
From: Eli Zaretskii @ 2018-02-04 20:10 UTC (permalink / raw)
  To: Paul Eggert; +Cc: aaronecay, emacs-devel, drew.adams, npostavs

> Cc: aaronecay@gmail.com, drew.adams@oracle.com, emacs-devel@gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sun, 4 Feb 2018 11:36:05 -0800
> 
> We still have a problem with Emacs Lisp code containing confusable characters 
> that can make the code exceedingly hard to review. These characters are 
> currently not caught or checked by anything. We really should do better.

I agree that we need a solid solution to that.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-04 20:10           ` Eli Zaretskii
@ 2018-02-04 20:36             ` Eli Zaretskii
  2018-02-04 20:48               ` Paul Eggert
  0 siblings, 1 reply; 98+ messages in thread
From: Eli Zaretskii @ 2018-02-04 20:36 UTC (permalink / raw)
  To: eggert; +Cc: emacs-devel

> Date: Sun, 04 Feb 2018 22:10:25 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: aaronecay@gmail.com, emacs-devel@gnu.org, drew.adams@oracle.com,
> 	npostavs@users.sourceforge.net
> 
> > Cc: aaronecay@gmail.com, drew.adams@oracle.com, emacs-devel@gnu.org
> > From: Paul Eggert <eggert@cs.ucla.edu>
> > Date: Sun, 4 Feb 2018 11:36:05 -0800
> > 
> > We still have a problem with Emacs Lisp code containing confusable characters 
> > that can make the code exceedingly hard to review. These characters are 
> > currently not caught or checked by anything. We really should do better.
> 
> I agree that we need a solid solution to that.

How about if we start by warning about any Lisp symbol whose name uses
characters from more than one script for non-punctuation characters?
puny.el has a function that solves a similar problem, which could be
used as a starting point.

We could make this an opt-in feature if the warnings are deemed to be
a potential annoyance.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-04 20:36             ` Eli Zaretskii
@ 2018-02-04 20:48               ` Paul Eggert
  2018-02-04 20:59                 ` Clément Pit-Claudel
  0 siblings, 1 reply; 98+ messages in thread
From: Paul Eggert @ 2018-02-04 20:48 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Eli Zaretskii wrote:
> How about if we start by warning about any Lisp symbol whose name uses
> characters from more than one script for non-punctuation characters?

This problem can occur even in one-character symbols. It might be better to 
establish a default script for the file, and warn about any characters from a 
different script.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-04 20:48               ` Paul Eggert
@ 2018-02-04 20:59                 ` Clément Pit-Claudel
  0 siblings, 0 replies; 98+ messages in thread
From: Clément Pit-Claudel @ 2018-02-04 20:59 UTC (permalink / raw)
  To: emacs-devel

On 2018-02-04 15:48, Paul Eggert wrote:
> Eli Zaretskii wrote:
>> How about if we start by warning about any Lisp symbol whose name uses
>> characters from more than one script for non-punctuation characters?
> 
> This problem can occur even in one-character symbols. It might be better to establish a default script for the file, and warn about any characters from a different script.

We could also default to warning about any characters in the confusables list and not in ascii.  And we'd make it easy to turn the check off using a file-local variable.




^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-04 17:37               ` Eli Zaretskii
@ 2018-02-04 21:31                 ` Noam Postavsky
  0 siblings, 0 replies; 98+ messages in thread
From: Noam Postavsky @ 2018-02-04 21:31 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: Michael Heerdegen, Clément Pit-Claudel, Alan Third,
	Drew Adams, Emacs developers

[-- Attachment #1: Type: text/plain, Size: 804 bytes --]

On Sun, Feb 4, 2018 at 12:37 PM, Eli Zaretskii <eliz@gnu.org> wrote:
>> From: Clément Pit-Claudel <cpitclaudel@gmail.com>
>> Date: Sun, 4 Feb 2018 10:36:49 -0500
>>
>> > The middle line would require Emacs to do a fuzzy search for similar
>> > symbols, which may be too much.
>>
>> OCaml does this (but at compile time).  Do we have a way to delay the fuzzy search to the point when the error message is displayed?  Otherwise we'll pay the price of the search even if the error is then swallowed by a condition-case.
>
> Isn't this premature optimization?

I think the check fits nicely into command-error-default-function
though. Attaching a quick proof-of-concept (handles only a single
curved quote at the beginning of symbol name). We would want something
also for the byte-compiler.

[-- Attachment #2: v1-0001-sketch-Catch-strange-quotes-on-error-time.patch --]
[-- Type: text/x-diff, Size: 3166 bytes --]

From c9d1e761cea56e94d9ad3d783c8ed7fcf448b082 Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs@gmail.com>
Date: Sun, 4 Feb 2018 16:20:32 -0500
Subject: [PATCH v1] [sketch] Catch strange quotes on error time

* src/keyboard.c (Fcommand_error_default_function): Check for RIGHT
SINGLE QUOTATION MARK and give a more detailed message.  TODO: check
for other confusables.
* src/lread.c (read1): Don't signal error on strange quotes.
* src/eval.c (Fsetq): Pass full arglist in error data.  TODO: the same
for all the other Qwrong_number_of_arguments cases.
---
 src/eval.c     |  3 ++-
 src/keyboard.c | 20 ++++++++++++++++++++
 src/lread.c    |  7 -------
 3 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/src/eval.c b/src/eval.c
index 7db4dbcf18..db61c0421f 100644
--- a/src/eval.c
+++ b/src/eval.c
@@ -507,7 +507,8 @@ DEFUN ("setq", Fsetq, Ssetq, 0, UNEVALLED, 0,
       Lisp_Object sym = XCAR (tail), lex_binding;
       tail = XCDR (tail);
       if (!CONSP (tail))
-	xsignal2 (Qwrong_number_of_arguments, Qsetq, make_number (nargs + 1));
+        xsignal3 (Qwrong_number_of_arguments, Qsetq,
+                  make_number (nargs + 1), args);
       Lisp_Object arg = XCAR (tail);
       tail = XCDR (tail);
       val = eval_sub (arg);
diff --git a/src/keyboard.c b/src/keyboard.c
index 4324991da4..24c5f66934 100644
--- a/src/keyboard.c
+++ b/src/keyboard.c
@@ -1047,6 +1047,26 @@ DEFUN ("command-error-default-function", Fcommand_error_default_function,
       bitch_at_user ();
 
       print_error_message (data, Qt, SSDATA (context), signal);
+
+      Lisp_Object errname = Fcar (data);
+      /* TODO: Add arglist to Qwrong_number_of_arguments errors, and
+         check those too.  */
+      if (EQ (errname, Qvoid_variable))
+        {
+          Lisp_Object void_symname = Fsymbol_name (Fnth (make_number (1), data));
+          if (SCHARS (void_symname) > 0 &&
+              /* TODO: check all confusables.  */
+              EQ (Faref (void_symname, make_number (0)), make_number (0x2019)))
+            {
+              Lisp_Object msg = CALLN
+                (Fformat_message,
+                 build_string ("\nSymbol has with `%c' (%s) at character 0,"
+                               " did you mean `%c' (%s)"),
+                 make_number (0x2019), build_string ("RIGHT SINGLE QUOTATION MARK"),
+                 make_number ('\''), build_string ("APOSTROPHE"));
+              Fprinc (msg, Qt);
+            }
+        }
     }
   return Qnil;
 }
diff --git a/src/lread.c b/src/lread.c
index 3b0a17c90b..ee08902f81 100644
--- a/src/lread.c
+++ b/src/lread.c
@@ -3470,13 +3470,6 @@ read1 (Lisp_Object readcharfun, int *pch, bool first_in_list)
 	    if (! NILP (result))
 	      return unbind_to (count, result);
 	  }
-        if (!quoted && multibyte)
-          {
-            int ch = STRING_CHAR ((unsigned char *) read_buffer);
-            if (confusable_symbol_character_p (ch))
-              xsignal2 (Qinvalid_read_syntax, build_string ("strange quote"),
-                        CALLN (Fstring, make_number (ch)));
-          }
 	{
 	  Lisp_Object result;
 	  ptrdiff_t nbytes = p - read_buffer;
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* Why "symbol's value" error about a list?
  2018-02-03 17:05     ` Eli Zaretskii
  2018-02-04  1:16       ` Michael Heerdegen
  2018-02-04  1:55       ` Drew Adams
@ 2018-02-05  1:06       ` Richard Stallman
  2018-02-05 20:35         ` Alan Mackenzie
  2018-02-06 11:27         ` Noam Postavsky
  2018-02-05  1:06       ` Change of Lisp syntax for "fancy" quotes in Emacs 27? Richard Stallman
  3 siblings, 2 replies; 98+ messages in thread
From: Richard Stallman @ 2018-02-05  1:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: npostavs, drew.adams, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > For example, suppose you have a Lisp program that produces the
  > following error message when compiled/executed:

  >   Symbol's value as variable is void: 'аbbrevs-changed

Does that error message really happen?  If so, how can I reproduce it?

I understand that the character 'а' is not ASCII a.  That explains why
'аbbrevs-changed' is not known as a variable.  But I'm talking about
a different issue, which has nothing to do with character coding.

Suppose it were 'foobaz', all ASCII, and we got an error such as

  >   Symbol's value as variable is void: 'foobaz

That still seems wrong.

If the error was that foobaz was void, the error message should not
include a quote.  It should say

  >   Symbol's value as variable is void: foobaz

Or if the error was that 'foobaz is used instead of a symbol, the error
message should say

  >   Wrong type argument: symbolp, (quote foobaz)


-- 
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)
Skype: No way! See https://stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-03 17:05     ` Eli Zaretskii
                         ` (2 preceding siblings ...)
  2018-02-05  1:06       ` Why "symbol's value" error about a list? Richard Stallman
@ 2018-02-05  1:06       ` Richard Stallman
  3 siblings, 0 replies; 98+ messages in thread
From: Richard Stallman @ 2018-02-05  1:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: npostavs, drew.adams, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > We want to find a way of identifying such situation and telling the
  > Lisp programmer about that in clear and easily understandable ways.
  > One way, perhaps too radical one, is to reject such "confusable"
  > characters outright.

I think that makes sense for Lisp symbols.  Lisp has had strings for
40 years now, so it isn't customary to use symbols to represent
arbitrary text.

We could have a mode where intern converts all these character codes
to a single canonical set, but the default could be to give an error for all
but the preferred one.

-- 
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)
Skype: No way! See https://stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Why "symbol's value" error about a list?
  2018-02-05  1:06       ` Why "symbol's value" error about a list? Richard Stallman
@ 2018-02-05 20:35         ` Alan Mackenzie
  2018-02-05 21:46           ` Drew Adams
  2018-02-06 14:51           ` Richard Stallman
  2018-02-06 11:27         ` Noam Postavsky
  1 sibling, 2 replies; 98+ messages in thread
From: Alan Mackenzie @ 2018-02-05 20:35 UTC (permalink / raw)
  To: Richard Stallman; +Cc: Eli Zaretskii, emacs-devel, drew.adams, npostavs

Hello, Richard.

On Sun, Feb 04, 2018 at 20:06:39 -0500, Richard Stallman wrote:
> [[[ To any NSA and FBI agents reading my email: please consider    ]]]
> [[[ whether defending the US Constitution against all enemies,     ]]]
> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]

>   > For example, suppose you have a Lisp program that produces the
>   > following error message when compiled/executed:

>   >   Symbol's value as variable is void: 'аbbrevs-changed

> Does that error message really happen?  If so, how can I reproduce it?

In Emacs-25.3 -Q, do

    M-: (message "(setq foo 'bar)") RET

, followed by getting the output from *Messages* into the kill ring with
M-w, followed by

    M-: C-y RET

.  You might think you are executing (setq foo 'bar).  You're not.
You're executing (setq foo ’bar), where the ’ is a Unicode curly quote.

The error message given out is:

    Symbol's value as variable is void: ’bar

.  If you're like me, you will read that as the symbol "bar" is void,
rather than the symbol "’bar" is void.

This is a result of the change in `message', silently to convert ' to a
curly quote, by default.  Some of us were unhappy at this change and
protested against it.

> I understand that the character 'а' is not ASCII a.  That explains why
> 'аbbrevs-changed' is not known as a variable.  But I'm talking about
> a different issue, which has nothing to do with character coding.

> Suppose it were 'foobaz', all ASCII, and we got an error such as

>   >   Symbol's value as variable is void: 'foobaz

> That still seems wrong.

Again "’foobaz", not "foobaz" is the symbol, here.

> If the error was that foobaz was void, the error message should not
> include a quote.  It should say

>   >   Symbol's value as variable is void: foobaz

Yes.

> Or if the error was that 'foobaz is used instead of a symbol, the error
> message should say

>   >   Wrong type argument: symbolp, (quote foobaz)

In the recent pretest, Emacs-26.0.91, when a curly quote appears at the
start of a symbol, the reader rejects it, giving the error message:

    read--expression: Invalid read syntax: "strange quote", "'"

.  This is somewhat controversional, and is what the recent discussion
has been about.

> -- 
> Dr Richard Stallman
> President, Free Software Foundation (https://gnu.org, https://fsf.org)
> Internet Hall-of-Famer (https://internethalloffame.org)
> Skype: No way! See https://stallman.org/skype.html.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 98+ messages in thread

* RE: Why "symbol's value" error about a list?
  2018-02-05 20:35         ` Alan Mackenzie
@ 2018-02-05 21:46           ` Drew Adams
  2018-02-06  4:13             ` Eli Zaretskii
  2018-02-06 14:51           ` Richard Stallman
  1 sibling, 1 reply; 98+ messages in thread
From: Drew Adams @ 2018-02-05 21:46 UTC (permalink / raw)
  To: Alan Mackenzie, Richard Stallman; +Cc: Eli Zaretskii, emacs-devel, npostavs

> >   > For example, suppose you have a Lisp program that produces the
> >   > following error message when compiled/executed:
> >   >   Symbol's value as variable is void: 'аbbrevs-changed
> 
> > Does that error message really happen?  If so, how can I reproduce it?
> 
> In Emacs-25.3 -Q, do M-: (message "(setq foo 'bar)") RET
> followed by getting the output from *Messages* into the kill ring with
> M-w, followed by M-: C-y RET.
> 
> You might think you are executing (setq foo 'bar).  You're not.
> You're executing (setq foo ’bar), where the ’ is a Unicode curly quote.
> 
> The error message given out is:
>     Symbol's value as variable is void: ’bar

That was the old, and legitimate, error message, yes.  It
accurately describes what is really going on (as you describe
well, below).

Now the message is instead (invalid-read-syntax "strange quote"
"’").  Is that better?  That's part of what this discussion is
about.

I suggested that the variable name be enclosed in `...'.  That
would make the original message clearer, I think:

  Symbol's value as variable is void: `’bar'

At least it could make it more likely that you would think about
looking at that quote mark.

> This is a result of the change in `message', silently to convert ' to a
> curly quote, by default.  Some of us were unhappy at this change and
> protested against it.

Count me as one of those "some of us".  Echoing Lisp code
should do just that - no fiddling to "prettify" apostrophe to
curly quote etc.

> > Suppose it were 'foobaz', all ASCII, and we got an error such as
> >   Symbol's value as variable is void: 'foobaz
> > That still seems wrong.

Here's the thing: There _is_ a Lisp error - no doubt.

But for Lisp the error is not that a curly quote was read
as part of a symbol name.  That's not a Lisp error (at
least it has not been, until now.)

The error is using a symbol as a variable, when it is not
defined as a variable.  Which is exactly what the original
error message said.

That's the LISP error.  Is there a _user error_ here?
Yes, it's the mistake of copying and pasting what was
printed in *Messages*.

That user mistake is excusable.  And we would want to
inform the user about it, if we can't prevent it.  But
changing Lisp read syntax to guess what might be the
most helpful thing to tell a user here is NOT the solution.

Should this Lisp syntax change be reverted?  That's the
question being discussed here.

Changing the read syntax is a general, Lisp-level change.
We should instead prevent this user mistake by removing
its cause.

The real error here is (IMO) a design error by Emacs: The
expression read and copied to *Messages* should not have
been "helpfully" translated to use a curly quote instead
of an apostrophe.

Emacs shot Lisp in the foot on this one.  It's not the
fault of Lisp and its reader (syntax).  It's the fault of
some misguided "modernization" of Emacs gone amuck.  Users
should not find the input (setq foo 'bar) transformed to
(setq foo ’bar), i.e., APOSTROPHE replaced by RIGHT SINGLE
QUOTATION MARK.

> Again "’foobaz", not "foobaz" is the symbol, here.

Yes, and that's a legitimate symbol name.  Nothing wrong
with Lisp telling us that that symbol is undefined as a
variable.  That's exactly what the _LISP_ problem is here.

That's just not the symbol that was passed to Lisp
originally.  That's a non-variable symbol name copied from
*Messages*.  The mistake was putting that in *Messages* in
the first place.

Where was the mistake?  Lisp claiming that you used a
symbol as an undefined variable?  The user copying that
symbol name from *Messages* and trying to evaluate its
symbol as a var?  Or Emacs inserting a different symbol
name in *Messages*, by substituting the text "’bar" for
the text "'bar"?

The original Lisp expression was a Lisp expression, not
just text.  A quote mark (apostrophe) in Lisp has special
meaning, special syntax.  That shouldn't be ignored by
some dumb (yes) substitution of curly quotes for straight
quotes.

> > If the error was that foobaz was void, the error message
> > should not include a quote.  It should say
> >   Symbol's value as variable is void: foobaz
> 
> Yes.

No.  Only if `foobaz' were indeed the symbol that was an
undefined variable.  But that's NOT the case here.  The
undefined variable here is the symbol `’foobaz' (from
*Messages*) - it really is.

The underlying mistake took place long before Lisp
evaluation of the pasted sexp.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Why "symbol's value" error about a list?
  2018-02-05 21:46           ` Drew Adams
@ 2018-02-06  4:13             ` Eli Zaretskii
  2018-02-06  7:32               ` Tim Cross
  2018-02-06 15:45               ` Drew Adams
  0 siblings, 2 replies; 98+ messages in thread
From: Eli Zaretskii @ 2018-02-06  4:13 UTC (permalink / raw)
  To: Drew Adams; +Cc: acm, emacs-devel, rms, npostavs

> Date: Mon, 5 Feb 2018 13:46:38 -0800 (PST)
> From: Drew Adams <drew.adams@oracle.com>
> Cc: Eli Zaretskii <eliz@gnu.org>, npostavs@users.sourceforge.net,
>         emacs-devel@gnu.org
> 
> > The error message given out is:
> >     Symbol's value as variable is void: ’bar
> 
> That was the old, and legitimate, error message, yes.  It
> accurately describes what is really going on (as you describe
> well, below).
> 
> Now the message is instead (invalid-read-syntax "strange quote"
> "’").  Is that better?

I think it's somewhat better, because it talks about "strange quote",
which is a hint for the user about the actual problem.

> I suggested that the variable name be enclosed in `...'.  That
> would make the original message clearer, I think:
> 
>   Symbol's value as variable is void: `’bar'

That might make things even more confusing, because the text actually
displayed will be this:

    Symbol’s value as variable is void: ‘’bar’

which loses all hints of what is being quoted here.

> > This is a result of the change in `message', silently to convert ' to a
> > curly quote, by default.  Some of us were unhappy at this change and
> > protested against it.
> 
> Count me as one of those "some of us".  Echoing Lisp code
> should do just that - no fiddling to "prettify" apostrophe to
> curly quote etc.

That ship has sailed two Emacs releases ago.  We are trying to fix the
fallout.

And strange quotes is only one situation where confusingly similar
characters can be presented in error messages, making it hard for
users to spot the real problem.  We are trying to find ways of making
such "typos" more evident in error messages.

> The error is using a symbol as a variable, when it is not
> defined as a variable.  Which is exactly what the original
> error message said.
> 
> That's the LISP error.  Is there a _user error_ here?
> Yes, it's the mistake of copying and pasting what was
> printed in *Messages*.
> 
> That user mistake is excusable.  And we would want to
> inform the user about it, if we can't prevent it.  But
> changing Lisp read syntax to guess what might be the
> most helpful thing to tell a user here is NOT the solution.

The issue is what _would_ be a helpful message in these cases.  You
are just saying what should _not_ be done (repeatedly), but that
doesn't advance us towards the solution.

> Should this Lisp syntax change be reverted?  That's the
> question being discussed here.

No, that's only part of the question.  The other, no less important
part is if we revert that change, how to make the confusing error
message less so and more helpful in understanding the user error.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Why "symbol's value" error about a list?
  2018-02-06  4:13             ` Eli Zaretskii
@ 2018-02-06  7:32               ` Tim Cross
  2018-02-06  7:40                 ` Eli Zaretskii
  2018-02-06 15:45                 ` Drew Adams
  2018-02-06 15:45               ` Drew Adams
  1 sibling, 2 replies; 98+ messages in thread
From: Tim Cross @ 2018-02-06  7:32 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, npostavs, rms, Drew Adams, Emacs developers

[-- Attachment #1: Type: text/plain, Size: 3814 bytes --]

It seems there are two issues here - they are not completely separate, but
do seem to be distinct and probably need to be addressed in two steps.

If the statement

> Count me as one of those "some of us".  Echoing Lisp code
> should do just that - no fiddling to "prettify" apostrophe to
> curly quote etc.

is correct, then I would agree it was a bad design decision. The *Messages*
buffer should display lisp code exactly as it is read and not try to
'prettify' it.

The second issue seems to be more about how to make the error message more
informative. I suspect this is a much harder problem to resolve. I don't
know what the right solution is for that, but I do know that I would have
more chance of recognising my error if the message displayed in the buffer
displays the lisp code exactly as it was read by the reader.

Tim


On 6 February 2018 at 15:13, Eli Zaretskii <eliz@gnu.org> wrote:

> > Date: Mon, 5 Feb 2018 13:46:38 -0800 (PST)
> > From: Drew Adams <drew.adams@oracle.com>
> > Cc: Eli Zaretskii <eliz@gnu.org>, npostavs@users.sourceforge.net,
> >         emacs-devel@gnu.org
> >
> > > The error message given out is:
> > >     Symbol's value as variable is void: ’bar
> >
> > That was the old, and legitimate, error message, yes.  It
> > accurately describes what is really going on (as you describe
> > well, below).
> >
> > Now the message is instead (invalid-read-syntax "strange quote"
> > "’").  Is that better?
>
> I think it's somewhat better, because it talks about "strange quote",
> which is a hint for the user about the actual problem.
>
> > I suggested that the variable name be enclosed in `...'.  That
> > would make the original message clearer, I think:
> >
> >   Symbol's value as variable is void: `’bar'
>
> That might make things even more confusing, because the text actually
> displayed will be this:
>
>     Symbol’s value as variable is void: ‘’bar’
>
> which loses all hints of what is being quoted here.
>
> > > This is a result of the change in `message', silently to convert ' to a
> > > curly quote, by default.  Some of us were unhappy at this change and
> > > protested against it.
> >
> > Count me as one of those "some of us".  Echoing Lisp code
> > should do just that - no fiddling to "prettify" apostrophe to
> > curly quote etc.
>
> That ship has sailed two Emacs releases ago.  We are trying to fix the
> fallout.
>
> And strange quotes is only one situation where confusingly similar
> characters can be presented in error messages, making it hard for
> users to spot the real problem.  We are trying to find ways of making
> such "typos" more evident in error messages.
>
> > The error is using a symbol as a variable, when it is not
> > defined as a variable.  Which is exactly what the original
> > error message said.
> >
> > That's the LISP error.  Is there a _user error_ here?
> > Yes, it's the mistake of copying and pasting what was
> > printed in *Messages*.
> >
> > That user mistake is excusable.  And we would want to
> > inform the user about it, if we can't prevent it.  But
> > changing Lisp read syntax to guess what might be the
> > most helpful thing to tell a user here is NOT the solution.
>
> The issue is what _would_ be a helpful message in these cases.  You
> are just saying what should _not_ be done (repeatedly), but that
> doesn't advance us towards the solution.
>
> > Should this Lisp syntax change be reverted?  That's the
> > question being discussed here.
>
> No, that's only part of the question.  The other, no less important
> part is if we revert that change, how to make the confusing error
> message less so and more helpful in understanding the user error.
>
>


-- 
regards,

Tim

--
Tim Cross

[-- Attachment #2: Type: text/html, Size: 5199 bytes --]

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Why "symbol's value" error about a list?
  2018-02-06  7:32               ` Tim Cross
@ 2018-02-06  7:40                 ` Eli Zaretskii
  2018-02-06 15:45                 ` Drew Adams
  1 sibling, 0 replies; 98+ messages in thread
From: Eli Zaretskii @ 2018-02-06  7:40 UTC (permalink / raw)
  To: emacs-devel, Tim Cross; +Cc: acm, Emacs developers, rms, Drew Adams, npostavs

On February 6, 2018 9:32:21 AM GMT+02:00, Tim Cross <theophilusx@gmail.com> wrote:
> It seems there are two issues here - they are not completely separate,
> but
> do seem to be distinct and probably need to be addressed in two steps.
> 
> If the statement
> 
> > Count me as one of those "some of us".  Echoing Lisp code
> > should do just that - no fiddling to "prettify" apostrophe to
> > curly quote etc.
> 
> is correct, then I would agree it was a bad design decision. The
> *Messages*
> buffer should display lisp code exactly as it is read and not try to
> 'prettify' it.
> 
> The second issue seems to be more about how to make the error message
> more
> informative. I suspect this is a much harder problem to resolve. I
> don't
> know what the right solution is for that, but I do know that I would
> have
> more chance of recognising my error if the message displayed in the
> buffer
> displays the lisp code exactly as it was read by the reader.
> 
> Tim
> 
> 
> On 6 February 2018 at 15:13, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> > > Date: Mon, 5 Feb 2018 13:46:38 -0800 (PST)
> > > From: Drew Adams <drew.adams@oracle.com>
> > > Cc: Eli Zaretskii <eliz@gnu.org>, npostavs@users.sourceforge.net,
> > >         emacs-devel@gnu.org
> > >
> > > > The error message given out is:
> > > >     Symbol's value as variable is void: ’bar
> > >
> > > That was the old, and legitimate, error message, yes.  It
> > > accurately describes what is really going on (as you describe
> > > well, below).
> > >
> > > Now the message is instead (invalid-read-syntax "strange quote"
> > > "’").  Is that better?
> >
> > I think it's somewhat better, because it talks about "strange
> quote",
> > which is a hint for the user about the actual problem.
> >
> > > I suggested that the variable name be enclosed in `...'.  That
> > > would make the original message clearer, I think:
> > >
> > >   Symbol's value as variable is void: `’bar'
> >
> > That might make things even more confusing, because the text
> actually
> > displayed will be this:
> >
> >     Symbol’s value as variable is void: ‘’bar’
> >
> > which loses all hints of what is being quoted here.
> >
> > > > This is a result of the change in `message', silently to convert
> ' to a
> > > > curly quote, by default.  Some of us were unhappy at this change
> and
> > > > protested against it.
> > >
> > > Count me as one of those "some of us".  Echoing Lisp code
> > > should do just that - no fiddling to "prettify" apostrophe to
> > > curly quote etc.
> >
> > That ship has sailed two Emacs releases ago.  We are trying to fix
> the
> > fallout.
> >
> > And strange quotes is only one situation where confusingly similar
> > characters can be presented in error messages, making it hard for
> > users to spot the real problem.  We are trying to find ways of
> making
> > such "typos" more evident in error messages.
> >
> > > The error is using a symbol as a variable, when it is not
> > > defined as a variable.  Which is exactly what the original
> > > error message said.
> > >
> > > That's the LISP error.  Is there a _user error_ here?
> > > Yes, it's the mistake of copying and pasting what was
> > > printed in *Messages*.
> > >
> > > That user mistake is excusable.  And we would want to
> > > inform the user about it, if we can't prevent it.  But
> > > changing Lisp read syntax to guess what might be the
> > > most helpful thing to tell a user here is NOT the solution.
> >
> > The issue is what _would_ be a helpful message in these cases.  You
> > are just saying what should _not_ be done (repeatedly), but that
> > doesn't advance us towards the solution.
> >
> > > Should this Lisp syntax change be reverted?  That's the
> > > question being discussed here.
> >
> > No, that's only part of the question.  The other, no less important
> > part is if we revert that change, how to make the confusing error
> > message less so and more helpful in understanding the user error.
> >
> >

Lisp code is not changed in messages; only quoted plain text is.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Why "symbol's value" error about a list?
  2018-02-05  1:06       ` Why "symbol's value" error about a list? Richard Stallman
  2018-02-05 20:35         ` Alan Mackenzie
@ 2018-02-06 11:27         ` Noam Postavsky
  2018-02-06 14:53           ` Richard Stallman
  2018-02-06 18:52           ` Eli Zaretskii
  1 sibling, 2 replies; 98+ messages in thread
From: Noam Postavsky @ 2018-02-06 11:27 UTC (permalink / raw)
  To: Richard Stallman; +Cc: Eli Zaretskii, Drew Adams, Emacs developers

On Sun, Feb 4, 2018 at 8:06 PM, Richard Stallman <rms@gnu.org> wrote:

>   > For example, suppose you have a Lisp program that produces the
>   > following error message when compiled/executed:
>
>   >   Symbol's value as variable is void: 'аbbrevs-changed
>
> Does that error message really happen?  If so, how can I reproduce it?

I don't think there is a way to get this particular message (with an
ascii apostrophe and cyrillic a).



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Why "symbol's value" error about a list?
  2018-02-05 20:35         ` Alan Mackenzie
  2018-02-05 21:46           ` Drew Adams
@ 2018-02-06 14:51           ` Richard Stallman
  1 sibling, 0 replies; 98+ messages in thread
From: Richard Stallman @ 2018-02-06 14:51 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: eliz, emacs-devel, drew.adams, npostavs

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > >   >   Symbol's value as variable is void: 'foobaz

  > > That still seems wrong.

  > Again "’foobaz", not "foobaz" is the symbol, here.

This makes sense, as an issue about quotes.

But why, then, did the other message talk about using a confusable
letter and say that was the ONLY problem.

-- 
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)
Skype: No way! See https://stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Why "symbol's value" error about a list?
  2018-02-06 11:27         ` Noam Postavsky
@ 2018-02-06 14:53           ` Richard Stallman
  2018-02-06 18:59             ` Eli Zaretskii
  2018-02-06 18:52           ` Eli Zaretskii
  1 sibling, 1 reply; 98+ messages in thread
From: Richard Stallman @ 2018-02-06 14:53 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: eliz, drew.adams, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > >   > For example, suppose you have a Lisp program that produces the
  > >   > following error message when compiled/executed:
  > >
  > >   >   Symbol's value as variable is void: 'аbbrevs-changed
  > >
  > > Does that error message really happen?  If so, how can I reproduce it?

  > I don't think there is a way to get this particular message (with an
  > ascii apostrophe and cyrillic a).

So it was purely hypothetical?

In that case, I wish the person who wrote that had made it clear
it was not a real, existing problem.  The failure to do this
made us waste our time.

-- 
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)
Skype: No way! See https://stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 98+ messages in thread

* RE: Why "symbol's value" error about a list?
  2018-02-06  4:13             ` Eli Zaretskii
  2018-02-06  7:32               ` Tim Cross
@ 2018-02-06 15:45               ` Drew Adams
  2018-02-06 19:17                 ` Eli Zaretskii
  1 sibling, 1 reply; 98+ messages in thread
From: Drew Adams @ 2018-02-06 15:45 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, emacs-devel, rms, npostavs

> > > The error message given out is:
> > >     Symbol's value as variable is void: ’bar
> >
> > That was the old, and legitimate, error message, yes.  It
> > accurately describes what is really going on (as you describe
> > well, below).
> >
> > Now the message is instead (invalid-read-syntax "strange quote"
> > "’").  Is that better?
> 
> I think it's somewhat better, because it talks about "strange quote",
> which is a hint for the user about the actual problem.

The actual problem is the use of a non-variable symbol
as a variable.  At least that has been the problem in
this example, until the recent change in Lisp syntax.

There's no problem using a symbol as a variable if its
name is ’bar.  You've just made it necessary now to
escape that curly quote when defining and using the
symbol:

 (defvar \’bar 42 "...")

And if a variable in fact has that name you still raise
a Lisp read-syntax error if the quote is not escaped.

> > I suggested that the variable name be enclosed in `...'.  That
> > would make the original message clearer, I think:
> >   Symbol's value as variable is void: `’bar'
> 
> That might make things even more confusing, because the text actually
> displayed will be this:
>     Symbol’s value as variable is void: ‘’bar’
> which loses all hints of what is being quoted here.

I wrote `’bar'.

> > > This is a result of the change in `message', silently to
> > > convert ' to a curly quote, by default.  Some of us were
> > > unhappy at this change and protested against it.
> >
> > Count me as one of those "some of us".  Echoing Lisp code
> > should do just that - no fiddling to "prettify" apostrophe to
> > curly quote etc.
> 
> That ship has sailed two Emacs releases ago.  We are trying to
> fix the fallout.

Two releases ago and still reaping the fallout rewards...
Time to call back that ship or try to redirect it?

> And strange quotes is only one situation where confusingly similar
> characters can be presented in error messages, making it hard for
> users to spot the real problem.  We are trying to find ways of making
> such "typos" more evident in error messages.

Where's the error in  (defvar ’bar 42 "...")?
You've introduced Lisp read errors where there were
none.

The error here is the automatic translation of a Lisp
sexp that uses an ordinary quote mark (apostrophe) to
a curly quote by `message' (?), so that the wrong sexp
gets logged to *Messages*.

The second error is trying to fix that error by changing
Lisp syntax so that an error is raised, instead of just
(optionally) displaying a warning message.  There's no
reason to stop Lisp evaluation just because we want to
inform a user about a possible misunderstanding (gotcha).

> > The error is using a symbol as a variable, when it is not
> > defined as a variable.  Which is exactly what the original
> > error message said.
> >
> > That's the LISP error.  Is there a _user error_ here?
> > Yes, it's the mistake of copying and pasting what was
> > printed in *Messages*.
> >
> > That user mistake is excusable.  And we would want to
> > inform the user about it, if we can't prevent it.  But
> > changing Lisp read syntax to guess what might be the
> > most helpful thing to tell a user here is NOT the solution.
> 
> The issue is what _would_ be a helpful message in these cases.  You
> are just saying what should _not_ be done (repeatedly), but that
> doesn't advance us towards the solution.

I've said (repeatedly, as you like to repeat) that we can
display all the warnings you like.  What we should not do
is change Lisp syntax to raise an artificial error.

There is no Lisp error in evaluating (setq ’bar 42),
regardless of how or why someone might do that.  It's
fine to let someone know that s?he did it, pointing
to the curly quote.  It's wrong to raise a Lisp error.

> > Should this Lisp syntax change be reverted?  That's the
> > question being discussed here.
> 
> No, that's only part of the question.  The other, no less important
> part is if we revert that change, how to make the confusing error
> message less so and more helpful in understanding the user error.

Agreed.  The first step is to revert the change in Lisp
syntax.  The second step is to design aids for users to
recognize such gotchas.

The zeroth step is to realize that the Lisp change should
be reverted.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* RE: Why "symbol's value" error about a list?
  2018-02-06  7:32               ` Tim Cross
  2018-02-06  7:40                 ` Eli Zaretskii
@ 2018-02-06 15:45                 ` Drew Adams
  1 sibling, 0 replies; 98+ messages in thread
From: Drew Adams @ 2018-02-06 15:45 UTC (permalink / raw)
  To: Tim Cross, Eli Zaretskii; +Cc: acm, npostavs, rms, Emacs developers

> It seems there are two issues here - they are not completely
> separate, but do seem to be distinct and probably need to be
> addressed in two steps. 
>
> If the statement 
>
> > Count me as one of those "some of us".  Echoing Lisp code
> > should do just that - no fiddling to "prettify" apostrophe to
> > curly quote etc.
>
> is correct, then I would agree it was a bad design decision.
> The *Messages* buffer should display lisp code exactly as it
> is read and not try to 'prettify' it.

Yes.

> The second issue seems to be more about how to make the error
> message more informative.

Yes, but it's not just about that error message.  That Lisp
error is about an undefined variable.  But there are plenty
of other contexts where users can be confused by such a
gotcha.

If code or a user did in fact define a variable named ’bar
then there would be no Lisp error (prior to the recent change).

Such a symbol name could nevertheless be confusing in some
contexts.  But it's not about how best to present that
undefined-variable error message.

That message was telling the truth, even if in the particular
context presented it might not be immediately clear to a user
what the undefined symbol name is (i.e., that the name contains
a curly quote).

> I suspect this is a much harder problem to resolve. I don't
> know what the right solution is for that, but I do know that
> I would have more chance of recognising my error if the
> message displayed in the buffer displays the lisp code
> exactly as it was read by the reader.

Precisely.

That Lisp error really is about using symbol `’bar' as
a variable.  Such code can exist in different contexts,
only some of which have anything to do with a user
mistaking a curly quote for an apostrophe.

Just as some Lisp code can mistakenly use symbol `abc'
as a variable apart from any binding of it as a variable,
and so provoking the undefined-variable error, so can
code mistakenly use symbol `’bar' as an undefined variable.

Such a context would have nothing to do with the newly
fabricated error (invalid-read-syntax "strange quote" "’").
That's just the wrong error for Lisp to raise here.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Why "symbol's value" error about a list?
  2018-02-06 11:27         ` Noam Postavsky
  2018-02-06 14:53           ` Richard Stallman
@ 2018-02-06 18:52           ` Eli Zaretskii
  1 sibling, 0 replies; 98+ messages in thread
From: Eli Zaretskii @ 2018-02-06 18:52 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: rms, drew.adams, emacs-devel

> From: Noam Postavsky <npostavs@users.sourceforge.net>
> Date: Tue, 6 Feb 2018 06:27:33 -0500
> Cc: Eli Zaretskii <eliz@gnu.org>, Drew Adams <drew.adams@oracle.com>, 
> 	Emacs developers <emacs-devel@gnu.org>
> 
> On Sun, Feb 4, 2018 at 8:06 PM, Richard Stallman <rms@gnu.org> wrote:
> 
> >   > For example, suppose you have a Lisp program that produces the
> >   > following error message when compiled/executed:
> >
> >   >   Symbol's value as variable is void: 'аbbrevs-changed
> >
> > Does that error message really happen?  If so, how can I reproduce it?
> 
> I don't think there is a way to get this particular message (with an
> ascii apostrophe and cyrillic a).

The point I wanted to make stands if you remove the apostrophe.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Why "symbol's value" error about a list?
  2018-02-06 14:53           ` Richard Stallman
@ 2018-02-06 18:59             ` Eli Zaretskii
  2018-02-07  2:40               ` Richard Stallman
  0 siblings, 1 reply; 98+ messages in thread
From: Eli Zaretskii @ 2018-02-06 18:59 UTC (permalink / raw)
  To: rms; +Cc: emacs-devel, drew.adams, npostavs

> From: Richard Stallman <rms@gnu.org>
> CC: eliz@gnu.org, drew.adams@oracle.com, emacs-devel@gnu.org
> Date: Tue, 06 Feb 2018 09:53:16 -0500
> 
>   > I don't think there is a way to get this particular message (with an
>   > ascii apostrophe and cyrillic a).
> 
> So it was purely hypothetical?
> 
> In that case, I wish the person who wrote that had made it clear
> it was not a real, existing problem.

That person did:

  For example, suppose you have a Lisp program that produces the
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  following error message when compiled/executed:

    Symbol's value as variable is void: 'аbbrevs-changed



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Why "symbol's value" error about a list?
  2018-02-06 15:45               ` Drew Adams
@ 2018-02-06 19:17                 ` Eli Zaretskii
  0 siblings, 0 replies; 98+ messages in thread
From: Eli Zaretskii @ 2018-02-06 19:17 UTC (permalink / raw)
  To: Drew Adams; +Cc: acm, emacs-devel, rms, npostavs

> Date: Tue, 6 Feb 2018 07:45:55 -0800 (PST)
> From: Drew Adams <drew.adams@oracle.com>
> Cc: acm@muc.de, rms@gnu.org, npostavs@users.sourceforge.net,
>         emacs-devel@gnu.org
> 
> > > > The error message given out is:
> > > >     Symbol's value as variable is void: ’bar
> > >
> > > That was the old, and legitimate, error message, yes.  It
> > > accurately describes what is really going on (as you describe
> > > well, below).
> > >
> > > Now the message is instead (invalid-read-syntax "strange quote"
> > > "’").  Is that better?
> > 
> > I think it's somewhat better, because it talks about "strange quote",
> > which is a hint for the user about the actual problem.
> 
> The actual problem is the use of a non-variable symbol
> as a variable.

That's one possibility, yes.  But a much more probable possibility is
that the user mistyped the quote, either because she copy/pasted it
from some text, or because she turned on the Electric Quote mode, or
for some other reason.

A useful error message should consider this latter probable cause and
help the user correct it, if indeed that was the reason.  Many tools
do similar second-guessing for frequent mistakes.  For example. GNU
Make detects when a line in a Makefile starts with 8 SPC characters
instead of a mandatory TAB, and says:

 *** missing separator (did you mean TAB instead of 8 spaces?).

The first part is the "dumb" error message, based on the syntax error,
the part in parentheses is a helpful hint for the user, based on many
such user errors seen in the past.

Latest versions of GCC also provide similar hints.

> You've just made it necessary now to escape that curly quote when
> defining and using the symbol:

You are changing the subject.  I just wrote that an error message
which mentions "strange quotes" is somewhat better than one which just
states the syntax error.  I said nothing about anything else.

> > >   Symbol's value as variable is void: `’bar'
> > 
> > That might make things even more confusing, because the text actually
> > displayed will be this:
> >     Symbol’s value as variable is void: ‘’bar’
> > which loses all hints of what is being quoted here.
> 
> I wrote `’bar'.

Yes, but the Lisp function 'message' has its own ideas regarding
quoting text, as you well know.  Try evaluating this:

  (message "Symbol's value as variable is void: `%s'" "’bar")

> > That ship has sailed two Emacs releases ago.  We are trying to
> > fix the fallout.
> 
> Two releases ago and still reaping the fallout rewards...
> Time to call back that ship or try to redirect it?

You can try fighting this Quixotic battle on and on, but I don't
recommend that.

> You've introduced Lisp read errors where there were
> none.

No, I didn't do anything of the kind.
> > > Should this Lisp syntax change be reverted?  That's the
> > > question being discussed here.
> > 
> > No, that's only part of the question.  The other, no less important
> > part is if we revert that change, how to make the confusing error
> > message less so and more helpful in understanding the user error.
> 
> Agreed.  The first step is to revert the change in Lisp
> syntax.  The second step is to design aids for users to
> recognize such gotchas.

I think both steps should be made together, otherwise we would be
making a change for the worse.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Why "symbol's value" error about a list?
  2018-02-06 18:59             ` Eli Zaretskii
@ 2018-02-07  2:40               ` Richard Stallman
  2018-02-07  3:42                 ` Eli Zaretskii
  0 siblings, 1 reply; 98+ messages in thread
From: Richard Stallman @ 2018-02-07  2:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, drew.adams, npostavs

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > In that case, I wish the person who wrote that had made it clear
  > > it was not a real, existing problem.

  > That person did:

  >   For example, suppose you have a Lisp program that produces the
  >   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  >   following error message when compiled/executed:

Those words do NOT say that this is an unreal example.

-- 
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)
Skype: No way! See https://stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Why "symbol's value" error about a list?
  2018-02-07  2:40               ` Richard Stallman
@ 2018-02-07  3:42                 ` Eli Zaretskii
  0 siblings, 0 replies; 98+ messages in thread
From: Eli Zaretskii @ 2018-02-07  3:42 UTC (permalink / raw)
  To: rms; +Cc: emacs-devel, drew.adams, npostavs

> From: Richard Stallman <rms@gnu.org>
> CC: npostavs@users.sourceforge.net, drew.adams@oracle.com,
> 	emacs-devel@gnu.org
> Date: Tue, 06 Feb 2018 21:40:50 -0500
> 
>   > > In that case, I wish the person who wrote that had made it clear
>   > > it was not a real, existing problem.
> 
>   > That person did:
> 
>   >   For example, suppose you have a Lisp program that produces the
>   >   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   >   following error message when compiled/executed:
> 
> Those words do NOT say that this is an unreal example.

They do for me.  Of course, my command of English is not perfect, so
maybe I'm missing something.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-02-02 22:24 Change of Lisp syntax for "fancy" quotes in Emacs 27? Noam Postavsky
                   ` (2 preceding siblings ...)
  2018-02-03 18:13 ` Aaron Ecay
@ 2018-10-05  0:03 ` Noam Postavsky
  2018-10-05  1:01   ` Paul Eggert
                     ` (2 more replies)
  3 siblings, 3 replies; 98+ messages in thread
From: Noam Postavsky @ 2018-10-05  0:03 UTC (permalink / raw)
  To: Emacs developers; +Cc: Drew Adams

On Fri, 2 Feb 2018 at 17:24, Noam Postavsky
<npostavs@users.sourceforge.net> wrote:
>
> In Emacs 26 and earlier the following is valid lisp code:
>
> (setq ’bar 42)
> (setq foo ’bar)
>
> In the current master branch, this will signal (invalid-read-syntax
> "strange quote" "’").

I've posted a patch which removes the error in this case, and instead
just adds to the error message if evaluating an expression with a
fancy quote leads to an error, see Bug#32939
<https://debbugs.gnu.org/cgi/bugreport.cgi?bug=32939>.

Archive link to previous discussion:
https://lists.gnu.org/archive/html/emacs-devel/2018-02/msg00093.html



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-05  0:03 ` Noam Postavsky
@ 2018-10-05  1:01   ` Paul Eggert
  2018-10-05  8:43     ` Eli Zaretskii
  2018-10-06 15:40   ` eval-last-sexp / C-x C-e, and punctuation like `?’' [Was: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?)] Garreau, Alexandre
  2018-10-16 12:48   ` Change of Lisp syntax for "fancy" quotes in Emacs 27? Garreau, Alexandre
  2 siblings, 1 reply; 98+ messages in thread
From: Paul Eggert @ 2018-10-05  1:01 UTC (permalink / raw)
  To: Noam Postavsky, Emacs developers; +Cc: Drew Adams

I'm afraid this patch is heading in the wrong direction, as we should be 
more vigilant about confusables, not less.

Consider this example, abstracted from the auth-source-secrets-create 
source code:

     (if (eq r 'secret)
         (let ((data data))
           (lambda () data))
       data)

The intent of the (let ((data data)) ...) code is to create a thunk 
which, when evaluated, yields the current value of 'data' (not the value 
of 'data' when the thunk is called), and that is what any human reading 
the code will see. However, that is not what the code actually does. In 
the (let ((data data)) ...), the space between the two instances of 
'data' is really an EN SPACE (U+2002) so the 'let' is declaring an 
identifier 'data data' whose name contains an EN SPACE and whose value 
is nil, an identifier that is never used; so the thunk yields the later 
value of 'data', not the earlier one.

Because humans cannot reliably review source code containing characters 
that are easily confusable with the ASCII symbols that are a basic part 
of Elisp syntax, we should not be relaxing the reader to encourage 
developers to use these characters in their identifiers. On the 
contrary, we should be discouraging their use even more than we do now.




^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-05  1:01   ` Paul Eggert
@ 2018-10-05  8:43     ` Eli Zaretskii
  2018-10-05 23:02       ` Paul Eggert
  0 siblings, 1 reply; 98+ messages in thread
From: Eli Zaretskii @ 2018-10-05  8:43 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel, drew.adams, npostavs

> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Thu, 4 Oct 2018 18:01:26 -0700
> Cc: Drew Adams <drew.adams@oracle.com>
> 
> I'm afraid this patch is heading in the wrong direction, as we should be 
> more vigilant about confusables, not less.
> 
> Consider this example, abstracted from the auth-source-secrets-create 
> source code:
> 
>      (if (eq r 'secret)
>          (let ((data data))
>            (lambda () data))
>        data)

Is this example relevant to the proposed changes?  The latter only
change what we do for quote-like symbols that are not interpreted as
quotes by the Lisp reader.  You, OTOH, are raising a different
problem, one for which AFAIK we currently have no solution.

The general issue of "confusable" characters, both in Lisp code and in
user interaction, is an issue that still awaits a proper solution in
Emacs.  (Many moons ago, I was seduced to write a couple of primitives
to allow detection of confusable text that played tricks with bidi
reordering, but AFAICT those primitives are still not used, which is a
pity, IMO.)  I'd encourage people to work on this.

However, the much more narrow issue brought up by this bug report is
specifically about quote characters.  It is related to changes in our
messages, which now by default produce non-ASCII quotes, something
that made this particular problem more probable than it was before.  I
think as long as we don't disallow such characters in Lisp symbols,
the proposed treatment, via evaluation-time warning, is a reasonable
solution, slightly better than the somewhat confusing error message we
present now.

We could also augment that by displaying the confusable characters in
a distinct face, something we already do for some of them.

IOW, I disagree with "discourage" part of your opinion: there's
nothing wrong with using such characters as long as we don't formally
cease to support them.  And the commonly accepted mechanism of
pointing out potentially wrong constructs is by visual cues and
warning messages, not by erroring out.  Compare how we treat something
like this in C programs:

  if (a = b) { do something; }



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-05  8:43     ` Eli Zaretskii
@ 2018-10-05 23:02       ` Paul Eggert
  2018-10-06  0:20         ` Drew Adams
                           ` (3 more replies)
  0 siblings, 4 replies; 98+ messages in thread
From: Paul Eggert @ 2018-10-05 23:02 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, drew.adams, npostavs

On 10/5/18 1:43 AM, Eli Zaretskii wrote:
> the commonly accepted mechanism of
> pointing out potentially wrong constructs is by visual cues and
> warning messages

If we decide that Elisp source code must be able to abuse confusable 
characters, then of course we should allow such abuse and support it as 
best we can, including selective highlighting and whatnot to try to warn 
readers of the abuse. Such support won't work outside Emacs, but people 
using non-Emacs programs to look at Elisp code will simply be out of luck.

However, that would be heading in the wrong direction, because we 
shouldn't assume that Elisp code is reviewed only via Emacs. I regularly 
use Savannah's web interface to look at Elisp source code diffs, for 
example, and there's lots of other ways I and other developers use 
non-Emacs programs to look at Elisp source. Because reading source code 
is an essential property of free software, and because it would set a 
bad precedent if we said or implied that one really should use only 
Emacs to read Elisp code, we can't sufficiently address the problem 
merely by highlighting characters when Emacs is viewing them in a 
certain way and saying or implying that people should use only Emacs to 
review Elisp code.

I'm not arguing that Elisp should prohibit symbols from containing 
confusing characters, only that these characters should be easily 
recognizable in plain-text source code, without requiring Emacs itself 
(configured a certain way) to view the source. For example, if we 
required a backslash before every confusable character in a symbol, that 
would go a long way toward addressing the problem.




^ permalink raw reply	[flat|nested] 98+ messages in thread

* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-05 23:02       ` Paul Eggert
@ 2018-10-06  0:20         ` Drew Adams
  2018-10-06  9:14           ` Alan Mackenzie
  2018-10-06 16:17           ` Paul Eggert
  2018-10-06 10:11         ` Eli Zaretskii
                           ` (2 subsequent siblings)
  3 siblings, 2 replies; 98+ messages in thread
From: Drew Adams @ 2018-10-06  0:20 UTC (permalink / raw)
  To: Paul Eggert, Eli Zaretskii; +Cc: emacs-devel, npostavs

> I'm not arguing that Elisp should prohibit symbols from containing
> confusing characters, only that these characters should be easily
> recognizable in plain-text source code, without requiring Emacs itself
> (configured a certain way) to view the source. For example, if we
> required a backslash before every confusable character in a symbol, that
> would go a long way toward addressing the problem.

The right approach is to let Lisp tell you about its syntax.
Lisp should raise an error only for, well, an actual Lisp error.

If Emacs wants to highlight something that it can guess
(accurately) might be a typo (e.g. a copy+paste gotcha)
then fine. And even that highlighting should be optional
(which it is, if from font-lock).

There are more and more such copy+paste gotchas, for
at least a couple reasons: (1) Emacs has now moved to
using curly quotes more gratuitously, and (2) users copy
code from both Emacs and other sources, and some such
code uses curly quotes (even sometimes mistakenly in
place of apostrophes in Lisp), no-break space chars, and
other confusables.

And maybe also (3) users can sometimes type a curly
quote, no-break space, etc. more easily now, in some
editors, maybe even sometimes by just hitting an
ordinary keyboard key, such as ' or space bar.

We have to live with this now, like it or not. That's not a
reason to tell Lisp to treat a curly quote as an apostrophe,
and it's not a reason to tell it to raise an error when a
curly quote is used in a place where an apostrophe could
be used.

Similarly, it's not a reason for Lisp to guess that you really
meant SPC instead of no-break space. And so on.

This is a judgment call, but we should _let Lisp judge_
about syntax errors, based on, well, its own syntax. If you
use (let (foo  foo)...), where there is a no-break space
between foo and foo, so be it. That's a single symbol,
`foo foo'.

(My mail client doesn't even let me paste a no-break
space char there, it seems, so you'll have to pretend.
That's the kind of second-guessing behavior we should
be avoiding, FWIW.)



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06  0:20         ` Drew Adams
@ 2018-10-06  9:14           ` Alan Mackenzie
  2018-10-06 14:34             ` Stefan Monnier
                               ` (2 more replies)
  2018-10-06 16:17           ` Paul Eggert
  1 sibling, 3 replies; 98+ messages in thread
From: Alan Mackenzie @ 2018-10-06  9:14 UTC (permalink / raw)
  To: Drew Adams; +Cc: npostavs, Eli Zaretskii, Paul Eggert, emacs-devel

Hello, Drew.

Just a quick point.

On Sat, Oct 06, 2018 at 00:20:27 +0000, Drew Adams wrote:

[ .... ]

> This is a judgment call, but we should _let Lisp judge_
> about syntax errors, based on, well, its own syntax. If you
> use (let (foo  foo)...), where there is a no-break space
> between foo and foo, so be it. That's a single symbol,
> `foo foo'.

Do we even allow the syntax (let ((foo))...)?  If we do, then why?
There's (let (foo)...) and (let ((foo nil))...) for binding a symbol to
nil.

We made (setq foo) invalid some while ago.  Why not similarly make (let
((foo))...) invalid?  That would solve at least part of this problem,
is easy to do. and is almost certainly harmless.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-05 23:02       ` Paul Eggert
  2018-10-06  0:20         ` Drew Adams
@ 2018-10-06 10:11         ` Eli Zaretskii
  2018-10-06 15:51           ` Paul Eggert
  2018-10-06 11:22         ` Garreau, Alexandre
  2018-10-09 14:43         ` Noam Postavsky
  3 siblings, 1 reply; 98+ messages in thread
From: Eli Zaretskii @ 2018-10-06 10:11 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel, drew.adams, npostavs

> Cc: npostavs@users.sourceforge.net, emacs-devel@gnu.org, drew.adams@oracle.com
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Fri, 5 Oct 2018 16:02:09 -0700
> 
> However, that would be heading in the wrong direction, because we 
> shouldn't assume that Elisp code is reviewed only via Emacs. I regularly 
> use Savannah's web interface to look at Elisp source code diffs, for 
> example, and there's lots of other ways I and other developers use 
> non-Emacs programs to look at Elisp source. Because reading source code 
> is an essential property of free software, and because it would set a 
> bad precedent if we said or implied that one really should use only 
> Emacs to read Elisp code, we can't sufficiently address the problem 
> merely by highlighting characters when Emacs is viewing them in a 
> certain way and saying or implying that people should use only Emacs to 
> review Elisp code.
> 
> I'm not arguing that Elisp should prohibit symbols from containing 
> confusing characters, only that these characters should be easily 
> recognizable in plain-text source code, without requiring Emacs itself 
> (configured a certain way) to view the source. For example, if we 
> required a backslash before every confusable character in a symbol, that 
> would go a long way toward addressing the problem.

I agree that viewing ELisp code outside of Emacs is a valid use case.
But I don't think a backslash before these non-ASCII quotes will
significantly lower the confusion potential when those characters are
used in the source.

Basically, there's a contradiction here between our desire not to
confuse relatively inexperienced users of ELisp and help them avoid
problems which might be hard to figure out, and our desire not to
annoy experienced users.  Personally, I think that using faces strikes
a good balance between these contradictory motives.  I don't see how
we can be harsh to uses of these characters without actually
prohibiting their use in symbols.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-05 23:02       ` Paul Eggert
  2018-10-06  0:20         ` Drew Adams
  2018-10-06 10:11         ` Eli Zaretskii
@ 2018-10-06 11:22         ` Garreau, Alexandre
  2018-10-06 11:50           ` Eli Zaretskii
  2018-10-06 16:24           ` Change of Lisp syntax for "fancy" quotes in Emacs 27? Paul Eggert
  2018-10-09 14:43         ` Noam Postavsky
  3 siblings, 2 replies; 98+ messages in thread
From: Garreau, Alexandre @ 2018-10-06 11:22 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Eli Zaretskii, npostavs, drew.adams, emacs-devel

On 2018-10-05 at 16:02, Paul Eggert wrote:
> However, that would be heading in the wrong direction, because we
> shouldn't assume that Elisp code is reviewed only via Emacs. I
> regularly use Savannah's web interface

In a world where unicode is increasingly present and confusion about its
characters increasingly problematic (typosquatting, etc.) wouldn’t it be
reasonable to expect unicode-related semantic functions to be provided
in most frameworks, systems and languages to allow better handling of
such problems, thus making that problem the interface’s one?

Maybe if ever this problem occurs more and more in domain names,
internet addresses, and such, interfaces such as web ones, or other
editors, will inevitably need to support features to avoid confusion the
same way emacs currently could?



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06 11:22         ` Garreau, Alexandre
@ 2018-10-06 11:50           ` Eli Zaretskii
  2018-10-06 12:10             ` Garreau, Alexandre
  2018-10-06 13:15             ` Unicode security-issues workarounds elsewhere [Was: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?] Garreau, Alexandre
  2018-10-06 16:24           ` Change of Lisp syntax for "fancy" quotes in Emacs 27? Paul Eggert
  1 sibling, 2 replies; 98+ messages in thread
From: Eli Zaretskii @ 2018-10-06 11:50 UTC (permalink / raw)
  To: Garreau, Alexandre; +Cc: npostavs, eggert, drew.adams, emacs-devel

> From: "Garreau\, Alexandre" <galex-713@galex-713.eu>
> Cc: Eli Zaretskii <eliz@gnu.org>,  emacs-devel@gnu.org,  drew.adams@oracle.com,  npostavs@users.sourceforge.net
> Date: Sat, 06 Oct 2018 13:22:14 +0200
> 
> In a world where unicode is increasingly present and confusion about its
> characters increasingly problematic (typosquatting, etc.) wouldn’t it be
> reasonable to expect unicode-related semantic functions to be provided
> in most frameworks, systems and languages to allow better handling of
> such problems, thus making that problem the interface’s one?

I don't think I understand what this means in practice; please
elaborate.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06 11:50           ` Eli Zaretskii
@ 2018-10-06 12:10             ` Garreau, Alexandre
  2018-10-06 14:00               ` Eli Zaretskii
  2018-10-06 13:15             ` Unicode security-issues workarounds elsewhere [Was: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?] Garreau, Alexandre
  1 sibling, 1 reply; 98+ messages in thread
From: Garreau, Alexandre @ 2018-10-06 12:10 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: npostavs, eggert, drew.adams, emacs-devel

Le 06/10/2018 à 14h50, Eli Zaretskii a écrit :
>> From: "Garreau\, Alexandre" <galex-713@galex-713.eu>
>> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org,
>> drew.adams@oracle.com, npostavs@users.sourceforge.net
>> Date: Sat, 06 Oct 2018 13:22:14 +0200
>> 
>> In a world where unicode is increasingly present and confusion about its
>> characters increasingly problematic (typosquatting, etc.) wouldn’t it be
>> reasonable to expect unicode-related semantic functions to be provided
>> in most frameworks, systems and languages to allow better handling of
>> such problems, thus making that problem the interface’s one?
>
> I don't think I understand what this means in practice; please
> elaborate.

afaik there are also problems in other contents than source code about
undistinguishable unicode character, such as the latin ?o and the
cyrillic ?о (the first example of unicode-powered typosquatting I ever
heard), the different spaces (sometimes not distinguishable in monospace
font), or, to stay on monospacing problems: I have great pain in writing
correct french text as I must always check in something not-emacs about
which one between ?– and ?— is the medium and the long dash (I normally
recall through their position on my keyboard but as they’re aside I
often forget), not to recall the different hacks about bidirectionality
you highlighted earlier.  I also heard about emails confusing
semantic-based bayesian anti-spam by putting not-spammy words in mails
that, because of some unicode tricks, wouldn’t be displayed to user.

This problems aren’t local to source code, nor to emacs (as many people
use something else than emacs to read mails, websites, news, and reading
domain names), and afaik there are canonicalizations and semantic
unicode categories functions to help knowing what is punctuation, what
is combining, what is displayed and takes how much space, and maybe, but
I’m unsure, which characters are to be difficult or even impossible to
distinguish (or some canonicalizations function to get two differently
encoded (related to combining characters (such as the difference between
"é" and "é" (made of ?e then ?́ (it’s fun to see how this last one is
strangely displayed and finely evaluated by emacs)))) strings comparable
the same, or two characters-different but looking-alike strings
comparable the same too).

I guess this issue is even going to be less a problem in free softwares
where theorically the writers should be well-intentioned and shouldn’t
try to trick the readers on what the software do (and/or it should at
least be reviewed with capable tools and/or knowledge), compared to
cases where this is going to be abusable and profitable, such as
typosquating ("google.com" and "gооgle.com" are not the same (it’s
interesting to notice too how emacs forward/backward-word detects and
use the language-switching to stop at the "оо", I’m astounished by these
capabilities I have to thank you guy for a such great piece of
software!) but google could aford (and took care) to buy both while not
everyone could do as well (and nobody yet reserved "amazоn.com"), and
people might crack, steal or blackmail using something like that).



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Unicode security-issues workarounds elsewhere [Was: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?]
  2018-10-06 11:50           ` Eli Zaretskii
  2018-10-06 12:10             ` Garreau, Alexandre
@ 2018-10-06 13:15             ` Garreau, Alexandre
  2018-10-06 14:01               ` Eli Zaretskii
  1 sibling, 1 reply; 98+ messages in thread
From: Garreau, Alexandre @ 2018-10-06 13:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: npostavs, eggert, drew.adams, emacs-devel

Le 06/10/2018 à 14h50, Eli Zaretskii a écrit :
>> From: "Garreau\, Alexandre" <galex-713@galex-713.eu>
>> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org,
>> drew.adams@oracle.com, npostavs@users.sourceforge.net
>> Date: Sat, 06 Oct 2018 13:22:14 +0200
>> 
>> In a world where unicode is increasingly present and confusion about its
>> characters increasingly problematic (typosquatting, etc.) wouldn’t it be
>> reasonable to expect unicode-related semantic functions to be provided
>> in most frameworks, systems and languages to allow better handling of
>> such problems, thus making that problem the interface’s one?
>
> I don't think I understand what this means in practice; please
> elaborate.

The point I wanted to make is since as I highlighted this problem is of
greater importance in other interfaces than source codes, especially
browsers and web sites, typically, as these gets to be the most used
interfaces for everything nowadays.  So I guess these unicode
anti-confusion functions and more high-level functions based on these
already are or will become present in browsers and in languages such as
perl and php to end up in high-level functions in frameworks made in
perl or php, for instance, so that at the end “other interfaces than
emacs” such as web-browsers or websites may end supporting features such
as coloring differently mixed-script or unusual spaces, etc.

The other option being “ban unicode as much as possible” or “disallow
mixed-script”, and “ban all unicode punctuation characters (or all
non-letters (or non-alphanumeric?) characters, or something like that)
unless they’re inside ascii”.

I believe with increased support of unicode most languages, frameworks
and software should end with features allowing to allow these without
creating too much problems (at least not that much a lot more than in
emacs).



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06 12:10             ` Garreau, Alexandre
@ 2018-10-06 14:00               ` Eli Zaretskii
  2018-10-24 22:25                 ` Noam Postavsky
  0 siblings, 1 reply; 98+ messages in thread
From: Eli Zaretskii @ 2018-10-06 14:00 UTC (permalink / raw)
  To: Garreau, Alexandre; +Cc: npostavs, eggert, drew.adams, emacs-devel

> From: "Garreau\, Alexandre" <galex-713@galex-713.eu>
> Cc: eggert@cs.ucla.edu,  emacs-devel@gnu.org,  drew.adams@oracle.com,  npostavs@users.sourceforge.net
> Date: Sat, 06 Oct 2018 14:10:17 +0200
> 
> afaik there are also problems in other contents than source code about
> undistinguishable unicode character, such as the latin ?o and the
> cyrillic ?о (the first example of unicode-powered typosquatting I ever
> heard), the different spaces (sometimes not distinguishable in monospace
> font), or, to stay on monospacing problems: I have great pain in writing
> correct french text as I must always check in something not-emacs about
> which one between ?– and ?— is the medium and the long dash (I normally
> recall through their position on my keyboard but as they’re aside I
> often forget), not to recall the different hacks about bidirectionality
> you highlighted earlier.  I also heard about emails confusing
> semantic-based bayesian anti-spam by putting not-spammy words in mails
> that, because of some unicode tricks, wouldn’t be displayed to user.

This is the more general problem I mentioned up-thread:

  http://lists.gnu.org/archive/html/emacs-devel/2018-10/msg00052.html

I agree that we should improve Emacs in that area, but I think it
would be wrong to hold off fixing the issue with quotes because the
more general problem is solved.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Unicode security-issues workarounds elsewhere [Was: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?]
  2018-10-06 13:15             ` Unicode security-issues workarounds elsewhere [Was: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?] Garreau, Alexandre
@ 2018-10-06 14:01               ` Eli Zaretskii
  0 siblings, 0 replies; 98+ messages in thread
From: Eli Zaretskii @ 2018-10-06 14:01 UTC (permalink / raw)
  To: Garreau, Alexandre; +Cc: eggert, emacs-devel, drew.adams, npostavs

> From: "Garreau\, Alexandre" <galex-713@galex-713.eu>
> Date: Sat, 06 Oct 2018 15:15:34 +0200
> Cc: npostavs@users.sourceforge.net, eggert@cs.ucla.edu, drew.adams@oracle.com,
> 	emacs-devel@gnu.org
> 
> The other option being “ban unicode as much as possible” or “disallow
> mixed-script”, and “ban all unicode punctuation characters (or all
> non-letters (or non-alphanumeric?) characters, or something like that)
> unless they’re inside ascii”.

I don't think this is feasible in Emacs, we cannot limit non-ASCII
punctuation to ASCII text.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06  9:14           ` Alan Mackenzie
@ 2018-10-06 14:34             ` Stefan Monnier
  2018-10-06 14:57             ` Drew Adams
  2018-10-06 16:10             ` Paul Eggert
  2 siblings, 0 replies; 98+ messages in thread
From: Stefan Monnier @ 2018-10-06 14:34 UTC (permalink / raw)
  To: emacs-devel

> We made (setq foo) invalid some while ago.  Why not similarly make (let
> ((foo))...) invalid?

I assume you mean that the byte-compiler should signal a warning?
If so, I'm fully in favor.

Signaling an actual error would be problematic at this stage, because
it'd break too much code.


        Stefan




^ permalink raw reply	[flat|nested] 98+ messages in thread

* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06  9:14           ` Alan Mackenzie
  2018-10-06 14:34             ` Stefan Monnier
@ 2018-10-06 14:57             ` Drew Adams
  2018-10-06 15:42               ` Garreau, Alexandre
  2018-10-06 16:10             ` Paul Eggert
  2 siblings, 1 reply; 98+ messages in thread
From: Drew Adams @ 2018-10-06 14:57 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: npostavs, Eli Zaretskii, Paul Eggert, emacs-devel

> > This is a judgment call, but we should _let Lisp judge_
> > about syntax errors, based on, well, its own syntax. If you
> > use (let (foo  foo)...), where there is a no-break space
> > between foo and foo, so be it. That's a single symbol,
> > `foo foo'.
> 
> Do we even allow the syntax (let ((foo))...)?  If we do, then why?
> There's (let (foo)...) and (let ((foo nil))...) for binding a symbol to
> nil.

Yes, sorry. I wasn't paying attention to the parens in that
example.

My point was only that use of `foo foo' (with a no-break
space between the two foo's) as a mistake/typo for an
intended `foo foo' (with a normal space) should not be
signaled by Lisp as an error. But the no-break space could
be highlighted as sometimes helpful info. `foo foo' (with
no-break space) is just a symbol, for Lisp - not a syntax
error.

E.g. (changing the example):

(let (foo foo)...) binds symbol `foo foo' (with a no-break
space) to nil. It doesn't bind symbol `foo' to the current
value of symbol `foo'.

So, e.g., if symbol `foo' happens to be unbound then
even evaluation of that binding won't raise an error
(e.g. unbound variable `foo').



^ permalink raw reply	[flat|nested] 98+ messages in thread

* eval-last-sexp / C-x C-e, and punctuation like `?’' [Was: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?)]
  2018-10-05  0:03 ` Noam Postavsky
  2018-10-05  1:01   ` Paul Eggert
@ 2018-10-06 15:40   ` Garreau, Alexandre
  2018-10-16 12:48   ` Change of Lisp syntax for "fancy" quotes in Emacs 27? Garreau, Alexandre
  2 siblings, 0 replies; 98+ messages in thread
From: Garreau, Alexandre @ 2018-10-06 15:40 UTC (permalink / raw)
  To: Emacs developers; +Cc: Drew Adams, Noam Postavsky

On 2018-10-04 at 20:03, Noam Postavsky wrote:
> On Fri, 2 Feb 2018 at 17:24, Noam Postavsky
> <npostavs@users.sourceforge.net> wrote:
>>
>> In Emacs 26 and earlier the following is valid lisp code:
>>
>> (setq ’bar 42)
>> (setq foo ’bar)

I just noticed: in emacs 25, if evaluating `’bar' with `eval-last-sexp'
/ C-x C-e, this gives an error as it ignores the ?’ and eval only `bar',
the same way, if point is placed after the ?’, it tries to eval “setq”…

Maybe I do not know enough of elisp, but why that? are there other
punctuation characters triggering this behavior?  meanwhile, are they
all okay for the reader to put in symbols unescaped (except ? , ?\",
?\(, ?\), ?,, ?`, and maybe some others from ascii I forgot)?

Why does eval-last-sexps treat this differently than the reader?



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06 14:57             ` Drew Adams
@ 2018-10-06 15:42               ` Garreau, Alexandre
  0 siblings, 0 replies; 98+ messages in thread
From: Garreau, Alexandre @ 2018-10-06 15:42 UTC (permalink / raw)
  To: Drew Adams
  Cc: Alan Mackenzie, Eli Zaretskii, Paul Eggert, emacs-devel, npostavs

On 2018-10-06 at 14:57, Drew Adams wrote:
> My point was only that use of `foo foo' (with a no-break
> space between the two foo's) as a mistake/typo for an
> intended `foo foo' (with a normal space) should not be
> signaled by Lisp as an error. But the no-break space could
> be highlighted as sometimes helpful info. `foo foo' (with
> no-break space) is just a symbol, for Lisp - not a syntax
> error.

Unbreakable space is already colored as some sort of colored underscore.
The problem is there are a bunch of other kind of spaces, though I
personally only use unbreakable, non-justifying and/or “fine” (dunno if
it match an en space) space in French, though I do it commonly in
strings with emacs (which only correctly highlight the normal
unbreakable space, but not the others).  I’d like to see these
highlighted (perhaps, or even preferably, a different way) as well, or
should it be something customizable according user preferences (or
language?).

> E.g. (changing the example):
>
> (let (foo foo)...) binds symbol `foo foo' (with a no-break
> space) to nil. It doesn't bind symbol `foo' to the current
> value of symbol `foo'.

I would have expected it to bind twice foo to nil (or to signal an error
or a warning), yet it seems you used a normal space, unbreakable-space
already highlights in emacs, so I’d noticed it.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06 10:11         ` Eli Zaretskii
@ 2018-10-06 15:51           ` Paul Eggert
  2018-10-06 16:45             ` Eli Zaretskii
  0 siblings, 1 reply; 98+ messages in thread
From: Paul Eggert @ 2018-10-06 15:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Noam Postavsky, emacs-devel

Eli Zaretskii wrote:
> I agree that viewing ELisp code outside of Emacs is a valid use case.
> But I don't think a backslash before these non-ASCII quotes will
> significantly lower the confusion potential when those characters are
> used in the source.

I don't follow. If someone writes '(let ((foo\ bar)) baz)' then a human reader 
is put immediately and obviously on notice that there's something odd about that 
code. We already require a backslash for that ordinary space (U+0020); why not 
also require it for EN SPACE (U+2002)? That will significantly lower confusion here.

The point is not to distinguish 'foo\ bar' (with ordinary space) from 'foo\ bar' 
(with en space); the point is to distinguish both from the 'foo bar' (two 
identifiers) that a reader would ordinarily expect here, because that's the main 
way a malicious hacker could confuse even experienced reviewers.
> Basically, there's a contradiction here between our desire not to
> confuse relatively inexperienced users of ELisp and help them avoid
> problems which might be hard to figure out, and our desire not to
> annoy experienced users.

That's not the point I was making. I'm an experienced Elisp user, and I am 
*extremely annoyed* (to put it mildly) that malicious users can put one over on 
us by using characters that look like spaces, or parentheses, or whatever, 
characters that are not what they look like. This has nothing to do with 
confusing inexperienced users. I *really want* Elisp to be relatively immune to 
this problem, at least for programs that I help maintain. And I don't want the 
immunity to work only when I'm using Emacs on a nice display: I often read code 
with Emacs highlighting unavailable or turned off, or without using Emacs at all.

At the very least there should be an option whereby the Emacs source code itself 
is routinely verified to be free of confusable characters in identifiers, to 
help prevent malicious code from sneaking into Emacs itself. Even if we give 
users the ability to let others shoot them, we should at least improve our own 
defenses.

> I don't see how
> we can be harsh to uses of these characters without actually
> prohibiting their use in symbols.

I already gave one proposal for doing just that: require that characters 
confusable with ASCII be escaped. Initially we can merely warn about any 
unescaped confusables; as long as the warning is prominent enough that should be 
OK for starters. This proposal does not prohibit their use in symbols, as one 
can simply escape the characters.

There are other ways to skin this cat as well. We should be heading in this 
direction, not removing the (admittedly inadequate) protection we already have.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06  9:14           ` Alan Mackenzie
  2018-10-06 14:34             ` Stefan Monnier
  2018-10-06 14:57             ` Drew Adams
@ 2018-10-06 16:10             ` Paul Eggert
  2 siblings, 0 replies; 98+ messages in thread
From: Paul Eggert @ 2018-10-06 16:10 UTC (permalink / raw)
  To: Alan Mackenzie, Drew Adams; +Cc: Eli Zaretskii, npostavs, emacs-devel

Alan Mackenzie wrote:
> Do we even allow the syntax (let ((foo))...)?

Although that's a reasonable question it doesn't address the main problem, as 
there are lots of other opportunities for confusion like this one. For example, 
if I see '(foo ․ bar) in Elisp source code, I'll naturally think it yields a 
cons of two symbols. It doesn't: it yields a list of three symbols, because that 
dot is not a FULL STOP (U+002E); it's a ONE DOT LEADER (U+2024).



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06  0:20         ` Drew Adams
  2018-10-06  9:14           ` Alan Mackenzie
@ 2018-10-06 16:17           ` Paul Eggert
  2018-10-07  1:13             ` Drew Adams
  2018-10-08  3:51             ` Richard Stallman
  1 sibling, 2 replies; 98+ messages in thread
From: Paul Eggert @ 2018-10-06 16:17 UTC (permalink / raw)
  To: Drew Adams, Eli Zaretskii; +Cc: emacs-devel, npostavs

Drew Adams wrote:
> The right approach is to let Lisp tell you about its syntax... we should _let Lisp judge_

This seems to be assuming that there is something out there called "Lisp" that 
tells us what to do. That's entirely backwards. Lisp is our servant, not our 
master. Lisp syntax should be whatever the best syntax we can come up with, to 
help us and our users do our work, and this should be the way we think and write 
about it.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06 11:22         ` Garreau, Alexandre
  2018-10-06 11:50           ` Eli Zaretskii
@ 2018-10-06 16:24           ` Paul Eggert
  2018-10-06 16:40             ` Stefan Monnier
  1 sibling, 1 reply; 98+ messages in thread
From: Paul Eggert @ 2018-10-06 16:24 UTC (permalink / raw)
  To: Garreau, Alexandre; +Cc: Eli Zaretskii, npostavs, drew.adams, emacs-devel

Garreau, Alexandre wrote:
> ouldn’t it be
> reasonable to expect unicode-related semantic functions to be provided
> in most frameworks, systems and languages to allow better handling of
> such problems

What a wonderful world that would be! But it's not the world we live in. Even 
Emacs misdisplays many of these characters now. And other tools that I routinely 
use (Firefox, Thunderbird, Gnome terminal) are even worse. We can't reasonably 
assume that confusable characters will be displayed nicely everywhere, not even 
five or ten years from now.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06 16:24           ` Change of Lisp syntax for "fancy" quotes in Emacs 27? Paul Eggert
@ 2018-10-06 16:40             ` Stefan Monnier
  0 siblings, 0 replies; 98+ messages in thread
From: Stefan Monnier @ 2018-10-06 16:40 UTC (permalink / raw)
  To: emacs-devel

> in. Even Emacs misdisplays many of these characters now. And other tools

FWIW, we have the `uni-confusables` in GNU ELPA for that purpose.
Maybe we should integrate more tightly.


        Stefan




^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06 15:51           ` Paul Eggert
@ 2018-10-06 16:45             ` Eli Zaretskii
  2018-10-06 18:03               ` Paul Eggert
  0 siblings, 1 reply; 98+ messages in thread
From: Eli Zaretskii @ 2018-10-06 16:45 UTC (permalink / raw)
  To: Paul Eggert; +Cc: npostavs, emacs-devel

> From: Paul Eggert <eggert@cs.ucla.edu>
> Cc: emacs-devel@gnu.org, Noam Postavsky <npostavs@users.sourceforge.net>
> Date: Sat, 6 Oct 2018 08:51:18 -0700
> 
> Eli Zaretskii wrote:
> > I agree that viewing ELisp code outside of Emacs is a valid use case.
> > But I don't think a backslash before these non-ASCII quotes will
> > significantly lower the confusion potential when those characters are
> > used in the source.
> 
> I don't follow. If someone writes '(let ((foo\ bar)) baz)' then a human reader 
> is put immediately and obviously on notice that there's something odd about that 
> code. We already require a backslash for that ordinary space (U+0020); why not 
> also require it for EN SPACE (U+2002)? That will significantly lower confusion here.

How will it lower the confusion, when the same is required for a
space?

And once again, these examples are not relevant to the issue at hand,
which is only about quotes.

> > I don't see how
> > we can be harsh to uses of these characters without actually
> > prohibiting their use in symbols.
> 
> I already gave one proposal for doing just that: require that characters 
> confusable with ASCII be escaped.

That's an annoyance, IMO.  This is why this bug report exists, right?

And again, please don't bring up the more general issue with any other
confusable character, as those require a more general solution about
which we don't yet have a clear idea.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06 16:45             ` Eli Zaretskii
@ 2018-10-06 18:03               ` Paul Eggert
  2018-10-06 18:29                 ` Eli Zaretskii
  0 siblings, 1 reply; 98+ messages in thread
From: Paul Eggert @ 2018-10-06 18:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: npostavs, emacs-devel

Eli Zaretskii wrote:

> these examples are not relevant to the issue at hand,
> which is only about quotes

Quotes are part of the same problem. For example, here's some code in Gnus:

    (ignore-errors (gnus-get-function method 'open-server))

Change that APOSTROPHE (U+0027) to RIGHT SINGLE QUOTATION MARK (U+2019) and the 
code will look the same but do something quite different, with no diagnostic. 
This sort of code is reasonably common and can easily be security-relevant. If 
Emacs stops diagnosing this abuse of confusable characters, we're opening 
ourselves up more to malicious code.

> How will it lower the confusion, when the same is required for a space?

Let me rephrase my point, with apostrophe rather than space.

The point is not to distinguish ´open-server (with U+00B4 ACUTE ACCENT) from 
՚open-server (with U+055A ARMENIAN APOSTROPHE); the point is to distinguish both 
of these from the 'open-server (apostrophe followed by symbol) that a reader 
would ordinarily expect here. We need to give an obvious way for human readers 
to see that something odd is going on. Readers can then use C-x = (or whatever) 
to find out exactly what the oddness is.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06 18:03               ` Paul Eggert
@ 2018-10-06 18:29                 ` Eli Zaretskii
  2018-10-06 19:18                   ` Paul Eggert
                                     ` (2 more replies)
  0 siblings, 3 replies; 98+ messages in thread
From: Eli Zaretskii @ 2018-10-06 18:29 UTC (permalink / raw)
  To: Paul Eggert; +Cc: npostavs, emacs-devel

> Cc: emacs-devel@gnu.org, npostavs@users.sourceforge.net
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sat, 6 Oct 2018 11:03:25 -0700
> 
> Eli Zaretskii wrote:
> 
> > these examples are not relevant to the issue at hand,
> > which is only about quotes
> 
> Quotes are part of the same problem.

Yes, but a much smaller part.  And solving the problem with quotes
doesn't require to solve the more general one.

>     (ignore-errors (gnus-get-function method 'open-server))
> 
> Change that APOSTROPHE (U+0027) to RIGHT SINGLE QUOTATION MARK (U+2019) and the 
> code will look the same but do something quite different, with no diagnostic. 

I understand.  I'm just saying that adding a backslash between the
U+2019 quote will not significantly improve the situation, because
Emacs Lisp uses backslashes in many other situation, like ?\", and
therefore the mere fact that there is a backslash doesn't necessarily
alert the human reader to the existence of an unusual character.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06 18:29                 ` Eli Zaretskii
@ 2018-10-06 19:18                   ` Paul Eggert
  2018-10-06 19:30                   ` Paul Eggert
  2018-10-06 19:32                   ` Garreau, Alexandre
  2 siblings, 0 replies; 98+ messages in thread
From: Paul Eggert @ 2018-10-06 19:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, npostavs

Eli Zaretskii wrote:
> Emacs Lisp uses backslashes in many other situation, like ?\", and
> therefore the mere fact that there is a backslash doesn't necessarily
> alert the human reader to the existence of an unusual character.

Yes, the solution I proposed addresses only symbols, not the more-general 
problem of confusable characters in strings and characters. Although symbols are 
a more significant issue since they occur more often, it would be nice to 
address strings and character constants too. We can do that by adding support 
for a new string escape \cX for the confusable character X (e.g., ?\c՚ would 
mean U+055A ARMENIAN APOSTROPHE), and by diagnosing the use of unescaped 
confusable characters in strings and character constants.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06 18:29                 ` Eli Zaretskii
  2018-10-06 19:18                   ` Paul Eggert
@ 2018-10-06 19:30                   ` Paul Eggert
  2018-10-06 19:32                   ` Garreau, Alexandre
  2 siblings, 0 replies; 98+ messages in thread
From: Paul Eggert @ 2018-10-06 19:30 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, npostavs

Eli Zaretskii wrote:
>>      (ignore-errors (gnus-get-function method 'open-server))
>>
>> Change that APOSTROPHE (U+0027) to RIGHT SINGLE QUOTATION MARK (U+2019) and the
>> code will look the same but do something quite different, with no diagnostic.

> ... adding a backslash between the
> U+2019 quote will not significantly improve the situation, because
> Emacs Lisp uses backslashes in many other situation, like ?\", and
> therefore the mere fact that there is a backslash doesn't necessarily
> alert the human reader to the existence of an unusual character.

True, a backslash within a string or character is not a sufficient alert. 
However, a backslash within a symbol is. For example, although it's relatively 
common to see strings like this in Elisp source code:

    "color=\"blue\""

it's extremely uncommon to see symbols like this:

    color=\"blue\"

and so if one sees such a symbol (possibly with some other character in place of 
the " marks, possibly not) one will easily know that something odd is going on.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06 18:29                 ` Eli Zaretskii
  2018-10-06 19:18                   ` Paul Eggert
  2018-10-06 19:30                   ` Paul Eggert
@ 2018-10-06 19:32                   ` Garreau, Alexandre
  2 siblings, 0 replies; 98+ messages in thread
From: Garreau, Alexandre @ 2018-10-06 19:32 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Paul Eggert, emacs-devel, npostavs

On 2018-10-06 at 21:29, Eli Zaretskii wrote:
> I understand.  I'm just saying that adding a backslash between the
> U+2019 quote will not significantly improve the situation, because
> Emacs Lisp uses backslashes in many other situation, like ?\", and
> therefore the mere fact that there is a backslash doesn't necessarily
> alert the human reader to the existence of an unusual character.

I think what is wanted here is not to alert of unusual character but
alert of a non-syntaxically-relevant (for the reader) character, such as
?\', which quotes (or be it ?\", or ?\(, or ?\), etc.), which looks like
a character that will have syntactic consequences (like quoting), while
being a “normal” (part of the symbol) character.  Like not alerting two
symbols may look alike (the ՚open-server/´open-server case), but one
thing which is a symbol, unquoted (like \'open-server, which is evident
it’s a symbol because of the backslash, since afaik no character
preceded by a backslash can do something with syntax, apart being part
of a symbol (thus the behavior is clear)), and the other which *not* be
a unquoted symbol at all (like 'open-server, which just returns the
symbol).

I guess the same issue indeed arise with ?¸, or anything alike, that
won’t unquote the next symbol.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06 16:17           ` Paul Eggert
@ 2018-10-07  1:13             ` Drew Adams
  2018-10-08  3:51             ` Richard Stallman
  1 sibling, 0 replies; 98+ messages in thread
From: Drew Adams @ 2018-10-07  1:13 UTC (permalink / raw)
  To: Paul Eggert, Eli Zaretskii; +Cc: emacs-devel, npostavs

> > The right approach is to let Lisp tell you about its syntax... we should _let
> Lisp judge_
> 
> This seems to be assuming that there is something out there called "Lisp" that
> tells us what to do. That's entirely backwards. Lisp is our servant, not our
> master. Lisp syntax should be whatever the best syntax we can come up with,
> to help us and our users do our work, and this should be the way we think and
> write about it.

Such characters have symbol syntax in Lisp (Elisp, for instance).
That's the Lisp syntax in question. Lisp doesn't require you to
escape them in symbols. Hasn't done so before and shouldn't
do so now (IMHO).

Yes, we have liberty to change Lisp, including Lisp syntax, in
any number of ways. That doesn't mean that we should.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06 16:17           ` Paul Eggert
  2018-10-07  1:13             ` Drew Adams
@ 2018-10-08  3:51             ` Richard Stallman
  1 sibling, 0 replies; 98+ messages in thread
From: Richard Stallman @ 2018-10-08  3:51 UTC (permalink / raw)
  To: Paul Eggert; +Cc: eliz, npostavs, drew.adams, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

We can make any change in Emacs Lisp syntax that we decide to make,
but incoherence with the spirit of Lisp will lead to trouble.

The spirit of Lisp is not something that can be precisely defined,
but experienced Lispers will mostly agree about what it says.

-- 
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-05 23:02       ` Paul Eggert
                           ` (2 preceding siblings ...)
  2018-10-06 11:22         ` Garreau, Alexandre
@ 2018-10-09 14:43         ` Noam Postavsky
  2018-10-09 15:30           ` Paul Eggert
  2018-10-10  3:57           ` Richard Stallman
  3 siblings, 2 replies; 98+ messages in thread
From: Noam Postavsky @ 2018-10-09 14:43 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Eli Zaretskii, Drew Adams, Emacs developers

On Fri, 5 Oct 2018 at 19:02, Paul Eggert <eggert@cs.ucla.edu> wrote:
>
> On 10/5/18 1:43 AM, Eli Zaretskii wrote:
> > the commonly accepted mechanism of
> > pointing out potentially wrong constructs is by visual cues and
> > warning messages
>
> If we decide that Elisp source code must be able to abuse confusable
> characters, then of course we should allow such abuse and support it as
> best we can, including selective highlighting and whatnot to try to warn
> readers of the abuse. Such support won't work outside Emacs, but people
> using non-Emacs programs to look at Elisp code will simply be out of luck.

The problem is that deciding which characters are confusable and hence
require backslash escaping is based on a shifting mess of heuristics.
So I don't think it's workable to signal a hard error for this. Both
in terms of false positives which could mean possibly breaking code,
and false negatives which means we would be giving a false sense of
security. That's why I proposed adding highlighting and enhancing
existing error messages instead. Of course adding warnings would also
make sense.

By the way, your EN SPACE example already gives a compile warning:

Warning: Unused lexical variable ‘data data’



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-09 14:43         ` Noam Postavsky
@ 2018-10-09 15:30           ` Paul Eggert
  2018-10-09 16:13             ` Eli Zaretskii
  2018-10-10  3:57           ` Richard Stallman
  1 sibling, 1 reply; 98+ messages in thread
From: Paul Eggert @ 2018-10-09 15:30 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: Eli Zaretskii, Drew Adams, Emacs developers

Noam Postavsky wrote:
> deciding which characters are confusable and hence
> require backslash escaping is based on a shifting mess of heuristics.

No more than the "shifting mess of heuristics" inevitable in any choice of 
syntax. Quite possibly the confusables list from the Unicode consortium will do. 
The list won't shift much once it's established.

We can start merely by warning about confusable characters and seeing how often 
those warnings are triggered in real (as opposed to malicious or 
purposely-tricky) code. If the warnings are quite rare, in a later Emacs version 
we can change the manual from "confusable characters should be escaped" to 
"confusable characters must be escaped".



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-09 15:30           ` Paul Eggert
@ 2018-10-09 16:13             ` Eli Zaretskii
  2018-10-09 17:07               ` Paul Eggert
  0 siblings, 1 reply; 98+ messages in thread
From: Eli Zaretskii @ 2018-10-09 16:13 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel, drew.adams, npostavs

> Cc: Eli Zaretskii <eliz@gnu.org>, Emacs developers <emacs-devel@gnu.org>,
>  Drew Adams <drew.adams@oracle.com>
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Tue, 9 Oct 2018 08:30:22 -0700
> 
> Noam Postavsky wrote:
> > deciding which characters are confusable and hence
> > require backslash escaping is based on a shifting mess of heuristics.
> 
> No more than the "shifting mess of heuristics" inevitable in any choice of 
> syntax. Quite possibly the confusables list from the Unicode consortium will do. 
> The list won't shift much once it's established.
> 
> We can start merely by warning about confusable characters and seeing how often 
> those warnings are triggered in real (as opposed to malicious or 
> purposely-tricky) code. If the warnings are quite rare, in a later Emacs version 
> we can change the manual from "confusable characters should be escaped" to 
> "confusable characters must be escaped".

Confusable characters are confusable only when they are surrounded by
ASCII characters or by characters that look like ASCII.  By
themselves, at least many of them, are entirely legitimate.  For
example, I see no reason to warn about a symbol named "сталин", even
though the characters с and а will be considered confusables if the
symbol would be named something like "саn".

So I think we cannot go by characters here, we need to examine the
context.

That is why I think we shouldn't link this particular issue, of quote
characters, with the more general problem: the latter is much more
complicated to solve correctly.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-09 16:13             ` Eli Zaretskii
@ 2018-10-09 17:07               ` Paul Eggert
  2018-10-09 19:18                 ` Andreas Schwab
  2018-10-10  3:58                 ` Richard Stallman
  0 siblings, 2 replies; 98+ messages in thread
From: Paul Eggert @ 2018-10-09 17:07 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, drew.adams, npostavs

Eli Zaretskii wrote:
> I see no reason to warn about a symbol named "сталин", even
> though the characters с and а will be considered confusables if the
> symbol would be named something like "саn".

Sure, that's fine. We can limit symbol warnings to the symbols containing 
non-ASCII chars all of which are confusable with ASCII. This will warn about 
"саn" (with Cyrillic "с" and "а") but not about "сталин" (with Cyrillic "a").

The point of the guideline is not to warn about every possible confusable 
character; it's to defend against malicious code.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-09 17:07               ` Paul Eggert
@ 2018-10-09 19:18                 ` Andreas Schwab
  2018-10-10  9:39                   ` Aaron Ecay
  2018-10-10 15:18                   ` Eli Zaretskii
  2018-10-10  3:58                 ` Richard Stallman
  1 sibling, 2 replies; 98+ messages in thread
From: Andreas Schwab @ 2018-10-09 19:18 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Eli Zaretskii, npostavs, drew.adams, emacs-devel

On Okt 09 2018, Paul Eggert <eggert@cs.ucla.edu> wrote:

> Sure, that's fine. We can limit symbol warnings to the symbols containing
> non-ASCII chars all of which are confusable with ASCII. This will warn
> about "саn" (with Cyrillic "с" and "а") but not about "сталин" (with
> Cyrillic "a").

I'm pretty sure you can find many Russian words that are written with
only Latin-alike letters.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-09 14:43         ` Noam Postavsky
  2018-10-09 15:30           ` Paul Eggert
@ 2018-10-10  3:57           ` Richard Stallman
  2018-10-10 14:41             ` Eli Zaretskii
  1 sibling, 1 reply; 98+ messages in thread
From: Richard Stallman @ 2018-10-10  3:57 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: eliz, eggert, drew.adams, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > The problem is that deciding which characters are confusable and hence
  > require backslash escaping is based on a shifting mess of heuristics.
  > So I don't think it's workable to signal a hard error for this. Both
  > in terms of false positives which could mean possibly breaking code,
  > and false negatives which means we would be giving a false sense of
  > security.

In principle, that\s a valid point.  But can't we assembe a fixed list
of characters that are confusable with the usual fonts. and base the
warning on that list?

We could conceivably have a feature that would check any fontset for
confusable characters, and warn the user if it has confusable pairs
that are not in the usual list of confusable pairs.

-- 
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-09 17:07               ` Paul Eggert
  2018-10-09 19:18                 ` Andreas Schwab
@ 2018-10-10  3:58                 ` Richard Stallman
  1 sibling, 0 replies; 98+ messages in thread
From: Richard Stallman @ 2018-10-10  3:58 UTC (permalink / raw)
  To: Paul Eggert; +Cc: eliz, npostavs, drew.adams, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > Sure, that's fine. We can limit symbol warnings to the symbols containing 
  > non-ASCII chars all of which are confusable with ASCII. This will warn about 
  > "саn" (with Cyrillic "с" and "а") but not about "сталин" (with Cyrillic "a").

I agree, but we need to extend this protection to things other than
program code.

  > The point of the guideline is not to warn about every possible confusable 
  > character; it's to defend against malicious code.

Confusables are dangerous in host names, too.

The same principle could apply: if the host name contains, as well as
the confusables, some non-confusable non-ASCII characters from the
same Unicode page, there is no need to warn about it.

Perhaps there are other cases, too.

-- 
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-09 19:18                 ` Andreas Schwab
@ 2018-10-10  9:39                   ` Aaron Ecay
  2018-10-10 11:18                     ` Garreau, Alexandre
  2018-10-10 15:18                   ` Eli Zaretskii
  1 sibling, 1 reply; 98+ messages in thread
From: Aaron Ecay @ 2018-10-10  9:39 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: emacs-devel

2018ko urriak 9an, Andreas Schwab-ek idatzi zuen:
> 
> On Okt 09 2018, Paul Eggert <eggert@cs.ucla.edu> wrote:
> 
>> Sure, that's fine. We can limit symbol warnings to the symbols containing
>> non-ASCII chars all of which are confusable with ASCII. This will warn
>> about "саn" (with Cyrillic "с" and "а") but not about "сталин" (with
>> Cyrillic "a").
> 
> I'm pretty sure you can find many Russian words that are written with
> only Latin-alike letters.

Should this be a warning?

(let ((с cyrillic-ess)) ...)

What about this?

(let ((c latin-c)
      (с cyrillic-ess))
  ...)

IMO the answer to both questions is yes (because Latin letters are
used for elisp special forms like “let,” so they should be inherently
privileged) – but I only use Latin letters in programs I write, so I
probably donʼt have the perspective to know how annoying such warnings
could be to regular users of other scripts.

However, since warnings are only (potentially) annoying rather than
changing the behavior of programs, it makes sense to be aggressive with
them, in order to gauge how disruptive it would be to actually change
the way text is interpreted as code.

PS An issue that seems related is that it is presently possible to bind
the symbols ö (one character, U+00F6 LATIN SMALL LETTER O WITH DIAERESIS)
and ö (two characters, U+006F LATIN SMALL LETTER O followed by U+0308
COMBINING DIAERESIS) to different values.  This seems like the kind of
thing that should be (at least) warned about and (probably) disallowed.

-- 
Aaron Ecay



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-10  9:39                   ` Aaron Ecay
@ 2018-10-10 11:18                     ` Garreau, Alexandre
  2018-10-10 14:31                       ` Eli Zaretskii
  0 siblings, 1 reply; 98+ messages in thread
From: Garreau, Alexandre @ 2018-10-10 11:18 UTC (permalink / raw)
  To: Aaron Ecay; +Cc: Andreas Schwab, emacs-devel

On 2018-10-10 at 10:39, Aaron Ecay wrote:
> PS An issue that seems related is that it is presently possible to bind
> the symbols ö (one character, U+00F6 LATIN SMALL LETTER O WITH DIAERESIS)
> and ö (two characters, U+006F LATIN SMALL LETTER O followed by U+0308
> COMBINING DIAERESIS) to different values.  This seems like the kind of
> thing that should be (at least) warned about and (probably) disallowed.

Oh yes… I confirm, this is the case here too (but my version is only
25.1.1, so maybe it changed), it really shouldn’t be that way at all
(tried with é and é (btw these two display quite strangely slightly
differently…))…



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-10 11:18                     ` Garreau, Alexandre
@ 2018-10-10 14:31                       ` Eli Zaretskii
  0 siblings, 0 replies; 98+ messages in thread
From: Eli Zaretskii @ 2018-10-10 14:31 UTC (permalink / raw)
  To: Garreau, Alexandre; +Cc: aaronecay, schwab, emacs-devel

> From: "Garreau\, Alexandre" <galex-713@galex-713.eu>
> Date: Wed, 10 Oct 2018 13:18:09 +0200
> Cc: Andreas Schwab <schwab@linux-m68k.org>, emacs-devel@gnu.org
> 
> (tried with é and é (btw these two display quite strangely slightly
> differently…))…

It depends on your fonts.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-10  3:57           ` Richard Stallman
@ 2018-10-10 14:41             ` Eli Zaretskii
  2018-10-11  5:01               ` Richard Stallman
  0 siblings, 1 reply; 98+ messages in thread
From: Eli Zaretskii @ 2018-10-10 14:41 UTC (permalink / raw)
  To: rms; +Cc: eggert, emacs-devel, drew.adams, npostavs

> From: Richard Stallman <rms@gnu.org>
> Cc: eggert@cs.ucla.edu, eliz@gnu.org, drew.adams@oracle.com,
> 	emacs-devel@gnu.org
> Date: Tue, 09 Oct 2018 23:57:01 -0400
> 
> We could conceivably have a feature that would check any fontset for
> confusable characters

I'm not an expert on fonts, but I don't think this is reliable enough.
Are you saying that a font might use the same glyph for similarly
looking characters from different scripts?  If that is true, then yes,
we could detect that.  But the fact that we have such a font doesn't
yet mean there is a problem worth warning the user, since these
characters need to be _used_ in a certain context to cause confusion.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-09 19:18                 ` Andreas Schwab
  2018-10-10  9:39                   ` Aaron Ecay
@ 2018-10-10 15:18                   ` Eli Zaretskii
  2018-10-10 15:43                     ` Drew Adams
  2018-10-10 16:08                     ` Yuri Khan
  1 sibling, 2 replies; 98+ messages in thread
From: Eli Zaretskii @ 2018-10-10 15:18 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: eggert, emacs-devel, drew.adams, npostavs

> From: Andreas Schwab <schwab@linux-m68k.org>
> Date: Tue, 09 Oct 2018 21:18:51 +0200
> Cc: Eli Zaretskii <eliz@gnu.org>, npostavs@users.sourceforge.net,
> 	drew.adams@oracle.com, emacs-devel@gnu.org
> 
> I'm pretty sure you can find many Russian words that are written with
> only Latin-alike letters.

I wrote the following toy program:

  (let ((confusing '(?а ?е ?о ?р ?с ?у ?х))
	(buf (get-buffer-create "*confusing*")))
    (while (not (eobp))
      (let* ((word (buffer-substring (line-beginning-position)
				     (line-end-position)))
	     (chars (append word nil)))
	(if (null (seq-difference chars confusing))
	    (with-current-buffer buf
	      (insert word ?\n))))
      (forward-line))))

and ran it on a list of 174800 words from a Russian dictionary.  The
result was 60 words that used only Latin-alike letters.  So, not too
many, but not just a few, either.

Of course, there are many more non-word combinations of the above
letters that might look like Latin words.

Also note that "confusability" sometimes depends on letter-case.  For
example, the lower-case "вор" doesn't look like a Latin word, but the
upper-case "ВОР" does.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-10 15:18                   ` Eli Zaretskii
@ 2018-10-10 15:43                     ` Drew Adams
  2018-10-10 16:08                     ` Yuri Khan
  1 sibling, 0 replies; 98+ messages in thread
From: Drew Adams @ 2018-10-10 15:43 UTC (permalink / raw)
  To: Eli Zaretskii, Andreas Schwab; +Cc: eggert, emacs-devel, npostavs

It sounds like the contexts where a char might be confused
with another are varied and depend on things that can even
include user attention and intention.

If we want to help users be aware of character-confusion
possibilities then I think whatever we offer them in this
regard needs to be (1) optional and (2) configurable
(granularity, specifying contexts/uses/conditions, etc.).

I think we can offer to help by highlighting characters (or
their surrounding contexts, e.g., when a char is tiny or
otherwise unobtrusive or invisible).

I think we should avoid raising errors, but that could be
an option that some users might want to choose in some
contexts. We could perhaps offer a range of help
responses, from a range of highlighting possibilities to
outright error-raising.

We can have code that tries to be clever, but that should
only be used if asked for by a user. We should not try to
second-guess text or users by default.

The last thing we should want is to bother users by
default, or systematically, warning them left and right
about possibilities of confusion. Such warnings or
notifications or highlights need to be opt-in, IMHO.

Above all, Emacs, and especially Emacs Lisp, should
continue to be an environment where you can do what
you want without obstruction or unnecessary
hand-holding or helicopter-parenting.

(Note that I qualified that with "unnecessary". If there
is some real, strong, unambiguous danger that we can
identify then of course we need to offer protection up
front. That help would not be "unnecessary".)



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-10 15:18                   ` Eli Zaretskii
  2018-10-10 15:43                     ` Drew Adams
@ 2018-10-10 16:08                     ` Yuri Khan
  2018-10-15 20:30                       ` Juri Linkov
  1 sibling, 1 reply; 98+ messages in thread
From: Yuri Khan @ 2018-10-10 16:08 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: Noam Postavsky, Paul Eggert, Andreas Schwab, Drew Adams,
	Emacs developers

On Wed, Oct 10, 2018 at 10:20 PM Eli Zaretskii <eliz@gnu.org> wrote:

>   (let ((confusing '(?а ?е ?о ?р ?с ?у ?х))

> Also note that "confusability" sometimes depends on letter-case.  For
> example, the lower-case "вор" doesn't look like a Latin word, but the
> upper-case "ВОР" does.

Yes. In uppercase: ?А ?В ?Е ?К ?М ?Н ?О ?Р ?С ?Т ?Х. (Coincidentally,
Russian car license plates use confusable letters exclusively, so that
people who are not fluent in the Cyrillic script could still report
violations.)

Also, confusability depends on font style. In lowercase italic, these
pairs are also confusable: д/g, з/z, и/u, п/n, т/m, ч/r.



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-10 14:41             ` Eli Zaretskii
@ 2018-10-11  5:01               ` Richard Stallman
  0 siblings, 0 replies; 98+ messages in thread
From: Richard Stallman @ 2018-10-11  5:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, emacs-devel, drew.adams, npostavs

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > I'm not an expert on fonts, but I don't think this is reliable enough.
  > Are you saying that a font might use the same glyph for similarly
  > looking characters from different scripts?  If that is true, then yes,
  > we could detect that.  But the fact that we have such a font doesn't
  > yet mean there is a problem worth warning the user, since these
  > characters need to be _used_ in a certain context to cause confusion.

Which characters are confusable is one question,
and which contexts they can cause confusion in is another question.

We need to distinguish those, to factor the problem.

-- 
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-10 16:08                     ` Yuri Khan
@ 2018-10-15 20:30                       ` Juri Linkov
  0 siblings, 0 replies; 98+ messages in thread
From: Juri Linkov @ 2018-10-15 20:30 UTC (permalink / raw)
  To: Yuri Khan
  Cc: Paul Eggert, Noam Postavsky, Emacs developers, Andreas Schwab,
	Eli Zaretskii, Drew Adams

>>   (let ((confusing '(?а ?е ?о ?р ?с ?у ?х))
>
>> Also note that "confusability" sometimes depends on letter-case.  For
>> example, the lower-case "вор" doesn't look like a Latin word, but the
>> upper-case "ВОР" does.
>
> Yes. In uppercase: ?А ?В ?Е ?К ?М ?Н ?О ?Р ?С ?Т ?Х. (Coincidentally,
> Russian car license plates use confusable letters exclusively, so that
> people who are not fluent in the Cyrillic script could still report
> violations.)

There are programs composed completely of confusable characters like
http://compuhumour.narod.ru/listing/prog_tormoz.html



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-05  0:03 ` Noam Postavsky
  2018-10-05  1:01   ` Paul Eggert
  2018-10-06 15:40   ` eval-last-sexp / C-x C-e, and punctuation like `?’' [Was: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?)] Garreau, Alexandre
@ 2018-10-16 12:48   ` Garreau, Alexandre
  2 siblings, 0 replies; 98+ messages in thread
From: Garreau, Alexandre @ 2018-10-16 12:48 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: Drew Adams, Emacs developers

On 2018-10-04 at 20:03, Noam Postavsky wrote:
> On Fri, 2 Feb 2018 at 17:24, Noam Postavsky
> <npostavs@users.sourceforge.net> wrote:
>>
>> In Emacs 26 and earlier the following is valid lisp code:
>>
>> (setq ’bar 42)
>> (setq foo ’bar)
>>
>> In the current master branch, this will signal (invalid-read-syntax
>> "strange quote" "’").

Btw, aren’t there any ways of, at least locally, extending/redefining
such reader behavior such as the one of “'”, “,”/“,@”, “`”, “.”, “:”,
etc.?

For instance to experiment having such fancy and strange quotes in
source code: people really wanting to use it *might* want to use it as
such, instead of symbol component, which, inside ascii, often (with a
lot of exceptions such as in “!”, “?”, “:” (though it can have special
meaning) or “.” (though this one doesn’t work alone) other
non-human-text “punctuation” (also named “special characters”) such as
in “%&*+/<>=@^_|”) cannot be part of a symbol without escaping (for
instance: “"#'(),;\[]`” (though this is tiny, it is not that simple)).



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?
  2018-10-06 14:00               ` Eli Zaretskii
@ 2018-10-24 22:25                 ` Noam Postavsky
  0 siblings, 0 replies; 98+ messages in thread
From: Noam Postavsky @ 2018-10-24 22:25 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: Paul Eggert, Drew Adams, Garreau, Alexandre, Emacs developers

On Sat, 6 Oct 2018 at 10:00, Eli Zaretskii <eliz@gnu.org> wrote:
>
> > afaik there are also problems in other contents than source code about
> > undistinguishable unicode character, such as the latin ?o and the
> > cyrillic ?о (the first example of unicode-powered typosquatting I ever
> > heard), the different spaces (sometimes not distinguishable in monospace
[...]
>
> This is the more general problem I mentioned up-thread:
>
>   http://lists.gnu.org/archive/html/emacs-devel/2018-10/msg00052.html
>
> I agree that we should improve Emacs in that area, but I think it
> would be wrong to hold off fixing the issue with quotes because the
> more general problem is solved.

Right, I do see the quotes thing as an obvious place to start, and due
to how difficult the more general issue is, starting with enhancement
of existing warnings and existing errors make sense. It addresses the
immediate problem of confusiong from curly quotes, and we can add more
confusable characters later without fear of breaking anything.



^ permalink raw reply	[flat|nested] 98+ messages in thread

end of thread, other threads:[~2018-10-24 22:25 UTC | newest]

Thread overview: 98+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-02-02 22:24 Change of Lisp syntax for "fancy" quotes in Emacs 27? Noam Postavsky
2018-02-02 22:52 ` Paul Eggert
2018-02-03  0:00   ` Drew Adams
2018-02-03  0:09     ` Paul Eggert
2018-02-03  0:39       ` Drew Adams
2018-02-03  8:33 ` Eli Zaretskii
2018-02-03 16:16   ` Drew Adams
2018-02-03 17:05     ` Eli Zaretskii
2018-02-04  1:16       ` Michael Heerdegen
2018-02-04  1:25         ` Clément Pit-Claudel
2018-02-04  2:05           ` Drew Adams
2018-02-04  2:06           ` Michael Heerdegen
2018-02-04 10:34           ` Alan Third
2018-02-04 15:36             ` Clément Pit-Claudel
2018-02-04 17:37               ` Eli Zaretskii
2018-02-04 21:31                 ` Noam Postavsky
2018-02-04 11:15         ` Alan Mackenzie
2018-02-04 15:54           ` Drew Adams
2018-02-04 14:47         ` Noam Postavsky
2018-02-04  1:55       ` Drew Adams
2018-02-04  2:10         ` Noam Postavsky
2018-02-05  1:06       ` Why "symbol's value" error about a list? Richard Stallman
2018-02-05 20:35         ` Alan Mackenzie
2018-02-05 21:46           ` Drew Adams
2018-02-06  4:13             ` Eli Zaretskii
2018-02-06  7:32               ` Tim Cross
2018-02-06  7:40                 ` Eli Zaretskii
2018-02-06 15:45                 ` Drew Adams
2018-02-06 15:45               ` Drew Adams
2018-02-06 19:17                 ` Eli Zaretskii
2018-02-06 14:51           ` Richard Stallman
2018-02-06 11:27         ` Noam Postavsky
2018-02-06 14:53           ` Richard Stallman
2018-02-06 18:59             ` Eli Zaretskii
2018-02-07  2:40               ` Richard Stallman
2018-02-07  3:42                 ` Eli Zaretskii
2018-02-06 18:52           ` Eli Zaretskii
2018-02-05  1:06       ` Change of Lisp syntax for "fancy" quotes in Emacs 27? Richard Stallman
2018-02-03 18:13 ` Aaron Ecay
2018-02-04  2:05   ` Drew Adams
2018-02-04  4:51   ` Paul Eggert
2018-02-04  9:47     ` Andreas Schwab
2018-02-04 15:04     ` Noam Postavsky
2018-02-04 17:33       ` Eli Zaretskii
2018-02-04 19:36         ` Paul Eggert
2018-02-04 19:55           ` Philipp Stephani
2018-02-04 20:10           ` Eli Zaretskii
2018-02-04 20:36             ` Eli Zaretskii
2018-02-04 20:48               ` Paul Eggert
2018-02-04 20:59                 ` Clément Pit-Claudel
2018-10-05  0:03 ` Noam Postavsky
2018-10-05  1:01   ` Paul Eggert
2018-10-05  8:43     ` Eli Zaretskii
2018-10-05 23:02       ` Paul Eggert
2018-10-06  0:20         ` Drew Adams
2018-10-06  9:14           ` Alan Mackenzie
2018-10-06 14:34             ` Stefan Monnier
2018-10-06 14:57             ` Drew Adams
2018-10-06 15:42               ` Garreau, Alexandre
2018-10-06 16:10             ` Paul Eggert
2018-10-06 16:17           ` Paul Eggert
2018-10-07  1:13             ` Drew Adams
2018-10-08  3:51             ` Richard Stallman
2018-10-06 10:11         ` Eli Zaretskii
2018-10-06 15:51           ` Paul Eggert
2018-10-06 16:45             ` Eli Zaretskii
2018-10-06 18:03               ` Paul Eggert
2018-10-06 18:29                 ` Eli Zaretskii
2018-10-06 19:18                   ` Paul Eggert
2018-10-06 19:30                   ` Paul Eggert
2018-10-06 19:32                   ` Garreau, Alexandre
2018-10-06 11:22         ` Garreau, Alexandre
2018-10-06 11:50           ` Eli Zaretskii
2018-10-06 12:10             ` Garreau, Alexandre
2018-10-06 14:00               ` Eli Zaretskii
2018-10-24 22:25                 ` Noam Postavsky
2018-10-06 13:15             ` Unicode security-issues workarounds elsewhere [Was: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?] Garreau, Alexandre
2018-10-06 14:01               ` Eli Zaretskii
2018-10-06 16:24           ` Change of Lisp syntax for "fancy" quotes in Emacs 27? Paul Eggert
2018-10-06 16:40             ` Stefan Monnier
2018-10-09 14:43         ` Noam Postavsky
2018-10-09 15:30           ` Paul Eggert
2018-10-09 16:13             ` Eli Zaretskii
2018-10-09 17:07               ` Paul Eggert
2018-10-09 19:18                 ` Andreas Schwab
2018-10-10  9:39                   ` Aaron Ecay
2018-10-10 11:18                     ` Garreau, Alexandre
2018-10-10 14:31                       ` Eli Zaretskii
2018-10-10 15:18                   ` Eli Zaretskii
2018-10-10 15:43                     ` Drew Adams
2018-10-10 16:08                     ` Yuri Khan
2018-10-15 20:30                       ` Juri Linkov
2018-10-10  3:58                 ` Richard Stallman
2018-10-10  3:57           ` Richard Stallman
2018-10-10 14:41             ` Eli Zaretskii
2018-10-11  5:01               ` Richard Stallman
2018-10-06 15:40   ` eval-last-sexp / C-x C-e, and punctuation like `?’' [Was: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?)] Garreau, Alexandre
2018-10-16 12:48   ` Change of Lisp syntax for "fancy" quotes in Emacs 27? Garreau, Alexandre

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).