unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#30217: Ambiguity in NEWS in emacs-26.0.91
@ 2018-01-22 22:17 Alan Mackenzie
  2018-01-22 22:42 ` Drew Adams
  0 siblings, 1 reply; 18+ messages in thread
From: Alan Mackenzie @ 2018-01-22 22:17 UTC (permalink / raw)
  To: 30217

Hello, Emacs.

In the new NEWS in the recent pretest, at L1381 we have:

    ** To avoid confusion caused by "smart quotes", the reader no longer
    accepts Lisp symbols which begin with the following quotation
    characters: `'�""�� � � , unless they are escaped with backslash.
                                     ^^^^

, which leaves it unclear whether it's the "smart quotes" or the Lisp
symbols which need escaping.

I suggest replacing "they" with either "these quotes" or "these symbols"
depending on the desired meaning.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#30217: Ambiguity in NEWS in emacs-26.0.91
  2018-01-22 22:17 bug#30217: Ambiguity in NEWS in emacs-26.0.91 Alan Mackenzie
@ 2018-01-22 22:42 ` Drew Adams
  2018-01-23  0:42   ` Noam Postavsky
  0 siblings, 1 reply; 18+ messages in thread
From: Drew Adams @ 2018-01-22 22:42 UTC (permalink / raw)
  To: Alan Mackenzie, 30217

> In the new NEWS in the recent pretest, at L1381 we have:
> 
>     ** To avoid confusion caused by "smart quotes", the reader no longer
>     accepts Lisp symbols which begin with the following quotation
>     characters: `'""   , unless they are escaped with backslash.
>                                      ^^^^
> 
> , which leaves it unclear whether it's the "smart quotes" or the Lisp
> symbols which need escaping.
> 
> I suggest replacing "they" with either "these quotes" or "these symbols"
> depending on the desired meaning.

Even if that ambiguity gets resolved, I have no idea what
the text means.  What does it mean for the Lisp reader to
"accept a Lisp symbol"?

Please describe exactly what the reader does when it reads
one of those characters followed by Lisp-symbol syntax, in
both cases: char escaped and char not escaped.





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#30217: Ambiguity in NEWS in emacs-26.0.91
  2018-01-22 22:42 ` Drew Adams
@ 2018-01-23  0:42   ` Noam Postavsky
  2018-01-23  0:56     ` Drew Adams
  0 siblings, 1 reply; 18+ messages in thread
From: Noam Postavsky @ 2018-01-23  0:42 UTC (permalink / raw)
  To: Drew Adams; +Cc: Alan Mackenzie, 30217

Drew Adams <drew.adams@oracle.com> writes:

>> In the new NEWS in the recent pretest, at L1381 we have:
>> 
>>     ** To avoid confusion caused by "smart quotes", the reader no longer
>>     accepts Lisp symbols which begin with the following quotation
>>     characters: `'""   , unless they are escaped with backslash.
>>                                      ^^^^
>> 
>> , which leaves it unclear whether it's the "smart quotes" or the Lisp
>> symbols which need escaping.

> Even if that ambiguity gets resolved, I have no idea what
> the text means.  What does it mean for the Lisp reader to
> "accept a Lisp symbol"?
>
> Please describe exactly what the reader does when it reads
> one of those characters followed by Lisp-symbol syntax, in
> both cases: char escaped and char not escaped.

How about this:

    ** To avoid confusion caused by "smart quotes", the reader signals an
    error when reading Lisp symbols which begin with one of the following
    quotation characters: ‘’‛“”‟〞"'.  A symbol beginning with such a
    character can be written by escaping the quotation character with a
    backslash.  For example:

        (read "‘smart") => Lisp error: (invalid-read-syntax "strange quote" "‘")
        (read "\\‘smart") == (intern "‘smart")





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#30217: Ambiguity in NEWS in emacs-26.0.91
  2018-01-23  0:42   ` Noam Postavsky
@ 2018-01-23  0:56     ` Drew Adams
  2018-01-23  1:40       ` Noam Postavsky
  0 siblings, 1 reply; 18+ messages in thread
From: Drew Adams @ 2018-01-23  0:56 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: Alan Mackenzie, 30217

> > Please describe exactly what the reader does when it reads
> > one of those characters followed by Lisp-symbol syntax, in
> > both cases: char escaped and char not escaped.
> 
> How about this:
> 
>     ** To avoid confusion caused by "smart quotes", the reader signals an
>     error when reading Lisp symbols which begin with one of the following
>     quotation characters: ‘’‛“”‟〞"'.  A symbol beginning with such a
>     character can be written by escaping the quotation character with a
>     backslash.  For example:
> 
>         (read "‘smart") => Lisp error: (invalid-read-syntax "strange
> quote" "‘")
>         (read "\\‘smart") == (intern "‘smart")

Yes, that's clear (to me).  I would never have guessed that
the previous description meant that.

But may I ask why such "strange quote" characters are not
taken as lisp-symbol constituent characters?  Why the need
to escape them?  Why are they treated specially?

That description describes a workaround "to avoid confusion",
but it's not clear why we need "to avoid confusion".  What
good is the error behavior in the first place?  If such chars
are not to be treated as normal symbol chars it should be
because they have some special treatment/behavior/interpretation
for Lisp, no?  If the only non-escaped behavior is to raise an
error then that just sounds like a bug, to me.

I'm probably missing something important, but whatever that
is it does not seem to be conveyed by the NEWS description.
At all.





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#30217: Ambiguity in NEWS in emacs-26.0.91
  2018-01-23  0:56     ` Drew Adams
@ 2018-01-23  1:40       ` Noam Postavsky
  2018-01-23  6:07         ` Drew Adams
  0 siblings, 1 reply; 18+ messages in thread
From: Noam Postavsky @ 2018-01-23  1:40 UTC (permalink / raw)
  To: Drew Adams; +Cc: Alan Mackenzie, 30217

Drew Adams <drew.adams@oracle.com> writes:

> But may I ask why such "strange quote" characters are not
> taken as lisp-symbol constituent characters?  Why the need
> to escape them?  Why are they treated specially?
>
> That description describes a workaround "to avoid confusion",
> but it's not clear why we need "to avoid confusion".

To give a less confusing error in cases like Bug#2967 and Bug#23425.





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#30217: Ambiguity in NEWS in emacs-26.0.91
  2018-01-23  1:40       ` Noam Postavsky
@ 2018-01-23  6:07         ` Drew Adams
  2018-01-23  6:21           ` Drew Adams
  2018-01-23 12:54           ` Noam Postavsky
  0 siblings, 2 replies; 18+ messages in thread
From: Drew Adams @ 2018-01-23  6:07 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: Alan Mackenzie, 30217

> > But may I ask why such "strange quote" characters are not
> > taken as lisp-symbol constituent characters?  Why the need
> > to escape them?  Why are they treated specially?
> >
> > That description describes a workaround "to avoid confusion",
> > but it's not clear why we need "to avoid confusion".
> 
> To give a less confusing error in cases like Bug#2967 and Bug#23425.

Seriously?  This is an absolutely horrible "fix" for each
of those problems.  This "cure" is worse than either of
those diseases, and as we all know, I think such diseases
are pretty awful.

The error message seems to be _super_ confusing.  It gives
no indication of problems such as those bugs, and it does
not begin to enlighten anyone about the confusion at their
heart.

If no one has a real fix for such bugs yet then please just
leave them open until someone comes up with a good idea.
This "fix" is not a good idea - for those bugs at least.

If this fix has some other purpose, then let's please
know what that is and talk about it.

But if such problems are the only reason for this "fix"
then please consider getting rid of such silly and useless
escaping and just change the error message to make clear
just what confusion it is meant to address: say that the
character is not an ascii apostrophe or whatever, if that
confusion is the real problem this is trying to solve.





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#30217: Ambiguity in NEWS in emacs-26.0.91
  2018-01-23  6:07         ` Drew Adams
@ 2018-01-23  6:21           ` Drew Adams
  2018-01-23 12:54           ` Noam Postavsky
  1 sibling, 0 replies; 18+ messages in thread
From: Drew Adams @ 2018-01-23  6:21 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: Alan Mackenzie, 30217

And besides - where do you stop doing this kind of thing?

Do we do something similar for characters that can
be mistaken for a period, in case you use one in an
attempt at dotted-pair syntax?

Do we do something similar for chars that can be
mistaken for a comma, inside backquoted sexps?

Do we do something similar for chars that can be
mistaken for a backquote?  An at-sign?  Ordinary
parentheses?

I really hope you reconsider this.  To me it looks
like an ugly hack that can bring only harm (including
more, not less, confusion), not good.





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#30217: Ambiguity in NEWS in emacs-26.0.91
  2018-01-23  6:07         ` Drew Adams
  2018-01-23  6:21           ` Drew Adams
@ 2018-01-23 12:54           ` Noam Postavsky
  2018-01-23 15:53             ` Drew Adams
  1 sibling, 1 reply; 18+ messages in thread
From: Noam Postavsky @ 2018-01-23 12:54 UTC (permalink / raw)
  To: Drew Adams; +Cc: Alan Mackenzie, 30217

Drew Adams <drew.adams@oracle.com> writes:

>> To give a less confusing error in cases like Bug#2967 and Bug#23425.
>
> Seriously?  This is an absolutely horrible "fix" for each
> of those problems.  This "cure" is worse than either of
> those diseases, and as we all know, I think such diseases
> are pretty awful.
>
> The error message seems to be _super_ confusing.  It gives
> no indication of problems such as those bugs, and it does
> not begin to enlighten anyone about the confusion at their
> heart.

The OP of Bug#2967 says

    I think it would be good if emacs looked for smart quotes in .emacs
    files and gave a warning or notice if it detected them. This would
    help troubleshooting.

Which is exactly what's being done now.  The OP of Bug#23425 says

    When this output is fed back into Emacs with M-:, it produces an obscure
    error message.

The Emacs 25 error for the expression in question is

    (wrong-number-of-arguments setq 31)

In Emacs 26.0.91, it is

    (invalid-read-syntax "strange quote" "’")

I think this is an improvement, since it does, in fact, indicate there
is a problematic use of ’.

Why do you think the signalling an error in this case is a bad idea?

> If no one has a real fix for such bugs yet then please just
> leave them open until someone comes up with a good idea.
> This "fix" is not a good idea - for those bugs at least.
>
> If this fix has some other purpose, then let's please
> know what that is and talk about it.
>
> But if such problems are the only reason for this "fix"
> then please consider getting rid of such silly and useless
> escaping and just change the error message

I don't quite understand what you mean by "getting rid of... escaping"
but keeping the error message.  It sounds like a you are contradicting
yourself.

> to make clear just what confusion it is meant to address: say that the
> character is not an ascii apostrophe or whatever, if that confusion is
> the real problem this is trying to solve.

Changing the error message is always possible, of course.  I'm not sure
if bringing "ascii" into it would make things clearer though.  Concrete
suggestions welcome.

> And besides - where do you stop doing this kind of thing?
>
> Do we do something similar for characters that can
> be mistaken for a period, in case you use one in an
> attempt at dotted-pair syntax?
>
> Do we do something similar for chars that can be
> mistaken for a comma, inside backquoted sexps?
>
> Do we do something similar for chars that can be
> mistaken for a backquote?  An at-sign?  Ordinary
> parentheses?

Maybe everything in the "Unicode confusables" listing?  Practically
speaking, I've never heard of problems with other characters, except
perhaps in programming "puzzles", obfuscated code contents and the like.

> I really hope you reconsider this.  To me it looks
> like an ugly hack that can bring only harm (including
> more, not less, confusion), not good.

Do you have any specific harms/confusion in mind?






^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#30217: Ambiguity in NEWS in emacs-26.0.91
  2018-01-23 12:54           ` Noam Postavsky
@ 2018-01-23 15:53             ` Drew Adams
  2018-01-23 23:00               ` Noam Postavsky
  0 siblings, 1 reply; 18+ messages in thread
From: Drew Adams @ 2018-01-23 15:53 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: Alan Mackenzie, 30217

> The OP of Bug#2967 says
>     I think it would be good if emacs looked for smart
>     quotes in .emacs files and gave a warning or notice
>     if it detected them. This would help troubleshooting.
>     Which is exactly what's being done now.

It's not necessarily appropriate to satisfy every
suggestion offered in every bug report. ;-)

I'm not sure such a warning is good, rather than bad.
I think not.  If a warning was printed for each "smart
quote" occurrence in a file that would surely be bad,
IMO.

An Emacs-Lisp file can contain pretty much anything,
including lots of natural-language text.  Are we now
issuing warnings even for "smart quotes" in comments
and strings?  That would definitely be a mistake.

In any case, I don't care much about byte-compiler
warnings - they are not the problem I responded to.
They can be ignored when they are not particularly
relevant.  The fact that they can sometimes represent
noise is at most an annoyance, not a real problem.

> The OP of Bug#23425 says
>   When this output is fed back into Emacs with M-:,

That represents pilot error, no doubt.

For `M-:' we _might_ try to provide an error message
that says you included a "smart quote" and say you
might want to check that that's what you intended.

I'm not suggesting we do that - I prefer not.  But
it's conceivable, if someone is really gung ho about
solving the purported problem.

I doubt that would be a good idea even for `M-:'.
But it would surely not be appropriate for other
contexts.

And even for `M-:' it's not obvious that we would
come up with a good test for the cases where it
would be helpful rather than confusing.

>   it produces an obscure error message.
> 
> The Emacs 25 error for the expression in question is
>   (wrong-number-of-arguments setq 31)

Which tells you pretty much that setq is missing an
argument or has too many, which makes you look at its
arguments.  Not so obscure.  And accurate.

> In Emacs 26.0.91, it is
>   (invalid-read-syntax "strange quote" "’")

Which is completely obscure, IMO.  Invalid read
syntax when reading what?  What's invalid about it?

In fact, it is not invalid.  It has never been invalid,
and it shouldn't suddenly be considered invalid now.

Confusion, not understanding an accurate error msg,
is not the same thing as Lisp itself having a bug
because such a character is included in a symbol.

> I think this is an improvement, since it does, in fact,
> indicate there is a problematic use of ’.

There is NOT any problematic use of ’ there.  The
user's understanding might be problematic, but that
read syntax is not problematic for Lisp.

Help users if we can, but don't screw Lisp in the
process.

(setq ’bar 42)
(setq foo ’bar)

That's perfectly fine Lisp, even if it might not be
what some might expect.  But now, after your "fix",
the first sexp raises an error - at read time, no less.

This is just wrong, IMO.  You've redefined Lisp
evaluation, taking away some of the importance of
symbols.  And this still raises no error:
(setq a’bar 42).

> Why do you think the signalling an error in this case
> is a bad idea?

Because it is.  Ms Lisp all her users are being
treated unfairly.  See above, and see my previous msg.

’bar is a fine symbol.  ’ has NO special meaning in
Lisp - it is NOT like ' or ` or ( or ) or . or , or @.

Now you've given it a special meaning: when in a
context where ' is special, raise an error because
it is not '.

That's plain wrong and confusing, and it subtracts
from Lisp (while adding nonsense to it).

> > If no one has a real fix for such bugs yet then please just
> > leave them open until someone comes up with a good idea.
> > This "fix" is not a good idea - for those bugs at least.
> >
> > If this fix has some other purpose, then let's please
> > know what that is and talk about it.
> >
> > But if such problems are the only reason for this "fix"
> > then please consider getting rid of such silly and useless
> > escaping and just change the error message
> 
> I don't quite understand what you mean by "getting rid of... escaping"
> but keeping the error message.  It sounds like a you are contradicting
> yourself.

I didn't say keep the error msg.  I said that if you
really think that some warning or error msg is important
here then fine.  But then improve the msg.

But I do NOT think that an error msg or warning is good
here.  A warning maybe, but not an error, which prevents
evaluation.  (But I doubt that warnings can be used here
accurately and without sowing ever more confusion.) 

Aside from the error/warning, such _escaping_ is another
bad idea.  It too subtracts from Lisp (while adding
nonsense to it).

IMHO, this "fix" - all of its parts - should be reverted
ASAP.  If you want to add some better error messaging
where we already raise an error, and if we can really
distinguish the cases where the better messaging should
be used, fine.  And if you want to add some warnings,
and if we can really distinguish the cases where the
warnings would be appropriate - accurately, fine.

To be clear, though, I'm in favor of neither of those
things.  Just leave it alone.  Using (mistakenly or
purposefully) such characters in symbol names is just
another potential gotcha.

There are plenty of them.  Users need to learn, e.g.,
that . is a symbol-constituent char in Lisp - so you
can have a symbol `a.b'.  And (a.b) is not the same
as (a . b).  Will you start requiring users to escape
the . in the symbol `a.b'?

To be really clear, the fix proposed should be removed.
Such characters, even if perhaps sometimes confusing
to some users, are legitimate symbol characters.  They
should just be left alone.  At _most_, and only if the
analysis were super-accurate and crystal clear, we
could consider adding warnings here or there.  We must
certainly not change Lisp here - no error-raising.

Starting to special-case such characters will get us
in a world of trouble - mark my words.

And as I said, there's no limit to the supply of such
chars.

> Changing the error message is always possible, of course.  I'm not sure
> if bringing "ascii" into it would make things clearer though.  Concrete
> suggestions welcome.

See above.  Please drop this attempted "fix" altogether.
It's just misguided, IMO.

At most, if you are persuaded that something needs to be
done about such "bugs" (warning pilots about such possible
pilot error) then please bring it up in emacs-devel.  You
are modifying Lisp itself in a basic way.  This should be
a no-no.

> > And besides - where do you stop doing this kind of thing?
> >
> > Do we do something similar for characters that can
> > be mistaken for a period, in case you use one in an
> > attempt at dotted-pair syntax?
> >
> > Do we do something similar for chars that can be
> > mistaken for a comma, inside backquoted sexps?
> >
> > Do we do something similar for chars that can be
> > mistaken for a backquote?  An at-sign?  Ordinary
> > parentheses?
> 
> Maybe everything in the "Unicode confusables" listing?  Practically
> speaking, I've never heard of problems with other characters, except
> perhaps in programming "puzzles", obfuscated code contents and the like.

There are lots of chars that can be confused, especially
given the possibility of different fonts.  I didn't even
mention other variants of brackets (aka square brackets),
braces (aka curly brackets), angle brackets, etc.

Would you try to protect a user from the confusion of
copy+pasting FULLWIDTH LEFT CURLY BRACKET FF5B{ in place
of LEFT CURLY BRACKET 7B { in a doc string ("... \\{...}")
or in a regexp?  Or of using LEFT WHITE SQUARE BRACKET
301A 〚 in place of [ in a vector?

Lisp is simple - and its use can be complicated.  You are
complicating Lisp itself immensely here.  Will you provide
fancy analysis for all of the possible contexts where such
char confusion could arise?

This is a big mistake - a crack in the foundation, IMO,
even if you think of it now only as helping a user with
a copy+paste error (pilot error).

> > I really hope you reconsider this.  To me it looks
> > like an ugly hack that can bring only harm (including
> > more, not less, confusion), not good.
> 
> Do you have any specific harms/confusion in mind?

See above.

This is *harmful* for our nice, clean Lisp - and YAGNI.





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#30217: Ambiguity in NEWS in emacs-26.0.91
  2018-01-23 15:53             ` Drew Adams
@ 2018-01-23 23:00               ` Noam Postavsky
  2018-01-23 23:19                 ` Drew Adams
  0 siblings, 1 reply; 18+ messages in thread
From: Noam Postavsky @ 2018-01-23 23:00 UTC (permalink / raw)
  To: Drew Adams; +Cc: Alan Mackenzie, 30217

On Tue, Jan 23, 2018 at 10:53 AM, Drew Adams <drew.adams@oracle.com> wrote:

> An Emacs-Lisp file can contain pretty much anything,
> including lots of natural-language text.  Are we now
> issuing warnings even for "smart quotes" in comments
> and strings?

Errors will be issued, but only for those occurring at the beginning of
a symbol.  String and comment contents will remain unaffected.

>>   it produces an obscure error message.
>>
>> The Emacs 25 error for the expression in question is
>>   (wrong-number-of-arguments setq 31)
>
> Which tells you pretty much that setq is missing an
> argument or has too many, which makes you look at its
> arguments.  Not so obscure.  And accurate.

And yet, Alan said

    This has wasted a lot of time identifying the problem, and
    fruitlessly searching for a solution in the Emacs and Elisp manuals,
    etc.

So maybe it's accurate in a narrow technical sense, but not in a
practically useful one.

> (setq ’bar 42)
> (setq foo ’bar)
>
> That's perfectly fine Lisp, even if it might not be
> what some might expect.  But now, after your "fix",
> the first sexp raises an error - at read time, no less.

Yes, that code no longer works, you would have to write

(setq \’bar 42)
(setq foo \’bar)

I don't consider this a big loss.  As far as I can see, this will just
make it harder to write obfuscated lisp code (although there will remain
plenty of other ways to obfuscate lisp code).

>  And this still raises no error:
> (setq a’bar 42).

Yes, it would be more difficult implementation-wise to catch that case,
and it seems much less likely to come up in practice.

> Aside from the error/warning, such _escaping_ is another
> bad idea.  It too subtracts from Lisp (while adding
> nonsense to it).

Nothing about escaping has changed.

> IMHO, this "fix" - all of its parts - should be reverted
[...]
> To be clear, though, I'm in favor of neither of those
[...]
> To be really clear, the fix proposed should be removed.

Thanks for trying to be clear, but repeating yourself like this just
makes your message longer, and therefore harder to comprehend.

I would really appreciate it if you would write shorter and more focused
messages, with less emotional rhetoric.  Keep the "emotional
temperature" low (see https://freenode.net/changuide, which is about
IRC, but the same principles apply to email conversations).

>> Maybe everything in the "Unicode confusables" listing?  Practically
>> speaking, I've never heard of problems with other characters, except
>> perhaps in programming "puzzles", obfuscated code contents and the like.
>
> There are lots of chars that can be confused, especially
> given the possibility of different fonts.  I didn't even
> mention other variants of brackets (aka square brackets),
> braces (aka curly brackets), angle brackets, etc.
>
> Would you try to protect a user from the confusion of
> copy+pasting FULLWIDTH LEFT CURLY BRACKET FF5B{ in place
> of LEFT CURLY BRACKET 7B { in a doc string ("... \\{...}")
> or in a regexp?  Or of using LEFT WHITE SQUARE BRACKET
> 301A 〚 in place of [ in a vector?

I don't plan to spend any effort towards that, no, although I wouldn't
necessarily be opposed to it.





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#30217: Ambiguity in NEWS in emacs-26.0.91
  2018-01-23 23:00               ` Noam Postavsky
@ 2018-01-23 23:19                 ` Drew Adams
  2018-01-24  0:02                   ` Noam Postavsky
  0 siblings, 1 reply; 18+ messages in thread
From: Drew Adams @ 2018-01-23 23:19 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: Alan Mackenzie, 30217

I won't reply to each thing you wrote, as I think
I've already spoken to each of those things and
made myself clear.

> > (setq ’bar 42)
> > (setq foo ’bar)
> 
> Yes, that code no longer works, you would have to write
> 
> (setq \’bar 42)
> (setq foo \’bar)
>
> > such _escaping_ is another bad idea.  It too subtracts
> > from Lisp (while adding nonsense to it).
> 
> Nothing about escaping has changed.

Of course something about escaping has changed.
\’bar is now read differently from ’bar.


[But \﴾bar is not (yet) read differently from ﴾bar.
That char is ORNATE LEFT PARENTHESIS, code point 64830.]





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#30217: Ambiguity in NEWS in emacs-26.0.91
  2018-01-23 23:19                 ` Drew Adams
@ 2018-01-24  0:02                   ` Noam Postavsky
  2018-01-28 15:52                     ` Noam Postavsky
  0 siblings, 1 reply; 18+ messages in thread
From: Noam Postavsky @ 2018-01-24  0:02 UTC (permalink / raw)
  To: Drew Adams; +Cc: Alan Mackenzie, 30217

[-- Attachment #1: Type: text/plain, Size: 624 bytes --]

Drew Adams <drew.adams@oracle.com> writes:

> Of course something about escaping has changed.
> \’bar is now read differently from ’bar.

Oh, I see.  I was considering that since the meaning of \’bar hasn't
changed, then escaping hasn't changed (though non-escaped syntax has).

Anyway, thinking about this made realize I broke read->print
round-tripping for these symbols, because I didn't change print to add
the backslash.  Attached is a patch which does this, but I'm not sure if
it can go into emacs-26.  If not, then I think we should at least delay
introduction of the reader change to Emacs 27.


[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 6255 bytes --]

From c661d622d7109dcddd957524c4dd4457b41c1561 Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs@gmail.com>
Date: Tue, 23 Jan 2018 18:50:23 -0500
Subject: [PATCH] Fix round tripping of read->print for symbols with strange
 quotes

Since 2017-07-22 "Signal error for symbol names with strange
quotes (Bug#2967)", symbol names beginning with certain quote
characters require an escaping backslash.  However, the corresponding
change for printing missed, so that (eq (read (prin1-to-string SYM))
SYM) does not give `t' for such symbols.
* src/character.c (confusable_symbol_character_p): New function,
extracted from test `read1'.
* src/lread.c (read1): Use it.
* src/print.c (print_object): Use it to print a backslash for symbols
starting with characters that `read1' requires to be escaped.
* test/src/print-tests.el (print-read-roundtrip): New test.
* etc/NEWS: Clarify the announcement for the earlier reader
change (Bug#30217).
---
 etc/NEWS                | 12 +++++++++---
 src/character.c         | 26 ++++++++++++++++++++++++++
 src/character.h         |  2 ++
 src/lread.c             | 17 +++--------------
 src/print.c             |  3 ++-
 test/src/print-tests.el |  4 ++++
 6 files changed, 46 insertions(+), 18 deletions(-)

diff --git a/etc/NEWS b/etc/NEWS
index f5859d7a60..c760738105 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1385,9 +1385,15 @@ renamed to 'lread--old-style-backquotes'.  No user code should use
 this variable.
 
 ---
-** To avoid confusion caused by "smart quotes", the reader no longer
-accepts Lisp symbols which begin with the following quotation
-characters: ‘’‛“”‟〞"', unless they are escaped with backslash.
+** To avoid confusion caused by "smart quotes", the reader signals an
+error when reading Lisp symbols which begin with one of the following
+quotation characters: ‘’‛“”‟〞"'.  A symbol beginning with such a
+character can be written by escaping the quotation character with a
+backslash.  For example:
+
+    (read "‘smart") => (invalid-read-syntax "strange quote" "‘")
+    (read "\\‘smart") == (intern "‘smart")
+
 
 +++
 ** 'default-file-name-coding-system' now defaults to a coding system
diff --git a/src/character.c b/src/character.c
index fa817a5031..4a934c7801 100644
--- a/src/character.c
+++ b/src/character.c
@@ -1050,6 +1050,32 @@ blankp (int c)
   return XINT (category) == UNICODE_CATEGORY_Zs; /* separator, space */
 }
 
+
+/* Return true for characters that would read as symbol characters,
+   but graphically may be confused with some kind of punctuation.  We
+   require an escaping backslash, when such characters begin a
+   symbol.  */
+bool
+confusable_symbol_character_p (int ch)
+{
+  switch (ch)
+    {
+    case 0x2018: /* LEFT SINGLE QUOTATION MARK */
+    case 0x2019: /* RIGHT SINGLE QUOTATION MARK */
+    case 0x201B: /* SINGLE HIGH-REVERSED-9 QUOTATION MARK */
+    case 0x201C: /* LEFT DOUBLE QUOTATION MARK */
+    case 0x201D: /* RIGHT DOUBLE QUOTATION MARK */
+    case 0x201F: /* DOUBLE HIGH-REVERSED-9 QUOTATION MARK */
+    case 0x301E: /* DOUBLE PRIME QUOTATION MARK */
+    case 0xFF02: /* FULLWIDTH QUOTATION MARK */
+    case 0xFF07: /* FULLWIDTH APOSTROPHE */
+      return true;
+
+    default:
+      return false;
+    }
+}
+
 signed char HEXDIGIT_CONST hexdigit[UCHAR_MAX + 1] =
   {
 #if HEXDIGIT_IS_CONST
diff --git a/src/character.h b/src/character.h
index c716885d46..d9e2d7bfc6 100644
--- a/src/character.h
+++ b/src/character.h
@@ -682,6 +682,8 @@ char_surrogate_p (int c)
 extern bool printablep (int);
 extern bool blankp (int);
 
+extern bool confusable_symbol_character_p (int ch);
+
 /* Return a translation table of id number ID.  */
 #define GET_TRANSLATION_TABLE(id) \
   (XCDR (XVECTOR (Vtranslation_table_vector)->contents[(id)]))
diff --git a/src/lread.c b/src/lread.c
index 45d60647be..82731781f0 100644
--- a/src/lread.c
+++ b/src/lread.c
@@ -3482,20 +3482,9 @@ read1 (Lisp_Object readcharfun, int *pch, bool first_in_list)
         if (!quoted && multibyte)
           {
             int ch = STRING_CHAR ((unsigned char *) read_buffer);
-            switch (ch)
-              {
-              case 0x2018: /* LEFT SINGLE QUOTATION MARK */
-              case 0x2019: /* RIGHT SINGLE QUOTATION MARK */
-              case 0x201B: /* SINGLE HIGH-REVERSED-9 QUOTATION MARK */
-              case 0x201C: /* LEFT DOUBLE QUOTATION MARK */
-              case 0x201D: /* RIGHT DOUBLE QUOTATION MARK */
-              case 0x201F: /* DOUBLE HIGH-REVERSED-9 QUOTATION MARK */
-              case 0x301E: /* DOUBLE PRIME QUOTATION MARK */
-              case 0xFF02: /* FULLWIDTH QUOTATION MARK */
-              case 0xFF07: /* FULLWIDTH APOSTROPHE */
-                xsignal2 (Qinvalid_read_syntax, build_string ("strange quote"),
-                          CALLN (Fstring, make_number (ch)));
-              }
+            if (confusable_symbol_character_p (ch))
+              xsignal2 (Qinvalid_read_syntax, build_string ("strange quote"),
+                        CALLN (Fstring, make_number (ch)));
           }
 	{
 	  Lisp_Object result;
diff --git a/src/print.c b/src/print.c
index 47cb33deeb..b0741531f7 100644
--- a/src/print.c
+++ b/src/print.c
@@ -1971,7 +1971,8 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
 		    || c == ';' || c == '#' || c == '(' || c == ')'
 		    || c == ',' || c == '.' || c == '`'
 		    || c == '[' || c == ']' || c == '?' || c <= 040
-		    || confusing)
+                    || confusing
+		    || (i == 1 && confusable_symbol_character_p (c)))
 		  {
 		    printchar ('\\', printcharfun);
 		    confusing = false;
diff --git a/test/src/print-tests.el b/test/src/print-tests.el
index 46368c69ad..01e65028bc 100644
--- a/test/src/print-tests.el
+++ b/test/src/print-tests.el
@@ -58,5 +58,9 @@
                        (buffer-string))
                      "--------\n"))))
 
+(ert-deftest print-read-roundtrip ()
+  (let ((sym '\’bar))
+    (should (eq (read (prin1-to-string sym)) sym))))
+
 (provide 'print-tests)
 ;;; print-tests.el ends here
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* bug#30217: Ambiguity in NEWS in emacs-26.0.91
  2018-01-24  0:02                   ` Noam Postavsky
@ 2018-01-28 15:52                     ` Noam Postavsky
  2018-02-02 18:52                       ` Drew Adams
  0 siblings, 1 reply; 18+ messages in thread
From: Noam Postavsky @ 2018-01-28 15:52 UTC (permalink / raw)
  To: Drew Adams; +Cc: Alan Mackenzie, 30217

tags 30217 fixed
close 30217 27.1
quit

Noam Postavsky <npostavs@users.sourceforge.net> writes:

> Anyway, thinking about this made realize I broke read->print
> round-tripping for these symbols, because I didn't change print to add
> the backslash.  Attached is a patch which does this, but I'm not sure if
> it can go into emacs-26.  If not, then I think we should at least delay
> introduction of the reader change to Emacs 27.

I've reverted the reader change from emacs-26 [1: 0510a78da5], and made
the printer change in master [2: 36c8128e74].

[1: 0510a78da5]: 2018-01-28 10:49:51 -0500
  Revert "Signal error for symbol names with strange quotes (Bug#2967)"
  https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=0510a78da5faaa40ebfdf59d0ac6107a72c1be1d

[2: 36c8128e74]: 2018-01-28 10:43:01 -0500
  Fix round tripping of read->print for symbols with strange quotes
  https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=36c8128e740ce91af10769bef46a21a72dafc56c






^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#30217: Ambiguity in NEWS in emacs-26.0.91
  2018-01-28 15:52                     ` Noam Postavsky
@ 2018-02-02 18:52                       ` Drew Adams
  2018-02-02 19:08                         ` Noam Postavsky
  0 siblings, 1 reply; 18+ messages in thread
From: Drew Adams @ 2018-02-02 18:52 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: Alan Mackenzie, 30217

> > Anyway, thinking about this made realize I broke read->print
> > round-tripping for these symbols, because I didn't change print to add
> > the backslash.  Attached is a patch which does this, but I'm not sure
> > if it can go into emacs-26.  If not, then I think we should at least
> > delay introduction of the reader change to Emacs 27.
> 
> I've reverted the reader change from emacs-26 [1: 0510a78da5], and made
> the printer change in master [2: 36c8128e74].
> 
> [1: 0510a78da5]: 2018-01-28 10:49:51 -0500
>   Revert "Signal error for symbol names with strange quotes (Bug#2967)"
>   https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__git.savannah.gnu.org_cgit_emacs.git_commit_-3Fid-
> 3D0510a78da5faaa40ebfdf59d0ac6107a72c1be1d&d=DwIBAg&c=RoP1YumCXCgaWHvlZY
> R8PZh8Bv7qIrMUB65eapI_JnE&r=kI3P6ljGv6CTHIKju0jqInF6AOwMCYRDQUmqX22rJ98&
> m=CJCOrx9BMpwlrEdgoRt6L_U2rZeTHXl36a6syPdXK0A&s=C6Y-
> iAZovMSg2XWbKEAMMn5ACMJh9Xxqgd1MWV-x_bY&e=
> 
> [2: 36c8128e74]: 2018-01-28 10:43:01 -0500
>   Fix round tripping of read->print for symbols with strange quotes
>   https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__git.savannah.gnu.org_cgit_emacs.git_commit_-3Fid-
> 3D36c8128e740ce91af10769bef46a21a72dafc56c&d=DwIBAg&c=RoP1YumCXCgaWHvlZY
> R8PZh8Bv7qIrMUB65eapI_JnE&r=kI3P6ljGv6CTHIKju0jqInF6AOwMCYRDQUmqX22rJ98&
> m=CJCOrx9BMpwlrEdgoRt6L_U2rZeTHXl36a6syPdXK0A&s=BMX9YKfA1uHGyZL4RoGAKcs2
> yeKzu3QkNTMZhAdnPZU&e=

Sorry, but it's not clear to me.  Is this being abandoned
completely (I hope so), or is it just being postponed to
Emacs 27?

I came across this in the latest emacs-tangents@gnu.org
message for 2018-01-29:

http://git.savannah.gnu.org/cgit/emacs.git/commit/etc/NEWS?id=36c8128e740ce91af10769bef46a21a72dafc56c





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#30217: Ambiguity in NEWS in emacs-26.0.91
  2018-02-02 18:52                       ` Drew Adams
@ 2018-02-02 19:08                         ` Noam Postavsky
  2018-02-02 21:37                           ` Drew Adams
  0 siblings, 1 reply; 18+ messages in thread
From: Noam Postavsky @ 2018-02-02 19:08 UTC (permalink / raw)
  To: Drew Adams; +Cc: Alan Mackenzie, 30217

On Fri, Feb 2, 2018 at 1:52 PM, Drew Adams <drew.adams@oracle.com> wrote:

> Sorry, but it's not clear to me.  Is this being abandoned
> completely (I hope so), or is it just being postponed to
> Emacs 27?

It's currently only postponed to Emacs 27, I suggest you bring it up
in emacs-devel if you think we should get rid of it. Since we simply
disagree about this, I don't think further dialogue here will help
anything.





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#30217: Ambiguity in NEWS in emacs-26.0.91
  2018-02-02 19:08                         ` Noam Postavsky
@ 2018-02-02 21:37                           ` Drew Adams
  2018-02-02 22:14                             ` Ista Zahn
  0 siblings, 1 reply; 18+ messages in thread
From: Drew Adams @ 2018-02-02 21:37 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: Alan Mackenzie, 30217

> > Sorry, but it's not clear to me.  Is this being abandoned
> > completely (I hope so), or is it just being postponed to
> > Emacs 27?
> 
> It's currently only postponed to Emacs 27, I suggest you bring it up
> in emacs-devel if you think we should get rid of it. Since we simply
> disagree about this, I don't think further dialogue here will help
> anything.

I think you should bring it up, and I think you should have
from the beginning.  This is not just about fixing a bug.

You're the one who is, in effect, proposing a change to Lisp.

This is not normal Lisp behavior.  This is a far cry from
quote and backquote, comma and period, all of which are quite
traditional for Lisp.

These are ordinary symbol-constituent characters, and should
not be handled in the way you've implemented.  (I wanted to
say "suggested", but you didn't suggest it to emacs-devel;
you just implemented it - in a bug thread, no less.)





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#30217: Ambiguity in NEWS in emacs-26.0.91
  2018-02-02 21:37                           ` Drew Adams
@ 2018-02-02 22:14                             ` Ista Zahn
  2018-02-02 22:35                               ` Noam Postavsky
  0 siblings, 1 reply; 18+ messages in thread
From: Ista Zahn @ 2018-02-02 22:14 UTC (permalink / raw)
  To: Drew Adams; +Cc: Alan Mackenzie, 30217, Noam Postavsky

[-- Attachment #1: Type: text/plain, Size: 1180 bytes --]

On Feb 2, 2018 4:39 PM, "Drew Adams" <drew.adams@oracle.com> wrote:

> > Sorry, but it's not clear to me.  Is this being abandoned
> > completely (I hope so), or is it just being postponed to
> > Emacs 27?
>
> It's currently only postponed to Emacs 27, I suggest you bring it up
> in emacs-devel if you think we should get rid of it. Since we simply
> disagree about this, I don't think further dialogue here will help
> anything.

I think you should bring it up, and I think you should have
from the beginning.  This is not just about fixing a bug.

You're the one who is, in effect, proposing a change to Lisp.

This is not normal Lisp behavior.  This is a far cry from
quote and backquote, comma and period, all of which are quite
traditional for Lisp.

These are ordinary symbol-constituent characters, and should
not be handled in the way you've implemented.  (I wanted to
say "suggested", but you didn't suggest it to emacs-devel;
you just implemented it - in a bug thread, no less.)


I'm nobody in this community, but in case it means anything I agree
completely with Drew. This isn't a bug fix, but a language change that
needs to be carefully thought out and discussed.

[-- Attachment #2: Type: text/html, Size: 2085 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#30217: Ambiguity in NEWS in emacs-26.0.91
  2018-02-02 22:14                             ` Ista Zahn
@ 2018-02-02 22:35                               ` Noam Postavsky
  0 siblings, 0 replies; 18+ messages in thread
From: Noam Postavsky @ 2018-02-02 22:35 UTC (permalink / raw)
  To: Ista Zahn; +Cc: Alan Mackenzie, 30217

On Fri, Feb 2, 2018 at 5:14 PM, Ista Zahn <istazahn@gmail.com> wrote:

> I'm nobody in this community, but in case it means anything I agree
> completely with Drew. This isn't a bug fix, but a language change that needs
> to be carefully thought out and discussed.

Thanks. I'm honestly not sure how much careful thought will be needed
beyond a tally of yes/no votes, but I've posted to emacs-devel now.

https://lists.gnu.org/archive/html/emacs-devel/2018-02/msg00093.html





^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2018-02-02 22:35 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-22 22:17 bug#30217: Ambiguity in NEWS in emacs-26.0.91 Alan Mackenzie
2018-01-22 22:42 ` Drew Adams
2018-01-23  0:42   ` Noam Postavsky
2018-01-23  0:56     ` Drew Adams
2018-01-23  1:40       ` Noam Postavsky
2018-01-23  6:07         ` Drew Adams
2018-01-23  6:21           ` Drew Adams
2018-01-23 12:54           ` Noam Postavsky
2018-01-23 15:53             ` Drew Adams
2018-01-23 23:00               ` Noam Postavsky
2018-01-23 23:19                 ` Drew Adams
2018-01-24  0:02                   ` Noam Postavsky
2018-01-28 15:52                     ` Noam Postavsky
2018-02-02 18:52                       ` Drew Adams
2018-02-02 19:08                         ` Noam Postavsky
2018-02-02 21:37                           ` Drew Adams
2018-02-02 22:14                             ` Ista Zahn
2018-02-02 22:35                               ` Noam Postavsky

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).