* bug#30217: Ambiguity in NEWS in emacs-26.0.91 @ 2018-01-22 22:17 Alan Mackenzie 2018-01-22 22:42 ` Drew Adams 0 siblings, 1 reply; 18+ messages in thread From: Alan Mackenzie @ 2018-01-22 22:17 UTC (permalink / raw) To: 30217 Hello, Emacs. In the new NEWS in the recent pretest, at L1381 we have: ** To avoid confusion caused by "smart quotes", the reader no longer accepts Lisp symbols which begin with the following quotation characters: `'�""�� � � , unless they are escaped with backslash. ^^^^ , which leaves it unclear whether it's the "smart quotes" or the Lisp symbols which need escaping. I suggest replacing "they" with either "these quotes" or "these symbols" depending on the desired meaning. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#30217: Ambiguity in NEWS in emacs-26.0.91 2018-01-22 22:17 bug#30217: Ambiguity in NEWS in emacs-26.0.91 Alan Mackenzie @ 2018-01-22 22:42 ` Drew Adams 2018-01-23 0:42 ` Noam Postavsky 0 siblings, 1 reply; 18+ messages in thread From: Drew Adams @ 2018-01-22 22:42 UTC (permalink / raw) To: Alan Mackenzie, 30217 > In the new NEWS in the recent pretest, at L1381 we have: > > ** To avoid confusion caused by "smart quotes", the reader no longer > accepts Lisp symbols which begin with the following quotation > characters: `'"" , unless they are escaped with backslash. > ^^^^ > > , which leaves it unclear whether it's the "smart quotes" or the Lisp > symbols which need escaping. > > I suggest replacing "they" with either "these quotes" or "these symbols" > depending on the desired meaning. Even if that ambiguity gets resolved, I have no idea what the text means. What does it mean for the Lisp reader to "accept a Lisp symbol"? Please describe exactly what the reader does when it reads one of those characters followed by Lisp-symbol syntax, in both cases: char escaped and char not escaped. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#30217: Ambiguity in NEWS in emacs-26.0.91 2018-01-22 22:42 ` Drew Adams @ 2018-01-23 0:42 ` Noam Postavsky 2018-01-23 0:56 ` Drew Adams 0 siblings, 1 reply; 18+ messages in thread From: Noam Postavsky @ 2018-01-23 0:42 UTC (permalink / raw) To: Drew Adams; +Cc: Alan Mackenzie, 30217 Drew Adams <drew.adams@oracle.com> writes: >> In the new NEWS in the recent pretest, at L1381 we have: >> >> ** To avoid confusion caused by "smart quotes", the reader no longer >> accepts Lisp symbols which begin with the following quotation >> characters: `'"" , unless they are escaped with backslash. >> ^^^^ >> >> , which leaves it unclear whether it's the "smart quotes" or the Lisp >> symbols which need escaping. > Even if that ambiguity gets resolved, I have no idea what > the text means. What does it mean for the Lisp reader to > "accept a Lisp symbol"? > > Please describe exactly what the reader does when it reads > one of those characters followed by Lisp-symbol syntax, in > both cases: char escaped and char not escaped. How about this: ** To avoid confusion caused by "smart quotes", the reader signals an error when reading Lisp symbols which begin with one of the following quotation characters: ‘’‛“”‟〞"'. A symbol beginning with such a character can be written by escaping the quotation character with a backslash. For example: (read "‘smart") => Lisp error: (invalid-read-syntax "strange quote" "‘") (read "\\‘smart") == (intern "‘smart") ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#30217: Ambiguity in NEWS in emacs-26.0.91 2018-01-23 0:42 ` Noam Postavsky @ 2018-01-23 0:56 ` Drew Adams 2018-01-23 1:40 ` Noam Postavsky 0 siblings, 1 reply; 18+ messages in thread From: Drew Adams @ 2018-01-23 0:56 UTC (permalink / raw) To: Noam Postavsky; +Cc: Alan Mackenzie, 30217 > > Please describe exactly what the reader does when it reads > > one of those characters followed by Lisp-symbol syntax, in > > both cases: char escaped and char not escaped. > > How about this: > > ** To avoid confusion caused by "smart quotes", the reader signals an > error when reading Lisp symbols which begin with one of the following > quotation characters: ‘’‛“”‟〞"'. A symbol beginning with such a > character can be written by escaping the quotation character with a > backslash. For example: > > (read "‘smart") => Lisp error: (invalid-read-syntax "strange > quote" "‘") > (read "\\‘smart") == (intern "‘smart") Yes, that's clear (to me). I would never have guessed that the previous description meant that. But may I ask why such "strange quote" characters are not taken as lisp-symbol constituent characters? Why the need to escape them? Why are they treated specially? That description describes a workaround "to avoid confusion", but it's not clear why we need "to avoid confusion". What good is the error behavior in the first place? If such chars are not to be treated as normal symbol chars it should be because they have some special treatment/behavior/interpretation for Lisp, no? If the only non-escaped behavior is to raise an error then that just sounds like a bug, to me. I'm probably missing something important, but whatever that is it does not seem to be conveyed by the NEWS description. At all. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#30217: Ambiguity in NEWS in emacs-26.0.91 2018-01-23 0:56 ` Drew Adams @ 2018-01-23 1:40 ` Noam Postavsky 2018-01-23 6:07 ` Drew Adams 0 siblings, 1 reply; 18+ messages in thread From: Noam Postavsky @ 2018-01-23 1:40 UTC (permalink / raw) To: Drew Adams; +Cc: Alan Mackenzie, 30217 Drew Adams <drew.adams@oracle.com> writes: > But may I ask why such "strange quote" characters are not > taken as lisp-symbol constituent characters? Why the need > to escape them? Why are they treated specially? > > That description describes a workaround "to avoid confusion", > but it's not clear why we need "to avoid confusion". To give a less confusing error in cases like Bug#2967 and Bug#23425. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#30217: Ambiguity in NEWS in emacs-26.0.91 2018-01-23 1:40 ` Noam Postavsky @ 2018-01-23 6:07 ` Drew Adams 2018-01-23 6:21 ` Drew Adams 2018-01-23 12:54 ` Noam Postavsky 0 siblings, 2 replies; 18+ messages in thread From: Drew Adams @ 2018-01-23 6:07 UTC (permalink / raw) To: Noam Postavsky; +Cc: Alan Mackenzie, 30217 > > But may I ask why such "strange quote" characters are not > > taken as lisp-symbol constituent characters? Why the need > > to escape them? Why are they treated specially? > > > > That description describes a workaround "to avoid confusion", > > but it's not clear why we need "to avoid confusion". > > To give a less confusing error in cases like Bug#2967 and Bug#23425. Seriously? This is an absolutely horrible "fix" for each of those problems. This "cure" is worse than either of those diseases, and as we all know, I think such diseases are pretty awful. The error message seems to be _super_ confusing. It gives no indication of problems such as those bugs, and it does not begin to enlighten anyone about the confusion at their heart. If no one has a real fix for such bugs yet then please just leave them open until someone comes up with a good idea. This "fix" is not a good idea - for those bugs at least. If this fix has some other purpose, then let's please know what that is and talk about it. But if such problems are the only reason for this "fix" then please consider getting rid of such silly and useless escaping and just change the error message to make clear just what confusion it is meant to address: say that the character is not an ascii apostrophe or whatever, if that confusion is the real problem this is trying to solve. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#30217: Ambiguity in NEWS in emacs-26.0.91 2018-01-23 6:07 ` Drew Adams @ 2018-01-23 6:21 ` Drew Adams 2018-01-23 12:54 ` Noam Postavsky 1 sibling, 0 replies; 18+ messages in thread From: Drew Adams @ 2018-01-23 6:21 UTC (permalink / raw) To: Noam Postavsky; +Cc: Alan Mackenzie, 30217 And besides - where do you stop doing this kind of thing? Do we do something similar for characters that can be mistaken for a period, in case you use one in an attempt at dotted-pair syntax? Do we do something similar for chars that can be mistaken for a comma, inside backquoted sexps? Do we do something similar for chars that can be mistaken for a backquote? An at-sign? Ordinary parentheses? I really hope you reconsider this. To me it looks like an ugly hack that can bring only harm (including more, not less, confusion), not good. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#30217: Ambiguity in NEWS in emacs-26.0.91 2018-01-23 6:07 ` Drew Adams 2018-01-23 6:21 ` Drew Adams @ 2018-01-23 12:54 ` Noam Postavsky 2018-01-23 15:53 ` Drew Adams 1 sibling, 1 reply; 18+ messages in thread From: Noam Postavsky @ 2018-01-23 12:54 UTC (permalink / raw) To: Drew Adams; +Cc: Alan Mackenzie, 30217 Drew Adams <drew.adams@oracle.com> writes: >> To give a less confusing error in cases like Bug#2967 and Bug#23425. > > Seriously? This is an absolutely horrible "fix" for each > of those problems. This "cure" is worse than either of > those diseases, and as we all know, I think such diseases > are pretty awful. > > The error message seems to be _super_ confusing. It gives > no indication of problems such as those bugs, and it does > not begin to enlighten anyone about the confusion at their > heart. The OP of Bug#2967 says I think it would be good if emacs looked for smart quotes in .emacs files and gave a warning or notice if it detected them. This would help troubleshooting. Which is exactly what's being done now. The OP of Bug#23425 says When this output is fed back into Emacs with M-:, it produces an obscure error message. The Emacs 25 error for the expression in question is (wrong-number-of-arguments setq 31) In Emacs 26.0.91, it is (invalid-read-syntax "strange quote" "’") I think this is an improvement, since it does, in fact, indicate there is a problematic use of ’. Why do you think the signalling an error in this case is a bad idea? > If no one has a real fix for such bugs yet then please just > leave them open until someone comes up with a good idea. > This "fix" is not a good idea - for those bugs at least. > > If this fix has some other purpose, then let's please > know what that is and talk about it. > > But if such problems are the only reason for this "fix" > then please consider getting rid of such silly and useless > escaping and just change the error message I don't quite understand what you mean by "getting rid of... escaping" but keeping the error message. It sounds like a you are contradicting yourself. > to make clear just what confusion it is meant to address: say that the > character is not an ascii apostrophe or whatever, if that confusion is > the real problem this is trying to solve. Changing the error message is always possible, of course. I'm not sure if bringing "ascii" into it would make things clearer though. Concrete suggestions welcome. > And besides - where do you stop doing this kind of thing? > > Do we do something similar for characters that can > be mistaken for a period, in case you use one in an > attempt at dotted-pair syntax? > > Do we do something similar for chars that can be > mistaken for a comma, inside backquoted sexps? > > Do we do something similar for chars that can be > mistaken for a backquote? An at-sign? Ordinary > parentheses? Maybe everything in the "Unicode confusables" listing? Practically speaking, I've never heard of problems with other characters, except perhaps in programming "puzzles", obfuscated code contents and the like. > I really hope you reconsider this. To me it looks > like an ugly hack that can bring only harm (including > more, not less, confusion), not good. Do you have any specific harms/confusion in mind? ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#30217: Ambiguity in NEWS in emacs-26.0.91 2018-01-23 12:54 ` Noam Postavsky @ 2018-01-23 15:53 ` Drew Adams 2018-01-23 23:00 ` Noam Postavsky 0 siblings, 1 reply; 18+ messages in thread From: Drew Adams @ 2018-01-23 15:53 UTC (permalink / raw) To: Noam Postavsky; +Cc: Alan Mackenzie, 30217 > The OP of Bug#2967 says > I think it would be good if emacs looked for smart > quotes in .emacs files and gave a warning or notice > if it detected them. This would help troubleshooting. > Which is exactly what's being done now. It's not necessarily appropriate to satisfy every suggestion offered in every bug report. ;-) I'm not sure such a warning is good, rather than bad. I think not. If a warning was printed for each "smart quote" occurrence in a file that would surely be bad, IMO. An Emacs-Lisp file can contain pretty much anything, including lots of natural-language text. Are we now issuing warnings even for "smart quotes" in comments and strings? That would definitely be a mistake. In any case, I don't care much about byte-compiler warnings - they are not the problem I responded to. They can be ignored when they are not particularly relevant. The fact that they can sometimes represent noise is at most an annoyance, not a real problem. > The OP of Bug#23425 says > When this output is fed back into Emacs with M-:, That represents pilot error, no doubt. For `M-:' we _might_ try to provide an error message that says you included a "smart quote" and say you might want to check that that's what you intended. I'm not suggesting we do that - I prefer not. But it's conceivable, if someone is really gung ho about solving the purported problem. I doubt that would be a good idea even for `M-:'. But it would surely not be appropriate for other contexts. And even for `M-:' it's not obvious that we would come up with a good test for the cases where it would be helpful rather than confusing. > it produces an obscure error message. > > The Emacs 25 error for the expression in question is > (wrong-number-of-arguments setq 31) Which tells you pretty much that setq is missing an argument or has too many, which makes you look at its arguments. Not so obscure. And accurate. > In Emacs 26.0.91, it is > (invalid-read-syntax "strange quote" "’") Which is completely obscure, IMO. Invalid read syntax when reading what? What's invalid about it? In fact, it is not invalid. It has never been invalid, and it shouldn't suddenly be considered invalid now. Confusion, not understanding an accurate error msg, is not the same thing as Lisp itself having a bug because such a character is included in a symbol. > I think this is an improvement, since it does, in fact, > indicate there is a problematic use of ’. There is NOT any problematic use of ’ there. The user's understanding might be problematic, but that read syntax is not problematic for Lisp. Help users if we can, but don't screw Lisp in the process. (setq ’bar 42) (setq foo ’bar) That's perfectly fine Lisp, even if it might not be what some might expect. But now, after your "fix", the first sexp raises an error - at read time, no less. This is just wrong, IMO. You've redefined Lisp evaluation, taking away some of the importance of symbols. And this still raises no error: (setq a’bar 42). > Why do you think the signalling an error in this case > is a bad idea? Because it is. Ms Lisp all her users are being treated unfairly. See above, and see my previous msg. ’bar is a fine symbol. ’ has NO special meaning in Lisp - it is NOT like ' or ` or ( or ) or . or , or @. Now you've given it a special meaning: when in a context where ' is special, raise an error because it is not '. That's plain wrong and confusing, and it subtracts from Lisp (while adding nonsense to it). > > If no one has a real fix for such bugs yet then please just > > leave them open until someone comes up with a good idea. > > This "fix" is not a good idea - for those bugs at least. > > > > If this fix has some other purpose, then let's please > > know what that is and talk about it. > > > > But if such problems are the only reason for this "fix" > > then please consider getting rid of such silly and useless > > escaping and just change the error message > > I don't quite understand what you mean by "getting rid of... escaping" > but keeping the error message. It sounds like a you are contradicting > yourself. I didn't say keep the error msg. I said that if you really think that some warning or error msg is important here then fine. But then improve the msg. But I do NOT think that an error msg or warning is good here. A warning maybe, but not an error, which prevents evaluation. (But I doubt that warnings can be used here accurately and without sowing ever more confusion.) Aside from the error/warning, such _escaping_ is another bad idea. It too subtracts from Lisp (while adding nonsense to it). IMHO, this "fix" - all of its parts - should be reverted ASAP. If you want to add some better error messaging where we already raise an error, and if we can really distinguish the cases where the better messaging should be used, fine. And if you want to add some warnings, and if we can really distinguish the cases where the warnings would be appropriate - accurately, fine. To be clear, though, I'm in favor of neither of those things. Just leave it alone. Using (mistakenly or purposefully) such characters in symbol names is just another potential gotcha. There are plenty of them. Users need to learn, e.g., that . is a symbol-constituent char in Lisp - so you can have a symbol `a.b'. And (a.b) is not the same as (a . b). Will you start requiring users to escape the . in the symbol `a.b'? To be really clear, the fix proposed should be removed. Such characters, even if perhaps sometimes confusing to some users, are legitimate symbol characters. They should just be left alone. At _most_, and only if the analysis were super-accurate and crystal clear, we could consider adding warnings here or there. We must certainly not change Lisp here - no error-raising. Starting to special-case such characters will get us in a world of trouble - mark my words. And as I said, there's no limit to the supply of such chars. > Changing the error message is always possible, of course. I'm not sure > if bringing "ascii" into it would make things clearer though. Concrete > suggestions welcome. See above. Please drop this attempted "fix" altogether. It's just misguided, IMO. At most, if you are persuaded that something needs to be done about such "bugs" (warning pilots about such possible pilot error) then please bring it up in emacs-devel. You are modifying Lisp itself in a basic way. This should be a no-no. > > And besides - where do you stop doing this kind of thing? > > > > Do we do something similar for characters that can > > be mistaken for a period, in case you use one in an > > attempt at dotted-pair syntax? > > > > Do we do something similar for chars that can be > > mistaken for a comma, inside backquoted sexps? > > > > Do we do something similar for chars that can be > > mistaken for a backquote? An at-sign? Ordinary > > parentheses? > > Maybe everything in the "Unicode confusables" listing? Practically > speaking, I've never heard of problems with other characters, except > perhaps in programming "puzzles", obfuscated code contents and the like. There are lots of chars that can be confused, especially given the possibility of different fonts. I didn't even mention other variants of brackets (aka square brackets), braces (aka curly brackets), angle brackets, etc. Would you try to protect a user from the confusion of copy+pasting FULLWIDTH LEFT CURLY BRACKET FF5B{ in place of LEFT CURLY BRACKET 7B { in a doc string ("... \\{...}") or in a regexp? Or of using LEFT WHITE SQUARE BRACKET 301A 〚 in place of [ in a vector? Lisp is simple - and its use can be complicated. You are complicating Lisp itself immensely here. Will you provide fancy analysis for all of the possible contexts where such char confusion could arise? This is a big mistake - a crack in the foundation, IMO, even if you think of it now only as helping a user with a copy+paste error (pilot error). > > I really hope you reconsider this. To me it looks > > like an ugly hack that can bring only harm (including > > more, not less, confusion), not good. > > Do you have any specific harms/confusion in mind? See above. This is *harmful* for our nice, clean Lisp - and YAGNI. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#30217: Ambiguity in NEWS in emacs-26.0.91 2018-01-23 15:53 ` Drew Adams @ 2018-01-23 23:00 ` Noam Postavsky 2018-01-23 23:19 ` Drew Adams 0 siblings, 1 reply; 18+ messages in thread From: Noam Postavsky @ 2018-01-23 23:00 UTC (permalink / raw) To: Drew Adams; +Cc: Alan Mackenzie, 30217 On Tue, Jan 23, 2018 at 10:53 AM, Drew Adams <drew.adams@oracle.com> wrote: > An Emacs-Lisp file can contain pretty much anything, > including lots of natural-language text. Are we now > issuing warnings even for "smart quotes" in comments > and strings? Errors will be issued, but only for those occurring at the beginning of a symbol. String and comment contents will remain unaffected. >> it produces an obscure error message. >> >> The Emacs 25 error for the expression in question is >> (wrong-number-of-arguments setq 31) > > Which tells you pretty much that setq is missing an > argument or has too many, which makes you look at its > arguments. Not so obscure. And accurate. And yet, Alan said This has wasted a lot of time identifying the problem, and fruitlessly searching for a solution in the Emacs and Elisp manuals, etc. So maybe it's accurate in a narrow technical sense, but not in a practically useful one. > (setq ’bar 42) > (setq foo ’bar) > > That's perfectly fine Lisp, even if it might not be > what some might expect. But now, after your "fix", > the first sexp raises an error - at read time, no less. Yes, that code no longer works, you would have to write (setq \’bar 42) (setq foo \’bar) I don't consider this a big loss. As far as I can see, this will just make it harder to write obfuscated lisp code (although there will remain plenty of other ways to obfuscate lisp code). > And this still raises no error: > (setq a’bar 42). Yes, it would be more difficult implementation-wise to catch that case, and it seems much less likely to come up in practice. > Aside from the error/warning, such _escaping_ is another > bad idea. It too subtracts from Lisp (while adding > nonsense to it). Nothing about escaping has changed. > IMHO, this "fix" - all of its parts - should be reverted [...] > To be clear, though, I'm in favor of neither of those [...] > To be really clear, the fix proposed should be removed. Thanks for trying to be clear, but repeating yourself like this just makes your message longer, and therefore harder to comprehend. I would really appreciate it if you would write shorter and more focused messages, with less emotional rhetoric. Keep the "emotional temperature" low (see https://freenode.net/changuide, which is about IRC, but the same principles apply to email conversations). >> Maybe everything in the "Unicode confusables" listing? Practically >> speaking, I've never heard of problems with other characters, except >> perhaps in programming "puzzles", obfuscated code contents and the like. > > There are lots of chars that can be confused, especially > given the possibility of different fonts. I didn't even > mention other variants of brackets (aka square brackets), > braces (aka curly brackets), angle brackets, etc. > > Would you try to protect a user from the confusion of > copy+pasting FULLWIDTH LEFT CURLY BRACKET FF5B{ in place > of LEFT CURLY BRACKET 7B { in a doc string ("... \\{...}") > or in a regexp? Or of using LEFT WHITE SQUARE BRACKET > 301A 〚 in place of [ in a vector? I don't plan to spend any effort towards that, no, although I wouldn't necessarily be opposed to it. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#30217: Ambiguity in NEWS in emacs-26.0.91 2018-01-23 23:00 ` Noam Postavsky @ 2018-01-23 23:19 ` Drew Adams 2018-01-24 0:02 ` Noam Postavsky 0 siblings, 1 reply; 18+ messages in thread From: Drew Adams @ 2018-01-23 23:19 UTC (permalink / raw) To: Noam Postavsky; +Cc: Alan Mackenzie, 30217 I won't reply to each thing you wrote, as I think I've already spoken to each of those things and made myself clear. > > (setq ’bar 42) > > (setq foo ’bar) > > Yes, that code no longer works, you would have to write > > (setq \’bar 42) > (setq foo \’bar) > > > such _escaping_ is another bad idea. It too subtracts > > from Lisp (while adding nonsense to it). > > Nothing about escaping has changed. Of course something about escaping has changed. \’bar is now read differently from ’bar. [But \﴾bar is not (yet) read differently from ﴾bar. That char is ORNATE LEFT PARENTHESIS, code point 64830.] ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#30217: Ambiguity in NEWS in emacs-26.0.91 2018-01-23 23:19 ` Drew Adams @ 2018-01-24 0:02 ` Noam Postavsky 2018-01-28 15:52 ` Noam Postavsky 0 siblings, 1 reply; 18+ messages in thread From: Noam Postavsky @ 2018-01-24 0:02 UTC (permalink / raw) To: Drew Adams; +Cc: Alan Mackenzie, 30217 [-- Attachment #1: Type: text/plain, Size: 624 bytes --] Drew Adams <drew.adams@oracle.com> writes: > Of course something about escaping has changed. > \’bar is now read differently from ’bar. Oh, I see. I was considering that since the meaning of \’bar hasn't changed, then escaping hasn't changed (though non-escaped syntax has). Anyway, thinking about this made realize I broke read->print round-tripping for these symbols, because I didn't change print to add the backslash. Attached is a patch which does this, but I'm not sure if it can go into emacs-26. If not, then I think we should at least delay introduction of the reader change to Emacs 27. [-- Attachment #2: patch --] [-- Type: text/plain, Size: 6255 bytes --] From c661d622d7109dcddd957524c4dd4457b41c1561 Mon Sep 17 00:00:00 2001 From: Noam Postavsky <npostavs@gmail.com> Date: Tue, 23 Jan 2018 18:50:23 -0500 Subject: [PATCH] Fix round tripping of read->print for symbols with strange quotes Since 2017-07-22 "Signal error for symbol names with strange quotes (Bug#2967)", symbol names beginning with certain quote characters require an escaping backslash. However, the corresponding change for printing missed, so that (eq (read (prin1-to-string SYM)) SYM) does not give `t' for such symbols. * src/character.c (confusable_symbol_character_p): New function, extracted from test `read1'. * src/lread.c (read1): Use it. * src/print.c (print_object): Use it to print a backslash for symbols starting with characters that `read1' requires to be escaped. * test/src/print-tests.el (print-read-roundtrip): New test. * etc/NEWS: Clarify the announcement for the earlier reader change (Bug#30217). --- etc/NEWS | 12 +++++++++--- src/character.c | 26 ++++++++++++++++++++++++++ src/character.h | 2 ++ src/lread.c | 17 +++-------------- src/print.c | 3 ++- test/src/print-tests.el | 4 ++++ 6 files changed, 46 insertions(+), 18 deletions(-) diff --git a/etc/NEWS b/etc/NEWS index f5859d7a60..c760738105 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -1385,9 +1385,15 @@ renamed to 'lread--old-style-backquotes'. No user code should use this variable. --- -** To avoid confusion caused by "smart quotes", the reader no longer -accepts Lisp symbols which begin with the following quotation -characters: ‘’‛“”‟〞"', unless they are escaped with backslash. +** To avoid confusion caused by "smart quotes", the reader signals an +error when reading Lisp symbols which begin with one of the following +quotation characters: ‘’‛“”‟〞"'. A symbol beginning with such a +character can be written by escaping the quotation character with a +backslash. For example: + + (read "‘smart") => (invalid-read-syntax "strange quote" "‘") + (read "\\‘smart") == (intern "‘smart") + +++ ** 'default-file-name-coding-system' now defaults to a coding system diff --git a/src/character.c b/src/character.c index fa817a5031..4a934c7801 100644 --- a/src/character.c +++ b/src/character.c @@ -1050,6 +1050,32 @@ blankp (int c) return XINT (category) == UNICODE_CATEGORY_Zs; /* separator, space */ } + +/* Return true for characters that would read as symbol characters, + but graphically may be confused with some kind of punctuation. We + require an escaping backslash, when such characters begin a + symbol. */ +bool +confusable_symbol_character_p (int ch) +{ + switch (ch) + { + case 0x2018: /* LEFT SINGLE QUOTATION MARK */ + case 0x2019: /* RIGHT SINGLE QUOTATION MARK */ + case 0x201B: /* SINGLE HIGH-REVERSED-9 QUOTATION MARK */ + case 0x201C: /* LEFT DOUBLE QUOTATION MARK */ + case 0x201D: /* RIGHT DOUBLE QUOTATION MARK */ + case 0x201F: /* DOUBLE HIGH-REVERSED-9 QUOTATION MARK */ + case 0x301E: /* DOUBLE PRIME QUOTATION MARK */ + case 0xFF02: /* FULLWIDTH QUOTATION MARK */ + case 0xFF07: /* FULLWIDTH APOSTROPHE */ + return true; + + default: + return false; + } +} + signed char HEXDIGIT_CONST hexdigit[UCHAR_MAX + 1] = { #if HEXDIGIT_IS_CONST diff --git a/src/character.h b/src/character.h index c716885d46..d9e2d7bfc6 100644 --- a/src/character.h +++ b/src/character.h @@ -682,6 +682,8 @@ char_surrogate_p (int c) extern bool printablep (int); extern bool blankp (int); +extern bool confusable_symbol_character_p (int ch); + /* Return a translation table of id number ID. */ #define GET_TRANSLATION_TABLE(id) \ (XCDR (XVECTOR (Vtranslation_table_vector)->contents[(id)])) diff --git a/src/lread.c b/src/lread.c index 45d60647be..82731781f0 100644 --- a/src/lread.c +++ b/src/lread.c @@ -3482,20 +3482,9 @@ read1 (Lisp_Object readcharfun, int *pch, bool first_in_list) if (!quoted && multibyte) { int ch = STRING_CHAR ((unsigned char *) read_buffer); - switch (ch) - { - case 0x2018: /* LEFT SINGLE QUOTATION MARK */ - case 0x2019: /* RIGHT SINGLE QUOTATION MARK */ - case 0x201B: /* SINGLE HIGH-REVERSED-9 QUOTATION MARK */ - case 0x201C: /* LEFT DOUBLE QUOTATION MARK */ - case 0x201D: /* RIGHT DOUBLE QUOTATION MARK */ - case 0x201F: /* DOUBLE HIGH-REVERSED-9 QUOTATION MARK */ - case 0x301E: /* DOUBLE PRIME QUOTATION MARK */ - case 0xFF02: /* FULLWIDTH QUOTATION MARK */ - case 0xFF07: /* FULLWIDTH APOSTROPHE */ - xsignal2 (Qinvalid_read_syntax, build_string ("strange quote"), - CALLN (Fstring, make_number (ch))); - } + if (confusable_symbol_character_p (ch)) + xsignal2 (Qinvalid_read_syntax, build_string ("strange quote"), + CALLN (Fstring, make_number (ch))); } { Lisp_Object result; diff --git a/src/print.c b/src/print.c index 47cb33deeb..b0741531f7 100644 --- a/src/print.c +++ b/src/print.c @@ -1971,7 +1971,8 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) || c == ';' || c == '#' || c == '(' || c == ')' || c == ',' || c == '.' || c == '`' || c == '[' || c == ']' || c == '?' || c <= 040 - || confusing) + || confusing + || (i == 1 && confusable_symbol_character_p (c))) { printchar ('\\', printcharfun); confusing = false; diff --git a/test/src/print-tests.el b/test/src/print-tests.el index 46368c69ad..01e65028bc 100644 --- a/test/src/print-tests.el +++ b/test/src/print-tests.el @@ -58,5 +58,9 @@ (buffer-string)) "--------\n")))) +(ert-deftest print-read-roundtrip () + (let ((sym '\’bar)) + (should (eq (read (prin1-to-string sym)) sym)))) + (provide 'print-tests) ;;; print-tests.el ends here -- 2.11.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* bug#30217: Ambiguity in NEWS in emacs-26.0.91 2018-01-24 0:02 ` Noam Postavsky @ 2018-01-28 15:52 ` Noam Postavsky 2018-02-02 18:52 ` Drew Adams 0 siblings, 1 reply; 18+ messages in thread From: Noam Postavsky @ 2018-01-28 15:52 UTC (permalink / raw) To: Drew Adams; +Cc: Alan Mackenzie, 30217 tags 30217 fixed close 30217 27.1 quit Noam Postavsky <npostavs@users.sourceforge.net> writes: > Anyway, thinking about this made realize I broke read->print > round-tripping for these symbols, because I didn't change print to add > the backslash. Attached is a patch which does this, but I'm not sure if > it can go into emacs-26. If not, then I think we should at least delay > introduction of the reader change to Emacs 27. I've reverted the reader change from emacs-26 [1: 0510a78da5], and made the printer change in master [2: 36c8128e74]. [1: 0510a78da5]: 2018-01-28 10:49:51 -0500 Revert "Signal error for symbol names with strange quotes (Bug#2967)" https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=0510a78da5faaa40ebfdf59d0ac6107a72c1be1d [2: 36c8128e74]: 2018-01-28 10:43:01 -0500 Fix round tripping of read->print for symbols with strange quotes https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=36c8128e740ce91af10769bef46a21a72dafc56c ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#30217: Ambiguity in NEWS in emacs-26.0.91 2018-01-28 15:52 ` Noam Postavsky @ 2018-02-02 18:52 ` Drew Adams 2018-02-02 19:08 ` Noam Postavsky 0 siblings, 1 reply; 18+ messages in thread From: Drew Adams @ 2018-02-02 18:52 UTC (permalink / raw) To: Noam Postavsky; +Cc: Alan Mackenzie, 30217 > > Anyway, thinking about this made realize I broke read->print > > round-tripping for these symbols, because I didn't change print to add > > the backslash. Attached is a patch which does this, but I'm not sure > > if it can go into emacs-26. If not, then I think we should at least > > delay introduction of the reader change to Emacs 27. > > I've reverted the reader change from emacs-26 [1: 0510a78da5], and made > the printer change in master [2: 36c8128e74]. > > [1: 0510a78da5]: 2018-01-28 10:49:51 -0500 > Revert "Signal error for symbol names with strange quotes (Bug#2967)" > https://urldefense.proofpoint.com/v2/url?u=https- > 3A__git.savannah.gnu.org_cgit_emacs.git_commit_-3Fid- > 3D0510a78da5faaa40ebfdf59d0ac6107a72c1be1d&d=DwIBAg&c=RoP1YumCXCgaWHvlZY > R8PZh8Bv7qIrMUB65eapI_JnE&r=kI3P6ljGv6CTHIKju0jqInF6AOwMCYRDQUmqX22rJ98& > m=CJCOrx9BMpwlrEdgoRt6L_U2rZeTHXl36a6syPdXK0A&s=C6Y- > iAZovMSg2XWbKEAMMn5ACMJh9Xxqgd1MWV-x_bY&e= > > [2: 36c8128e74]: 2018-01-28 10:43:01 -0500 > Fix round tripping of read->print for symbols with strange quotes > https://urldefense.proofpoint.com/v2/url?u=https- > 3A__git.savannah.gnu.org_cgit_emacs.git_commit_-3Fid- > 3D36c8128e740ce91af10769bef46a21a72dafc56c&d=DwIBAg&c=RoP1YumCXCgaWHvlZY > R8PZh8Bv7qIrMUB65eapI_JnE&r=kI3P6ljGv6CTHIKju0jqInF6AOwMCYRDQUmqX22rJ98& > m=CJCOrx9BMpwlrEdgoRt6L_U2rZeTHXl36a6syPdXK0A&s=BMX9YKfA1uHGyZL4RoGAKcs2 > yeKzu3QkNTMZhAdnPZU&e= Sorry, but it's not clear to me. Is this being abandoned completely (I hope so), or is it just being postponed to Emacs 27? I came across this in the latest emacs-tangents@gnu.org message for 2018-01-29: http://git.savannah.gnu.org/cgit/emacs.git/commit/etc/NEWS?id=36c8128e740ce91af10769bef46a21a72dafc56c ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#30217: Ambiguity in NEWS in emacs-26.0.91 2018-02-02 18:52 ` Drew Adams @ 2018-02-02 19:08 ` Noam Postavsky 2018-02-02 21:37 ` Drew Adams 0 siblings, 1 reply; 18+ messages in thread From: Noam Postavsky @ 2018-02-02 19:08 UTC (permalink / raw) To: Drew Adams; +Cc: Alan Mackenzie, 30217 On Fri, Feb 2, 2018 at 1:52 PM, Drew Adams <drew.adams@oracle.com> wrote: > Sorry, but it's not clear to me. Is this being abandoned > completely (I hope so), or is it just being postponed to > Emacs 27? It's currently only postponed to Emacs 27, I suggest you bring it up in emacs-devel if you think we should get rid of it. Since we simply disagree about this, I don't think further dialogue here will help anything. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#30217: Ambiguity in NEWS in emacs-26.0.91 2018-02-02 19:08 ` Noam Postavsky @ 2018-02-02 21:37 ` Drew Adams 2018-02-02 22:14 ` Ista Zahn 0 siblings, 1 reply; 18+ messages in thread From: Drew Adams @ 2018-02-02 21:37 UTC (permalink / raw) To: Noam Postavsky; +Cc: Alan Mackenzie, 30217 > > Sorry, but it's not clear to me. Is this being abandoned > > completely (I hope so), or is it just being postponed to > > Emacs 27? > > It's currently only postponed to Emacs 27, I suggest you bring it up > in emacs-devel if you think we should get rid of it. Since we simply > disagree about this, I don't think further dialogue here will help > anything. I think you should bring it up, and I think you should have from the beginning. This is not just about fixing a bug. You're the one who is, in effect, proposing a change to Lisp. This is not normal Lisp behavior. This is a far cry from quote and backquote, comma and period, all of which are quite traditional for Lisp. These are ordinary symbol-constituent characters, and should not be handled in the way you've implemented. (I wanted to say "suggested", but you didn't suggest it to emacs-devel; you just implemented it - in a bug thread, no less.) ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#30217: Ambiguity in NEWS in emacs-26.0.91 2018-02-02 21:37 ` Drew Adams @ 2018-02-02 22:14 ` Ista Zahn 2018-02-02 22:35 ` Noam Postavsky 0 siblings, 1 reply; 18+ messages in thread From: Ista Zahn @ 2018-02-02 22:14 UTC (permalink / raw) To: Drew Adams; +Cc: Alan Mackenzie, 30217, Noam Postavsky [-- Attachment #1: Type: text/plain, Size: 1180 bytes --] On Feb 2, 2018 4:39 PM, "Drew Adams" <drew.adams@oracle.com> wrote: > > Sorry, but it's not clear to me. Is this being abandoned > > completely (I hope so), or is it just being postponed to > > Emacs 27? > > It's currently only postponed to Emacs 27, I suggest you bring it up > in emacs-devel if you think we should get rid of it. Since we simply > disagree about this, I don't think further dialogue here will help > anything. I think you should bring it up, and I think you should have from the beginning. This is not just about fixing a bug. You're the one who is, in effect, proposing a change to Lisp. This is not normal Lisp behavior. This is a far cry from quote and backquote, comma and period, all of which are quite traditional for Lisp. These are ordinary symbol-constituent characters, and should not be handled in the way you've implemented. (I wanted to say "suggested", but you didn't suggest it to emacs-devel; you just implemented it - in a bug thread, no less.) I'm nobody in this community, but in case it means anything I agree completely with Drew. This isn't a bug fix, but a language change that needs to be carefully thought out and discussed. [-- Attachment #2: Type: text/html, Size: 2085 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#30217: Ambiguity in NEWS in emacs-26.0.91 2018-02-02 22:14 ` Ista Zahn @ 2018-02-02 22:35 ` Noam Postavsky 0 siblings, 0 replies; 18+ messages in thread From: Noam Postavsky @ 2018-02-02 22:35 UTC (permalink / raw) To: Ista Zahn; +Cc: Alan Mackenzie, 30217 On Fri, Feb 2, 2018 at 5:14 PM, Ista Zahn <istazahn@gmail.com> wrote: > I'm nobody in this community, but in case it means anything I agree > completely with Drew. This isn't a bug fix, but a language change that needs > to be carefully thought out and discussed. Thanks. I'm honestly not sure how much careful thought will be needed beyond a tally of yes/no votes, but I've posted to emacs-devel now. https://lists.gnu.org/archive/html/emacs-devel/2018-02/msg00093.html ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2018-02-02 22:35 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-01-22 22:17 bug#30217: Ambiguity in NEWS in emacs-26.0.91 Alan Mackenzie 2018-01-22 22:42 ` Drew Adams 2018-01-23 0:42 ` Noam Postavsky 2018-01-23 0:56 ` Drew Adams 2018-01-23 1:40 ` Noam Postavsky 2018-01-23 6:07 ` Drew Adams 2018-01-23 6:21 ` Drew Adams 2018-01-23 12:54 ` Noam Postavsky 2018-01-23 15:53 ` Drew Adams 2018-01-23 23:00 ` Noam Postavsky 2018-01-23 23:19 ` Drew Adams 2018-01-24 0:02 ` Noam Postavsky 2018-01-28 15:52 ` Noam Postavsky 2018-02-02 18:52 ` Drew Adams 2018-02-02 19:08 ` Noam Postavsky 2018-02-02 21:37 ` Drew Adams 2018-02-02 22:14 ` Ista Zahn 2018-02-02 22:35 ` Noam Postavsky
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).