* Change of Lisp syntax for "fancy" quotes in Emacs 27? @ 2018-02-02 22:24 Noam Postavsky 2018-02-02 22:52 ` Paul Eggert ` (3 more replies) 0 siblings, 4 replies; 98+ messages in thread From: Noam Postavsky @ 2018-02-02 22:24 UTC (permalink / raw) To: Emacs developers; +Cc: Drew Adams In Emacs 26 and earlier the following is valid lisp code: (setq ’bar 42) (setq foo ’bar) In the current master branch, this will signal (invalid-read-syntax "strange quote" "’"). To write the equivalent the ’ must be backslash escaped: (setq \’bar 42) (setq foo \’bar) (the backslash escaping also works in earlier Emacs versions). The point of this change is to give a more straightforward error in cases where a plain straight quote is accidentally written instead of a curved one. In Bug#30217, Drew Adams strongly objects to this change. I don't want to "sneak" this in, so I'm asking here for people's thoughts on this. References: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=30217 https://debbugs.gnu.org/cgi/bugreport.cgi?bug=2967 PS In case anyone has trouble reading the example code (e.g., due to some email encoding failure), evaluating (insert "(setq \u2019bar 42)\n(setq foo \u2019bar)") will write it into your current buffer. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-02 22:24 Change of Lisp syntax for "fancy" quotes in Emacs 27? Noam Postavsky @ 2018-02-02 22:52 ` Paul Eggert 2018-02-03 0:00 ` Drew Adams 2018-02-03 8:33 ` Eli Zaretskii ` (2 subsequent siblings) 3 siblings, 1 reply; 98+ messages in thread From: Paul Eggert @ 2018-02-02 22:52 UTC (permalink / raw) To: Noam Postavsky, Emacs developers; +Cc: Drew Adams On 02/02/2018 02:24 PM, Noam Postavsky wrote: > In Bug#30217, Drew Adams strongly objects to this change. I don't want > to "sneak" this in, so I'm asking here for people's thoughts on this. I see two main categories of users here, with different needs. Less-expert users are likely to run into problems with quotes and other characters (that's why we got bug reports), and appreciate diagnostics pinpointing the problems; also, programmers concerned about security are likely to want these confusing characters to be diagnosed, to prevent an attacker from sending code that is easily read one way but actually operates in a different way. On the other hand, programs that generate Elisp code might prefer not having to special-case these characters. So perhaps there should be a buffer-local variable that controls which behavior is selected. The default behavior should be the one that caters better to general users and is safer. While we're on the topic, I suggest using the Unicode confusables list <http://www.unicode.org/Public/security/10.0.0/confusables.txt> to come up with a list of confusing alternatives for each character that has a special meaning in Emacs Lisp. This should be better than our trying to come up with our own, ad-hoc list. For example, U+A78C LATIN SMALL LETTER SALTILLO (ꞌ) looks almost exactly like an apostrophe on my screen and is in the confusables list, but is not a character that Emacs currently checks for. ^ permalink raw reply [flat|nested] 98+ messages in thread
* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-02 22:52 ` Paul Eggert @ 2018-02-03 0:00 ` Drew Adams 2018-02-03 0:09 ` Paul Eggert 0 siblings, 1 reply; 98+ messages in thread From: Drew Adams @ 2018-02-03 0:00 UTC (permalink / raw) To: Paul Eggert, Noam Postavsky, Emacs developers > I see two main categories of users here, with different needs. > Less-expert users are likely to run into problems with quotes > and other characters (that's why we got bug reports), and > appreciate diagnostics pinpointing the problems; also, > programmers concerned about security are likely to want these > confusing characters to be diagnosed, to prevent an attacker > from sending code that is easily read one way but actually > operates in a different way. > > On the other hand, programs that generate Elisp code might > prefer not having to special-case these characters. So > perhaps there should be a buffer-local variable that controls > which behavior is selected. The default behavior should be > the one that caters better to general users and is safer. The distinction I think needs to be made is between: 1. Trying to _warn users_ (all users, less-expert or not) about possible misuse of particularly confusable chars. This just warns about possible pilot error. 2. _Changing Lisp_ reading and evaluating, to treat some (all?) confusable characters specially, changing their syntax and requiring them to be escaped in order to be treated normally (i.e., as they have been treated so far). I object to #2, NOT to #1. #1: By all means, we should try to help users. We can issue byte-compilation warnings and some interactive warnings - provided we can helpfully and unambiguously distinguish the right situations. #2 changes Lisp in non-neglible, non-helpful ways. See bug #30217 for more. ---- There are lots more characters to which the same non-bug "fix" of changing Lisp might be applied (which means that users will wonder why this confusable char is treated specially, and not that one). Such chars include pretty much anything that could be confused with anything that is ever used as a delimiter in Emacs Lisp: brackets (in the British sense) of all sorts: parens, square, angle, curly. There are really quite a few such bracket-confusables. Such chars also include pretty much anything that could be confused with any other chars that are used specially in Lisp: period, comma, quote, backquote, colon. Again: there are quite a few such confusables. They even include chars that could be confused with the directory separators used in Emacs Lisp. Finally (?), they include chars that could be confused with the ASCII-digit numerals 0123456789. There are lots of these confusables too. (Even with just ASCII there are confusables. Think of what some use in passwords or leet: zero vs uppercase letter O, digit 1 vs lowercase letter l, etc. We've just gotten used to carefully distinguishing such chars. Now there are many more, and slighter, differences to get used to.) ---- Beyond the question of which chars to treat specially, there's the question of where - in which contexts - to try to distinguish them. Contexts include such places as sexps being evaluated, doc strings, and comments. They can also include fonts: a given character might be confusable, or more confusable, in one font than in another. Even font size can make a difference (with some fonts I find myself zooming in to see whether a quote-thingy might really be a curly quote). The questions of which chars and where (context) are both relevant even if we only warn users (#1) and do not change Lisp syntax (#2). ---- At the very least, I would hope that if we do anything at all about this we would start by only warning. I really hope we will not change Lisp syntax for this, i.e., I hope we revert the change that has been made so far for Emacs 27. > While we're on the topic, I suggest using the Unicode > confusables list ... to come up with a list of confusing > alternatives for each character that has a special meaning > in Emacs Lisp. This should be better than our trying to > come up with our own, ad-hoc list. > > For example, U+A78C LATIN SMALL LETTER SALTILLO (ꞌ) looks > almost exactly like an apostrophe on my screen and is in > the confusables list, but is not a character that Emacs > currently checks for. Yup, and that's just one tiny tip of this terribly tippy iceberg. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-03 0:00 ` Drew Adams @ 2018-02-03 0:09 ` Paul Eggert 2018-02-03 0:39 ` Drew Adams 0 siblings, 1 reply; 98+ messages in thread From: Paul Eggert @ 2018-02-03 0:09 UTC (permalink / raw) To: Drew Adams, Noam Postavsky, Emacs developers On 02/02/2018 04:00 PM, Drew Adams wrote: > The distinction I think needs to be made is between: > > 1. Trying to_warn users_ (all users, less-expert or not) > about possible misuse of particularly confusable chars. > This just warns about possible pilot error. > > 2._Changing Lisp_ reading and evaluating, to treat some > (all?) confusable characters specially, changing their > syntax and requiring them to be escaped in order to be > treated normally (i.e., as they have been treated so far). > > I object to #2, NOT to #1. I don't see a clear distinction between #1 and #2. For example, in an adversarial environment, users who get warned about suspicious characters in their incoming source files will most likely type "no" when asked to run such code. In that case, if you want your audience to include users who care even a smidgen about security, you'll need to escape confusable characters in the business parts of your Emacs Lisp code. Effectively that will be a change to Emacs Lisp, even if its formal syntax does not change. ^ permalink raw reply [flat|nested] 98+ messages in thread
* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-03 0:09 ` Paul Eggert @ 2018-02-03 0:39 ` Drew Adams 0 siblings, 0 replies; 98+ messages in thread From: Drew Adams @ 2018-02-03 0:39 UTC (permalink / raw) To: Paul Eggert, Noam Postavsky, Emacs developers > > The distinction I think needs to be made is between: > > > > 1. Trying to_warn users_ (all users, less-expert or not) > > about possible misuse of particularly confusable chars. > > This just warns about possible pilot error. > > > > 2._Changing Lisp_ reading and evaluating, to treat some > > (all?) confusable characters specially, changing their > > syntax and requiring them to be escaped in order to be > > treated normally (i.e., as they have been treated so far). > > > > I object to #2, NOT to #1. > > I don't see a clear distinction between #1 and #2. That's too bad. They are really quite different. In the first case, you get a warning. In the second case your code breaks. > For example, in an adversarial environment... I don't think that's the reason for this change at all. It was not mentioned in the bug thread, AFAIK. The motivation was to prevent confusion on the part of users, not to prevent or avoid malevolent behavior. Please see the bug thread (#30217). The idea was to improve convenience and reduce confusion by someone who copy+pastes code from a web page (for example), when (for example) that page renders a normal quote as a curly quote. You want to introduce a security aspect here. I can't speak much to that. I'll simply ask whether other Lisps (e.g. Common Lisp) worry about such a risk? What does Clojure do about confusables in Lisp symbols? Does any other Lisp change the Lisp syntax and behavior to require special escaping of such chars in symbols (or elsewhere)? Sure, even if no other Lisp worries about this or takes the same approach as that proposed, that's not a proof that Emacs Lisp shouldn't. Still... Given enough motivation, you can already, today, create Lisp code (confusing, confusable, or otherwise) that is evil, even without using any consusable Unicode chars. When I was a kid we would play tricks on each other, changing a character somewhere in a friend's large deck of punched Hollerith cards - e.g., insert or remove a decimal point. You had to wait a full day to get back the result of your program run, and the result would only be a pretty cryptic error msg. Argggh! It was just good-natured fun - a game among friends. And that was only with assembler and Fortran, and we were just newbie kids. Imagine what you can do today, without bothering to rely on close Unicode confusables. Sorry, but your "security" argument just doesn't pass muster, for me. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-02 22:24 Change of Lisp syntax for "fancy" quotes in Emacs 27? Noam Postavsky 2018-02-02 22:52 ` Paul Eggert @ 2018-02-03 8:33 ` Eli Zaretskii 2018-02-03 16:16 ` Drew Adams 2018-02-03 18:13 ` Aaron Ecay 2018-10-05 0:03 ` Noam Postavsky 3 siblings, 1 reply; 98+ messages in thread From: Eli Zaretskii @ 2018-02-03 8:33 UTC (permalink / raw) To: Noam Postavsky; +Cc: drew.adams, emacs-devel > From: Noam Postavsky <npostavs@users.sourceforge.net> > Date: Fri, 2 Feb 2018 17:24:43 -0500 > Cc: Drew Adams <drew.adams@oracle.com> > > In Emacs 26 and earlier the following is valid lisp code: > > (setq ’bar 42) > (setq foo ’bar) > > In the current master branch, this will signal (invalid-read-syntax > "strange quote" "’"). To write the equivalent the ’ must be backslash > escaped: > > (setq \’bar 42) > (setq foo \’bar) > > (the backslash escaping also works in earlier Emacs versions). > > The point of this change is to give a more straightforward error in > cases where a plain straight quote is accidentally written instead of > a curved one. The bug reports which triggered the above changes are bug#2967 and bug#23425. So any proposal to remove those changes should also suggest an alternative for handling those bug reports. ^ permalink raw reply [flat|nested] 98+ messages in thread
* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-03 8:33 ` Eli Zaretskii @ 2018-02-03 16:16 ` Drew Adams 2018-02-03 17:05 ` Eli Zaretskii 0 siblings, 1 reply; 98+ messages in thread From: Drew Adams @ 2018-02-03 16:16 UTC (permalink / raw) To: Eli Zaretskii, Noam Postavsky; +Cc: emacs-devel > The bug reports which triggered the above changes are bug#2967 and > bug#23425. So any proposal to remove those changes should also > suggest an alternative for handling those bug reports. For "handling those bug reports"? Are we to add more cans of worms to this question, obscuring it? AFAICT, no alternatives to handling those bugs are needed because of reverting the Lisp syntax change made for bug #30217. Can you point to how/why reverting that change would necessitate alternative fixes for those bugs? Bug #2967 just asked for a warning, e.g. during byte-compilation or loading. There's no objection here to warning. Bug #2967 did not ask for (or get) a change in Lisp syntax. I see no negative impact on #2967 from reverting the Lisp-syntax "fix" to #30217. Even #30217 did not ask for such a syntax change. Warning is sufficient for fixing #30217 too. Bug #23425, on the other hand, is a gigantic stream-of-consciousness about anything and everything to do with Paul's changes to Emacs over the last few years wrt curly quotes. It's not a single bug report thread - it's all over the map. In any case, #23425, like #2967 (and even #30217), is not about what was done to "fix" #30217 - changing Lisp syntax for fancy quotes. How is it helpful to throw all of #23425 into this Lisp syntax-change question, as if the present issue puts into question everything ever discussed about curly quotes? Or do you have something specific in mind here wrt #23425 - some part of it? Something that would actually be impacted negatively by reverting the Lisp syntax changes for #30217? If so, please identify it. But if you mean only the ability to get confused by copy+pasting Lisp code that has a fancy quote mark somewhere in place of ordinary ASCII apostrophe ('), e.g., (setq foo ’bar), then that's just the same pilot-error gotcha as for bug #30217. There are many gotchas in Lisp. You can see repeated postings of some at various places (e.g., help-gnu-emacs, emacs.stackexchange). E.g., the error that a given Lisp function is not defined (because its library was not loaded). The pilot error described in bug #30217 is not even a commonly reported one. The "fix" made in #30217 is an overreaction. So one solution to #30217 is to do nothing - just revert the misguided Lisp syntax change. Users will learn that gotcha the same way they learn others. Not every report of a gotcha needs to lead to changes to Emacs. If we do nothing there will continue to be some such pilot errors, of course. But we already raise an error if the code leads to a problem. And the original error message from bug #23425 is _more_ meaningful and helpful, not less, than the new one after the "fix". The original error msg of #23425: (wrong-number-of-arguments setq 31) tells you pretty much that setq is missing an argument or it has too many, which makes you look at its arguments. Not so obscure. And accurate. The new error msg: (invalid-read-syntax "strange quote" "’") is obscure. Invalid read syntax when reading what? What's invalid about it? Confusion - not understanding an accurate error msg, is not the same thing as Lisp itself having a bug because such a character is included in a symbol name. Another solution is to try to warn users about the use of confusables. That's actually many solutions, because it requires handling different chars and different gotcha contexts differently, and carefully. But unlike a syntax change it's not an all-or-nothing thing: we could add warnings here and there, as something might be better than nothing. Either doing nothing or trying to warn about such gotchas is right. Changing Lisp syntax here is not right. Lisp doesn't have a bug here. This is all about pilot error - the same kind of thing that happens when someone mistypes `,' for `.' for dotted-pair syntax, or types `.' in `a.b' intending dotted-pair syntax but getting a symbol instead, or quotes a sexp expecting the sexp to be evaluated. Yes, a user might scratch her head when seeing the error message from such a mistake, but the error message is right, not wrong, and eventually the light turns on. And this enlightenment is aided by the fact that Lisp syntax is so simple. The "fix" for bug #30217 goes in the opposite direction. It makes Lisp syntax more complex and makes understanding syntax mistakes more difficult. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-03 16:16 ` Drew Adams @ 2018-02-03 17:05 ` Eli Zaretskii 2018-02-04 1:16 ` Michael Heerdegen ` (3 more replies) 0 siblings, 4 replies; 98+ messages in thread From: Eli Zaretskii @ 2018-02-03 17:05 UTC (permalink / raw) To: Drew Adams; +Cc: emacs-devel, npostavs > Date: Sat, 3 Feb 2018 08:16:15 -0800 (PST) > From: Drew Adams <drew.adams@oracle.com> > Cc: emacs-devel@gnu.org > > > The bug reports which triggered the above changes are bug#2967 and > > bug#23425. So any proposal to remove those changes should also > > suggest an alternative for handling those bug reports. > > For "handling those bug reports"? Are we to add > more cans of worms to this question, obscuring it? > > AFAICT, no alternatives to handling those bugs > are needed because of reverting the Lisp syntax > change made for bug #30217. Can you point to > how/why reverting that change would necessitate > alternative fixes for those bugs? Those bug reports complained about obscure error messages that are unhelpful when a Lisp programmer tries to figure out the root cause. I'm saying that we should find an alternative way of making clear, helpful error messages in those special cases where characters which display similarly might make the error message confusing if it just cites the symbol's name. For example, suppose you have a Lisp program that produces the following error message when compiled/executed: Symbol's value as variable is void: 'аbbrevs-changed You then type "C-h v abbrevs-changed RET" and get the expected result, meaning that the variable is known to Emacs. How quickly will you be able to spot the cause of the error message? The change that got reverted from the emacs-26 branch was about a similar case, but for a character that's much more important for Lisp than 'a': it's about the character used to quote symbol names. But the essence is the same: due to how characters are displayed, some characters can be confused for others. We want to find a way of identifying such situation and telling the Lisp programmer about that in clear and easily understandable ways. One way, perhaps too radical one, is to reject such "confusable" characters outright. We could decide that we don't want such a radical solution, but that doesn't mean we should give up on the attempt to find some other solution for the problem. Neither does it mean we should proclaim people who installed the change as enemies of the society. > Bug #23425, on the other hand, is a gigantic > stream-of-consciousness about anything and > everything [...] > [...] > How is it helpful to throw all of #23425 into > this Lisp syntax-change question, as if the > present issue puts into question everything > ever discussed about curly quotes? I could turn the table and ask you how is it helpful to dump on us all your random thoughts about this, instead of simply saying you didn't understand the relevance and asking for more explanations. Which I just provided. I hope now the issue is clear enough. > And the original error message from bug #23425 > is _more_ meaningful and helpful, not less, > than the new one after the "fix". > > The original error msg of #23425: > (wrong-number-of-arguments setq 31) > > tells you pretty much that setq is missing an > argument or it has too many, which makes you > look at its arguments. Not so obscure. And > accurate. > > The new error msg: > (invalid-read-syntax "strange quote" "’") > > is obscure. Invalid read syntax when reading > what? What's invalid about it? I think you are so eager to make your point that you are willing to claim that black is white and vice versa. Any objective person would agree that the new error message is more directly pointing to the root cause, which is the syntax of specifying a quoted symbol name using a "strange quote". If we are good in writing and indexing our ELisp manual, then I'd expect to find there an index entry for "strange quote", which will land me where this issue is explained. Case closed. Once again, I can agree that this measure might be too harsh, but I would still like to see clear diagnostics of such typos, and like Paul, I thing we should take our inspiration from the Unicode Standard's notion of "confusables". Ideas and proposals for patches along those lines are welcome. Ignoring the problem, or trying to convince us that it doesn't exist, is not. > Either doing nothing or trying to warn about such > gotchas is right. Changing Lisp syntax here is > not right. Doing nothing would be ignoring the problem. That changing Lisp syntax is not right is your opinion: legitimate, but clearly not shared by at least some. > Lisp doesn't have a bug here. That's a strawman, and you know it. We are talking about diagnostics for bugs in Lisp programs. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-03 17:05 ` Eli Zaretskii @ 2018-02-04 1:16 ` Michael Heerdegen 2018-02-04 1:25 ` Clément Pit-Claudel ` (2 more replies) 2018-02-04 1:55 ` Drew Adams ` (2 subsequent siblings) 3 siblings, 3 replies; 98+ messages in thread From: Michael Heerdegen @ 2018-02-04 1:16 UTC (permalink / raw) To: Eli Zaretskii; +Cc: npostavs, Drew Adams, emacs-devel Hello, Helpfulness of error messages surely depends on the beholder, and on expectations. In my eyes, > Symbol's value as variable is void: 'аbbrevs-changed is quite clear: you think this ^^^^^^^^^^^^^^^^ is a quoted thing, but the error message calls it a symbol. So there must be a problem with that quote, it has obviously gotten read as part of the symbol. Sure, you have still to find out why. OTOH > > (invalid-read-syntax "strange quote" "’") also doesn't say what's wrong with that quote. It even calls something a quote where there is none. The error message is confusing. Repeating the pseudo quote character in the error message doesn't make it look less like a quote. > I think you are so eager to make your point that you are willing to > claim that black is white and vice versa. Any objective person would > agree that the new error message is more directly pointing to the root > cause Are you really sure that every Emacs user would expect that we modify the Lisp reader to catch typos? FWIW, we already modified the Lisp reader to catch another style issue (to get rid of old-style backquotes) and made it error. It broke my stuff (el-search) horribly - though I don't use old-style backquotes, and for code that also doesn't use them. Now I need to work around `read' and define my own `read' function. I also need to remember for a long time that using `read' is forbidden in my library. I even implemented a minor mode to warn me just about that: it warns me that I use `read' and it's forbidden. Otherwise, I would get strange errors when using my stuff, from time to time, whenever I added a `read' by accident. All other users of my package, too. And believe me, _these_ error messages are then less understandable than > Symbol's value as variable is void: 'аbbrevs-changed. Misusing something fundamental as the Lisp reader to catch such stuff should be the very last resort. The result can get much more confusing in situations we now don't think about. > > Lisp doesn't have a bug here. > That's a strawman, and you know it. We are talking about diagnostics > for bugs in Lisp programs. I think it's a eligible argument. Drew just thinks it's the wrong fix. He may also think that no fix would maybe suffice. That's ok, and I think he made some good points. We should discuss about alternative approaches to move forward. People often paste stuff into scratch or the M-: prompt that they copied from elsewhere. Maybe we could make M-: and C-x C-e check for this problem. These could also check for other, similar frequent problems. Any better suggestions? Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-04 1:16 ` Michael Heerdegen @ 2018-02-04 1:25 ` Clément Pit-Claudel 2018-02-04 2:05 ` Drew Adams ` (2 more replies) 2018-02-04 11:15 ` Alan Mackenzie 2018-02-04 14:47 ` Noam Postavsky 2 siblings, 3 replies; 98+ messages in thread From: Clément Pit-Claudel @ 2018-02-04 1:25 UTC (permalink / raw) To: Michael Heerdegen, Eli Zaretskii; +Cc: emacs-devel, Drew Adams, npostavs On 2018-02-03 20:16, Michael Heerdegen wrote: > Helpfulness of error messages surely depends on the beholder, and on > expectations. In my eyes, > >> Symbol's value as variable is void: 'аbbrevs-changed > is quite clear: you think this ^^^^^^^^^^^^^^^^ is a quoted > thing, but the error message calls it a symbol. So there must be a > problem with that quote, it has obviously gotten read as part of the > symbol. Sure, you have still to find out why. I think you're making Eli's point, actually :) The problem isn't the quote: it's the CYRILLIC SMALL LETTER A instead of LATIN SMALL LETTER A. IOW, (string= "аbbrevs-changed" "abbrevs-changed") is nil. I think Eli was illustrating the confusion that can stem from Unicode confusables (and I must agree that the error message could be much better ^^) Clément. ^ permalink raw reply [flat|nested] 98+ messages in thread
* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-04 1:25 ` Clément Pit-Claudel @ 2018-02-04 2:05 ` Drew Adams 2018-02-04 2:06 ` Michael Heerdegen 2018-02-04 10:34 ` Alan Third 2 siblings, 0 replies; 98+ messages in thread From: Drew Adams @ 2018-02-04 2:05 UTC (permalink / raw) To: Clément Pit-Claudel, Michael Heerdegen, Eli Zaretskii Cc: emacs-devel, npostavs > > Helpfulness of error messages surely depends on the beholder, and on > > expectations. In my eyes, > > > >> Symbol's value as variable is void: 'аbbrevs-changed > > is quite clear: you think this ^^^^^^^^^^^^^^^^ is a quoted > > thing, but the error message calls it a symbol. So there must be a > > problem with that quote, it has obviously gotten read as part of the > > symbol. Sure, you have still to find out why. > > I think you're making Eli's point, actually :) > > The problem isn't the quote: it's the CYRILLIC SMALL LETTER A instead of > LATIN SMALL LETTER A. IOW, (string= "аbbrevs-changed" "abbrevs- > changed") is nil. > > I think Eli was illustrating the confusion that can stem from Unicode > confusables (and I must agree that the error message could be much > better ^^) I too misread Eli's example as being about using a curly quote instead of an apostrophe. You're right that it's an ordinary apostrophe and the first `a' is the letter you mention. But then why would anyone ever see the quote mark in such a message? Was the message artificially configured? In any case, if that example, without the quote, say, is trying to make Eli's point, then he must be arguing for warning about using such confusables also - `а' as a confusable for `a'. That's a monumental undertaking (take a look at the confusables.txt list). And the messages (warning or error) would need to be pretty darn clear about just what char was used and where, in order not to sow even more confusion. It sure won't cut the mustard to just say "Invalid read syntax"! ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-04 1:25 ` Clément Pit-Claudel 2018-02-04 2:05 ` Drew Adams @ 2018-02-04 2:06 ` Michael Heerdegen 2018-02-04 10:34 ` Alan Third 2 siblings, 0 replies; 98+ messages in thread From: Michael Heerdegen @ 2018-02-04 2:06 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: Eli Zaretskii, emacs-devel, Drew Adams, npostavs Clément Pit-Claudel <cpitclaudel@gmail.com> writes: > On 2018-02-03 20:16, Michael Heerdegen wrote: > > Helpfulness of error messages surely depends on the beholder, and on > > expectations. In my eyes, > > > >> Symbol's value as variable is void: 'аbbrevs-changed > > is quite clear: you think this ^^^^^^^^^^^^^^^^ is a quoted > > thing, but the error message calls it a symbol. So there must be a > > problem with that quote, it has obviously gotten read as part of the > > symbol. Sure, you have still to find out why. > > I think you're making Eli's point, actually :) > > The problem isn't the quote: it's the CYRILLIC SMALL LETTER A instead > of LATIN SMALL LETTER A. IOW, (string= "аbbrevs-changed" > "abbrevs-changed") is nil. Oh. Why is then there a quote in this error message? FWIW, I'm not against doing something that helps the user in such situations. But these are problems in the interaction between the user and Emacs, so we should care about it on that (the interface) level. And keep Lisp, the language, simple. Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-04 1:25 ` Clément Pit-Claudel 2018-02-04 2:05 ` Drew Adams 2018-02-04 2:06 ` Michael Heerdegen @ 2018-02-04 10:34 ` Alan Third 2018-02-04 15:36 ` Clément Pit-Claudel 2 siblings, 1 reply; 98+ messages in thread From: Alan Third @ 2018-02-04 10:34 UTC (permalink / raw) To: Clément Pit-Claudel Cc: Michael Heerdegen, Eli Zaretskii, npostavs, Drew Adams, emacs-devel On Sat, Feb 03, 2018 at 08:25:01PM -0500, Clément Pit-Claudel wrote: > On 2018-02-03 20:16, Michael Heerdegen wrote: > > Helpfulness of error messages surely depends on the beholder, and on > > expectations. In my eyes, > > > >> Symbol's value as variable is void: 'аbbrevs-changed > > is quite clear: you think this ^^^^^^^^^^^^^^^^ is a quoted > > thing, but the error message calls it a symbol. So there must be a > > problem with that quote, it has obviously gotten read as part of the > > symbol. Sure, you have still to find out why. > > I think you're making Eli's point, actually :) > > The problem isn't the quote: it's the CYRILLIC SMALL LETTER A > instead of LATIN SMALL LETTER A. IOW, (string= "аbbrevs-changed" > "abbrevs-changed") is nil. > > I think Eli was illustrating the confusion that can stem from > Unicode confusables (and I must agree that the error message could > be much better ^^) Something like: Symbol's value as variable is void: 'аbbrevs-changed Did you mean `abbrevs-changed'? Symbol contains `а' (CYRILLIC SMALL LETTER A) at character 0, did you mean `a' (LATIN SMALL LETTER A)? The middle line would require Emacs to do a fuzzy search for similar symbols, which may be too much. Something like that could be helpful even in cases where the name has been mistyped (abbrev-changed instead of abbrevs-changed, for example). -- Alan Third ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-04 10:34 ` Alan Third @ 2018-02-04 15:36 ` Clément Pit-Claudel 2018-02-04 17:37 ` Eli Zaretskii 0 siblings, 1 reply; 98+ messages in thread From: Clément Pit-Claudel @ 2018-02-04 15:36 UTC (permalink / raw) To: Alan Third Cc: Michael Heerdegen, Eli Zaretskii, npostavs, Drew Adams, emacs-devel On 2018-02-04 05:34, Alan Third wrote: > Symbol's value as variable is void: 'аbbrevs-changed > Did you mean `abbrevs-changed'? > Symbol contains `а' (CYRILLIC SMALL LETTER A) at character 0, did you > mean `a' (LATIN SMALL LETTER A)? That would be pretty cool. > The middle line would require Emacs to do a fuzzy search for similar > symbols, which may be too much. OCaml does this (but at compile time). Do we have a way to delay the fuzzy search to the point when the error message is displayed? Otherwise we'll pay the price of the search even if the error is then swallowed by a condition-case. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-04 15:36 ` Clément Pit-Claudel @ 2018-02-04 17:37 ` Eli Zaretskii 2018-02-04 21:31 ` Noam Postavsky 0 siblings, 1 reply; 98+ messages in thread From: Eli Zaretskii @ 2018-02-04 17:37 UTC (permalink / raw) To: Clément Pit-Claudel Cc: michael_heerdegen, alan, npostavs, drew.adams, emacs-devel > Cc: Michael Heerdegen <michael_heerdegen@web.de>, Eli Zaretskii > <eliz@gnu.org>, emacs-devel@gnu.org, Drew Adams <drew.adams@oracle.com>, > npostavs@users.sourceforge.net > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Sun, 4 Feb 2018 10:36:49 -0500 > > > The middle line would require Emacs to do a fuzzy search for similar > > symbols, which may be too much. > > OCaml does this (but at compile time). Do we have a way to delay the fuzzy search to the point when the error message is displayed? Otherwise we'll pay the price of the search even if the error is then swallowed by a condition-case. Isn't this premature optimization? We aren't even sure yet that such a fuzzy search will be too expensive. We could, for example, implement the confusables as a char-table, which would make it fast enough, I think. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-04 17:37 ` Eli Zaretskii @ 2018-02-04 21:31 ` Noam Postavsky 0 siblings, 0 replies; 98+ messages in thread From: Noam Postavsky @ 2018-02-04 21:31 UTC (permalink / raw) To: Eli Zaretskii Cc: Michael Heerdegen, Clément Pit-Claudel, Alan Third, Drew Adams, Emacs developers [-- Attachment #1: Type: text/plain, Size: 804 bytes --] On Sun, Feb 4, 2018 at 12:37 PM, Eli Zaretskii <eliz@gnu.org> wrote: >> From: Clément Pit-Claudel <cpitclaudel@gmail.com> >> Date: Sun, 4 Feb 2018 10:36:49 -0500 >> >> > The middle line would require Emacs to do a fuzzy search for similar >> > symbols, which may be too much. >> >> OCaml does this (but at compile time). Do we have a way to delay the fuzzy search to the point when the error message is displayed? Otherwise we'll pay the price of the search even if the error is then swallowed by a condition-case. > > Isn't this premature optimization? I think the check fits nicely into command-error-default-function though. Attaching a quick proof-of-concept (handles only a single curved quote at the beginning of symbol name). We would want something also for the byte-compiler. [-- Attachment #2: v1-0001-sketch-Catch-strange-quotes-on-error-time.patch --] [-- Type: text/x-diff, Size: 3166 bytes --] From c9d1e761cea56e94d9ad3d783c8ed7fcf448b082 Mon Sep 17 00:00:00 2001 From: Noam Postavsky <npostavs@gmail.com> Date: Sun, 4 Feb 2018 16:20:32 -0500 Subject: [PATCH v1] [sketch] Catch strange quotes on error time * src/keyboard.c (Fcommand_error_default_function): Check for RIGHT SINGLE QUOTATION MARK and give a more detailed message. TODO: check for other confusables. * src/lread.c (read1): Don't signal error on strange quotes. * src/eval.c (Fsetq): Pass full arglist in error data. TODO: the same for all the other Qwrong_number_of_arguments cases. --- src/eval.c | 3 ++- src/keyboard.c | 20 ++++++++++++++++++++ src/lread.c | 7 ------- 3 files changed, 22 insertions(+), 8 deletions(-) diff --git a/src/eval.c b/src/eval.c index 7db4dbcf18..db61c0421f 100644 --- a/src/eval.c +++ b/src/eval.c @@ -507,7 +507,8 @@ DEFUN ("setq", Fsetq, Ssetq, 0, UNEVALLED, 0, Lisp_Object sym = XCAR (tail), lex_binding; tail = XCDR (tail); if (!CONSP (tail)) - xsignal2 (Qwrong_number_of_arguments, Qsetq, make_number (nargs + 1)); + xsignal3 (Qwrong_number_of_arguments, Qsetq, + make_number (nargs + 1), args); Lisp_Object arg = XCAR (tail); tail = XCDR (tail); val = eval_sub (arg); diff --git a/src/keyboard.c b/src/keyboard.c index 4324991da4..24c5f66934 100644 --- a/src/keyboard.c +++ b/src/keyboard.c @@ -1047,6 +1047,26 @@ DEFUN ("command-error-default-function", Fcommand_error_default_function, bitch_at_user (); print_error_message (data, Qt, SSDATA (context), signal); + + Lisp_Object errname = Fcar (data); + /* TODO: Add arglist to Qwrong_number_of_arguments errors, and + check those too. */ + if (EQ (errname, Qvoid_variable)) + { + Lisp_Object void_symname = Fsymbol_name (Fnth (make_number (1), data)); + if (SCHARS (void_symname) > 0 && + /* TODO: check all confusables. */ + EQ (Faref (void_symname, make_number (0)), make_number (0x2019))) + { + Lisp_Object msg = CALLN + (Fformat_message, + build_string ("\nSymbol has with `%c' (%s) at character 0," + " did you mean `%c' (%s)"), + make_number (0x2019), build_string ("RIGHT SINGLE QUOTATION MARK"), + make_number ('\''), build_string ("APOSTROPHE")); + Fprinc (msg, Qt); + } + } } return Qnil; } diff --git a/src/lread.c b/src/lread.c index 3b0a17c90b..ee08902f81 100644 --- a/src/lread.c +++ b/src/lread.c @@ -3470,13 +3470,6 @@ read1 (Lisp_Object readcharfun, int *pch, bool first_in_list) if (! NILP (result)) return unbind_to (count, result); } - if (!quoted && multibyte) - { - int ch = STRING_CHAR ((unsigned char *) read_buffer); - if (confusable_symbol_character_p (ch)) - xsignal2 (Qinvalid_read_syntax, build_string ("strange quote"), - CALLN (Fstring, make_number (ch))); - } { Lisp_Object result; ptrdiff_t nbytes = p - read_buffer; -- 2.11.0 ^ permalink raw reply related [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-04 1:16 ` Michael Heerdegen 2018-02-04 1:25 ` Clément Pit-Claudel @ 2018-02-04 11:15 ` Alan Mackenzie 2018-02-04 15:54 ` Drew Adams 2018-02-04 14:47 ` Noam Postavsky 2 siblings, 1 reply; 98+ messages in thread From: Alan Mackenzie @ 2018-02-04 11:15 UTC (permalink / raw) To: Michael Heerdegen; +Cc: Eli Zaretskii, emacs-devel, Drew Adams, npostavs Hello, Michael. On Sun, Feb 04, 2018 at 02:16:52 +0100, Michael Heerdegen wrote: > Hello, > Helpfulness of error messages surely depends on the beholder, and on > expectations. In my eyes, > > Symbol's value as variable is void: 'аbbrevs-changed > is quite clear: you think this ^^^^^^^^^^^^^^^^ is a quoted > thing, but the error message calls it a symbol. So there must be a > problem with that quote, it has obviously gotten read as part of the > symbol. Sure, you have still to find out why. OTOH This has actually happened to me. In the error message, I didn't see the quote as part of the symbol, I subconsciously dismissed it as a quoting convention in the error message. So what my brain saw was Symbol's value as variable is void: abbrevs-changed . This puzzled me a long time. > > > (invalid-read-syntax "strange quote" "’") > also doesn't say what's wrong with that quote. It even calls something > a quote where there is none. Perhaps "strange quasi quote" would be more emphatic and clearer. > The error message is confusing. Repeating the pseudo quote character > in the error message doesn't make it look less like a quote. Agreed, on both points. > > I think you are so eager to make your point that you are willing to > > claim that black is white and vice versa. Any objective person would > > agree that the new error message is more directly pointing to the root > > cause > Are you really sure that every Emacs user would expect that we modify > the Lisp reader to catch typos? We're not talking about typos here. The curly quotes aren't present on typical keyboard layouts (though I'm informed they are present on Finnish keyboards), so nobody who isn't Finnish will type one of these characters by accident. We're talking about Emacs itself corrupting ASCII quotes into curly quotes in a `message' call because of the default setting of `text-quoting-style', and so on. Because of this, the error message should concentrate on that quote, not the strange symbol, which Emacs itself created. [ .... ] > > Symbol's value as variable is void: 'аbbrevs-changed. > Misusing something fundamental as the Lisp reader to catch such stuff > should be the very last resort. The result can get much more confusing > in situations we now don't think about. Maybe we're already at the last resort for this problem. Or maybe not. Maybe an error message for unknown symbols should check for them beginning with a curly quote. > > > Lisp doesn't have a bug here. > > That's a strawman, and you know it. We are talking about diagnostics > > for bugs in Lisp programs. > I think it's a eligible argument. Drew just thinks it's the wrong fix. > He may also think that no fix would maybe suffice. That's ok, and I > think he made some good points. > We should discuss about alternative approaches to move forward. People > often paste stuff into scratch or the M-: prompt that they copied from > elsewhere. Maybe we could make M-: and C-x C-e check for this problem. > These could also check for other, similar frequent problems. Any better > suggestions? I think that's a good suggestion. > Michael. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 98+ messages in thread
* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-04 11:15 ` Alan Mackenzie @ 2018-02-04 15:54 ` Drew Adams 0 siblings, 0 replies; 98+ messages in thread From: Drew Adams @ 2018-02-04 15:54 UTC (permalink / raw) To: Alan Mackenzie, Michael Heerdegen; +Cc: Eli Zaretskii, emacs-devel, npostavs > We're not talking about typos here. The curly quotes aren't present on > typical keyboard layouts (though I'm informed they are present on > Finnish keyboards), so nobody who isn't Finnish will type one of these > characters by accident. We're talking about Emacs itself corrupting > ASCII quotes into curly quotes in a `message' call because of the > default setting of `text-quoting-style', and so on. > > Because of this, the error message should concentrate on that quote, not > the strange symbol, which Emacs itself created. Not necessarily. Although I share your concern about Emacs promulgating curly quotes, there is a real usage problem akin to "typos": users copying text, including Lisp code, from a web page or elsewhere, and pasting it into Emacs as code to be evaluated at some point. If the source of the copy has already changed simple apostrophe to a curly quote (as one example) then that's what gets passed to Emacs. The person copy+pasting may well have no clue about just which characters are being copied. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-04 1:16 ` Michael Heerdegen 2018-02-04 1:25 ` Clément Pit-Claudel 2018-02-04 11:15 ` Alan Mackenzie @ 2018-02-04 14:47 ` Noam Postavsky 2 siblings, 0 replies; 98+ messages in thread From: Noam Postavsky @ 2018-02-04 14:47 UTC (permalink / raw) To: Michael Heerdegen; +Cc: Eli Zaretskii, Drew Adams, Emacs developers On Sat, Feb 3, 2018 at 8:16 PM, Michael Heerdegen <michael_heerdegen@web.de> wrote: > FWIW, we already modified the Lisp reader to catch another style issue > (to get rid of old-style backquotes) and made it error. It broke my > stuff (el-search) horribly - though I don't use old-style backquotes, > and for code that also doesn't use them. That backquote change made `read' signal errors when reading subexpressions of otherwise valid code. The change under discussion changes what is valid code, so you won't have the problem of getting read errors for valid code. (changing what is valid Lisp has other drawbacks, as Drew has repeatedly pointed out) ^ permalink raw reply [flat|nested] 98+ messages in thread
* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-03 17:05 ` Eli Zaretskii 2018-02-04 1:16 ` Michael Heerdegen @ 2018-02-04 1:55 ` Drew Adams 2018-02-04 2:10 ` Noam Postavsky 2018-02-05 1:06 ` Why "symbol's value" error about a list? Richard Stallman 2018-02-05 1:06 ` Change of Lisp syntax for "fancy" quotes in Emacs 27? Richard Stallman 3 siblings, 1 reply; 98+ messages in thread From: Drew Adams @ 2018-02-04 1:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, npostavs > Those bug reports complained about obscure error messages that are > unhelpful when a Lisp programmer tries to figure out the root cause. > I'm saying that we should find an alternative way of making clear, > helpful error messages in those special cases where characters which > display similarly might make the error message confusing if it just > cites the symbol's name. OK. Except I would say warnings, not error messages, at least in most cases. But even if we have an error message, that's not a call to change the syntax of Lisp. User errors happen. We should just want to help users avoid making such errors. > For example, suppose you have a Lisp program that produces > the following error message when compiled/executed: > > Symbol's value as variable is void: 'аbbrevs-changed > > You then type "C-h v abbrevs-changed RET" and get the expected result, > meaning that the variable is known to Emacs. How quickly will you be > able to spot the cause of the error message? Some people will wonder for a while. Others, perhaps already bitten by this gotcha, will notice the quote mark there right away. One thing that would help, I think, and which should be done in general, would be to put the offending thingie between `...': Symbol's value as variable is void: `'аbbrevs-changed' That makes it more obvious that the symbol name includes that fancy quote char. Still, all of this is pilot error, where "pilot" can include the user who wrote the code but more likely means a user who copy+pasted it. > The change that got reverted from the emacs-26 branch was about a > similar case, but for a character that's much more important for Lisp > than 'a': it's about the character used to quote symbol names. But > the essence is the same: due to how characters are displayed, some > characters can be confused for others. > > We want to find a way of identifying such situation and telling the > Lisp programmer about that in clear and easily understandable ways. > One way, perhaps too radical one, is to reject such "confusable" > characters outright. We could decide that we don't want such a > radical solution, but that doesn't mean we should give up on the > attempt to find some other solution for the problem. Neither does it > mean we should proclaim people who installed the change as enemies of > the society. Agreed. As I've said, I'm in favor of providing friendly warnings/reminders that point out that such a character is present. I think that should be enough. There are lots of potential confusables, and lots of different use contexts. But if we start with just one or two such chars and one or two common and clear contexts where a warning might help, that would be good. We can always add more such warnings as cases come up (get reported or otherwise become obvious). It would be an overreaction, IMO, to jump to changing the existing Lisp syntax to raise errors when someone uses such a character in, say, a symbol name. We should not require such chars to be escaped in a symbol name. Such chars have no special meaning for Lisp (unlike `.', `,' `'', ``', `(', `)', `[', `]', `"', `<', `>', `#' `;', and perhaps some more). > > Bug #23425, on the other hand, is a gigantic > > stream-of-consciousness about anything and > > everything [...] > > [...] > > How is it helpful to throw all of #23425 into > > this Lisp syntax-change question, as if the > > present issue puts into question everything > > ever discussed about curly quotes? > > I could turn the table and ask you how is it helpful > to dump on us all your random thoughts about this, > instead of simply saying you didn't understand the > relevance and asking for more explanations. Which I > just provided. Whoa! I don't see a connection between the current issue and the many things discussed in #23425. And I don't think I dumped any random thoughts on anyone. > I hope now the issue is clear enough. No idea what your point is there. If there is some part of bug #23425 that you think is relevant here, and you think it will be UNfixed by reverting the Lisp-syntax change made for bug #30217, please tell us what that part is. I don't see anything in #23425 that needs the change in Lisp syntax made for #30217. And I don't see that Lisp change being necessary to fix #30217 either. It wasn't requested by the bug filer, AFAIK. Same for the other bugs you mentioned. The filers just asked for warnings, AFAICT. > > And the original error message from bug #23425 > > is _more_ meaningful and helpful, not less, > > than the new one after the "fix". > > I think you are so eager to make your point that you are willing to > claim that black is white and vice versa. Any objective person would > agree that the new error message is more directly pointing to the root > cause, which is the syntax of specifying a quoted symbol name using a > "strange quote". If we are good in writing and indexing our ELisp > manual, then I'd expect to find there an index entry for "strange > quote", which will land me where this issue is explained. Case > closed. We can perhaps agree to disagree about that. But of course if you say the case is closed then it's closed. > Once again, I can agree that this measure might be too harsh, but I > would still like to see clear diagnostics of such typos, and like > Paul, I thing we should take our inspiration from the Unicode > Standard's notion of "confusables". I've agreed about that from the beginning. It can be helpful to warn users about possible confusion when they use confusables. And I agree that clear diagnostics are needed - that was one of my points. That's different from changing the syntax of Lisp. > Ideas and proposals for patches along those lines > are welcome. Ditto. > Ignoring the problem, or trying to convince us > that it doesn't exist, is not. I recognize the problems of confusable characters. Not all such possible confusions are equally likely, in practice. Recognizing contexts where something might well be a typo, and warrants a helpful reminder/warning, is what's needed - case by case. What's not needed, IMO (and probably the only place where I differ from you on this, even if you don't want to recognize it) is a change in Lisp syntax, making it a read error not to escape such a character. > > Either doing nothing or trying to warn about such > > gotchas is right. Changing Lisp syntax here is > > not right. > > Doing nothing would be ignoring the problem. Yes. It's maybe not the best help for users, but it would be one way to handle those few reports of confusion. We get a lot more questions due to other confusions wrt Lisp than we do such questions due to confusing one char for another. I didn't, and don't, say that doing nothing is the best approach. I said it's one way to deal with such reports. Unlike changing Lisp syntax, it at least doesn't introduce new problems. > That changing Lisp syntax is not right is your > opinion: legitimate, but clearly not shared by at > least some. That's why we're having this discussion. I have yet to hear a reason why it is right to change Lisp syntax for this - why a simple warning is not sufficient and we need to also make Lisp raise an error. > > Lisp doesn't have a bug here. > > That's a strawman, and you know it. We are talking > about diagnostics for bugs in Lisp programs. I have no objection to diagnostics. Add warnings for byte-compilation, loading, whatever. Make sure the warnings are clear. Say, for instance that a curly quote was used in sexp `...'. Don't just say that invalid syntax was read (somewhere). Clearly pointing out the confusable char in the possibly confused sexp should go a long way to making things clear. My objection is to making such chars be escaped to prevent Lisp from raising an error. I don't put `a’b' in the same class as, say, `a,b'. `,' is special in Lisp, and (setq a,b 42) should (and does) raise an error. `’' is not special in Lisp, and (setq a’b 42) should not raise an error (IMO). Likewise, (setq ,b 42) (yes) and (setq ’b 42) (no). If you want to argue for this syntax change, why not address some of my arguments against it? Where will you draw the line, for instance? There are _lots_ of possible confusables. I'd say start with only the few that have actually been reported (is there only one reported?), trying to come up with reasonable warnings in particular contexts of use. That would be a good start. We might even have a user option that lists the confusables to check/warn for, with whatever default value people here think is best (it might be only `’', to start with - or both left and right curly quotes). Are you thinking instead (since both you and Paul mentioned the Unicode list of confusables) of starting with _all_ characters in that list? http://www.unicode.org/Public/security/8.0.0/confusables.txt I won't argue about which chars should be warned about, though I might be interested to see what contexts we warn for and what the messages will be. My objection is not about detecting this or that use of this or that character and warning/reminding users about it. My objection is to making Lisp require escaping of such characters. That's all. I think I've made that as clear as I possibly can. But you seem to want to paint my objection as being against helping users know about accidental use of confusables, e.g., `’' instead of `''. Why? ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-04 1:55 ` Drew Adams @ 2018-02-04 2:10 ` Noam Postavsky 0 siblings, 0 replies; 98+ messages in thread From: Noam Postavsky @ 2018-02-04 2:10 UTC (permalink / raw) To: Drew Adams; +Cc: Eli Zaretskii, Emacs developers On Sat, Feb 3, 2018 at 8:55 PM, Drew Adams <drew.adams@oracle.com> wrote: > My objection is to making Lisp require escaping of > such characters. That's all. I think I've made > that as clear as I possibly can. I think your position is indeed quite clear by now. In fact, I think the length and frequency of your posts are going to make it harder for other people to participate, so could you dial it back it a bit. Please? ^ permalink raw reply [flat|nested] 98+ messages in thread
* Why "symbol's value" error about a list? 2018-02-03 17:05 ` Eli Zaretskii 2018-02-04 1:16 ` Michael Heerdegen 2018-02-04 1:55 ` Drew Adams @ 2018-02-05 1:06 ` Richard Stallman 2018-02-05 20:35 ` Alan Mackenzie 2018-02-06 11:27 ` Noam Postavsky 2018-02-05 1:06 ` Change of Lisp syntax for "fancy" quotes in Emacs 27? Richard Stallman 3 siblings, 2 replies; 98+ messages in thread From: Richard Stallman @ 2018-02-05 1:06 UTC (permalink / raw) To: Eli Zaretskii; +Cc: npostavs, drew.adams, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > For example, suppose you have a Lisp program that produces the > following error message when compiled/executed: > Symbol's value as variable is void: 'аbbrevs-changed Does that error message really happen? If so, how can I reproduce it? I understand that the character 'а' is not ASCII a. That explains why 'аbbrevs-changed' is not known as a variable. But I'm talking about a different issue, which has nothing to do with character coding. Suppose it were 'foobaz', all ASCII, and we got an error such as > Symbol's value as variable is void: 'foobaz That still seems wrong. If the error was that foobaz was void, the error message should not include a quote. It should say > Symbol's value as variable is void: foobaz Or if the error was that 'foobaz is used instead of a symbol, the error message should say > Wrong type argument: symbolp, (quote foobaz) -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) Skype: No way! See https://stallman.org/skype.html. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Why "symbol's value" error about a list? 2018-02-05 1:06 ` Why "symbol's value" error about a list? Richard Stallman @ 2018-02-05 20:35 ` Alan Mackenzie 2018-02-05 21:46 ` Drew Adams 2018-02-06 14:51 ` Richard Stallman 2018-02-06 11:27 ` Noam Postavsky 1 sibling, 2 replies; 98+ messages in thread From: Alan Mackenzie @ 2018-02-05 20:35 UTC (permalink / raw) To: Richard Stallman; +Cc: Eli Zaretskii, emacs-devel, drew.adams, npostavs Hello, Richard. On Sun, Feb 04, 2018 at 20:06:39 -0500, Richard Stallman wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > For example, suppose you have a Lisp program that produces the > > following error message when compiled/executed: > > Symbol's value as variable is void: 'аbbrevs-changed > Does that error message really happen? If so, how can I reproduce it? In Emacs-25.3 -Q, do M-: (message "(setq foo 'bar)") RET , followed by getting the output from *Messages* into the kill ring with M-w, followed by M-: C-y RET . You might think you are executing (setq foo 'bar). You're not. You're executing (setq foo ’bar), where the ’ is a Unicode curly quote. The error message given out is: Symbol's value as variable is void: ’bar . If you're like me, you will read that as the symbol "bar" is void, rather than the symbol "’bar" is void. This is a result of the change in `message', silently to convert ' to a curly quote, by default. Some of us were unhappy at this change and protested against it. > I understand that the character 'а' is not ASCII a. That explains why > 'аbbrevs-changed' is not known as a variable. But I'm talking about > a different issue, which has nothing to do with character coding. > Suppose it were 'foobaz', all ASCII, and we got an error such as > > Symbol's value as variable is void: 'foobaz > That still seems wrong. Again "’foobaz", not "foobaz" is the symbol, here. > If the error was that foobaz was void, the error message should not > include a quote. It should say > > Symbol's value as variable is void: foobaz Yes. > Or if the error was that 'foobaz is used instead of a symbol, the error > message should say > > Wrong type argument: symbolp, (quote foobaz) In the recent pretest, Emacs-26.0.91, when a curly quote appears at the start of a symbol, the reader rejects it, giving the error message: read--expression: Invalid read syntax: "strange quote", "'" . This is somewhat controversional, and is what the recent discussion has been about. > -- > Dr Richard Stallman > President, Free Software Foundation (https://gnu.org, https://fsf.org) > Internet Hall-of-Famer (https://internethalloffame.org) > Skype: No way! See https://stallman.org/skype.html. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 98+ messages in thread
* RE: Why "symbol's value" error about a list? 2018-02-05 20:35 ` Alan Mackenzie @ 2018-02-05 21:46 ` Drew Adams 2018-02-06 4:13 ` Eli Zaretskii 2018-02-06 14:51 ` Richard Stallman 1 sibling, 1 reply; 98+ messages in thread From: Drew Adams @ 2018-02-05 21:46 UTC (permalink / raw) To: Alan Mackenzie, Richard Stallman; +Cc: Eli Zaretskii, emacs-devel, npostavs > > > For example, suppose you have a Lisp program that produces the > > > following error message when compiled/executed: > > > Symbol's value as variable is void: 'аbbrevs-changed > > > Does that error message really happen? If so, how can I reproduce it? > > In Emacs-25.3 -Q, do M-: (message "(setq foo 'bar)") RET > followed by getting the output from *Messages* into the kill ring with > M-w, followed by M-: C-y RET. > > You might think you are executing (setq foo 'bar). You're not. > You're executing (setq foo ’bar), where the ’ is a Unicode curly quote. > > The error message given out is: > Symbol's value as variable is void: ’bar That was the old, and legitimate, error message, yes. It accurately describes what is really going on (as you describe well, below). Now the message is instead (invalid-read-syntax "strange quote" "’"). Is that better? That's part of what this discussion is about. I suggested that the variable name be enclosed in `...'. That would make the original message clearer, I think: Symbol's value as variable is void: `’bar' At least it could make it more likely that you would think about looking at that quote mark. > This is a result of the change in `message', silently to convert ' to a > curly quote, by default. Some of us were unhappy at this change and > protested against it. Count me as one of those "some of us". Echoing Lisp code should do just that - no fiddling to "prettify" apostrophe to curly quote etc. > > Suppose it were 'foobaz', all ASCII, and we got an error such as > > Symbol's value as variable is void: 'foobaz > > That still seems wrong. Here's the thing: There _is_ a Lisp error - no doubt. But for Lisp the error is not that a curly quote was read as part of a symbol name. That's not a Lisp error (at least it has not been, until now.) The error is using a symbol as a variable, when it is not defined as a variable. Which is exactly what the original error message said. That's the LISP error. Is there a _user error_ here? Yes, it's the mistake of copying and pasting what was printed in *Messages*. That user mistake is excusable. And we would want to inform the user about it, if we can't prevent it. But changing Lisp read syntax to guess what might be the most helpful thing to tell a user here is NOT the solution. Should this Lisp syntax change be reverted? That's the question being discussed here. Changing the read syntax is a general, Lisp-level change. We should instead prevent this user mistake by removing its cause. The real error here is (IMO) a design error by Emacs: The expression read and copied to *Messages* should not have been "helpfully" translated to use a curly quote instead of an apostrophe. Emacs shot Lisp in the foot on this one. It's not the fault of Lisp and its reader (syntax). It's the fault of some misguided "modernization" of Emacs gone amuck. Users should not find the input (setq foo 'bar) transformed to (setq foo ’bar), i.e., APOSTROPHE replaced by RIGHT SINGLE QUOTATION MARK. > Again "’foobaz", not "foobaz" is the symbol, here. Yes, and that's a legitimate symbol name. Nothing wrong with Lisp telling us that that symbol is undefined as a variable. That's exactly what the _LISP_ problem is here. That's just not the symbol that was passed to Lisp originally. That's a non-variable symbol name copied from *Messages*. The mistake was putting that in *Messages* in the first place. Where was the mistake? Lisp claiming that you used a symbol as an undefined variable? The user copying that symbol name from *Messages* and trying to evaluate its symbol as a var? Or Emacs inserting a different symbol name in *Messages*, by substituting the text "’bar" for the text "'bar"? The original Lisp expression was a Lisp expression, not just text. A quote mark (apostrophe) in Lisp has special meaning, special syntax. That shouldn't be ignored by some dumb (yes) substitution of curly quotes for straight quotes. > > If the error was that foobaz was void, the error message > > should not include a quote. It should say > > Symbol's value as variable is void: foobaz > > Yes. No. Only if `foobaz' were indeed the symbol that was an undefined variable. But that's NOT the case here. The undefined variable here is the symbol `’foobaz' (from *Messages*) - it really is. The underlying mistake took place long before Lisp evaluation of the pasted sexp. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Why "symbol's value" error about a list? 2018-02-05 21:46 ` Drew Adams @ 2018-02-06 4:13 ` Eli Zaretskii 2018-02-06 7:32 ` Tim Cross 2018-02-06 15:45 ` Drew Adams 0 siblings, 2 replies; 98+ messages in thread From: Eli Zaretskii @ 2018-02-06 4:13 UTC (permalink / raw) To: Drew Adams; +Cc: acm, emacs-devel, rms, npostavs > Date: Mon, 5 Feb 2018 13:46:38 -0800 (PST) > From: Drew Adams <drew.adams@oracle.com> > Cc: Eli Zaretskii <eliz@gnu.org>, npostavs@users.sourceforge.net, > emacs-devel@gnu.org > > > The error message given out is: > > Symbol's value as variable is void: ’bar > > That was the old, and legitimate, error message, yes. It > accurately describes what is really going on (as you describe > well, below). > > Now the message is instead (invalid-read-syntax "strange quote" > "’"). Is that better? I think it's somewhat better, because it talks about "strange quote", which is a hint for the user about the actual problem. > I suggested that the variable name be enclosed in `...'. That > would make the original message clearer, I think: > > Symbol's value as variable is void: `’bar' That might make things even more confusing, because the text actually displayed will be this: Symbol’s value as variable is void: ‘’bar’ which loses all hints of what is being quoted here. > > This is a result of the change in `message', silently to convert ' to a > > curly quote, by default. Some of us were unhappy at this change and > > protested against it. > > Count me as one of those "some of us". Echoing Lisp code > should do just that - no fiddling to "prettify" apostrophe to > curly quote etc. That ship has sailed two Emacs releases ago. We are trying to fix the fallout. And strange quotes is only one situation where confusingly similar characters can be presented in error messages, making it hard for users to spot the real problem. We are trying to find ways of making such "typos" more evident in error messages. > The error is using a symbol as a variable, when it is not > defined as a variable. Which is exactly what the original > error message said. > > That's the LISP error. Is there a _user error_ here? > Yes, it's the mistake of copying and pasting what was > printed in *Messages*. > > That user mistake is excusable. And we would want to > inform the user about it, if we can't prevent it. But > changing Lisp read syntax to guess what might be the > most helpful thing to tell a user here is NOT the solution. The issue is what _would_ be a helpful message in these cases. You are just saying what should _not_ be done (repeatedly), but that doesn't advance us towards the solution. > Should this Lisp syntax change be reverted? That's the > question being discussed here. No, that's only part of the question. The other, no less important part is if we revert that change, how to make the confusing error message less so and more helpful in understanding the user error. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Why "symbol's value" error about a list? 2018-02-06 4:13 ` Eli Zaretskii @ 2018-02-06 7:32 ` Tim Cross 2018-02-06 7:40 ` Eli Zaretskii 2018-02-06 15:45 ` Drew Adams 2018-02-06 15:45 ` Drew Adams 1 sibling, 2 replies; 98+ messages in thread From: Tim Cross @ 2018-02-06 7:32 UTC (permalink / raw) To: Eli Zaretskii; +Cc: acm, npostavs, rms, Drew Adams, Emacs developers [-- Attachment #1: Type: text/plain, Size: 3814 bytes --] It seems there are two issues here - they are not completely separate, but do seem to be distinct and probably need to be addressed in two steps. If the statement > Count me as one of those "some of us". Echoing Lisp code > should do just that - no fiddling to "prettify" apostrophe to > curly quote etc. is correct, then I would agree it was a bad design decision. The *Messages* buffer should display lisp code exactly as it is read and not try to 'prettify' it. The second issue seems to be more about how to make the error message more informative. I suspect this is a much harder problem to resolve. I don't know what the right solution is for that, but I do know that I would have more chance of recognising my error if the message displayed in the buffer displays the lisp code exactly as it was read by the reader. Tim On 6 February 2018 at 15:13, Eli Zaretskii <eliz@gnu.org> wrote: > > Date: Mon, 5 Feb 2018 13:46:38 -0800 (PST) > > From: Drew Adams <drew.adams@oracle.com> > > Cc: Eli Zaretskii <eliz@gnu.org>, npostavs@users.sourceforge.net, > > emacs-devel@gnu.org > > > > > The error message given out is: > > > Symbol's value as variable is void: ’bar > > > > That was the old, and legitimate, error message, yes. It > > accurately describes what is really going on (as you describe > > well, below). > > > > Now the message is instead (invalid-read-syntax "strange quote" > > "’"). Is that better? > > I think it's somewhat better, because it talks about "strange quote", > which is a hint for the user about the actual problem. > > > I suggested that the variable name be enclosed in `...'. That > > would make the original message clearer, I think: > > > > Symbol's value as variable is void: `’bar' > > That might make things even more confusing, because the text actually > displayed will be this: > > Symbol’s value as variable is void: ‘’bar’ > > which loses all hints of what is being quoted here. > > > > This is a result of the change in `message', silently to convert ' to a > > > curly quote, by default. Some of us were unhappy at this change and > > > protested against it. > > > > Count me as one of those "some of us". Echoing Lisp code > > should do just that - no fiddling to "prettify" apostrophe to > > curly quote etc. > > That ship has sailed two Emacs releases ago. We are trying to fix the > fallout. > > And strange quotes is only one situation where confusingly similar > characters can be presented in error messages, making it hard for > users to spot the real problem. We are trying to find ways of making > such "typos" more evident in error messages. > > > The error is using a symbol as a variable, when it is not > > defined as a variable. Which is exactly what the original > > error message said. > > > > That's the LISP error. Is there a _user error_ here? > > Yes, it's the mistake of copying and pasting what was > > printed in *Messages*. > > > > That user mistake is excusable. And we would want to > > inform the user about it, if we can't prevent it. But > > changing Lisp read syntax to guess what might be the > > most helpful thing to tell a user here is NOT the solution. > > The issue is what _would_ be a helpful message in these cases. You > are just saying what should _not_ be done (repeatedly), but that > doesn't advance us towards the solution. > > > Should this Lisp syntax change be reverted? That's the > > question being discussed here. > > No, that's only part of the question. The other, no less important > part is if we revert that change, how to make the confusing error > message less so and more helpful in understanding the user error. > > -- regards, Tim -- Tim Cross [-- Attachment #2: Type: text/html, Size: 5199 bytes --] ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Why "symbol's value" error about a list? 2018-02-06 7:32 ` Tim Cross @ 2018-02-06 7:40 ` Eli Zaretskii 2018-02-06 15:45 ` Drew Adams 1 sibling, 0 replies; 98+ messages in thread From: Eli Zaretskii @ 2018-02-06 7:40 UTC (permalink / raw) To: emacs-devel, Tim Cross; +Cc: acm, Emacs developers, rms, Drew Adams, npostavs On February 6, 2018 9:32:21 AM GMT+02:00, Tim Cross <theophilusx@gmail.com> wrote: > It seems there are two issues here - they are not completely separate, > but > do seem to be distinct and probably need to be addressed in two steps. > > If the statement > > > Count me as one of those "some of us". Echoing Lisp code > > should do just that - no fiddling to "prettify" apostrophe to > > curly quote etc. > > is correct, then I would agree it was a bad design decision. The > *Messages* > buffer should display lisp code exactly as it is read and not try to > 'prettify' it. > > The second issue seems to be more about how to make the error message > more > informative. I suspect this is a much harder problem to resolve. I > don't > know what the right solution is for that, but I do know that I would > have > more chance of recognising my error if the message displayed in the > buffer > displays the lisp code exactly as it was read by the reader. > > Tim > > > On 6 February 2018 at 15:13, Eli Zaretskii <eliz@gnu.org> wrote: > > > > Date: Mon, 5 Feb 2018 13:46:38 -0800 (PST) > > > From: Drew Adams <drew.adams@oracle.com> > > > Cc: Eli Zaretskii <eliz@gnu.org>, npostavs@users.sourceforge.net, > > > emacs-devel@gnu.org > > > > > > > The error message given out is: > > > > Symbol's value as variable is void: ’bar > > > > > > That was the old, and legitimate, error message, yes. It > > > accurately describes what is really going on (as you describe > > > well, below). > > > > > > Now the message is instead (invalid-read-syntax "strange quote" > > > "’"). Is that better? > > > > I think it's somewhat better, because it talks about "strange > quote", > > which is a hint for the user about the actual problem. > > > > > I suggested that the variable name be enclosed in `...'. That > > > would make the original message clearer, I think: > > > > > > Symbol's value as variable is void: `’bar' > > > > That might make things even more confusing, because the text > actually > > displayed will be this: > > > > Symbol’s value as variable is void: ‘’bar’ > > > > which loses all hints of what is being quoted here. > > > > > > This is a result of the change in `message', silently to convert > ' to a > > > > curly quote, by default. Some of us were unhappy at this change > and > > > > protested against it. > > > > > > Count me as one of those "some of us". Echoing Lisp code > > > should do just that - no fiddling to "prettify" apostrophe to > > > curly quote etc. > > > > That ship has sailed two Emacs releases ago. We are trying to fix > the > > fallout. > > > > And strange quotes is only one situation where confusingly similar > > characters can be presented in error messages, making it hard for > > users to spot the real problem. We are trying to find ways of > making > > such "typos" more evident in error messages. > > > > > The error is using a symbol as a variable, when it is not > > > defined as a variable. Which is exactly what the original > > > error message said. > > > > > > That's the LISP error. Is there a _user error_ here? > > > Yes, it's the mistake of copying and pasting what was > > > printed in *Messages*. > > > > > > That user mistake is excusable. And we would want to > > > inform the user about it, if we can't prevent it. But > > > changing Lisp read syntax to guess what might be the > > > most helpful thing to tell a user here is NOT the solution. > > > > The issue is what _would_ be a helpful message in these cases. You > > are just saying what should _not_ be done (repeatedly), but that > > doesn't advance us towards the solution. > > > > > Should this Lisp syntax change be reverted? That's the > > > question being discussed here. > > > > No, that's only part of the question. The other, no less important > > part is if we revert that change, how to make the confusing error > > message less so and more helpful in understanding the user error. > > > > Lisp code is not changed in messages; only quoted plain text is. ^ permalink raw reply [flat|nested] 98+ messages in thread
* RE: Why "symbol's value" error about a list? 2018-02-06 7:32 ` Tim Cross 2018-02-06 7:40 ` Eli Zaretskii @ 2018-02-06 15:45 ` Drew Adams 1 sibling, 0 replies; 98+ messages in thread From: Drew Adams @ 2018-02-06 15:45 UTC (permalink / raw) To: Tim Cross, Eli Zaretskii; +Cc: acm, npostavs, rms, Emacs developers > It seems there are two issues here - they are not completely > separate, but do seem to be distinct and probably need to be > addressed in two steps. > > If the statement > > > Count me as one of those "some of us". Echoing Lisp code > > should do just that - no fiddling to "prettify" apostrophe to > > curly quote etc. > > is correct, then I would agree it was a bad design decision. > The *Messages* buffer should display lisp code exactly as it > is read and not try to 'prettify' it. Yes. > The second issue seems to be more about how to make the error > message more informative. Yes, but it's not just about that error message. That Lisp error is about an undefined variable. But there are plenty of other contexts where users can be confused by such a gotcha. If code or a user did in fact define a variable named ’bar then there would be no Lisp error (prior to the recent change). Such a symbol name could nevertheless be confusing in some contexts. But it's not about how best to present that undefined-variable error message. That message was telling the truth, even if in the particular context presented it might not be immediately clear to a user what the undefined symbol name is (i.e., that the name contains a curly quote). > I suspect this is a much harder problem to resolve. I don't > know what the right solution is for that, but I do know that > I would have more chance of recognising my error if the > message displayed in the buffer displays the lisp code > exactly as it was read by the reader. Precisely. That Lisp error really is about using symbol `’bar' as a variable. Such code can exist in different contexts, only some of which have anything to do with a user mistaking a curly quote for an apostrophe. Just as some Lisp code can mistakenly use symbol `abc' as a variable apart from any binding of it as a variable, and so provoking the undefined-variable error, so can code mistakenly use symbol `’bar' as an undefined variable. Such a context would have nothing to do with the newly fabricated error (invalid-read-syntax "strange quote" "’"). That's just the wrong error for Lisp to raise here. ^ permalink raw reply [flat|nested] 98+ messages in thread
* RE: Why "symbol's value" error about a list? 2018-02-06 4:13 ` Eli Zaretskii 2018-02-06 7:32 ` Tim Cross @ 2018-02-06 15:45 ` Drew Adams 2018-02-06 19:17 ` Eli Zaretskii 1 sibling, 1 reply; 98+ messages in thread From: Drew Adams @ 2018-02-06 15:45 UTC (permalink / raw) To: Eli Zaretskii; +Cc: acm, emacs-devel, rms, npostavs > > > The error message given out is: > > > Symbol's value as variable is void: ’bar > > > > That was the old, and legitimate, error message, yes. It > > accurately describes what is really going on (as you describe > > well, below). > > > > Now the message is instead (invalid-read-syntax "strange quote" > > "’"). Is that better? > > I think it's somewhat better, because it talks about "strange quote", > which is a hint for the user about the actual problem. The actual problem is the use of a non-variable symbol as a variable. At least that has been the problem in this example, until the recent change in Lisp syntax. There's no problem using a symbol as a variable if its name is ’bar. You've just made it necessary now to escape that curly quote when defining and using the symbol: (defvar \’bar 42 "...") And if a variable in fact has that name you still raise a Lisp read-syntax error if the quote is not escaped. > > I suggested that the variable name be enclosed in `...'. That > > would make the original message clearer, I think: > > Symbol's value as variable is void: `’bar' > > That might make things even more confusing, because the text actually > displayed will be this: > Symbol’s value as variable is void: ‘’bar’ > which loses all hints of what is being quoted here. I wrote `’bar'. > > > This is a result of the change in `message', silently to > > > convert ' to a curly quote, by default. Some of us were > > > unhappy at this change and protested against it. > > > > Count me as one of those "some of us". Echoing Lisp code > > should do just that - no fiddling to "prettify" apostrophe to > > curly quote etc. > > That ship has sailed two Emacs releases ago. We are trying to > fix the fallout. Two releases ago and still reaping the fallout rewards... Time to call back that ship or try to redirect it? > And strange quotes is only one situation where confusingly similar > characters can be presented in error messages, making it hard for > users to spot the real problem. We are trying to find ways of making > such "typos" more evident in error messages. Where's the error in (defvar ’bar 42 "...")? You've introduced Lisp read errors where there were none. The error here is the automatic translation of a Lisp sexp that uses an ordinary quote mark (apostrophe) to a curly quote by `message' (?), so that the wrong sexp gets logged to *Messages*. The second error is trying to fix that error by changing Lisp syntax so that an error is raised, instead of just (optionally) displaying a warning message. There's no reason to stop Lisp evaluation just because we want to inform a user about a possible misunderstanding (gotcha). > > The error is using a symbol as a variable, when it is not > > defined as a variable. Which is exactly what the original > > error message said. > > > > That's the LISP error. Is there a _user error_ here? > > Yes, it's the mistake of copying and pasting what was > > printed in *Messages*. > > > > That user mistake is excusable. And we would want to > > inform the user about it, if we can't prevent it. But > > changing Lisp read syntax to guess what might be the > > most helpful thing to tell a user here is NOT the solution. > > The issue is what _would_ be a helpful message in these cases. You > are just saying what should _not_ be done (repeatedly), but that > doesn't advance us towards the solution. I've said (repeatedly, as you like to repeat) that we can display all the warnings you like. What we should not do is change Lisp syntax to raise an artificial error. There is no Lisp error in evaluating (setq ’bar 42), regardless of how or why someone might do that. It's fine to let someone know that s?he did it, pointing to the curly quote. It's wrong to raise a Lisp error. > > Should this Lisp syntax change be reverted? That's the > > question being discussed here. > > No, that's only part of the question. The other, no less important > part is if we revert that change, how to make the confusing error > message less so and more helpful in understanding the user error. Agreed. The first step is to revert the change in Lisp syntax. The second step is to design aids for users to recognize such gotchas. The zeroth step is to realize that the Lisp change should be reverted. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Why "symbol's value" error about a list? 2018-02-06 15:45 ` Drew Adams @ 2018-02-06 19:17 ` Eli Zaretskii 0 siblings, 0 replies; 98+ messages in thread From: Eli Zaretskii @ 2018-02-06 19:17 UTC (permalink / raw) To: Drew Adams; +Cc: acm, emacs-devel, rms, npostavs > Date: Tue, 6 Feb 2018 07:45:55 -0800 (PST) > From: Drew Adams <drew.adams@oracle.com> > Cc: acm@muc.de, rms@gnu.org, npostavs@users.sourceforge.net, > emacs-devel@gnu.org > > > > > The error message given out is: > > > > Symbol's value as variable is void: ’bar > > > > > > That was the old, and legitimate, error message, yes. It > > > accurately describes what is really going on (as you describe > > > well, below). > > > > > > Now the message is instead (invalid-read-syntax "strange quote" > > > "’"). Is that better? > > > > I think it's somewhat better, because it talks about "strange quote", > > which is a hint for the user about the actual problem. > > The actual problem is the use of a non-variable symbol > as a variable. That's one possibility, yes. But a much more probable possibility is that the user mistyped the quote, either because she copy/pasted it from some text, or because she turned on the Electric Quote mode, or for some other reason. A useful error message should consider this latter probable cause and help the user correct it, if indeed that was the reason. Many tools do similar second-guessing for frequent mistakes. For example. GNU Make detects when a line in a Makefile starts with 8 SPC characters instead of a mandatory TAB, and says: *** missing separator (did you mean TAB instead of 8 spaces?). The first part is the "dumb" error message, based on the syntax error, the part in parentheses is a helpful hint for the user, based on many such user errors seen in the past. Latest versions of GCC also provide similar hints. > You've just made it necessary now to escape that curly quote when > defining and using the symbol: You are changing the subject. I just wrote that an error message which mentions "strange quotes" is somewhat better than one which just states the syntax error. I said nothing about anything else. > > > Symbol's value as variable is void: `’bar' > > > > That might make things even more confusing, because the text actually > > displayed will be this: > > Symbol’s value as variable is void: ‘’bar’ > > which loses all hints of what is being quoted here. > > I wrote `’bar'. Yes, but the Lisp function 'message' has its own ideas regarding quoting text, as you well know. Try evaluating this: (message "Symbol's value as variable is void: `%s'" "’bar") > > That ship has sailed two Emacs releases ago. We are trying to > > fix the fallout. > > Two releases ago and still reaping the fallout rewards... > Time to call back that ship or try to redirect it? You can try fighting this Quixotic battle on and on, but I don't recommend that. > You've introduced Lisp read errors where there were > none. No, I didn't do anything of the kind. > > > Should this Lisp syntax change be reverted? That's the > > > question being discussed here. > > > > No, that's only part of the question. The other, no less important > > part is if we revert that change, how to make the confusing error > > message less so and more helpful in understanding the user error. > > Agreed. The first step is to revert the change in Lisp > syntax. The second step is to design aids for users to > recognize such gotchas. I think both steps should be made together, otherwise we would be making a change for the worse. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Why "symbol's value" error about a list? 2018-02-05 20:35 ` Alan Mackenzie 2018-02-05 21:46 ` Drew Adams @ 2018-02-06 14:51 ` Richard Stallman 1 sibling, 0 replies; 98+ messages in thread From: Richard Stallman @ 2018-02-06 14:51 UTC (permalink / raw) To: Alan Mackenzie; +Cc: eliz, emacs-devel, drew.adams, npostavs [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > Symbol's value as variable is void: 'foobaz > > That still seems wrong. > Again "’foobaz", not "foobaz" is the symbol, here. This makes sense, as an issue about quotes. But why, then, did the other message talk about using a confusable letter and say that was the ONLY problem. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) Skype: No way! See https://stallman.org/skype.html. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Why "symbol's value" error about a list? 2018-02-05 1:06 ` Why "symbol's value" error about a list? Richard Stallman 2018-02-05 20:35 ` Alan Mackenzie @ 2018-02-06 11:27 ` Noam Postavsky 2018-02-06 14:53 ` Richard Stallman 2018-02-06 18:52 ` Eli Zaretskii 1 sibling, 2 replies; 98+ messages in thread From: Noam Postavsky @ 2018-02-06 11:27 UTC (permalink / raw) To: Richard Stallman; +Cc: Eli Zaretskii, Drew Adams, Emacs developers On Sun, Feb 4, 2018 at 8:06 PM, Richard Stallman <rms@gnu.org> wrote: > > For example, suppose you have a Lisp program that produces the > > following error message when compiled/executed: > > > Symbol's value as variable is void: 'аbbrevs-changed > > Does that error message really happen? If so, how can I reproduce it? I don't think there is a way to get this particular message (with an ascii apostrophe and cyrillic a). ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Why "symbol's value" error about a list? 2018-02-06 11:27 ` Noam Postavsky @ 2018-02-06 14:53 ` Richard Stallman 2018-02-06 18:59 ` Eli Zaretskii 2018-02-06 18:52 ` Eli Zaretskii 1 sibling, 1 reply; 98+ messages in thread From: Richard Stallman @ 2018-02-06 14:53 UTC (permalink / raw) To: Noam Postavsky; +Cc: eliz, drew.adams, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > For example, suppose you have a Lisp program that produces the > > > following error message when compiled/executed: > > > > > Symbol's value as variable is void: 'аbbrevs-changed > > > > Does that error message really happen? If so, how can I reproduce it? > I don't think there is a way to get this particular message (with an > ascii apostrophe and cyrillic a). So it was purely hypothetical? In that case, I wish the person who wrote that had made it clear it was not a real, existing problem. The failure to do this made us waste our time. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) Skype: No way! See https://stallman.org/skype.html. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Why "symbol's value" error about a list? 2018-02-06 14:53 ` Richard Stallman @ 2018-02-06 18:59 ` Eli Zaretskii 2018-02-07 2:40 ` Richard Stallman 0 siblings, 1 reply; 98+ messages in thread From: Eli Zaretskii @ 2018-02-06 18:59 UTC (permalink / raw) To: rms; +Cc: emacs-devel, drew.adams, npostavs > From: Richard Stallman <rms@gnu.org> > CC: eliz@gnu.org, drew.adams@oracle.com, emacs-devel@gnu.org > Date: Tue, 06 Feb 2018 09:53:16 -0500 > > > I don't think there is a way to get this particular message (with an > > ascii apostrophe and cyrillic a). > > So it was purely hypothetical? > > In that case, I wish the person who wrote that had made it clear > it was not a real, existing problem. That person did: For example, suppose you have a Lisp program that produces the ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ following error message when compiled/executed: Symbol's value as variable is void: 'аbbrevs-changed ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Why "symbol's value" error about a list? 2018-02-06 18:59 ` Eli Zaretskii @ 2018-02-07 2:40 ` Richard Stallman 2018-02-07 3:42 ` Eli Zaretskii 0 siblings, 1 reply; 98+ messages in thread From: Richard Stallman @ 2018-02-07 2:40 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, drew.adams, npostavs [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > In that case, I wish the person who wrote that had made it clear > > it was not a real, existing problem. > That person did: > For example, suppose you have a Lisp program that produces the > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > following error message when compiled/executed: Those words do NOT say that this is an unreal example. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) Skype: No way! See https://stallman.org/skype.html. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Why "symbol's value" error about a list? 2018-02-07 2:40 ` Richard Stallman @ 2018-02-07 3:42 ` Eli Zaretskii 0 siblings, 0 replies; 98+ messages in thread From: Eli Zaretskii @ 2018-02-07 3:42 UTC (permalink / raw) To: rms; +Cc: emacs-devel, drew.adams, npostavs > From: Richard Stallman <rms@gnu.org> > CC: npostavs@users.sourceforge.net, drew.adams@oracle.com, > emacs-devel@gnu.org > Date: Tue, 06 Feb 2018 21:40:50 -0500 > > > > In that case, I wish the person who wrote that had made it clear > > > it was not a real, existing problem. > > > That person did: > > > For example, suppose you have a Lisp program that produces the > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > following error message when compiled/executed: > > Those words do NOT say that this is an unreal example. They do for me. Of course, my command of English is not perfect, so maybe I'm missing something. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Why "symbol's value" error about a list? 2018-02-06 11:27 ` Noam Postavsky 2018-02-06 14:53 ` Richard Stallman @ 2018-02-06 18:52 ` Eli Zaretskii 1 sibling, 0 replies; 98+ messages in thread From: Eli Zaretskii @ 2018-02-06 18:52 UTC (permalink / raw) To: Noam Postavsky; +Cc: rms, drew.adams, emacs-devel > From: Noam Postavsky <npostavs@users.sourceforge.net> > Date: Tue, 6 Feb 2018 06:27:33 -0500 > Cc: Eli Zaretskii <eliz@gnu.org>, Drew Adams <drew.adams@oracle.com>, > Emacs developers <emacs-devel@gnu.org> > > On Sun, Feb 4, 2018 at 8:06 PM, Richard Stallman <rms@gnu.org> wrote: > > > > For example, suppose you have a Lisp program that produces the > > > following error message when compiled/executed: > > > > > Symbol's value as variable is void: 'аbbrevs-changed > > > > Does that error message really happen? If so, how can I reproduce it? > > I don't think there is a way to get this particular message (with an > ascii apostrophe and cyrillic a). The point I wanted to make stands if you remove the apostrophe. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-03 17:05 ` Eli Zaretskii ` (2 preceding siblings ...) 2018-02-05 1:06 ` Why "symbol's value" error about a list? Richard Stallman @ 2018-02-05 1:06 ` Richard Stallman 3 siblings, 0 replies; 98+ messages in thread From: Richard Stallman @ 2018-02-05 1:06 UTC (permalink / raw) To: Eli Zaretskii; +Cc: npostavs, drew.adams, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > We want to find a way of identifying such situation and telling the > Lisp programmer about that in clear and easily understandable ways. > One way, perhaps too radical one, is to reject such "confusable" > characters outright. I think that makes sense for Lisp symbols. Lisp has had strings for 40 years now, so it isn't customary to use symbols to represent arbitrary text. We could have a mode where intern converts all these character codes to a single canonical set, but the default could be to give an error for all but the preferred one. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) Skype: No way! See https://stallman.org/skype.html. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-02 22:24 Change of Lisp syntax for "fancy" quotes in Emacs 27? Noam Postavsky 2018-02-02 22:52 ` Paul Eggert 2018-02-03 8:33 ` Eli Zaretskii @ 2018-02-03 18:13 ` Aaron Ecay 2018-02-04 2:05 ` Drew Adams 2018-02-04 4:51 ` Paul Eggert 2018-10-05 0:03 ` Noam Postavsky 3 siblings, 2 replies; 98+ messages in thread From: Aaron Ecay @ 2018-02-03 18:13 UTC (permalink / raw) To: Noam Postavsky, Emacs developers; +Cc: Drew Adams Hi Noam, 2018ko otsailak 2an, Noam Postavsky-ek idatzi zuen: > > In Emacs 26 and earlier the following is valid lisp code: > > (setq ’bar 42) > (setq foo ’bar) I was surprised to learn that this is the case, in light of what is said in the Elisp reference about symbol names: “A symbol name can contain any characters whatever. Most symbol names are written with letters, digits, and the punctuation characters ‘-+=*/’. Such names require no special punctuation; the characters of the name suffice as long as the name does not look like a number. (If it does, write a ‘\’ at the beginning of the name to force interpretation as a symbol.) The characters ‘_~!@$%^&:<>{}?’ are less often used but also require no special punctuation. Any other characters may be included in a symbol's name by escaping them with a backslash.” (info "(elisp) Symbol Type") Would it be worth considering making the reader enforce this fully specification, as an alternative to your patch? That would solve this problem with curly quotes in symbol names (which also bit me at one point), as well as the potential problems with other confusable characters raised by Paul. (It might still be desirable to add a special user-friendly error message when the illegal characters are confusable with an ASCII single quote, as an additional user-friendliness measure.) Aaron PS if this approach is not taken, the manual should at least be changed to match the actual behavior of the reader. ^ permalink raw reply [flat|nested] 98+ messages in thread
* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-03 18:13 ` Aaron Ecay @ 2018-02-04 2:05 ` Drew Adams 2018-02-04 4:51 ` Paul Eggert 1 sibling, 0 replies; 98+ messages in thread From: Drew Adams @ 2018-02-04 2:05 UTC (permalink / raw) To: Aaron Ecay, Noam Postavsky, Emacs developers > I was surprised to learn that this is the case, in light of what is > said in the Elisp reference about symbol names: > > “A symbol name can > contain any characters whatever. Most symbol names are written with > letters, digits, and the punctuation characters ‘-+=*/’. Such names > require no special punctuation; the characters of the name suffice as > long as the name does not look like a number. (If it does, write a ‘\’ > at the beginning of the name to force interpretation as a symbol.) The > characters ‘_~!@$%^&:<>{}?’ are less often used but also require no > special punctuation. Any other characters may be included in a symbol's > name by escaping them with a backslash.” (info "(elisp) Symbol Type") Thank you very much for that. I guess I wasn't aware of that text. I thought that there were only a very few chars that needed to be escaped in symbol names - `,', `(', etc.: only chars that have special syntactic meaning in Lisp. I suppose that invalidates my objection, though I wonder _why_ we would require escaping so many ordinary chars. And like you I wonder whether that text is accurate. I wonder whether that is the intended design (why?) or it is just an inaccurate description of the real behavior. Trying various chars from confusables.txt, it does not seem like they require escaping (at least not yet). That text appears to be wrong. I'd prefer it if escaping was _not_ required for chars other than those mentioned in that text, including chars in confusables.txt. I think it makes more sense to require escaping only for characters that have special Lisp significance, syntactically. IOW, I prefer the actual behavior to the behavior described in that text. I don't think someone using Hebrew or Arabic or Chinese or Korean letters in a symbol name should need to escape each one (or any of them). But if the design described there has already been decided on then as best for Emacs then I guess my argument is moot. In that case, the implementation is currently waaaaaay out of whack wrt the design. And if that's the design to be implemented then I agree with you that implementing it as described in that text would at least have an advantage of consistency. > Would it be worth considering making the reader enforce this fully > specification, as an alternative to your patch? That would solve > this problem with curly quotes in symbol names (which also bit me at > one point), as well as the potential problems with other confusable > characters raised by Paul. > > (It might still be desirable to add a special user-friendly error > message when the illegal characters are confusable with an ASCII > single quote, as an additional user-friendliness measure.) > > if this approach is not taken, the manual should at least > be changed to match the actual behavior of the reader. That's the approach I'd prefer. Let chars be used in symbol names without escaping, except for those with special Lisp syntax. But add warnings in contexts where we think someone might have inadvertently used a confusable in place of a common character. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-03 18:13 ` Aaron Ecay 2018-02-04 2:05 ` Drew Adams @ 2018-02-04 4:51 ` Paul Eggert 2018-02-04 9:47 ` Andreas Schwab 2018-02-04 15:04 ` Noam Postavsky 1 sibling, 2 replies; 98+ messages in thread From: Paul Eggert @ 2018-02-04 4:51 UTC (permalink / raw) To: Aaron Ecay, Noam Postavsky, Emacs developers; +Cc: Drew Adams [-- Attachment #1: Type: text/plain, Size: 1230 bytes --] Aaron Ecay wrote: > I was surprised to learn that this is the case, in light of what is > said in the Elisp reference about symbol names Good point; thanks. In the spirit of "be strict about what you generate", the Emacs printer should escape any character that is not in the list of characters documented in the Elisp manual as being safe (i.e., as not requiring escaping). This is elementary future-proofing, and is independent of whether we want Emacs to warn about or disallow confusable chars in symbols. Proposed patches against 'master' attached. The first merely simplifes the code without changing its effect. The second fixes a bug in the manual, which incorrectly states that '?' never needs escaping in symbol names. These two patches are routine. (I assume the second one should be applied to emacs26 instead of to master.) The third patch changes the Lisp printer to escape characters as suggested above. The fourth patch changes the Lisp printer to escape '?' only at the start of a symbol. This is nicer for programs using Scheme-style naming conventions in Emacs Lisp, e.g., 'fooish?' rather than 'fooishp'. I discovered the need for this patch when I wrote the second patch. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-Simplify-print_object-a-bit.patch --] [-- Type: text/x-patch; name="0001-Simplify-print_object-a-bit.patch", Size: 2780 bytes --] From c03b816016f8cc2f15d275e7ad23448366489277 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Sat, 3 Feb 2018 20:29:00 -0800 Subject: [PATCH 1/4] Simplify print_object a bit * src/print.c (print_object): Simplify by using C99 constructs, and by taking advantage of the fact that Lisp strings are are followed by null bytes. --- src/print.c | 40 ++++++++++++++++------------------------ 1 file changed, 16 insertions(+), 24 deletions(-) diff --git a/src/print.c b/src/print.c index b3c0f6f..d3eb49d 100644 --- a/src/print.c +++ b/src/print.c @@ -1916,38 +1916,29 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) case Lisp_Symbol: { bool confusing; - unsigned char *p = SDATA (SYMBOL_NAME (obj)); - unsigned char *end = p + SBYTES (SYMBOL_NAME (obj)); - int c; - ptrdiff_t i, i_byte; - ptrdiff_t size_byte; - Lisp_Object name; - - name = SYMBOL_NAME (obj); - - if (p != end && (*p == '-' || *p == '+')) p++; - if (p == end) - confusing = 0; + Lisp_Object name = SYMBOL_NAME (obj); + ptrdiff_t size_byte = SBYTES (name); + unsigned char *p = SDATA (name); + unsigned char *end = p + size_byte; + /* If symbol name begins with a digit, and ends with a digit, and contains nothing but digits and `e', it could be treated as a number. So set CONFUSING. - Symbols that contain periods could also be taken as numbers, - but periods are always escaped, so we don't have to worry - about them here. */ - else if (*p >= '0' && *p <= '9' - && end[-1] >= '0' && end[-1] <= '9') + Symbols that contain '.' or '#' could also be taken as + numbers, but these are always escaped so don't worry about + them here. */ + if (c_isdigit (p[*p == '-' || *p == '+']) && c_isdigit (end[-1])) { - while (p != end && ((*p >= '0' && *p <= '9') - /* Needed for \2e10. */ - || *p == 'e' || *p == 'E')) + /* Check for 'e' too; needed for \2e10. */ + do p++; + while (c_isdigit (*p) || *p == 'e' || *p == 'E'); + confusing = (end == p); } else - confusing = 0; - - size_byte = SBYTES (name); + confusing = false; if (! NILP (Vprint_gensym) && !SYMBOL_INTERNED_IN_INITIAL_OBARRAY_P (obj)) @@ -1958,10 +1949,11 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) break; } - for (i = 0, i_byte = 0; i_byte < size_byte;) + for (ptrdiff_t i = 0, i_byte = 0; i_byte < size_byte;) { /* Here, we must convert each multi-byte form to the corresponding character code before handing it to PRINTCHAR. */ + int c; FETCH_STRING_CHAR_ADVANCE (c, name, i, i_byte); maybe_quit (); -- 2.7.4 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #3: 0002-Say-needs-escaping-at-start-of-symbol.patch --] [-- Type: text/x-patch; name="0002-Say-needs-escaping-at-start-of-symbol.patch", Size: 1210 bytes --] From 4b945a3fcbf6ff2bde4595dd8b8f472d1b3d17af Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Sat, 3 Feb 2018 20:30:21 -0800 Subject: [PATCH 2/4] Say ? needs escaping at start of symbol. * doc/lispref/objects.texi: ? is also special. --- doc/lispref/objects.texi | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/doc/lispref/objects.texi b/doc/lispref/objects.texi index af74062..f0420e6 100644 --- a/doc/lispref/objects.texi +++ b/doc/lispref/objects.texi @@ -557,7 +557,8 @@ Symbol Type of the name suffice as long as the name does not look like a number. (If it does, write a @samp{\} at the beginning of the name to force interpretation as a symbol.) The characters @samp{_~!@@$%^&:<>@{@}?} are -less often used but also require no special punctuation. Any other +less often used but also require no special punctuation, except that +@samp{\} must precede @samp{?} at the start of a symbol. Any other characters may be included in a symbol's name by escaping them with a backslash. In contrast to its use in strings, however, a backslash in the name of a symbol simply quotes the single character that follows the -- 2.7.4 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #4: 0003-prin1-etc.-now-escape-more-chars-in-symbols.patch --] [-- Type: text/x-patch; name="0003-prin1-etc.-now-escape-more-chars-in-symbols.patch", Size: 3147 bytes --] From 2add3a1595f709bb071e2b775970038470b2fab2 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Sat, 3 Feb 2018 20:30:48 -0800 Subject: [PATCH 3/4] prin1 etc. now escape more chars in symbols Inspired by email from Aaron Ecay in: https://lists.gnu.org/r/emacs-devel/2018-02/msg00125.html * etc/NEWS: Mention this. * src/print.c (print_object): Escape any character that is not documented to not require escaping. --- etc/NEWS | 7 +++++++ src/print.c | 37 +++++++++++++++++++++++++++++++------ 2 files changed, 38 insertions(+), 6 deletions(-) diff --git a/etc/NEWS b/etc/NEWS index afd0fba..2a46002 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -87,6 +87,13 @@ regular expression was previously invalid, but is now accepted: x\{32768\} +** 'print' and related functions now escape more chars in symbols. +They now escape any symbol character that is outside the documented +set of characters that do not need escaping. For example, (print +(intern "n\u0456l")) now outputs "n\іl" instead of "nіl", as a hint to +the reader that the "і" is not the usual U+0069 LATIN SMALL LETTER I, +but is instead U+0456 CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I. + \f * Editing Changes in Emacs 27.1 diff --git a/src/print.c b/src/print.c index d3eb49d..7eca36a 100644 --- a/src/print.c +++ b/src/print.c @@ -1959,12 +1959,37 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) if (escapeflag) { - if (c == '\"' || c == '\\' || c == '\'' - || c == ';' || c == '#' || c == '(' || c == ')' - || c == ',' || c == '.' || c == '`' - || c == '[' || c == ']' || c == '?' || c <= 040 - || confusing - || (i == 1 && confusable_symbol_character_p (c))) + switch (c) + { + /* The Emacs Lisp manual lists these characters as + not requiring escaping in symbols. Although some + other characters might also work, play it safe + and escape all but these characters. */ + case '!': case '$': case '%': case '&': + case '*': case '-': case '+': case '/': + case '0': case '1': case '2': case '3': case '4': + case '5': case '6': case '7': case '8': case '9': + case ':': case '<': case '=': case '>': case '@': + case 'A': case 'B': case 'C': case 'D': case 'E': case 'F': + case 'G': case 'H': case 'I': case 'J': case 'K': case 'L': + case 'M': case 'N': case 'O': case 'P': case 'Q': case 'R': + case 'S': case 'T': case 'U': case 'V': case 'W': case 'X': + case 'Y': case 'Z': + case '^': case '_': + case 'a': case 'b': case 'c': case 'd': case 'e': case 'f': + case 'g': case 'h': case 'i': case 'j': case 'k': case 'l': + case 'm': case 'n': case 'o': case 'p': case 'q': case 'r': + case 's': case 't': case 'u': case 'v': case 'w': case 'x': + case 'y': case 'z': + case '{': case '}': case '~': + break; + + default: + confusing = true; + break; + } + + if (confusing) { printchar ('\\', printcharfun); confusing = false; -- 2.7.4 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #5: 0004-Escape-only-at-start-of-symbol.patch --] [-- Type: text/x-patch; name="0004-Escape-only-at-start-of-symbol.patch", Size: 1913 bytes --] From 4289ea136de4876b5dfc20d83b5a2556d1b5d8e6 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Sat, 3 Feb 2018 20:39:48 -0800 Subject: [PATCH 4/4] Escape ? only at start of symbol * src/print.c (print_object): Do it. --- etc/NEWS | 4 ++++ src/print.c | 4 ++-- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/etc/NEWS b/etc/NEWS index 2a46002..c435136 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -94,6 +94,10 @@ set of characters that do not need escaping. For example, (print the reader that the "і" is not the usual U+0069 LATIN SMALL LETTER I, but is instead U+0456 CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I. +** 'print' etc. no longer escape non-initial '?' in symbols. +For example, the symbol 'list?' is now printed as-is. Initial '?' +is still escaped, e.g., (print (intern "?x")) still outputs "\?x". + \f * Editing Changes in Emacs 27.1 diff --git a/src/print.c b/src/print.c index 7eca36a..dfd6c50 100644 --- a/src/print.c +++ b/src/print.c @@ -1938,7 +1938,7 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) confusing = (end == p); } else - confusing = false; + confusing = *p == '?'; if (! NILP (Vprint_gensym) && !SYMBOL_INTERNED_IN_INITIAL_OBARRAY_P (obj)) @@ -1969,7 +1969,7 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) case '*': case '-': case '+': case '/': case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': - case ':': case '<': case '=': case '>': case '@': + case ':': case '<': case '=': case '>': case '?': case '@': case 'A': case 'B': case 'C': case 'D': case 'E': case 'F': case 'G': case 'H': case 'I': case 'J': case 'K': case 'L': case 'M': case 'N': case 'O': case 'P': case 'Q': case 'R': -- 2.7.4 ^ permalink raw reply related [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-04 4:51 ` Paul Eggert @ 2018-02-04 9:47 ` Andreas Schwab 2018-02-04 15:04 ` Noam Postavsky 1 sibling, 0 replies; 98+ messages in thread From: Andreas Schwab @ 2018-02-04 9:47 UTC (permalink / raw) To: Paul Eggert; +Cc: Aaron Ecay, Emacs developers, Drew Adams, Noam Postavsky On Feb 03 2018, Paul Eggert <eggert@cs.ucla.edu> wrote: > Good point; thanks. In the spirit of "be strict about what you generate", > the Emacs printer should escape any character that is not in the list of > characters documented in the Elisp manual as being safe (i.e., as not > requiring escaping). No! Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-04 4:51 ` Paul Eggert 2018-02-04 9:47 ` Andreas Schwab @ 2018-02-04 15:04 ` Noam Postavsky 2018-02-04 17:33 ` Eli Zaretskii 1 sibling, 1 reply; 98+ messages in thread From: Noam Postavsky @ 2018-02-04 15:04 UTC (permalink / raw) To: Paul Eggert; +Cc: Aaron Ecay, Drew Adams, Emacs developers On Sat, Feb 3, 2018 at 11:51 PM, Paul Eggert <eggert@cs.ucla.edu> wrote: > Aaron Ecay wrote: >> >> I was surprised to learn that this is the case, in light of what is >> said in the Elisp reference about symbol names Most symbol names are written with letters, digits, and the punctuation characters `-+=*/'. Such names require no special punctuation... > Good point; thanks. In the spirit of "be strict about what you generate", > the Emacs printer should escape any character that is not in the list of > characters documented in the Elisp manual as being safe (i.e., as not > requiring escaping). This is elementary future-proofing, and is independent > of whether we want Emacs to warn about or disallow confusable chars in > symbols. My impression is that manual passage was written with only ASCII characters in mind. But since Emacs has allowed Unicode characters in symbol names for a long time now, I don't think we should all of a sudden declare "letters" to mean just [a-zA-Z]. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-04 15:04 ` Noam Postavsky @ 2018-02-04 17:33 ` Eli Zaretskii 2018-02-04 19:36 ` Paul Eggert 0 siblings, 1 reply; 98+ messages in thread From: Eli Zaretskii @ 2018-02-04 17:33 UTC (permalink / raw) To: Noam Postavsky; +Cc: aaronecay, eggert, drew.adams, emacs-devel > From: Noam Postavsky <npostavs@users.sourceforge.net> > Date: Sun, 4 Feb 2018 10:04:26 -0500 > Cc: Aaron Ecay <aaronecay@gmail.com>, Drew Adams <drew.adams@oracle.com>, > Emacs developers <emacs-devel@gnu.org> > > I don't think we should all of a sudden declare "letters" to mean > just [a-zA-Z]. We don't: [:alpha:] nowadays matches much more than just [a-zA-Z]. So indeed, restricting Lisp symbols that way sounds too harsh, almost arbitrary. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-04 17:33 ` Eli Zaretskii @ 2018-02-04 19:36 ` Paul Eggert 2018-02-04 19:55 ` Philipp Stephani 2018-02-04 20:10 ` Eli Zaretskii 0 siblings, 2 replies; 98+ messages in thread From: Paul Eggert @ 2018-02-04 19:36 UTC (permalink / raw) To: Eli Zaretskii, Noam Postavsky; +Cc: aaronecay, drew.adams, emacs-devel Eli Zaretskii wrote: > restricting Lisp symbols that way sounds too harsh OK, I'll omit that patch. We still have a problem with Emacs Lisp code containing confusable characters that can make the code exceedingly hard to review. These characters are currently not caught or checked by anything. We really should do better. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-04 19:36 ` Paul Eggert @ 2018-02-04 19:55 ` Philipp Stephani 2018-02-04 20:10 ` Eli Zaretskii 1 sibling, 0 replies; 98+ messages in thread From: Philipp Stephani @ 2018-02-04 19:55 UTC (permalink / raw) To: Paul Eggert Cc: aaronecay, Eli Zaretskii, emacs-devel, drew.adams, Noam Postavsky [-- Attachment #1: Type: text/plain, Size: 768 bytes --] Paul Eggert <eggert@cs.ucla.edu> schrieb am So., 4. Feb. 2018 um 20:36 Uhr: > Eli Zaretskii wrote: > > restricting Lisp symbols that way sounds too harsh > > OK, I'll omit that patch. > > We still have a problem with Emacs Lisp code containing confusable > characters > that can make the code exceedingly hard to review. These characters are > currently not caught or checked by anything. We really should do better. > > The following should be unintrusive and not too hard: Let the reader push all confusable symbols (i.e. symbols that contain at least one unescaped character from the Unicode confusables list that maps to a sequence of ASCII characters) onto an internal dynamic variable. The byte compiler can then emit warnings if that variable becomes non-nil. [-- Attachment #2: Type: text/html, Size: 1071 bytes --] ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-04 19:36 ` Paul Eggert 2018-02-04 19:55 ` Philipp Stephani @ 2018-02-04 20:10 ` Eli Zaretskii 2018-02-04 20:36 ` Eli Zaretskii 1 sibling, 1 reply; 98+ messages in thread From: Eli Zaretskii @ 2018-02-04 20:10 UTC (permalink / raw) To: Paul Eggert; +Cc: aaronecay, emacs-devel, drew.adams, npostavs > Cc: aaronecay@gmail.com, drew.adams@oracle.com, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sun, 4 Feb 2018 11:36:05 -0800 > > We still have a problem with Emacs Lisp code containing confusable characters > that can make the code exceedingly hard to review. These characters are > currently not caught or checked by anything. We really should do better. I agree that we need a solid solution to that. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-04 20:10 ` Eli Zaretskii @ 2018-02-04 20:36 ` Eli Zaretskii 2018-02-04 20:48 ` Paul Eggert 0 siblings, 1 reply; 98+ messages in thread From: Eli Zaretskii @ 2018-02-04 20:36 UTC (permalink / raw) To: eggert; +Cc: emacs-devel > Date: Sun, 04 Feb 2018 22:10:25 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: aaronecay@gmail.com, emacs-devel@gnu.org, drew.adams@oracle.com, > npostavs@users.sourceforge.net > > > Cc: aaronecay@gmail.com, drew.adams@oracle.com, emacs-devel@gnu.org > > From: Paul Eggert <eggert@cs.ucla.edu> > > Date: Sun, 4 Feb 2018 11:36:05 -0800 > > > > We still have a problem with Emacs Lisp code containing confusable characters > > that can make the code exceedingly hard to review. These characters are > > currently not caught or checked by anything. We really should do better. > > I agree that we need a solid solution to that. How about if we start by warning about any Lisp symbol whose name uses characters from more than one script for non-punctuation characters? puny.el has a function that solves a similar problem, which could be used as a starting point. We could make this an opt-in feature if the warnings are deemed to be a potential annoyance. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-04 20:36 ` Eli Zaretskii @ 2018-02-04 20:48 ` Paul Eggert 2018-02-04 20:59 ` Clément Pit-Claudel 0 siblings, 1 reply; 98+ messages in thread From: Paul Eggert @ 2018-02-04 20:48 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii wrote: > How about if we start by warning about any Lisp symbol whose name uses > characters from more than one script for non-punctuation characters? This problem can occur even in one-character symbols. It might be better to establish a default script for the file, and warn about any characters from a different script. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-04 20:48 ` Paul Eggert @ 2018-02-04 20:59 ` Clément Pit-Claudel 0 siblings, 0 replies; 98+ messages in thread From: Clément Pit-Claudel @ 2018-02-04 20:59 UTC (permalink / raw) To: emacs-devel On 2018-02-04 15:48, Paul Eggert wrote: > Eli Zaretskii wrote: >> How about if we start by warning about any Lisp symbol whose name uses >> characters from more than one script for non-punctuation characters? > > This problem can occur even in one-character symbols. It might be better to establish a default script for the file, and warn about any characters from a different script. We could also default to warning about any characters in the confusables list and not in ascii. And we'd make it easy to turn the check off using a file-local variable. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-02-02 22:24 Change of Lisp syntax for "fancy" quotes in Emacs 27? Noam Postavsky ` (2 preceding siblings ...) 2018-02-03 18:13 ` Aaron Ecay @ 2018-10-05 0:03 ` Noam Postavsky 2018-10-05 1:01 ` Paul Eggert ` (2 more replies) 3 siblings, 3 replies; 98+ messages in thread From: Noam Postavsky @ 2018-10-05 0:03 UTC (permalink / raw) To: Emacs developers; +Cc: Drew Adams On Fri, 2 Feb 2018 at 17:24, Noam Postavsky <npostavs@users.sourceforge.net> wrote: > > In Emacs 26 and earlier the following is valid lisp code: > > (setq ’bar 42) > (setq foo ’bar) > > In the current master branch, this will signal (invalid-read-syntax > "strange quote" "’"). I've posted a patch which removes the error in this case, and instead just adds to the error message if evaluating an expression with a fancy quote leads to an error, see Bug#32939 <https://debbugs.gnu.org/cgi/bugreport.cgi?bug=32939>. Archive link to previous discussion: https://lists.gnu.org/archive/html/emacs-devel/2018-02/msg00093.html ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-05 0:03 ` Noam Postavsky @ 2018-10-05 1:01 ` Paul Eggert 2018-10-05 8:43 ` Eli Zaretskii 2018-10-06 15:40 ` eval-last-sexp / C-x C-e, and punctuation like `?’' [Was: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?)] Garreau, Alexandre 2018-10-16 12:48 ` Change of Lisp syntax for "fancy" quotes in Emacs 27? Garreau, Alexandre 2 siblings, 1 reply; 98+ messages in thread From: Paul Eggert @ 2018-10-05 1:01 UTC (permalink / raw) To: Noam Postavsky, Emacs developers; +Cc: Drew Adams I'm afraid this patch is heading in the wrong direction, as we should be more vigilant about confusables, not less. Consider this example, abstracted from the auth-source-secrets-create source code: (if (eq r 'secret) (let ((data data)) (lambda () data)) data) The intent of the (let ((data data)) ...) code is to create a thunk which, when evaluated, yields the current value of 'data' (not the value of 'data' when the thunk is called), and that is what any human reading the code will see. However, that is not what the code actually does. In the (let ((data data)) ...), the space between the two instances of 'data' is really an EN SPACE (U+2002) so the 'let' is declaring an identifier 'data data' whose name contains an EN SPACE and whose value is nil, an identifier that is never used; so the thunk yields the later value of 'data', not the earlier one. Because humans cannot reliably review source code containing characters that are easily confusable with the ASCII symbols that are a basic part of Elisp syntax, we should not be relaxing the reader to encourage developers to use these characters in their identifiers. On the contrary, we should be discouraging their use even more than we do now. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-05 1:01 ` Paul Eggert @ 2018-10-05 8:43 ` Eli Zaretskii 2018-10-05 23:02 ` Paul Eggert 0 siblings, 1 reply; 98+ messages in thread From: Eli Zaretskii @ 2018-10-05 8:43 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel, drew.adams, npostavs > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Thu, 4 Oct 2018 18:01:26 -0700 > Cc: Drew Adams <drew.adams@oracle.com> > > I'm afraid this patch is heading in the wrong direction, as we should be > more vigilant about confusables, not less. > > Consider this example, abstracted from the auth-source-secrets-create > source code: > > (if (eq r 'secret) > (let ((data data)) > (lambda () data)) > data) Is this example relevant to the proposed changes? The latter only change what we do for quote-like symbols that are not interpreted as quotes by the Lisp reader. You, OTOH, are raising a different problem, one for which AFAIK we currently have no solution. The general issue of "confusable" characters, both in Lisp code and in user interaction, is an issue that still awaits a proper solution in Emacs. (Many moons ago, I was seduced to write a couple of primitives to allow detection of confusable text that played tricks with bidi reordering, but AFAICT those primitives are still not used, which is a pity, IMO.) I'd encourage people to work on this. However, the much more narrow issue brought up by this bug report is specifically about quote characters. It is related to changes in our messages, which now by default produce non-ASCII quotes, something that made this particular problem more probable than it was before. I think as long as we don't disallow such characters in Lisp symbols, the proposed treatment, via evaluation-time warning, is a reasonable solution, slightly better than the somewhat confusing error message we present now. We could also augment that by displaying the confusable characters in a distinct face, something we already do for some of them. IOW, I disagree with "discourage" part of your opinion: there's nothing wrong with using such characters as long as we don't formally cease to support them. And the commonly accepted mechanism of pointing out potentially wrong constructs is by visual cues and warning messages, not by erroring out. Compare how we treat something like this in C programs: if (a = b) { do something; } ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-05 8:43 ` Eli Zaretskii @ 2018-10-05 23:02 ` Paul Eggert 2018-10-06 0:20 ` Drew Adams ` (3 more replies) 0 siblings, 4 replies; 98+ messages in thread From: Paul Eggert @ 2018-10-05 23:02 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, drew.adams, npostavs On 10/5/18 1:43 AM, Eli Zaretskii wrote: > the commonly accepted mechanism of > pointing out potentially wrong constructs is by visual cues and > warning messages If we decide that Elisp source code must be able to abuse confusable characters, then of course we should allow such abuse and support it as best we can, including selective highlighting and whatnot to try to warn readers of the abuse. Such support won't work outside Emacs, but people using non-Emacs programs to look at Elisp code will simply be out of luck. However, that would be heading in the wrong direction, because we shouldn't assume that Elisp code is reviewed only via Emacs. I regularly use Savannah's web interface to look at Elisp source code diffs, for example, and there's lots of other ways I and other developers use non-Emacs programs to look at Elisp source. Because reading source code is an essential property of free software, and because it would set a bad precedent if we said or implied that one really should use only Emacs to read Elisp code, we can't sufficiently address the problem merely by highlighting characters when Emacs is viewing them in a certain way and saying or implying that people should use only Emacs to review Elisp code. I'm not arguing that Elisp should prohibit symbols from containing confusing characters, only that these characters should be easily recognizable in plain-text source code, without requiring Emacs itself (configured a certain way) to view the source. For example, if we required a backslash before every confusable character in a symbol, that would go a long way toward addressing the problem. ^ permalink raw reply [flat|nested] 98+ messages in thread
* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-05 23:02 ` Paul Eggert @ 2018-10-06 0:20 ` Drew Adams 2018-10-06 9:14 ` Alan Mackenzie 2018-10-06 16:17 ` Paul Eggert 2018-10-06 10:11 ` Eli Zaretskii ` (2 subsequent siblings) 3 siblings, 2 replies; 98+ messages in thread From: Drew Adams @ 2018-10-06 0:20 UTC (permalink / raw) To: Paul Eggert, Eli Zaretskii; +Cc: emacs-devel, npostavs > I'm not arguing that Elisp should prohibit symbols from containing > confusing characters, only that these characters should be easily > recognizable in plain-text source code, without requiring Emacs itself > (configured a certain way) to view the source. For example, if we > required a backslash before every confusable character in a symbol, that > would go a long way toward addressing the problem. The right approach is to let Lisp tell you about its syntax. Lisp should raise an error only for, well, an actual Lisp error. If Emacs wants to highlight something that it can guess (accurately) might be a typo (e.g. a copy+paste gotcha) then fine. And even that highlighting should be optional (which it is, if from font-lock). There are more and more such copy+paste gotchas, for at least a couple reasons: (1) Emacs has now moved to using curly quotes more gratuitously, and (2) users copy code from both Emacs and other sources, and some such code uses curly quotes (even sometimes mistakenly in place of apostrophes in Lisp), no-break space chars, and other confusables. And maybe also (3) users can sometimes type a curly quote, no-break space, etc. more easily now, in some editors, maybe even sometimes by just hitting an ordinary keyboard key, such as ' or space bar. We have to live with this now, like it or not. That's not a reason to tell Lisp to treat a curly quote as an apostrophe, and it's not a reason to tell it to raise an error when a curly quote is used in a place where an apostrophe could be used. Similarly, it's not a reason for Lisp to guess that you really meant SPC instead of no-break space. And so on. This is a judgment call, but we should _let Lisp judge_ about syntax errors, based on, well, its own syntax. If you use (let (foo foo)...), where there is a no-break space between foo and foo, so be it. That's a single symbol, `foo foo'. (My mail client doesn't even let me paste a no-break space char there, it seems, so you'll have to pretend. That's the kind of second-guessing behavior we should be avoiding, FWIW.) ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 0:20 ` Drew Adams @ 2018-10-06 9:14 ` Alan Mackenzie 2018-10-06 14:34 ` Stefan Monnier ` (2 more replies) 2018-10-06 16:17 ` Paul Eggert 1 sibling, 3 replies; 98+ messages in thread From: Alan Mackenzie @ 2018-10-06 9:14 UTC (permalink / raw) To: Drew Adams; +Cc: npostavs, Eli Zaretskii, Paul Eggert, emacs-devel Hello, Drew. Just a quick point. On Sat, Oct 06, 2018 at 00:20:27 +0000, Drew Adams wrote: [ .... ] > This is a judgment call, but we should _let Lisp judge_ > about syntax errors, based on, well, its own syntax. If you > use (let (foo foo)...), where there is a no-break space > between foo and foo, so be it. That's a single symbol, > `foo foo'. Do we even allow the syntax (let ((foo))...)? If we do, then why? There's (let (foo)...) and (let ((foo nil))...) for binding a symbol to nil. We made (setq foo) invalid some while ago. Why not similarly make (let ((foo))...) invalid? That would solve at least part of this problem, is easy to do. and is almost certainly harmless. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 9:14 ` Alan Mackenzie @ 2018-10-06 14:34 ` Stefan Monnier 2018-10-06 14:57 ` Drew Adams 2018-10-06 16:10 ` Paul Eggert 2 siblings, 0 replies; 98+ messages in thread From: Stefan Monnier @ 2018-10-06 14:34 UTC (permalink / raw) To: emacs-devel > We made (setq foo) invalid some while ago. Why not similarly make (let > ((foo))...) invalid? I assume you mean that the byte-compiler should signal a warning? If so, I'm fully in favor. Signaling an actual error would be problematic at this stage, because it'd break too much code. Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 9:14 ` Alan Mackenzie 2018-10-06 14:34 ` Stefan Monnier @ 2018-10-06 14:57 ` Drew Adams 2018-10-06 15:42 ` Garreau, Alexandre 2018-10-06 16:10 ` Paul Eggert 2 siblings, 1 reply; 98+ messages in thread From: Drew Adams @ 2018-10-06 14:57 UTC (permalink / raw) To: Alan Mackenzie; +Cc: npostavs, Eli Zaretskii, Paul Eggert, emacs-devel > > This is a judgment call, but we should _let Lisp judge_ > > about syntax errors, based on, well, its own syntax. If you > > use (let (foo foo)...), where there is a no-break space > > between foo and foo, so be it. That's a single symbol, > > `foo foo'. > > Do we even allow the syntax (let ((foo))...)? If we do, then why? > There's (let (foo)...) and (let ((foo nil))...) for binding a symbol to > nil. Yes, sorry. I wasn't paying attention to the parens in that example. My point was only that use of `foo foo' (with a no-break space between the two foo's) as a mistake/typo for an intended `foo foo' (with a normal space) should not be signaled by Lisp as an error. But the no-break space could be highlighted as sometimes helpful info. `foo foo' (with no-break space) is just a symbol, for Lisp - not a syntax error. E.g. (changing the example): (let (foo foo)...) binds symbol `foo foo' (with a no-break space) to nil. It doesn't bind symbol `foo' to the current value of symbol `foo'. So, e.g., if symbol `foo' happens to be unbound then even evaluation of that binding won't raise an error (e.g. unbound variable `foo'). ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 14:57 ` Drew Adams @ 2018-10-06 15:42 ` Garreau, Alexandre 0 siblings, 0 replies; 98+ messages in thread From: Garreau, Alexandre @ 2018-10-06 15:42 UTC (permalink / raw) To: Drew Adams Cc: Alan Mackenzie, Eli Zaretskii, Paul Eggert, emacs-devel, npostavs On 2018-10-06 at 14:57, Drew Adams wrote: > My point was only that use of `foo foo' (with a no-break > space between the two foo's) as a mistake/typo for an > intended `foo foo' (with a normal space) should not be > signaled by Lisp as an error. But the no-break space could > be highlighted as sometimes helpful info. `foo foo' (with > no-break space) is just a symbol, for Lisp - not a syntax > error. Unbreakable space is already colored as some sort of colored underscore. The problem is there are a bunch of other kind of spaces, though I personally only use unbreakable, non-justifying and/or “fine” (dunno if it match an en space) space in French, though I do it commonly in strings with emacs (which only correctly highlight the normal unbreakable space, but not the others). I’d like to see these highlighted (perhaps, or even preferably, a different way) as well, or should it be something customizable according user preferences (or language?). > E.g. (changing the example): > > (let (foo foo)...) binds symbol `foo foo' (with a no-break > space) to nil. It doesn't bind symbol `foo' to the current > value of symbol `foo'. I would have expected it to bind twice foo to nil (or to signal an error or a warning), yet it seems you used a normal space, unbreakable-space already highlights in emacs, so I’d noticed it. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 9:14 ` Alan Mackenzie 2018-10-06 14:34 ` Stefan Monnier 2018-10-06 14:57 ` Drew Adams @ 2018-10-06 16:10 ` Paul Eggert 2 siblings, 0 replies; 98+ messages in thread From: Paul Eggert @ 2018-10-06 16:10 UTC (permalink / raw) To: Alan Mackenzie, Drew Adams; +Cc: Eli Zaretskii, npostavs, emacs-devel Alan Mackenzie wrote: > Do we even allow the syntax (let ((foo))...)? Although that's a reasonable question it doesn't address the main problem, as there are lots of other opportunities for confusion like this one. For example, if I see '(foo ․ bar) in Elisp source code, I'll naturally think it yields a cons of two symbols. It doesn't: it yields a list of three symbols, because that dot is not a FULL STOP (U+002E); it's a ONE DOT LEADER (U+2024). ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 0:20 ` Drew Adams 2018-10-06 9:14 ` Alan Mackenzie @ 2018-10-06 16:17 ` Paul Eggert 2018-10-07 1:13 ` Drew Adams 2018-10-08 3:51 ` Richard Stallman 1 sibling, 2 replies; 98+ messages in thread From: Paul Eggert @ 2018-10-06 16:17 UTC (permalink / raw) To: Drew Adams, Eli Zaretskii; +Cc: emacs-devel, npostavs Drew Adams wrote: > The right approach is to let Lisp tell you about its syntax... we should _let Lisp judge_ This seems to be assuming that there is something out there called "Lisp" that tells us what to do. That's entirely backwards. Lisp is our servant, not our master. Lisp syntax should be whatever the best syntax we can come up with, to help us and our users do our work, and this should be the way we think and write about it. ^ permalink raw reply [flat|nested] 98+ messages in thread
* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 16:17 ` Paul Eggert @ 2018-10-07 1:13 ` Drew Adams 2018-10-08 3:51 ` Richard Stallman 1 sibling, 0 replies; 98+ messages in thread From: Drew Adams @ 2018-10-07 1:13 UTC (permalink / raw) To: Paul Eggert, Eli Zaretskii; +Cc: emacs-devel, npostavs > > The right approach is to let Lisp tell you about its syntax... we should _let > Lisp judge_ > > This seems to be assuming that there is something out there called "Lisp" that > tells us what to do. That's entirely backwards. Lisp is our servant, not our > master. Lisp syntax should be whatever the best syntax we can come up with, > to help us and our users do our work, and this should be the way we think and > write about it. Such characters have symbol syntax in Lisp (Elisp, for instance). That's the Lisp syntax in question. Lisp doesn't require you to escape them in symbols. Hasn't done so before and shouldn't do so now (IMHO). Yes, we have liberty to change Lisp, including Lisp syntax, in any number of ways. That doesn't mean that we should. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 16:17 ` Paul Eggert 2018-10-07 1:13 ` Drew Adams @ 2018-10-08 3:51 ` Richard Stallman 1 sibling, 0 replies; 98+ messages in thread From: Richard Stallman @ 2018-10-08 3:51 UTC (permalink / raw) To: Paul Eggert; +Cc: eliz, npostavs, drew.adams, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] We can make any change in Emacs Lisp syntax that we decide to make, but incoherence with the spirit of Lisp will lead to trouble. The spirit of Lisp is not something that can be precisely defined, but experienced Lispers will mostly agree about what it says. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-05 23:02 ` Paul Eggert 2018-10-06 0:20 ` Drew Adams @ 2018-10-06 10:11 ` Eli Zaretskii 2018-10-06 15:51 ` Paul Eggert 2018-10-06 11:22 ` Garreau, Alexandre 2018-10-09 14:43 ` Noam Postavsky 3 siblings, 1 reply; 98+ messages in thread From: Eli Zaretskii @ 2018-10-06 10:11 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel, drew.adams, npostavs > Cc: npostavs@users.sourceforge.net, emacs-devel@gnu.org, drew.adams@oracle.com > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Fri, 5 Oct 2018 16:02:09 -0700 > > However, that would be heading in the wrong direction, because we > shouldn't assume that Elisp code is reviewed only via Emacs. I regularly > use Savannah's web interface to look at Elisp source code diffs, for > example, and there's lots of other ways I and other developers use > non-Emacs programs to look at Elisp source. Because reading source code > is an essential property of free software, and because it would set a > bad precedent if we said or implied that one really should use only > Emacs to read Elisp code, we can't sufficiently address the problem > merely by highlighting characters when Emacs is viewing them in a > certain way and saying or implying that people should use only Emacs to > review Elisp code. > > I'm not arguing that Elisp should prohibit symbols from containing > confusing characters, only that these characters should be easily > recognizable in plain-text source code, without requiring Emacs itself > (configured a certain way) to view the source. For example, if we > required a backslash before every confusable character in a symbol, that > would go a long way toward addressing the problem. I agree that viewing ELisp code outside of Emacs is a valid use case. But I don't think a backslash before these non-ASCII quotes will significantly lower the confusion potential when those characters are used in the source. Basically, there's a contradiction here between our desire not to confuse relatively inexperienced users of ELisp and help them avoid problems which might be hard to figure out, and our desire not to annoy experienced users. Personally, I think that using faces strikes a good balance between these contradictory motives. I don't see how we can be harsh to uses of these characters without actually prohibiting their use in symbols. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 10:11 ` Eli Zaretskii @ 2018-10-06 15:51 ` Paul Eggert 2018-10-06 16:45 ` Eli Zaretskii 0 siblings, 1 reply; 98+ messages in thread From: Paul Eggert @ 2018-10-06 15:51 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Noam Postavsky, emacs-devel Eli Zaretskii wrote: > I agree that viewing ELisp code outside of Emacs is a valid use case. > But I don't think a backslash before these non-ASCII quotes will > significantly lower the confusion potential when those characters are > used in the source. I don't follow. If someone writes '(let ((foo\ bar)) baz)' then a human reader is put immediately and obviously on notice that there's something odd about that code. We already require a backslash for that ordinary space (U+0020); why not also require it for EN SPACE (U+2002)? That will significantly lower confusion here. The point is not to distinguish 'foo\ bar' (with ordinary space) from 'foo\ bar' (with en space); the point is to distinguish both from the 'foo bar' (two identifiers) that a reader would ordinarily expect here, because that's the main way a malicious hacker could confuse even experienced reviewers. > Basically, there's a contradiction here between our desire not to > confuse relatively inexperienced users of ELisp and help them avoid > problems which might be hard to figure out, and our desire not to > annoy experienced users. That's not the point I was making. I'm an experienced Elisp user, and I am *extremely annoyed* (to put it mildly) that malicious users can put one over on us by using characters that look like spaces, or parentheses, or whatever, characters that are not what they look like. This has nothing to do with confusing inexperienced users. I *really want* Elisp to be relatively immune to this problem, at least for programs that I help maintain. And I don't want the immunity to work only when I'm using Emacs on a nice display: I often read code with Emacs highlighting unavailable or turned off, or without using Emacs at all. At the very least there should be an option whereby the Emacs source code itself is routinely verified to be free of confusable characters in identifiers, to help prevent malicious code from sneaking into Emacs itself. Even if we give users the ability to let others shoot them, we should at least improve our own defenses. > I don't see how > we can be harsh to uses of these characters without actually > prohibiting their use in symbols. I already gave one proposal for doing just that: require that characters confusable with ASCII be escaped. Initially we can merely warn about any unescaped confusables; as long as the warning is prominent enough that should be OK for starters. This proposal does not prohibit their use in symbols, as one can simply escape the characters. There are other ways to skin this cat as well. We should be heading in this direction, not removing the (admittedly inadequate) protection we already have. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 15:51 ` Paul Eggert @ 2018-10-06 16:45 ` Eli Zaretskii 2018-10-06 18:03 ` Paul Eggert 0 siblings, 1 reply; 98+ messages in thread From: Eli Zaretskii @ 2018-10-06 16:45 UTC (permalink / raw) To: Paul Eggert; +Cc: npostavs, emacs-devel > From: Paul Eggert <eggert@cs.ucla.edu> > Cc: emacs-devel@gnu.org, Noam Postavsky <npostavs@users.sourceforge.net> > Date: Sat, 6 Oct 2018 08:51:18 -0700 > > Eli Zaretskii wrote: > > I agree that viewing ELisp code outside of Emacs is a valid use case. > > But I don't think a backslash before these non-ASCII quotes will > > significantly lower the confusion potential when those characters are > > used in the source. > > I don't follow. If someone writes '(let ((foo\ bar)) baz)' then a human reader > is put immediately and obviously on notice that there's something odd about that > code. We already require a backslash for that ordinary space (U+0020); why not > also require it for EN SPACE (U+2002)? That will significantly lower confusion here. How will it lower the confusion, when the same is required for a space? And once again, these examples are not relevant to the issue at hand, which is only about quotes. > > I don't see how > > we can be harsh to uses of these characters without actually > > prohibiting their use in symbols. > > I already gave one proposal for doing just that: require that characters > confusable with ASCII be escaped. That's an annoyance, IMO. This is why this bug report exists, right? And again, please don't bring up the more general issue with any other confusable character, as those require a more general solution about which we don't yet have a clear idea. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 16:45 ` Eli Zaretskii @ 2018-10-06 18:03 ` Paul Eggert 2018-10-06 18:29 ` Eli Zaretskii 0 siblings, 1 reply; 98+ messages in thread From: Paul Eggert @ 2018-10-06 18:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: npostavs, emacs-devel Eli Zaretskii wrote: > these examples are not relevant to the issue at hand, > which is only about quotes Quotes are part of the same problem. For example, here's some code in Gnus: (ignore-errors (gnus-get-function method 'open-server)) Change that APOSTROPHE (U+0027) to RIGHT SINGLE QUOTATION MARK (U+2019) and the code will look the same but do something quite different, with no diagnostic. This sort of code is reasonably common and can easily be security-relevant. If Emacs stops diagnosing this abuse of confusable characters, we're opening ourselves up more to malicious code. > How will it lower the confusion, when the same is required for a space? Let me rephrase my point, with apostrophe rather than space. The point is not to distinguish ´open-server (with U+00B4 ACUTE ACCENT) from ՚open-server (with U+055A ARMENIAN APOSTROPHE); the point is to distinguish both of these from the 'open-server (apostrophe followed by symbol) that a reader would ordinarily expect here. We need to give an obvious way for human readers to see that something odd is going on. Readers can then use C-x = (or whatever) to find out exactly what the oddness is. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 18:03 ` Paul Eggert @ 2018-10-06 18:29 ` Eli Zaretskii 2018-10-06 19:18 ` Paul Eggert ` (2 more replies) 0 siblings, 3 replies; 98+ messages in thread From: Eli Zaretskii @ 2018-10-06 18:29 UTC (permalink / raw) To: Paul Eggert; +Cc: npostavs, emacs-devel > Cc: emacs-devel@gnu.org, npostavs@users.sourceforge.net > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sat, 6 Oct 2018 11:03:25 -0700 > > Eli Zaretskii wrote: > > > these examples are not relevant to the issue at hand, > > which is only about quotes > > Quotes are part of the same problem. Yes, but a much smaller part. And solving the problem with quotes doesn't require to solve the more general one. > (ignore-errors (gnus-get-function method 'open-server)) > > Change that APOSTROPHE (U+0027) to RIGHT SINGLE QUOTATION MARK (U+2019) and the > code will look the same but do something quite different, with no diagnostic. I understand. I'm just saying that adding a backslash between the U+2019 quote will not significantly improve the situation, because Emacs Lisp uses backslashes in many other situation, like ?\", and therefore the mere fact that there is a backslash doesn't necessarily alert the human reader to the existence of an unusual character. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 18:29 ` Eli Zaretskii @ 2018-10-06 19:18 ` Paul Eggert 2018-10-06 19:30 ` Paul Eggert 2018-10-06 19:32 ` Garreau, Alexandre 2 siblings, 0 replies; 98+ messages in thread From: Paul Eggert @ 2018-10-06 19:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, npostavs Eli Zaretskii wrote: > Emacs Lisp uses backslashes in many other situation, like ?\", and > therefore the mere fact that there is a backslash doesn't necessarily > alert the human reader to the existence of an unusual character. Yes, the solution I proposed addresses only symbols, not the more-general problem of confusable characters in strings and characters. Although symbols are a more significant issue since they occur more often, it would be nice to address strings and character constants too. We can do that by adding support for a new string escape \cX for the confusable character X (e.g., ?\c՚ would mean U+055A ARMENIAN APOSTROPHE), and by diagnosing the use of unescaped confusable characters in strings and character constants. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 18:29 ` Eli Zaretskii 2018-10-06 19:18 ` Paul Eggert @ 2018-10-06 19:30 ` Paul Eggert 2018-10-06 19:32 ` Garreau, Alexandre 2 siblings, 0 replies; 98+ messages in thread From: Paul Eggert @ 2018-10-06 19:30 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, npostavs Eli Zaretskii wrote: >> (ignore-errors (gnus-get-function method 'open-server)) >> >> Change that APOSTROPHE (U+0027) to RIGHT SINGLE QUOTATION MARK (U+2019) and the >> code will look the same but do something quite different, with no diagnostic. > ... adding a backslash between the > U+2019 quote will not significantly improve the situation, because > Emacs Lisp uses backslashes in many other situation, like ?\", and > therefore the mere fact that there is a backslash doesn't necessarily > alert the human reader to the existence of an unusual character. True, a backslash within a string or character is not a sufficient alert. However, a backslash within a symbol is. For example, although it's relatively common to see strings like this in Elisp source code: "color=\"blue\"" it's extremely uncommon to see symbols like this: color=\"blue\" and so if one sees such a symbol (possibly with some other character in place of the " marks, possibly not) one will easily know that something odd is going on. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 18:29 ` Eli Zaretskii 2018-10-06 19:18 ` Paul Eggert 2018-10-06 19:30 ` Paul Eggert @ 2018-10-06 19:32 ` Garreau, Alexandre 2 siblings, 0 replies; 98+ messages in thread From: Garreau, Alexandre @ 2018-10-06 19:32 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Paul Eggert, emacs-devel, npostavs On 2018-10-06 at 21:29, Eli Zaretskii wrote: > I understand. I'm just saying that adding a backslash between the > U+2019 quote will not significantly improve the situation, because > Emacs Lisp uses backslashes in many other situation, like ?\", and > therefore the mere fact that there is a backslash doesn't necessarily > alert the human reader to the existence of an unusual character. I think what is wanted here is not to alert of unusual character but alert of a non-syntaxically-relevant (for the reader) character, such as ?\', which quotes (or be it ?\", or ?\(, or ?\), etc.), which looks like a character that will have syntactic consequences (like quoting), while being a “normal” (part of the symbol) character. Like not alerting two symbols may look alike (the ՚open-server/´open-server case), but one thing which is a symbol, unquoted (like \'open-server, which is evident it’s a symbol because of the backslash, since afaik no character preceded by a backslash can do something with syntax, apart being part of a symbol (thus the behavior is clear)), and the other which *not* be a unquoted symbol at all (like 'open-server, which just returns the symbol). I guess the same issue indeed arise with ?¸, or anything alike, that won’t unquote the next symbol. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-05 23:02 ` Paul Eggert 2018-10-06 0:20 ` Drew Adams 2018-10-06 10:11 ` Eli Zaretskii @ 2018-10-06 11:22 ` Garreau, Alexandre 2018-10-06 11:50 ` Eli Zaretskii 2018-10-06 16:24 ` Change of Lisp syntax for "fancy" quotes in Emacs 27? Paul Eggert 2018-10-09 14:43 ` Noam Postavsky 3 siblings, 2 replies; 98+ messages in thread From: Garreau, Alexandre @ 2018-10-06 11:22 UTC (permalink / raw) To: Paul Eggert; +Cc: Eli Zaretskii, npostavs, drew.adams, emacs-devel On 2018-10-05 at 16:02, Paul Eggert wrote: > However, that would be heading in the wrong direction, because we > shouldn't assume that Elisp code is reviewed only via Emacs. I > regularly use Savannah's web interface In a world where unicode is increasingly present and confusion about its characters increasingly problematic (typosquatting, etc.) wouldn’t it be reasonable to expect unicode-related semantic functions to be provided in most frameworks, systems and languages to allow better handling of such problems, thus making that problem the interface’s one? Maybe if ever this problem occurs more and more in domain names, internet addresses, and such, interfaces such as web ones, or other editors, will inevitably need to support features to avoid confusion the same way emacs currently could? ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 11:22 ` Garreau, Alexandre @ 2018-10-06 11:50 ` Eli Zaretskii 2018-10-06 12:10 ` Garreau, Alexandre 2018-10-06 13:15 ` Unicode security-issues workarounds elsewhere [Was: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?] Garreau, Alexandre 2018-10-06 16:24 ` Change of Lisp syntax for "fancy" quotes in Emacs 27? Paul Eggert 1 sibling, 2 replies; 98+ messages in thread From: Eli Zaretskii @ 2018-10-06 11:50 UTC (permalink / raw) To: Garreau, Alexandre; +Cc: npostavs, eggert, drew.adams, emacs-devel > From: "Garreau\, Alexandre" <galex-713@galex-713.eu> > Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org, drew.adams@oracle.com, npostavs@users.sourceforge.net > Date: Sat, 06 Oct 2018 13:22:14 +0200 > > In a world where unicode is increasingly present and confusion about its > characters increasingly problematic (typosquatting, etc.) wouldn’t it be > reasonable to expect unicode-related semantic functions to be provided > in most frameworks, systems and languages to allow better handling of > such problems, thus making that problem the interface’s one? I don't think I understand what this means in practice; please elaborate. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 11:50 ` Eli Zaretskii @ 2018-10-06 12:10 ` Garreau, Alexandre 2018-10-06 14:00 ` Eli Zaretskii 2018-10-06 13:15 ` Unicode security-issues workarounds elsewhere [Was: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?] Garreau, Alexandre 1 sibling, 1 reply; 98+ messages in thread From: Garreau, Alexandre @ 2018-10-06 12:10 UTC (permalink / raw) To: Eli Zaretskii; +Cc: npostavs, eggert, drew.adams, emacs-devel Le 06/10/2018 à 14h50, Eli Zaretskii a écrit : >> From: "Garreau\, Alexandre" <galex-713@galex-713.eu> >> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org, >> drew.adams@oracle.com, npostavs@users.sourceforge.net >> Date: Sat, 06 Oct 2018 13:22:14 +0200 >> >> In a world where unicode is increasingly present and confusion about its >> characters increasingly problematic (typosquatting, etc.) wouldn’t it be >> reasonable to expect unicode-related semantic functions to be provided >> in most frameworks, systems and languages to allow better handling of >> such problems, thus making that problem the interface’s one? > > I don't think I understand what this means in practice; please > elaborate. afaik there are also problems in other contents than source code about undistinguishable unicode character, such as the latin ?o and the cyrillic ?о (the first example of unicode-powered typosquatting I ever heard), the different spaces (sometimes not distinguishable in monospace font), or, to stay on monospacing problems: I have great pain in writing correct french text as I must always check in something not-emacs about which one between ?– and ?— is the medium and the long dash (I normally recall through their position on my keyboard but as they’re aside I often forget), not to recall the different hacks about bidirectionality you highlighted earlier. I also heard about emails confusing semantic-based bayesian anti-spam by putting not-spammy words in mails that, because of some unicode tricks, wouldn’t be displayed to user. This problems aren’t local to source code, nor to emacs (as many people use something else than emacs to read mails, websites, news, and reading domain names), and afaik there are canonicalizations and semantic unicode categories functions to help knowing what is punctuation, what is combining, what is displayed and takes how much space, and maybe, but I’m unsure, which characters are to be difficult or even impossible to distinguish (or some canonicalizations function to get two differently encoded (related to combining characters (such as the difference between "é" and "é" (made of ?e then ?́ (it’s fun to see how this last one is strangely displayed and finely evaluated by emacs)))) strings comparable the same, or two characters-different but looking-alike strings comparable the same too). I guess this issue is even going to be less a problem in free softwares where theorically the writers should be well-intentioned and shouldn’t try to trick the readers on what the software do (and/or it should at least be reviewed with capable tools and/or knowledge), compared to cases where this is going to be abusable and profitable, such as typosquating ("google.com" and "gооgle.com" are not the same (it’s interesting to notice too how emacs forward/backward-word detects and use the language-switching to stop at the "оо", I’m astounished by these capabilities I have to thank you guy for a such great piece of software!) but google could aford (and took care) to buy both while not everyone could do as well (and nobody yet reserved "amazоn.com"), and people might crack, steal or blackmail using something like that). ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 12:10 ` Garreau, Alexandre @ 2018-10-06 14:00 ` Eli Zaretskii 2018-10-24 22:25 ` Noam Postavsky 0 siblings, 1 reply; 98+ messages in thread From: Eli Zaretskii @ 2018-10-06 14:00 UTC (permalink / raw) To: Garreau, Alexandre; +Cc: npostavs, eggert, drew.adams, emacs-devel > From: "Garreau\, Alexandre" <galex-713@galex-713.eu> > Cc: eggert@cs.ucla.edu, emacs-devel@gnu.org, drew.adams@oracle.com, npostavs@users.sourceforge.net > Date: Sat, 06 Oct 2018 14:10:17 +0200 > > afaik there are also problems in other contents than source code about > undistinguishable unicode character, such as the latin ?o and the > cyrillic ?о (the first example of unicode-powered typosquatting I ever > heard), the different spaces (sometimes not distinguishable in monospace > font), or, to stay on monospacing problems: I have great pain in writing > correct french text as I must always check in something not-emacs about > which one between ?– and ?— is the medium and the long dash (I normally > recall through their position on my keyboard but as they’re aside I > often forget), not to recall the different hacks about bidirectionality > you highlighted earlier. I also heard about emails confusing > semantic-based bayesian anti-spam by putting not-spammy words in mails > that, because of some unicode tricks, wouldn’t be displayed to user. This is the more general problem I mentioned up-thread: http://lists.gnu.org/archive/html/emacs-devel/2018-10/msg00052.html I agree that we should improve Emacs in that area, but I think it would be wrong to hold off fixing the issue with quotes because the more general problem is solved. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 14:00 ` Eli Zaretskii @ 2018-10-24 22:25 ` Noam Postavsky 0 siblings, 0 replies; 98+ messages in thread From: Noam Postavsky @ 2018-10-24 22:25 UTC (permalink / raw) To: Eli Zaretskii Cc: Paul Eggert, Drew Adams, Garreau, Alexandre, Emacs developers On Sat, 6 Oct 2018 at 10:00, Eli Zaretskii <eliz@gnu.org> wrote: > > > afaik there are also problems in other contents than source code about > > undistinguishable unicode character, such as the latin ?o and the > > cyrillic ?о (the first example of unicode-powered typosquatting I ever > > heard), the different spaces (sometimes not distinguishable in monospace [...] > > This is the more general problem I mentioned up-thread: > > http://lists.gnu.org/archive/html/emacs-devel/2018-10/msg00052.html > > I agree that we should improve Emacs in that area, but I think it > would be wrong to hold off fixing the issue with quotes because the > more general problem is solved. Right, I do see the quotes thing as an obvious place to start, and due to how difficult the more general issue is, starting with enhancement of existing warnings and existing errors make sense. It addresses the immediate problem of confusiong from curly quotes, and we can add more confusable characters later without fear of breaking anything. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Unicode security-issues workarounds elsewhere [Was: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?] 2018-10-06 11:50 ` Eli Zaretskii 2018-10-06 12:10 ` Garreau, Alexandre @ 2018-10-06 13:15 ` Garreau, Alexandre 2018-10-06 14:01 ` Eli Zaretskii 1 sibling, 1 reply; 98+ messages in thread From: Garreau, Alexandre @ 2018-10-06 13:15 UTC (permalink / raw) To: Eli Zaretskii; +Cc: npostavs, eggert, drew.adams, emacs-devel Le 06/10/2018 à 14h50, Eli Zaretskii a écrit : >> From: "Garreau\, Alexandre" <galex-713@galex-713.eu> >> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org, >> drew.adams@oracle.com, npostavs@users.sourceforge.net >> Date: Sat, 06 Oct 2018 13:22:14 +0200 >> >> In a world where unicode is increasingly present and confusion about its >> characters increasingly problematic (typosquatting, etc.) wouldn’t it be >> reasonable to expect unicode-related semantic functions to be provided >> in most frameworks, systems and languages to allow better handling of >> such problems, thus making that problem the interface’s one? > > I don't think I understand what this means in practice; please > elaborate. The point I wanted to make is since as I highlighted this problem is of greater importance in other interfaces than source codes, especially browsers and web sites, typically, as these gets to be the most used interfaces for everything nowadays. So I guess these unicode anti-confusion functions and more high-level functions based on these already are or will become present in browsers and in languages such as perl and php to end up in high-level functions in frameworks made in perl or php, for instance, so that at the end “other interfaces than emacs” such as web-browsers or websites may end supporting features such as coloring differently mixed-script or unusual spaces, etc. The other option being “ban unicode as much as possible” or “disallow mixed-script”, and “ban all unicode punctuation characters (or all non-letters (or non-alphanumeric?) characters, or something like that) unless they’re inside ascii”. I believe with increased support of unicode most languages, frameworks and software should end with features allowing to allow these without creating too much problems (at least not that much a lot more than in emacs). ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Unicode security-issues workarounds elsewhere [Was: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?] 2018-10-06 13:15 ` Unicode security-issues workarounds elsewhere [Was: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?] Garreau, Alexandre @ 2018-10-06 14:01 ` Eli Zaretskii 0 siblings, 0 replies; 98+ messages in thread From: Eli Zaretskii @ 2018-10-06 14:01 UTC (permalink / raw) To: Garreau, Alexandre; +Cc: eggert, emacs-devel, drew.adams, npostavs > From: "Garreau\, Alexandre" <galex-713@galex-713.eu> > Date: Sat, 06 Oct 2018 15:15:34 +0200 > Cc: npostavs@users.sourceforge.net, eggert@cs.ucla.edu, drew.adams@oracle.com, > emacs-devel@gnu.org > > The other option being “ban unicode as much as possible” or “disallow > mixed-script”, and “ban all unicode punctuation characters (or all > non-letters (or non-alphanumeric?) characters, or something like that) > unless they’re inside ascii”. I don't think this is feasible in Emacs, we cannot limit non-ASCII punctuation to ASCII text. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 11:22 ` Garreau, Alexandre 2018-10-06 11:50 ` Eli Zaretskii @ 2018-10-06 16:24 ` Paul Eggert 2018-10-06 16:40 ` Stefan Monnier 1 sibling, 1 reply; 98+ messages in thread From: Paul Eggert @ 2018-10-06 16:24 UTC (permalink / raw) To: Garreau, Alexandre; +Cc: Eli Zaretskii, npostavs, drew.adams, emacs-devel Garreau, Alexandre wrote: > ouldn’t it be > reasonable to expect unicode-related semantic functions to be provided > in most frameworks, systems and languages to allow better handling of > such problems What a wonderful world that would be! But it's not the world we live in. Even Emacs misdisplays many of these characters now. And other tools that I routinely use (Firefox, Thunderbird, Gnome terminal) are even worse. We can't reasonably assume that confusable characters will be displayed nicely everywhere, not even five or ten years from now. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-06 16:24 ` Change of Lisp syntax for "fancy" quotes in Emacs 27? Paul Eggert @ 2018-10-06 16:40 ` Stefan Monnier 0 siblings, 0 replies; 98+ messages in thread From: Stefan Monnier @ 2018-10-06 16:40 UTC (permalink / raw) To: emacs-devel > in. Even Emacs misdisplays many of these characters now. And other tools FWIW, we have the `uni-confusables` in GNU ELPA for that purpose. Maybe we should integrate more tightly. Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-05 23:02 ` Paul Eggert ` (2 preceding siblings ...) 2018-10-06 11:22 ` Garreau, Alexandre @ 2018-10-09 14:43 ` Noam Postavsky 2018-10-09 15:30 ` Paul Eggert 2018-10-10 3:57 ` Richard Stallman 3 siblings, 2 replies; 98+ messages in thread From: Noam Postavsky @ 2018-10-09 14:43 UTC (permalink / raw) To: Paul Eggert; +Cc: Eli Zaretskii, Drew Adams, Emacs developers On Fri, 5 Oct 2018 at 19:02, Paul Eggert <eggert@cs.ucla.edu> wrote: > > On 10/5/18 1:43 AM, Eli Zaretskii wrote: > > the commonly accepted mechanism of > > pointing out potentially wrong constructs is by visual cues and > > warning messages > > If we decide that Elisp source code must be able to abuse confusable > characters, then of course we should allow such abuse and support it as > best we can, including selective highlighting and whatnot to try to warn > readers of the abuse. Such support won't work outside Emacs, but people > using non-Emacs programs to look at Elisp code will simply be out of luck. The problem is that deciding which characters are confusable and hence require backslash escaping is based on a shifting mess of heuristics. So I don't think it's workable to signal a hard error for this. Both in terms of false positives which could mean possibly breaking code, and false negatives which means we would be giving a false sense of security. That's why I proposed adding highlighting and enhancing existing error messages instead. Of course adding warnings would also make sense. By the way, your EN SPACE example already gives a compile warning: Warning: Unused lexical variable ‘data data’ ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-09 14:43 ` Noam Postavsky @ 2018-10-09 15:30 ` Paul Eggert 2018-10-09 16:13 ` Eli Zaretskii 2018-10-10 3:57 ` Richard Stallman 1 sibling, 1 reply; 98+ messages in thread From: Paul Eggert @ 2018-10-09 15:30 UTC (permalink / raw) To: Noam Postavsky; +Cc: Eli Zaretskii, Drew Adams, Emacs developers Noam Postavsky wrote: > deciding which characters are confusable and hence > require backslash escaping is based on a shifting mess of heuristics. No more than the "shifting mess of heuristics" inevitable in any choice of syntax. Quite possibly the confusables list from the Unicode consortium will do. The list won't shift much once it's established. We can start merely by warning about confusable characters and seeing how often those warnings are triggered in real (as opposed to malicious or purposely-tricky) code. If the warnings are quite rare, in a later Emacs version we can change the manual from "confusable characters should be escaped" to "confusable characters must be escaped". ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-09 15:30 ` Paul Eggert @ 2018-10-09 16:13 ` Eli Zaretskii 2018-10-09 17:07 ` Paul Eggert 0 siblings, 1 reply; 98+ messages in thread From: Eli Zaretskii @ 2018-10-09 16:13 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel, drew.adams, npostavs > Cc: Eli Zaretskii <eliz@gnu.org>, Emacs developers <emacs-devel@gnu.org>, > Drew Adams <drew.adams@oracle.com> > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Tue, 9 Oct 2018 08:30:22 -0700 > > Noam Postavsky wrote: > > deciding which characters are confusable and hence > > require backslash escaping is based on a shifting mess of heuristics. > > No more than the "shifting mess of heuristics" inevitable in any choice of > syntax. Quite possibly the confusables list from the Unicode consortium will do. > The list won't shift much once it's established. > > We can start merely by warning about confusable characters and seeing how often > those warnings are triggered in real (as opposed to malicious or > purposely-tricky) code. If the warnings are quite rare, in a later Emacs version > we can change the manual from "confusable characters should be escaped" to > "confusable characters must be escaped". Confusable characters are confusable only when they are surrounded by ASCII characters or by characters that look like ASCII. By themselves, at least many of them, are entirely legitimate. For example, I see no reason to warn about a symbol named "сталин", even though the characters с and а will be considered confusables if the symbol would be named something like "саn". So I think we cannot go by characters here, we need to examine the context. That is why I think we shouldn't link this particular issue, of quote characters, with the more general problem: the latter is much more complicated to solve correctly. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-09 16:13 ` Eli Zaretskii @ 2018-10-09 17:07 ` Paul Eggert 2018-10-09 19:18 ` Andreas Schwab 2018-10-10 3:58 ` Richard Stallman 0 siblings, 2 replies; 98+ messages in thread From: Paul Eggert @ 2018-10-09 17:07 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, drew.adams, npostavs Eli Zaretskii wrote: > I see no reason to warn about a symbol named "сталин", even > though the characters с and а will be considered confusables if the > symbol would be named something like "саn". Sure, that's fine. We can limit symbol warnings to the symbols containing non-ASCII chars all of which are confusable with ASCII. This will warn about "саn" (with Cyrillic "с" and "а") but not about "сталин" (with Cyrillic "a"). The point of the guideline is not to warn about every possible confusable character; it's to defend against malicious code. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-09 17:07 ` Paul Eggert @ 2018-10-09 19:18 ` Andreas Schwab 2018-10-10 9:39 ` Aaron Ecay 2018-10-10 15:18 ` Eli Zaretskii 2018-10-10 3:58 ` Richard Stallman 1 sibling, 2 replies; 98+ messages in thread From: Andreas Schwab @ 2018-10-09 19:18 UTC (permalink / raw) To: Paul Eggert; +Cc: Eli Zaretskii, npostavs, drew.adams, emacs-devel On Okt 09 2018, Paul Eggert <eggert@cs.ucla.edu> wrote: > Sure, that's fine. We can limit symbol warnings to the symbols containing > non-ASCII chars all of which are confusable with ASCII. This will warn > about "саn" (with Cyrillic "с" and "а") but not about "сталин" (with > Cyrillic "a"). I'm pretty sure you can find many Russian words that are written with only Latin-alike letters. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different." ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-09 19:18 ` Andreas Schwab @ 2018-10-10 9:39 ` Aaron Ecay 2018-10-10 11:18 ` Garreau, Alexandre 2018-10-10 15:18 ` Eli Zaretskii 1 sibling, 1 reply; 98+ messages in thread From: Aaron Ecay @ 2018-10-10 9:39 UTC (permalink / raw) To: Andreas Schwab; +Cc: emacs-devel 2018ko urriak 9an, Andreas Schwab-ek idatzi zuen: > > On Okt 09 2018, Paul Eggert <eggert@cs.ucla.edu> wrote: > >> Sure, that's fine. We can limit symbol warnings to the symbols containing >> non-ASCII chars all of which are confusable with ASCII. This will warn >> about "саn" (with Cyrillic "с" and "а") but not about "сталин" (with >> Cyrillic "a"). > > I'm pretty sure you can find many Russian words that are written with > only Latin-alike letters. Should this be a warning? (let ((с cyrillic-ess)) ...) What about this? (let ((c latin-c) (с cyrillic-ess)) ...) IMO the answer to both questions is yes (because Latin letters are used for elisp special forms like “let,” so they should be inherently privileged) – but I only use Latin letters in programs I write, so I probably donʼt have the perspective to know how annoying such warnings could be to regular users of other scripts. However, since warnings are only (potentially) annoying rather than changing the behavior of programs, it makes sense to be aggressive with them, in order to gauge how disruptive it would be to actually change the way text is interpreted as code. PS An issue that seems related is that it is presently possible to bind the symbols ö (one character, U+00F6 LATIN SMALL LETTER O WITH DIAERESIS) and ö (two characters, U+006F LATIN SMALL LETTER O followed by U+0308 COMBINING DIAERESIS) to different values. This seems like the kind of thing that should be (at least) warned about and (probably) disallowed. -- Aaron Ecay ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-10 9:39 ` Aaron Ecay @ 2018-10-10 11:18 ` Garreau, Alexandre 2018-10-10 14:31 ` Eli Zaretskii 0 siblings, 1 reply; 98+ messages in thread From: Garreau, Alexandre @ 2018-10-10 11:18 UTC (permalink / raw) To: Aaron Ecay; +Cc: Andreas Schwab, emacs-devel On 2018-10-10 at 10:39, Aaron Ecay wrote: > PS An issue that seems related is that it is presently possible to bind > the symbols ö (one character, U+00F6 LATIN SMALL LETTER O WITH DIAERESIS) > and ö (two characters, U+006F LATIN SMALL LETTER O followed by U+0308 > COMBINING DIAERESIS) to different values. This seems like the kind of > thing that should be (at least) warned about and (probably) disallowed. Oh yes… I confirm, this is the case here too (but my version is only 25.1.1, so maybe it changed), it really shouldn’t be that way at all (tried with é and é (btw these two display quite strangely slightly differently…))… ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-10 11:18 ` Garreau, Alexandre @ 2018-10-10 14:31 ` Eli Zaretskii 0 siblings, 0 replies; 98+ messages in thread From: Eli Zaretskii @ 2018-10-10 14:31 UTC (permalink / raw) To: Garreau, Alexandre; +Cc: aaronecay, schwab, emacs-devel > From: "Garreau\, Alexandre" <galex-713@galex-713.eu> > Date: Wed, 10 Oct 2018 13:18:09 +0200 > Cc: Andreas Schwab <schwab@linux-m68k.org>, emacs-devel@gnu.org > > (tried with é and é (btw these two display quite strangely slightly > differently…))… It depends on your fonts. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-09 19:18 ` Andreas Schwab 2018-10-10 9:39 ` Aaron Ecay @ 2018-10-10 15:18 ` Eli Zaretskii 2018-10-10 15:43 ` Drew Adams 2018-10-10 16:08 ` Yuri Khan 1 sibling, 2 replies; 98+ messages in thread From: Eli Zaretskii @ 2018-10-10 15:18 UTC (permalink / raw) To: Andreas Schwab; +Cc: eggert, emacs-devel, drew.adams, npostavs > From: Andreas Schwab <schwab@linux-m68k.org> > Date: Tue, 09 Oct 2018 21:18:51 +0200 > Cc: Eli Zaretskii <eliz@gnu.org>, npostavs@users.sourceforge.net, > drew.adams@oracle.com, emacs-devel@gnu.org > > I'm pretty sure you can find many Russian words that are written with > only Latin-alike letters. I wrote the following toy program: (let ((confusing '(?а ?е ?о ?р ?с ?у ?х)) (buf (get-buffer-create "*confusing*"))) (while (not (eobp)) (let* ((word (buffer-substring (line-beginning-position) (line-end-position))) (chars (append word nil))) (if (null (seq-difference chars confusing)) (with-current-buffer buf (insert word ?\n)))) (forward-line)))) and ran it on a list of 174800 words from a Russian dictionary. The result was 60 words that used only Latin-alike letters. So, not too many, but not just a few, either. Of course, there are many more non-word combinations of the above letters that might look like Latin words. Also note that "confusability" sometimes depends on letter-case. For example, the lower-case "вор" doesn't look like a Latin word, but the upper-case "ВОР" does. ^ permalink raw reply [flat|nested] 98+ messages in thread
* RE: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-10 15:18 ` Eli Zaretskii @ 2018-10-10 15:43 ` Drew Adams 2018-10-10 16:08 ` Yuri Khan 1 sibling, 0 replies; 98+ messages in thread From: Drew Adams @ 2018-10-10 15:43 UTC (permalink / raw) To: Eli Zaretskii, Andreas Schwab; +Cc: eggert, emacs-devel, npostavs It sounds like the contexts where a char might be confused with another are varied and depend on things that can even include user attention and intention. If we want to help users be aware of character-confusion possibilities then I think whatever we offer them in this regard needs to be (1) optional and (2) configurable (granularity, specifying contexts/uses/conditions, etc.). I think we can offer to help by highlighting characters (or their surrounding contexts, e.g., when a char is tiny or otherwise unobtrusive or invisible). I think we should avoid raising errors, but that could be an option that some users might want to choose in some contexts. We could perhaps offer a range of help responses, from a range of highlighting possibilities to outright error-raising. We can have code that tries to be clever, but that should only be used if asked for by a user. We should not try to second-guess text or users by default. The last thing we should want is to bother users by default, or systematically, warning them left and right about possibilities of confusion. Such warnings or notifications or highlights need to be opt-in, IMHO. Above all, Emacs, and especially Emacs Lisp, should continue to be an environment where you can do what you want without obstruction or unnecessary hand-holding or helicopter-parenting. (Note that I qualified that with "unnecessary". If there is some real, strong, unambiguous danger that we can identify then of course we need to offer protection up front. That help would not be "unnecessary".) ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-10 15:18 ` Eli Zaretskii 2018-10-10 15:43 ` Drew Adams @ 2018-10-10 16:08 ` Yuri Khan 2018-10-15 20:30 ` Juri Linkov 1 sibling, 1 reply; 98+ messages in thread From: Yuri Khan @ 2018-10-10 16:08 UTC (permalink / raw) To: Eli Zaretskii Cc: Noam Postavsky, Paul Eggert, Andreas Schwab, Drew Adams, Emacs developers On Wed, Oct 10, 2018 at 10:20 PM Eli Zaretskii <eliz@gnu.org> wrote: > (let ((confusing '(?а ?е ?о ?р ?с ?у ?х)) > Also note that "confusability" sometimes depends on letter-case. For > example, the lower-case "вор" doesn't look like a Latin word, but the > upper-case "ВОР" does. Yes. In uppercase: ?А ?В ?Е ?К ?М ?Н ?О ?Р ?С ?Т ?Х. (Coincidentally, Russian car license plates use confusable letters exclusively, so that people who are not fluent in the Cyrillic script could still report violations.) Also, confusability depends on font style. In lowercase italic, these pairs are also confusable: д/g, з/z, и/u, п/n, т/m, ч/r. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-10 16:08 ` Yuri Khan @ 2018-10-15 20:30 ` Juri Linkov 0 siblings, 0 replies; 98+ messages in thread From: Juri Linkov @ 2018-10-15 20:30 UTC (permalink / raw) To: Yuri Khan Cc: Paul Eggert, Noam Postavsky, Emacs developers, Andreas Schwab, Eli Zaretskii, Drew Adams >> (let ((confusing '(?а ?е ?о ?р ?с ?у ?х)) > >> Also note that "confusability" sometimes depends on letter-case. For >> example, the lower-case "вор" doesn't look like a Latin word, but the >> upper-case "ВОР" does. > > Yes. In uppercase: ?А ?В ?Е ?К ?М ?Н ?О ?Р ?С ?Т ?Х. (Coincidentally, > Russian car license plates use confusable letters exclusively, so that > people who are not fluent in the Cyrillic script could still report > violations.) There are programs composed completely of confusable characters like http://compuhumour.narod.ru/listing/prog_tormoz.html ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-09 17:07 ` Paul Eggert 2018-10-09 19:18 ` Andreas Schwab @ 2018-10-10 3:58 ` Richard Stallman 1 sibling, 0 replies; 98+ messages in thread From: Richard Stallman @ 2018-10-10 3:58 UTC (permalink / raw) To: Paul Eggert; +Cc: eliz, npostavs, drew.adams, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Sure, that's fine. We can limit symbol warnings to the symbols containing > non-ASCII chars all of which are confusable with ASCII. This will warn about > "саn" (with Cyrillic "с" and "а") but not about "сталин" (with Cyrillic "a"). I agree, but we need to extend this protection to things other than program code. > The point of the guideline is not to warn about every possible confusable > character; it's to defend against malicious code. Confusables are dangerous in host names, too. The same principle could apply: if the host name contains, as well as the confusables, some non-confusable non-ASCII characters from the same Unicode page, there is no need to warn about it. Perhaps there are other cases, too. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-09 14:43 ` Noam Postavsky 2018-10-09 15:30 ` Paul Eggert @ 2018-10-10 3:57 ` Richard Stallman 2018-10-10 14:41 ` Eli Zaretskii 1 sibling, 1 reply; 98+ messages in thread From: Richard Stallman @ 2018-10-10 3:57 UTC (permalink / raw) To: Noam Postavsky; +Cc: eliz, eggert, drew.adams, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > The problem is that deciding which characters are confusable and hence > require backslash escaping is based on a shifting mess of heuristics. > So I don't think it's workable to signal a hard error for this. Both > in terms of false positives which could mean possibly breaking code, > and false negatives which means we would be giving a false sense of > security. In principle, that\s a valid point. But can't we assembe a fixed list of characters that are confusable with the usual fonts. and base the warning on that list? We could conceivably have a feature that would check any fontset for confusable characters, and warn the user if it has confusable pairs that are not in the usual list of confusable pairs. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-10 3:57 ` Richard Stallman @ 2018-10-10 14:41 ` Eli Zaretskii 2018-10-11 5:01 ` Richard Stallman 0 siblings, 1 reply; 98+ messages in thread From: Eli Zaretskii @ 2018-10-10 14:41 UTC (permalink / raw) To: rms; +Cc: eggert, emacs-devel, drew.adams, npostavs > From: Richard Stallman <rms@gnu.org> > Cc: eggert@cs.ucla.edu, eliz@gnu.org, drew.adams@oracle.com, > emacs-devel@gnu.org > Date: Tue, 09 Oct 2018 23:57:01 -0400 > > We could conceivably have a feature that would check any fontset for > confusable characters I'm not an expert on fonts, but I don't think this is reliable enough. Are you saying that a font might use the same glyph for similarly looking characters from different scripts? If that is true, then yes, we could detect that. But the fact that we have such a font doesn't yet mean there is a problem worth warning the user, since these characters need to be _used_ in a certain context to cause confusion. ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-10 14:41 ` Eli Zaretskii @ 2018-10-11 5:01 ` Richard Stallman 0 siblings, 0 replies; 98+ messages in thread From: Richard Stallman @ 2018-10-11 5:01 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, emacs-devel, drew.adams, npostavs [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > I'm not an expert on fonts, but I don't think this is reliable enough. > Are you saying that a font might use the same glyph for similarly > looking characters from different scripts? If that is true, then yes, > we could detect that. But the fact that we have such a font doesn't > yet mean there is a problem worth warning the user, since these > characters need to be _used_ in a certain context to cause confusion. Which characters are confusable is one question, and which contexts they can cause confusion in is another question. We need to distinguish those, to factor the problem. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 98+ messages in thread
* eval-last-sexp / C-x C-e, and punctuation like `?’' [Was: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?)] 2018-10-05 0:03 ` Noam Postavsky 2018-10-05 1:01 ` Paul Eggert @ 2018-10-06 15:40 ` Garreau, Alexandre 2018-10-16 12:48 ` Change of Lisp syntax for "fancy" quotes in Emacs 27? Garreau, Alexandre 2 siblings, 0 replies; 98+ messages in thread From: Garreau, Alexandre @ 2018-10-06 15:40 UTC (permalink / raw) To: Emacs developers; +Cc: Drew Adams, Noam Postavsky On 2018-10-04 at 20:03, Noam Postavsky wrote: > On Fri, 2 Feb 2018 at 17:24, Noam Postavsky > <npostavs@users.sourceforge.net> wrote: >> >> In Emacs 26 and earlier the following is valid lisp code: >> >> (setq ’bar 42) >> (setq foo ’bar) I just noticed: in emacs 25, if evaluating `’bar' with `eval-last-sexp' / C-x C-e, this gives an error as it ignores the ?’ and eval only `bar', the same way, if point is placed after the ?’, it tries to eval “setq”… Maybe I do not know enough of elisp, but why that? are there other punctuation characters triggering this behavior? meanwhile, are they all okay for the reader to put in symbols unescaped (except ? , ?\", ?\(, ?\), ?,, ?`, and maybe some others from ascii I forgot)? Why does eval-last-sexps treat this differently than the reader? ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? 2018-10-05 0:03 ` Noam Postavsky 2018-10-05 1:01 ` Paul Eggert 2018-10-06 15:40 ` eval-last-sexp / C-x C-e, and punctuation like `?’' [Was: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?)] Garreau, Alexandre @ 2018-10-16 12:48 ` Garreau, Alexandre 2 siblings, 0 replies; 98+ messages in thread From: Garreau, Alexandre @ 2018-10-16 12:48 UTC (permalink / raw) To: Noam Postavsky; +Cc: Drew Adams, Emacs developers On 2018-10-04 at 20:03, Noam Postavsky wrote: > On Fri, 2 Feb 2018 at 17:24, Noam Postavsky > <npostavs@users.sourceforge.net> wrote: >> >> In Emacs 26 and earlier the following is valid lisp code: >> >> (setq ’bar 42) >> (setq foo ’bar) >> >> In the current master branch, this will signal (invalid-read-syntax >> "strange quote" "’"). Btw, aren’t there any ways of, at least locally, extending/redefining such reader behavior such as the one of “'”, “,”/“,@”, “`”, “.”, “:”, etc.? For instance to experiment having such fancy and strange quotes in source code: people really wanting to use it *might* want to use it as such, instead of symbol component, which, inside ascii, often (with a lot of exceptions such as in “!”, “?”, “:” (though it can have special meaning) or “.” (though this one doesn’t work alone) other non-human-text “punctuation” (also named “special characters”) such as in “%&*+/<>=@^_|”) cannot be part of a symbol without escaping (for instance: “"#'(),;\[]`” (though this is tiny, it is not that simple)). ^ permalink raw reply [flat|nested] 98+ messages in thread
end of thread, other threads:[~2018-10-24 22:25 UTC | newest] Thread overview: 98+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-02-02 22:24 Change of Lisp syntax for "fancy" quotes in Emacs 27? Noam Postavsky 2018-02-02 22:52 ` Paul Eggert 2018-02-03 0:00 ` Drew Adams 2018-02-03 0:09 ` Paul Eggert 2018-02-03 0:39 ` Drew Adams 2018-02-03 8:33 ` Eli Zaretskii 2018-02-03 16:16 ` Drew Adams 2018-02-03 17:05 ` Eli Zaretskii 2018-02-04 1:16 ` Michael Heerdegen 2018-02-04 1:25 ` Clément Pit-Claudel 2018-02-04 2:05 ` Drew Adams 2018-02-04 2:06 ` Michael Heerdegen 2018-02-04 10:34 ` Alan Third 2018-02-04 15:36 ` Clément Pit-Claudel 2018-02-04 17:37 ` Eli Zaretskii 2018-02-04 21:31 ` Noam Postavsky 2018-02-04 11:15 ` Alan Mackenzie 2018-02-04 15:54 ` Drew Adams 2018-02-04 14:47 ` Noam Postavsky 2018-02-04 1:55 ` Drew Adams 2018-02-04 2:10 ` Noam Postavsky 2018-02-05 1:06 ` Why "symbol's value" error about a list? Richard Stallman 2018-02-05 20:35 ` Alan Mackenzie 2018-02-05 21:46 ` Drew Adams 2018-02-06 4:13 ` Eli Zaretskii 2018-02-06 7:32 ` Tim Cross 2018-02-06 7:40 ` Eli Zaretskii 2018-02-06 15:45 ` Drew Adams 2018-02-06 15:45 ` Drew Adams 2018-02-06 19:17 ` Eli Zaretskii 2018-02-06 14:51 ` Richard Stallman 2018-02-06 11:27 ` Noam Postavsky 2018-02-06 14:53 ` Richard Stallman 2018-02-06 18:59 ` Eli Zaretskii 2018-02-07 2:40 ` Richard Stallman 2018-02-07 3:42 ` Eli Zaretskii 2018-02-06 18:52 ` Eli Zaretskii 2018-02-05 1:06 ` Change of Lisp syntax for "fancy" quotes in Emacs 27? Richard Stallman 2018-02-03 18:13 ` Aaron Ecay 2018-02-04 2:05 ` Drew Adams 2018-02-04 4:51 ` Paul Eggert 2018-02-04 9:47 ` Andreas Schwab 2018-02-04 15:04 ` Noam Postavsky 2018-02-04 17:33 ` Eli Zaretskii 2018-02-04 19:36 ` Paul Eggert 2018-02-04 19:55 ` Philipp Stephani 2018-02-04 20:10 ` Eli Zaretskii 2018-02-04 20:36 ` Eli Zaretskii 2018-02-04 20:48 ` Paul Eggert 2018-02-04 20:59 ` Clément Pit-Claudel 2018-10-05 0:03 ` Noam Postavsky 2018-10-05 1:01 ` Paul Eggert 2018-10-05 8:43 ` Eli Zaretskii 2018-10-05 23:02 ` Paul Eggert 2018-10-06 0:20 ` Drew Adams 2018-10-06 9:14 ` Alan Mackenzie 2018-10-06 14:34 ` Stefan Monnier 2018-10-06 14:57 ` Drew Adams 2018-10-06 15:42 ` Garreau, Alexandre 2018-10-06 16:10 ` Paul Eggert 2018-10-06 16:17 ` Paul Eggert 2018-10-07 1:13 ` Drew Adams 2018-10-08 3:51 ` Richard Stallman 2018-10-06 10:11 ` Eli Zaretskii 2018-10-06 15:51 ` Paul Eggert 2018-10-06 16:45 ` Eli Zaretskii 2018-10-06 18:03 ` Paul Eggert 2018-10-06 18:29 ` Eli Zaretskii 2018-10-06 19:18 ` Paul Eggert 2018-10-06 19:30 ` Paul Eggert 2018-10-06 19:32 ` Garreau, Alexandre 2018-10-06 11:22 ` Garreau, Alexandre 2018-10-06 11:50 ` Eli Zaretskii 2018-10-06 12:10 ` Garreau, Alexandre 2018-10-06 14:00 ` Eli Zaretskii 2018-10-24 22:25 ` Noam Postavsky 2018-10-06 13:15 ` Unicode security-issues workarounds elsewhere [Was: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?] Garreau, Alexandre 2018-10-06 14:01 ` Eli Zaretskii 2018-10-06 16:24 ` Change of Lisp syntax for "fancy" quotes in Emacs 27? Paul Eggert 2018-10-06 16:40 ` Stefan Monnier 2018-10-09 14:43 ` Noam Postavsky 2018-10-09 15:30 ` Paul Eggert 2018-10-09 16:13 ` Eli Zaretskii 2018-10-09 17:07 ` Paul Eggert 2018-10-09 19:18 ` Andreas Schwab 2018-10-10 9:39 ` Aaron Ecay 2018-10-10 11:18 ` Garreau, Alexandre 2018-10-10 14:31 ` Eli Zaretskii 2018-10-10 15:18 ` Eli Zaretskii 2018-10-10 15:43 ` Drew Adams 2018-10-10 16:08 ` Yuri Khan 2018-10-15 20:30 ` Juri Linkov 2018-10-10 3:58 ` Richard Stallman 2018-10-10 3:57 ` Richard Stallman 2018-10-10 14:41 ` Eli Zaretskii 2018-10-11 5:01 ` Richard Stallman 2018-10-06 15:40 ` eval-last-sexp / C-x C-e, and punctuation like `?’' [Was: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27?)] Garreau, Alexandre 2018-10-16 12:48 ` Change of Lisp syntax for "fancy" quotes in Emacs 27? Garreau, Alexandre
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).