* Raw string literals in Emacs lisp. @ 2014-07-25 19:47 Matthew Plant 2014-07-25 19:56 ` Tassilo Horn ` (4 more replies) 0 siblings, 5 replies; 51+ messages in thread From: Matthew Plant @ 2014-07-25 19:47 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 842 bytes --] I think that raw string literals would be a really nice thing to add to Emacs lisp. The most immediate benefit is that writing regexps would be much easier. And since most of the work that goes into major modes is writing regexp, writing major modes would become a lot faster. Obviously it can't be done in any way that's really consistent with the language (it'd be super nice if ``string'' could be used, but alas). However, perhaps I have found a reasonable approach. What if we assume that any string surrounded immediately by parenthesis is a raw string literal? I'm pretty sure every instance of ("...") is currently illegal, and it would be almost certainly trivial to extend the Emacs' lexer/parser to support it. I can do it myself if everyone thinks this is a good idea. Please let me know what your thoughts are on this. -Matt [-- Attachment #2: Type: text/html, Size: 963 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-25 19:47 Raw string literals in Emacs lisp Matthew Plant @ 2014-07-25 19:56 ` Tassilo Horn 2014-07-25 20:06 ` Matthew Plant 2014-07-25 20:33 ` Tom Tromey ` (3 subsequent siblings) 4 siblings, 1 reply; 51+ messages in thread From: Tassilo Horn @ 2014-07-25 19:56 UTC (permalink / raw) To: Matthew Plant; +Cc: emacs-devel Matthew Plant <maplant2@illinois.edu> writes: Hi Matthew, > I think that raw string literals would be a really nice thing to add > to Emacs lisp. Yes, indeed. > What if we assume that any string surrounded immediately by > parenthesis is a raw string literal? I'm pretty sure every instance > of ("...") is currently illegal,... Nope, inside a `cond', ("default") is a short alternative for (t "default"). Bye, Tassilo ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-25 19:56 ` Tassilo Horn @ 2014-07-25 20:06 ` Matthew Plant 2014-07-25 20:15 ` Tassilo Horn 0 siblings, 1 reply; 51+ messages in thread From: Matthew Plant @ 2014-07-25 20:06 UTC (permalink / raw) To: Matthew Plant, emacs-devel [-- Attachment #1: Type: text/plain, Size: 714 bytes --] I would argue that is still workable, through various hacks. In the cond case if you wanted to specify I raw string literal you would do (("default")), which I think is still illegal. On Fri, Jul 25, 2014 at 12:56 PM, Tassilo Horn <tsdh@gnu.org> wrote: > Matthew Plant <maplant2@illinois.edu> writes: > > Hi Matthew, > > > I think that raw string literals would be a really nice thing to add > > to Emacs lisp. > > Yes, indeed. > > > What if we assume that any string surrounded immediately by > > parenthesis is a raw string literal? I'm pretty sure every instance > > of ("...") is currently illegal,... > > Nope, inside a `cond', ("default") is a short alternative for (t > "default"). > > Bye, > Tassilo > [-- Attachment #2: Type: text/html, Size: 1223 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-25 20:06 ` Matthew Plant @ 2014-07-25 20:15 ` Tassilo Horn 2014-07-25 20:24 ` Matthew Plant 0 siblings, 1 reply; 51+ messages in thread From: Tassilo Horn @ 2014-07-25 20:15 UTC (permalink / raw) To: Matthew Plant; +Cc: emacs-devel Matthew Plant <maplant2@illinois.edu> writes: > I would argue that is still workable, through various hacks. In the cond > case if you wanted to specify I raw string literal you would do > (("default")), which I think is still illegal. Yes, that's illegal. But why not #"foo" (like in Clojure regexps)? Or SXEmacs version of raw strings #r"foo"? To me, that reads much better than ("foo") and is much less ambiguous. Bye, Tassilo >> > What if we assume that any string surrounded immediately by >> > parenthesis is a raw string literal? I'm pretty sure every instance >> > of ("...") is currently illegal,... >> >> Nope, inside a `cond', ("default") is a short alternative for (t >> "default"). ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-25 20:15 ` Tassilo Horn @ 2014-07-25 20:24 ` Matthew Plant 0 siblings, 0 replies; 51+ messages in thread From: Matthew Plant @ 2014-07-25 20:24 UTC (permalink / raw) To: Matthew Plant, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1046 bytes --] I was under the impression that any ASCII character (with a few exceptions, but not including "#") could be used to define a variable. I see know that was a mistake. I also support #"foo", although #r"foo" seems unnecessarily verbose. On Fri, Jul 25, 2014 at 1:15 PM, Tassilo Horn <tsdh@gnu.org> wrote: > Matthew Plant <maplant2@illinois.edu> writes: > > > I would argue that is still workable, through various hacks. In the cond > > case if you wanted to specify I raw string literal you would do > > (("default")), which I think is still illegal. > > Yes, that's illegal. But why not #"foo" (like in Clojure regexps)? Or > SXEmacs version of raw strings #r"foo"? To me, that reads much better > than ("foo") and is much less ambiguous. > > Bye, > Tassilo > > >> > What if we assume that any string surrounded immediately by > >> > parenthesis is a raw string literal? I'm pretty sure every instance > >> > of ("...") is currently illegal,... > >> > >> Nope, inside a `cond', ("default") is a short alternative for (t > >> "default"). > [-- Attachment #2: Type: text/html, Size: 1692 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-25 19:47 Raw string literals in Emacs lisp Matthew Plant 2014-07-25 19:56 ` Tassilo Horn @ 2014-07-25 20:33 ` Tom Tromey 2014-07-25 21:40 ` Matthew Plant 2014-07-26 1:19 ` Stephen J. Turnbull ` (2 subsequent siblings) 4 siblings, 1 reply; 51+ messages in thread From: Tom Tromey @ 2014-07-25 20:33 UTC (permalink / raw) To: Matthew Plant; +Cc: emacs-devel Matthew> What if we assume that any string surrounded immediately by Matthew> parenthesis is a raw string literal? I'm pretty sure every Matthew> instance of ("...") is currently illegal, and it would be Matthew> almost certainly trivial to extend the Emacs' lexer/parser to Matthew> support it. I can do it myself if everyone thinks this is a Matthew> good idea. That kind of thing is valid in quoted contexts though. (defvar whatever '("hi")) FWIW there was a previous discussion about raw strings: http://comments.gmane.org/gmane.emacs.devel/152132 I think this killed the idea: http://permalink.gmane.org/gmane.emacs.devel/152155 Tom ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-25 20:33 ` Tom Tromey @ 2014-07-25 21:40 ` Matthew Plant 0 siblings, 0 replies; 51+ messages in thread From: Matthew Plant @ 2014-07-25 21:40 UTC (permalink / raw) To: Tom Tromey; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 2156 bytes --] Although I no longer think the idea is useful, quoted contexts would be very easy to detect and avoid. I think that, and this is the impression I got from skimming that thread, is that raw string literals are: 1. it's only really used for regex 2. it would be better to have a function that adds escapes to regexp strings, rather than make it easier to add them. 3. it would require more work than just updating the reader. This reasoning is solid but I think that it still falls short of justification of avoidance. I can think of a lot of common cases which show number one isn't correct. I think the best case would be for doc strings. It would be a lot nicer to write \[func] each instead of \\[func]. Number two I disagree with on the grounds that I just don't think it's the case. For example, I do not think escaped parens appear significantly more than non-espaced parens, especially when it comes to writing major modes. Additionally, because number one isn't really the case, this reasoning doesn't work out entirely either. Number three is just hogwash, because the "it would require more work than that" argument only works if the additional work is to make the current code handle old cases that have somehow become harder to handle. And this isn't the case. Sure, there will be come cases where old code cannot handle raw strings properly. But people can just file a bug report. -Matt On Fri, Jul 25, 2014 at 1:33 PM, Tom Tromey <tromey@redhat.com> wrote: > Matthew> What if we assume that any string surrounded immediately by > Matthew> parenthesis is a raw string literal? I'm pretty sure every > Matthew> instance of ("...") is currently illegal, and it would be > Matthew> almost certainly trivial to extend the Emacs' lexer/parser to > Matthew> support it. I can do it myself if everyone thinks this is a > Matthew> good idea. > > That kind of thing is valid in quoted contexts though. > > (defvar whatever '("hi")) > > FWIW there was a previous discussion about raw strings: > > http://comments.gmane.org/gmane.emacs.devel/152132 > > I think this killed the idea: > > http://permalink.gmane.org/gmane.emacs.devel/152155 > > Tom > [-- Attachment #2: Type: text/html, Size: 3016 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Raw string literals in Emacs lisp. 2014-07-25 19:47 Raw string literals in Emacs lisp Matthew Plant 2014-07-25 19:56 ` Tassilo Horn 2014-07-25 20:33 ` Tom Tromey @ 2014-07-26 1:19 ` Stephen J. Turnbull 2014-07-26 5:28 ` Matthew Plant 2014-07-26 21:37 ` Thorsten Jolitz 2014-07-29 6:32 ` William Xu 4 siblings, 1 reply; 51+ messages in thread From: Stephen J. Turnbull @ 2014-07-26 1:19 UTC (permalink / raw) To: Matthew Plant; +Cc: emacs-devel Matthew Plant writes: > What if we assume that any string surrounded immediately by > parenthesis is a raw string literal? Please don't. SXEmacs and XEmacs have had rawstring literals for many years using the syntax #r"...". It may not be the best way to do this, but it's (Common) Lisp-y, the prefix notation is familiar from at least one popular non-Lisp language (Python), and it's about as short a notation as you can imagine (I don't recall why #"..." was out, though). Use of parens for this purpose is likely to have wide-ranging implications, as it means that they no longer have unambiguous semantics, but require lookahead to interpret. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-26 1:19 ` Stephen J. Turnbull @ 2014-07-26 5:28 ` Matthew Plant 2014-07-26 5:45 ` chad 0 siblings, 1 reply; 51+ messages in thread From: Matthew Plant @ 2014-07-26 5:28 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: emacs-devel@gnu.org [-- Attachment #1: Type: text/plain, Size: 915 bytes --] Yes, I agree. I suggested it due to a misinterpretation of lisps rules. I would still like to implement #"..." but I'm not sure it would be accepted. On Friday, July 25, 2014, Stephen J. Turnbull <stephen@xemacs.org> wrote: > Matthew Plant writes: > > > What if we assume that any string surrounded immediately by > > parenthesis is a raw string literal? > > Please don't. SXEmacs and XEmacs have had rawstring literals for many > years using the syntax #r"...". It may not be the best way to do > this, but it's (Common) Lisp-y, the prefix notation is familiar from > at least one popular non-Lisp language (Python), and it's about as > short a notation as you can imagine (I don't recall why #"..." was > out, though). > > Use of parens for this purpose is likely to have wide-ranging > implications, as it means that they no longer have unambiguous > semantics, but require lookahead to interpret. > > > > [-- Attachment #2: Type: text/html, Size: 1206 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-26 5:28 ` Matthew Plant @ 2014-07-26 5:45 ` chad 2014-07-26 19:39 ` Matthew Plant 0 siblings, 1 reply; 51+ messages in thread From: chad @ 2014-07-26 5:45 UTC (permalink / raw) To: Matthew Plant; +Cc: emacs-devel@gnu.org It might be helpful to canvas the use of #r"string" in [S]XEmacs and see if anything especially nifty shows up. I think Stefan's reservations mostly come from a feeling that the obvious problem has a better solution elsewhere, but they have some actual experience which might shed a different light on the topic. ~Chad ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-26 5:45 ` chad @ 2014-07-26 19:39 ` Matthew Plant 2014-07-27 12:27 ` Stephen J. Turnbull 0 siblings, 1 reply; 51+ messages in thread From: Matthew Plant @ 2014-07-26 19:39 UTC (permalink / raw) To: chad; +Cc: emacs-devel@gnu.org [-- Attachment #1: Type: text/plain, Size: 1267 bytes --] I did a grep on the latest xemacs code base I could find, funnily enough, almost half of the instances of #r appeared in test cases. Not in facilitating them I might add, they were the test case. All of the non-test cases were regexps. Although this data is convincing in some respects, I would like to note that xemacs is dead. The download off their main page did not even have any raw string literals. I will still content that it is a useful feature to have. The cost of adding it (very minimal) are is to the benefit of having it. And why not? Emacs is also a language, unfortunately. We could all switch to guile and be done with it, but it appears the consensus is that elisp is finely tuned to do text processing. Elisp is a text processing language, and it should have as many features to facilitate in the processing of text as possible, this included. -Matt On Friday, July 25, 2014, chad <yandros@gmail.com> wrote: > It might be helpful to canvas the use of #r"string" in [S]XEmacs > and see if anything especially nifty shows up. I think Stefan's > reservations mostly come from a feeling that the obvious problem > has a better solution elsewhere, but they have some actual experience > which might shed a different light on the topic. > > ~Chad > [-- Attachment #2: Type: text/html, Size: 1565 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-26 19:39 ` Matthew Plant @ 2014-07-27 12:27 ` Stephen J. Turnbull 2014-07-27 13:03 ` David Kastrup 0 siblings, 1 reply; 51+ messages in thread From: Stephen J. Turnbull @ 2014-07-27 12:27 UTC (permalink / raw) To: Matthew Plant; +Cc: chad, emacs-devel@gnu.org Matthew Plant writes: > Although this data is convincing in some respects, I would like to note > that xemacs is dead. The reports of the death of XEmacs are premature. > The download off their main page did not even have any > raw string literals. XEmacs 21.4 will never have them. Of course almost all of the uses of raw strings are for regexps. Most non-regexp strings don't use string escapes, except for the occasional TAB or LF. Format strings use an alternative operator character %, so don't have the problem of string escape colliding with the operator character. Sure, you can do a lot for readability as PCRE or Python regexps have done, but regexps are unreadable almost by design, and those regexp syntaxes benefit from rawstrings, too. Almost anything (that doesn't involve changing the meaning of existing legal programs) that improves readability of regexps is worthwhile. Rawstrings are cheap and effective. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-27 12:27 ` Stephen J. Turnbull @ 2014-07-27 13:03 ` David Kastrup 2014-07-27 20:58 ` David Caldwell 2014-07-28 1:29 ` Stephen J. Turnbull 0 siblings, 2 replies; 51+ messages in thread From: David Kastrup @ 2014-07-27 13:03 UTC (permalink / raw) To: emacs-devel "Stephen J. Turnbull" <turnbull@sk.tsukuba.ac.jp> writes: > Matthew Plant writes: > > > Although this data is convincing in some respects, I would like to note > > that xemacs is dead. > > The reports of the death of XEmacs are premature. > > > The download off their main page did not even have any > > raw string literals. > > XEmacs 21.4 will never have them. Drawing the following excerpts from <URL:http://www.xemacs.org/Releases/index.html>: Arguably, having the last "stable release" made in 2009, having no "Gamma release" described as Note: XEmacs 21.4 has been promoted to stable, and there currently is no gamma series. Plans for the next release are in the works. The gamma series of releases is satisfactorily stable for most sophisticated users. Most Linux or *BSD users should get the best results from the gamma series, and we strongly recommend it to the ``tester'' distributions like NetBSD current, Debian sid, Mandrake Cooker, Red Hat Rawhide, and so on. XEmacs will be ready when they are! The gamma series of releases is the candidate for promotion to a stable series. Although we do not promote the code base to gamma while there are known critical bugs in the code base, to attempt to meet schedules we also do promote fairly quickly once we've fixed the last known critical bug. Everybody does this, and everybody knows that despite the best efforts of the developers, ``point oh'' releases typically still have bugs in them. The gamma concept simply acknowledges this. at all slated to become stable, and having the current "Beta release" branch 21.5 started in 2001 with the description The beta series of releases is for testers. Users should read the XEmacs Beta mailing list, <xemacs-beta@xemacs.org>. Users should prepare themselves for crashes, data loss, freezes, and other unpleasant events. The beta series contains much experimental code, and fairly large changes may be introduced directly into the code base. These are announced as they happen on xemacs-beta. Wannabe developers may also want to follow the XEmacs Patches <xemacs-patches@xemacs.org> and XEmacs CVS Commits <xemacs-cvs@xemacs.org> mailing lists for up-to-the-minute details about the state of the code base. is making Debian look like a fast-paced project. Reports of XEmacs being dead may be exaggerated, but it does look a lot like suspended animation. > Sure, you can do a lot for readability as PCRE or Python regexps have > done, but regexps are unreadable almost by design, and those regexp > syntaxes benefit from rawstrings, too. Almost anything (that doesn't > involve changing the meaning of existing legal programs) that improves > readability of regexps is worthwhile. > > Rawstrings are cheap and effective. When rawstrings are supported, it becomes more expedient to recognize things like \n and \t, probably also \f in regexps (\b is already taken). At the current point of time, they just evaluate to n and t. That makes input of tabs and newlines in raw strings a nuisance and a potential source of errors. It's not actually an issue with rawstrings as such, but rather of their use within regexps. -- David Kastrup ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-27 13:03 ` David Kastrup @ 2014-07-27 20:58 ` David Caldwell 2014-07-27 23:17 ` Matthew Plant ` (2 more replies) 2014-07-28 1:29 ` Stephen J. Turnbull 1 sibling, 3 replies; 51+ messages in thread From: David Caldwell @ 2014-07-27 20:58 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 2340 bytes --] On 7/27/14 6:03 AM, David Kastrup wrote: > "Stephen J. Turnbull" <turnbull@sk.tsukuba.ac.jp> writes: > >> Sure, you can do a lot for readability as PCRE or Python regexps have >> done, but regexps are unreadable almost by design, and those regexp >> syntaxes benefit from rawstrings, too. Almost anything (that doesn't >> involve changing the meaning of existing legal programs) that improves >> readability of regexps is worthwhile. >> >> Rawstrings are cheap and effective. > > When rawstrings are supported, it becomes more expedient to recognize > things like \n and \t, probably also \f in regexps (\b is already > taken). At the current point of time, they just evaluate to n and t. > That makes input of tabs and newlines in raw strings a nuisance and a > potential source of errors. > > It's not actually an issue with rawstrings as such, but rather of their > use within regexps. Why not, then, skip rawstrings completely and go directly to a regular expression reader: #r// (or even just #//) instead of #r""? Then you can add whatever semantics are needed for good regexp reading (ie, let '\n', '\t', and others get escaped in the string reading, but allow '\(' to go through unescaped). This will be just as easy to implement as raw strings. Languages like Javascript, Perl, Ruby, Bash, and Groovy have shown that having a special support for regexps at a language level is a very effective way of dealing with them. Plus it opens the door to extensions: #r//p for PCRE/Perl syntax[1] or #r//x for more readable regexps[2], etc. I think using rawstrings is too generic an answer to the problem. Given that so much of Emacs's functionality is reliant an regular expressions, it makes sense to design something specifically for them. Doing that means they can be tailored and tweaked for maximum functionality without worrying about possible other usages that people might come up (which will undoubtedly happen with rawstrings). -David [1] And practically every other language on the planet. Really, it seems like only Emacs is left in the dark ages of basic POSIX regexps where '(' means literal paren and not matching. [2] Another Perl feature, it allows whitespace and comments in regexps, for much improved readability. See http://perldoc.perl.org/perlre.html#/x [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4219 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-27 20:58 ` David Caldwell @ 2014-07-27 23:17 ` Matthew Plant 2014-07-28 18:27 ` Richard Stallman 2014-07-28 2:16 ` Stephen J. Turnbull 2014-07-30 20:28 ` Ted Zlatanov 2 siblings, 1 reply; 51+ messages in thread From: Matthew Plant @ 2014-07-27 23:17 UTC (permalink / raw) To: David Caldwell; +Cc: emacs-devel@gnu.org [-- Attachment #1: Type: text/plain, Size: 2740 bytes --] I think this is a very good idea. However, agreeing upon which semantics are needed may prove problematic. Do you have any suggestions on this point? The easiest method would probably just go off some other predefined rules like perl's (but definitely not perl's). -Matt On Sunday, July 27, 2014, David Caldwell <david@porkrind.org> wrote: > On 7/27/14 6:03 AM, David Kastrup wrote: > > "Stephen J. Turnbull" <turnbull@sk.tsukuba.ac.jp <javascript:;>> writes: > > > >> Sure, you can do a lot for readability as PCRE or Python regexps have > >> done, but regexps are unreadable almost by design, and those regexp > >> syntaxes benefit from rawstrings, too. Almost anything (that doesn't > >> involve changing the meaning of existing legal programs) that improves > >> readability of regexps is worthwhile. > >> > >> Rawstrings are cheap and effective. > > > > When rawstrings are supported, it becomes more expedient to recognize > > things like \n and \t, probably also \f in regexps (\b is already > > taken). At the current point of time, they just evaluate to n and t. > > That makes input of tabs and newlines in raw strings a nuisance and a > > potential source of errors. > > > > It's not actually an issue with rawstrings as such, but rather of their > > use within regexps. > > Why not, then, skip rawstrings completely and go directly to a regular > expression reader: #r// (or even just #//) instead of #r""? > > Then you can add whatever semantics are needed for good regexp reading > (ie, let '\n', '\t', and others get escaped in the string reading, but > allow '\(' to go through unescaped). This will be just as easy to > implement as raw strings. > > Languages like Javascript, Perl, Ruby, Bash, and Groovy have shown that > having a special support for regexps at a language level is a very > effective way of dealing with them. Plus it opens the door to > extensions: #r//p for PCRE/Perl syntax[1] or #r//x for more readable > regexps[2], etc. > > I think using rawstrings is too generic an answer to the problem. Given > that so much of Emacs's functionality is reliant an regular expressions, > it makes sense to design something specifically for them. Doing that > means they can be tailored and tweaked for maximum functionality without > worrying about possible other usages that people might come up (which > will undoubtedly happen with rawstrings). > > -David > > [1] And practically every other language on the planet. Really, it seems > like only Emacs is left in the dark ages of basic POSIX regexps where > '(' means literal paren and not matching. > > [2] Another Perl feature, it allows whitespace and comments in regexps, > for much improved readability. See http://perldoc.perl.org/perlre.html#/x > > [-- Attachment #2: Type: text/html, Size: 3426 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-27 23:17 ` Matthew Plant @ 2014-07-28 18:27 ` Richard Stallman 2014-07-28 19:32 ` Matthew Plant 0 siblings, 1 reply; 51+ messages in thread From: Richard Stallman @ 2014-07-28 18:27 UTC (permalink / raw) To: Matthew Plant; +Cc: emacs-devel, david [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] If we introduce a new syntax for regexps, we need to make the sexp parsing code handle it both forwards and backwards. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-28 18:27 ` Richard Stallman @ 2014-07-28 19:32 ` Matthew Plant 2014-07-29 19:15 ` Richard Stallman 0 siblings, 1 reply; 51+ messages in thread From: Matthew Plant @ 2014-07-28 19:32 UTC (permalink / raw) To: rms; +Cc: emacs-devel, David Caldwell [-- Attachment #1: Type: text/plain, Size: 860 bytes --] > > If we introduce a new syntax for regexps, we need to make the > sexp parsing code handle it both forwards and backwards. > I don't think this is especially difficult, but it is important to note. On Mon, Jul 28, 2014 at 11:27 AM, Richard Stallman <rms@gnu.org> wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > If we introduce a new syntax for regexps, we need to make the > sexp parsing code handle it both forwards and backwards. > > -- > Dr Richard Stallman > President, Free Software Foundation > 51 Franklin St > Boston MA 02110 > USA > www.fsf.org www.gnu.org > Skype: No way! That's nonfree (freedom-denying) software. > Use Ekiga or an ordinary phone call. > > [-- Attachment #2: Type: text/html, Size: 1617 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-28 19:32 ` Matthew Plant @ 2014-07-29 19:15 ` Richard Stallman 2014-07-30 0:26 ` Matthew Plant 0 siblings, 1 reply; 51+ messages in thread From: Richard Stallman @ 2014-07-29 19:15 UTC (permalink / raw) To: Matthew Plant; +Cc: emacs-devel, david [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > If we introduce a new syntax for regexps, we need to make the > sexp parsing code handle it both forwards and backwards. > I don't think this is especially difficult, but it is important to note. The point is we would need to design the syntax to make it possible. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-29 19:15 ` Richard Stallman @ 2014-07-30 0:26 ` Matthew Plant 2014-07-30 4:28 ` Richard Stallman 0 siblings, 1 reply; 51+ messages in thread From: Matthew Plant @ 2014-07-30 0:26 UTC (permalink / raw) To: rms; +Cc: emacs-devel@gnu.org, David Caldwell [-- Attachment #1: Type: text/plain, Size: 817 bytes --] If it can be parsed forwards it can be parsed backwards, although it might not be immediately possible to do so. The parser might have to check to see if it needs to re-parse a section, but I don't think there's any syntax we could introduce that is impossible to parse backwards. It's a difficulty thing. However, a regex syntax would be much more difficult to parse backwards; I think this is a convincing enough argument that only simple raw strings should be implemented. Pretty much every modern language that has regex has raw string literals. Heck, when regex was added to the C++ standard, raw string literals where added in the same spec. If raw string literals were added, should they allow custom delimiters? This would probably make the strings just as hard to parse backwards as regexps, so I say no. [-- Attachment #2: Type: text/html, Size: 943 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-30 0:26 ` Matthew Plant @ 2014-07-30 4:28 ` Richard Stallman 2014-07-30 18:54 ` Matthew Plant 0 siblings, 1 reply; 51+ messages in thread From: Richard Stallman @ 2014-07-30 4:28 UTC (permalink / raw) To: Matthew Plant; +Cc: david, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] If it can be parsed forwards it can be parsed backwards, Perhaps you mean it is possible in some theoretical sense. That's not the issue here. The Emacs sexp-scanning functions scan backwards in a simple way, and the syntax has to be suitable for them to handle. although it might not be immediately possible to do so. We need the backward scanning to work when the syntax is installed. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-30 4:28 ` Richard Stallman @ 2014-07-30 18:54 ` Matthew Plant 0 siblings, 0 replies; 51+ messages in thread From: Matthew Plant @ 2014-07-30 18:54 UTC (permalink / raw) To: rms; +Cc: David Caldwell, emacs-devel@gnu.org [-- Attachment #1: Type: text/plain, Size: 1152 bytes --] Indeed I do not think that backward scanning will be possible with any kind of string that has different delimiters than double quotes. On Tue, Jul 29, 2014 at 9:28 PM, Richard Stallman <rms@gnu.org> wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > If it can be parsed forwards it can be parsed backwards, > > Perhaps you mean it is possible in some theoretical sense. > That's not the issue here. The Emacs sexp-scanning functions > scan backwards in a simple way, and the syntax has to be > suitable for them to handle. > > although it > might > not be immediately possible to do so. > > We need the backward scanning to work when the syntax is installed. > > -- > Dr Richard Stallman > President, Free Software Foundation > 51 Franklin St > Boston MA 02110 > USA > www.fsf.org www.gnu.org > Skype: No way! That's nonfree (freedom-denying) software. > Use Ekiga or an ordinary phone call. > > [-- Attachment #2: Type: text/html, Size: 1788 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-27 20:58 ` David Caldwell 2014-07-27 23:17 ` Matthew Plant @ 2014-07-28 2:16 ` Stephen J. Turnbull 2014-07-28 7:43 ` Andreas Schwab 2014-07-30 20:28 ` Ted Zlatanov 2 siblings, 1 reply; 51+ messages in thread From: Stephen J. Turnbull @ 2014-07-28 2:16 UTC (permalink / raw) To: David Caldwell; +Cc: emacs-devel David Caldwell writes: > Why not, then, skip rawstrings completely and go directly to a regular > expression reader: #r// (or even just #//) instead of #r""? It's unlispy. Regular expressions *are* strings and can be manipulated as strings; (almost) any string can be used as a regular expression. Therefore (in Lisp) we normally define separate functions to deal with "string" use cases and "regexp" uses cases for the same object. And they mix and match well: (defvar xft-xlfd-font-regexp (concat ;; XLFD specifies ISO 8859-1 encoding, but we can't handle non-ASCII ;; in Mule when this function is called. So use HPC. ;; (xe_xlfd_prefix "\\(\\+[\040-\176\240-\377]*\\)?-") ;; (xe_xlfd_opt_text "\\([\040-\044\046-\176\240-\377]*\\)") ;; (xe_xlfd_text "\\([\040-\044\046-\176\240-\377]+\\)") "\\`" "\\(\\+[\040-\176]*\\)?-" ; prefix "\\([^-]+\\)" ; foundry "-" "\\([^-]+\\)" ; family "-" "\\([^-]+\\)" ; weight "-" "\\([0-9ior?*][iot]?\\)" ; slant "-" "\\([^-]+\\)" ; swidth "-" "\\([^-]*\\)" ; adstyle "-" "\\([0-9?*]+\\|\\[[ 0-9+~.e?*]+\\]\\)" ; pixelsize "-" "\\([0-9?*]+\\|\\[[ 0-9+~.e?*]+\\]\\)" ; pointsize "-" "\\([0-9?*]+\\)" ; resx "-" "\\([0-9?*]+\\)" ; resy "-" "\\([cmp?*]\\)" ; spacing "-" "~?" ; avgwidth "\\([0-9?*]+\\)" "-" "\\([^-]+\\)" ; registry "-" "\\([^-]+\\)" ; encoding "\\'") "The regular expression used to match XLFD font names.") Of course that would be more readable with rawstrings (not used because this code is shared with XEmacs 21.4), and even more readable with PCRE, but it shows we don't really need /x to build regexps readably. If #r"..." generated something other than strings, you'd have to write code to deal with issues like building regexps using concat. I think format would be a huge can of worms. > This will be just as easy to implement as raw strings. No, it won't. Raw strings are just a different read syntax for strings, and have exactly the same internal representation. At present we don't have a regular expression type (although we do have a compiled regular expression type internally). If you're not proposing to define a regular expression type (good luck getting that past RMS!), then you're just proposing a rawstring syntax tuned for regexp use. But there's no reason that couldn't be used for other purposes. For example, some people (Python programmers) would probably appreciate a #r"..."/x rawstring syntax that automatically dedents -- for use in docstrings. > Languages like Javascript, Perl, Ruby, Bash, and Groovy have shown that > having a special support for regexps at a language level is a very > effective way of dealing with them. Lisp is not those languages, and in fact it is very unlike those languages. > Plus it opens the door to extensions: #r//p for PCRE/Perl syntax[1] > or #r//x for more readable regexps[2], etc. (defun emacsify-pcre (s) "Convert a PCRE to Emacs notation, properly ;-) ignoring unknown backslash." ;; exercise for the reader ) or (require 'pcre) ; SXEmacs may have implemented this. (let ((cre (pcre-compile "..."))) (while (pcre-search-forward cre) (do-something))) and as shown above /x isn't really necessary. Like it or not, that's the way these things are done in the Emacs Lisp world. If you don't like it, there are languages like Javascript, Perl, Ruby, Bash, and Groovy. (Python is too much like Lisp for you, I suspect. ;-) > I think using rawstrings is too generic an answer to the problem. I think using rawstrings is the only sane answer to the problem. You can call them "regular expressions" as suggested by the #r notation and their most prominent application, but in Emacs Lisp representing them internally as a type other than string would be way too much work given the idioms we have for constructing regexps that would need to be reimplemented. Given that internally they are (Just String), why specialize to regular expressions? Would you error on #r/*.*/, which is invalid syntax for a regular expression? > [1] And practically every other language on the planet. Really, it seems > like only Emacs is left in the dark ages of basic POSIX regexps where > '(' means literal paren and not matching. Sure, but that's a different problem easily solved if anyone wants to do it. GNU grep shows how: use egrep. (POSIX grep with its default to basic REs and an argument -E to indicate modern syntax is a bad example for Lisp, I think.) The analog for Emacs is a suite of "pcre-" functions. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-28 2:16 ` Stephen J. Turnbull @ 2014-07-28 7:43 ` Andreas Schwab 0 siblings, 0 replies; 51+ messages in thread From: Andreas Schwab @ 2014-07-28 7:43 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: emacs-devel, David Caldwell There is also the rx macro. Andreas. -- Andreas Schwab, SUSE Labs, schwab@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different." ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-27 20:58 ` David Caldwell 2014-07-27 23:17 ` Matthew Plant 2014-07-28 2:16 ` Stephen J. Turnbull @ 2014-07-30 20:28 ` Ted Zlatanov 2014-07-30 20:41 ` David Caldwell 2 siblings, 1 reply; 51+ messages in thread From: Ted Zlatanov @ 2014-07-30 20:28 UTC (permalink / raw) To: emacs-devel On Sun, 27 Jul 2014 13:58:37 -0700 David Caldwell <david@porkrind.org> wrote: DC> Why not, then, skip rawstrings completely and go directly to a regular DC> expression reader: #r// (or even just #//) instead of #r""? For shell commands, for instance, it would be convenient to have rawstrings because they often have internal backslash escapes. Ted ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-30 20:28 ` Ted Zlatanov @ 2014-07-30 20:41 ` David Caldwell 2014-07-30 20:54 ` Ted Zlatanov 0 siblings, 1 reply; 51+ messages in thread From: David Caldwell @ 2014-07-30 20:41 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 1066 bytes --] On 7/30/14 1:28 PM, Ted Zlatanov wrote: > On Sun, 27 Jul 2014 13:58:37 -0700 David Caldwell <david@porkrind.org> wrote: > > DC> Why not, then, skip rawstrings completely and go directly to a regular > DC> expression reader: #r// (or even just #//) instead of #r""? > > For shell commands, for instance, it would be convenient to have > rawstrings because they often have internal backslash escapes. That's precisely the point I made later in my email—rawstrings used in shell don't want things like \n escaped, but regexps do (otherwise you have to add "\n" literal support to the regexp engine). There's 2 usages with competing semantics trying to use one generic interface. I still posit that having a syntax directly for regexps would be beneficial. And I think focusing on regexps is more important in Emacs as it happens more than complicated shell commands. Sadly it sounds like a the #r// would be a no-go due to the Emacs requirements of parsing it in reverse (I assume because '/' is a valid lisp symbol character). -David [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4219 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-30 20:41 ` David Caldwell @ 2014-07-30 20:54 ` Ted Zlatanov 2014-07-30 21:01 ` Matthew Plant ` (2 more replies) 0 siblings, 3 replies; 51+ messages in thread From: Ted Zlatanov @ 2014-07-30 20:54 UTC (permalink / raw) To: emacs-devel On Wed, 30 Jul 2014 13:41:19 -0700 David Caldwell <david@porkrind.org> wrote: DC> On 7/30/14 1:28 PM, Ted Zlatanov wrote: >> On Sun, 27 Jul 2014 13:58:37 -0700 David Caldwell <david@porkrind.org> wrote: >> DC> Why not, then, skip rawstrings completely and go directly to a regular DC> expression reader: #r// (or even just #//) instead of #r""? >> >> For shell commands, for instance, it would be convenient to have >> rawstrings because they often have internal backslash escapes. DC> That's precisely the point I made later in my email Sorry I didn't see it. DC> rawstrings used in shell don't want things like \n escaped, but DC> regexps do (otherwise you have to add "\n" literal support to the DC> regexp engine). There's 2 usages with competing semantics trying to DC> use one generic interface. I still posit that having a syntax DC> directly for regexps would be beneficial. And I think focusing on DC> regexps is more important in Emacs as it happens more than DC> complicated shell commands. Heredocs are generally useful and popular and would also be supported by this syntax. But please don't take that as a knock against regexp literal support, it's just not something I have needed. DC> Sadly it sounds like a the #r// would be a no-go due to the Emacs DC> requirements of parsing it in reverse (I assume because '/' is a valid DC> lisp symbol character). I have no opinion on that, I just want a simple syntax for literal data :) How about using a Unicode character as the marker? (prepares for stoning) Ted ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-30 20:54 ` Ted Zlatanov @ 2014-07-30 21:01 ` Matthew Plant 2014-07-30 21:16 ` Ted Zlatanov 2014-08-02 8:47 ` Alan Mackenzie 2014-08-02 9:17 ` Andreas Schwab 2 siblings, 1 reply; 51+ messages in thread From: Matthew Plant @ 2014-07-30 21:01 UTC (permalink / raw) To: emacs-devel@gnu.org [-- Attachment #1: Type: text/plain, Size: 2153 bytes --] > How about using a Unicode character as the marker? (prepares for stoning) I'm on the fence about this idea. It certainly would make parsing in reverse possible (assuming the reverse parsing functions do not operate on char *) and easy, but it would also possibly mess up formatting. It also might not add much convenience because frankly typing non-unicode characters is _hard_. One suggestion would be to use the unicode left and right double/single quotation marks. On Wed, Jul 30, 2014 at 1:54 PM, Ted Zlatanov <tzz@lifelogs.com> wrote: > On Wed, 30 Jul 2014 13:41:19 -0700 David Caldwell <david@porkrind.org> > wrote: > > DC> On 7/30/14 1:28 PM, Ted Zlatanov wrote: > >> On Sun, 27 Jul 2014 13:58:37 -0700 David Caldwell <david@porkrind.org> > wrote: > >> > DC> Why not, then, skip rawstrings completely and go directly to a regular > DC> expression reader: #r// (or even just #//) instead of #r""? > >> > >> For shell commands, for instance, it would be convenient to have > >> rawstrings because they often have internal backslash escapes. > > DC> That's precisely the point I made later in my email > > Sorry I didn't see it. > > DC> rawstrings used in shell don't want things like \n escaped, but > DC> regexps do (otherwise you have to add "\n" literal support to the > DC> regexp engine). There's 2 usages with competing semantics trying to > DC> use one generic interface. I still posit that having a syntax > DC> directly for regexps would be beneficial. And I think focusing on > DC> regexps is more important in Emacs as it happens more than > DC> complicated shell commands. > > Heredocs are generally useful and popular and would also be supported by > this syntax. But please don't take that as a knock against regexp > literal support, it's just not something I have needed. > > DC> Sadly it sounds like a the #r// would be a no-go due to the Emacs > DC> requirements of parsing it in reverse (I assume because '/' is a valid > DC> lisp symbol character). > > I have no opinion on that, I just want a simple syntax for literal data :) > > How about using a Unicode character as the marker? (prepares for stoning) > > Ted > > > [-- Attachment #2: Type: text/html, Size: 2922 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-30 21:01 ` Matthew Plant @ 2014-07-30 21:16 ` Ted Zlatanov 2014-07-30 21:19 ` Matthew Plant 0 siblings, 1 reply; 51+ messages in thread From: Ted Zlatanov @ 2014-07-30 21:16 UTC (permalink / raw) To: emacs-devel On Wed, 30 Jul 2014 14:01:52 -0700 Matthew Plant <maplant2@illinois.edu> wrote: >> How about using a Unicode character as the marker? (prepares for stoning) MP> I'm on the fence about this idea. It certainly would make parsing in MP> reverse possible (assuming the reverse parsing functions do not operate on MP> char *) and easy, but it would also possibly mess up formatting. It also MP> might not add much convenience because frankly typing non-unicode MP> characters is _hard_. Eh, it's really not hard. If it was the only problem with this approach, it could be enabled in the default keybindings. MP> One suggestion would be to use the unicode left and right double/single MP> quotation marks. Oh, no. The markers shouldn't look like existing ASCII characters or there will be lynchings. Maybe LEFT DOUBLE ANGLE BRACKET and RIGHT DOUBLE ANGLE BRACKET would work (they look like << and >>). Ted ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-30 21:16 ` Ted Zlatanov @ 2014-07-30 21:19 ` Matthew Plant 2014-07-31 10:13 ` Ted Zlatanov 0 siblings, 1 reply; 51+ messages in thread From: Matthew Plant @ 2014-07-30 21:19 UTC (permalink / raw) To: emacs-devel@gnu.org [-- Attachment #1: Type: text/plain, Size: 1212 bytes --] > Maybe LEFT DOUBLE ANGLE BRACKET and RIGHT DOUBLE ANGLE BRACKET would work (they look like << and >>). I'm pretty sure that those look like existing ASCII characters as well ;-) On Wed, Jul 30, 2014 at 2:16 PM, Ted Zlatanov <tzz@lifelogs.com> wrote: > On Wed, 30 Jul 2014 14:01:52 -0700 Matthew Plant <maplant2@illinois.edu> > wrote: > > >> How about using a Unicode character as the marker? (prepares for > stoning) > MP> I'm on the fence about this idea. It certainly would make parsing in > MP> reverse possible (assuming the reverse parsing functions do not > operate on > MP> char *) and easy, but it would also possibly mess up formatting. It > also > MP> might not add much convenience because frankly typing non-unicode > MP> characters is _hard_. > > Eh, it's really not hard. If it was the only problem with this approach, > it could be enabled in the default keybindings. > > MP> One suggestion would be to use the unicode left and right > double/single > MP> quotation marks. > > Oh, no. The markers shouldn't look like existing ASCII characters or > there will be lynchings. Maybe LEFT DOUBLE ANGLE BRACKET and RIGHT > DOUBLE ANGLE BRACKET would work (they look like << and >>). > > Ted > > > [-- Attachment #2: Type: text/html, Size: 1966 bytes --] ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-30 21:19 ` Matthew Plant @ 2014-07-31 10:13 ` Ted Zlatanov 0 siblings, 0 replies; 51+ messages in thread From: Ted Zlatanov @ 2014-07-31 10:13 UTC (permalink / raw) To: emacs-devel On Wed, 30 Jul 2014 14:19:07 -0700 Matthew Plant <maplant2@illinois.edu> wrote: TZ> Maybe LEFT DOUBLE ANGLE BRACKET and RIGHT TZ> DOUBLE ANGLE BRACKET would work (they look like << and >>). MP> I'm pretty sure that those look like existing ASCII characters as well ;-) (Your quoting is strange, could you please use a better MUA?) I'm honestly happy with anything that works. I actually had the French-style (used in many countries) guillemets in mind, like « for LEFT-POINTING DOUBLE ANGLE QUOTATION MARK. Those: * do not look too much like << in most fonts... * ...but are close enough to be familiar * are used in many locales to quote text already * already have keyboard shortcuts on many platforms (see http://en.wikipedia.org/wiki/Guillemet#Typing_.22.C2.AB.22_and_.22.C2.BB.22_on_computers) * are just extended ASCII, so they are already universally supported, both at the font and at the platform level So, two questions: - can anyone think of better markers for rawstring / raw regex literals? - are the markers proposed here going to resolve the backwards-scanning issue? And is that all that's blocking the proposal? Ted ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-30 20:54 ` Ted Zlatanov 2014-07-30 21:01 ` Matthew Plant @ 2014-08-02 8:47 ` Alan Mackenzie 2014-08-02 9:14 ` David Kastrup 2014-08-02 9:17 ` Andreas Schwab 2 siblings, 1 reply; 51+ messages in thread From: Alan Mackenzie @ 2014-08-02 8:47 UTC (permalink / raw) To: emacs-devel Hello, Ted. On Wed, Jul 30, 2014 at 04:54:43PM -0400, Ted Zlatanov wrote: > ....., I just want a simple syntax for literal data :) > How about using a Unicode character as the marker? (prepares for stoning) OK, it's taken time, and nobody else looks like they're about to do it, so I will cast the first stone. NO, NO, NO, NO! The only Unicode characters to be used in Emacs are those that are also ASCII characters, with a tiny number of essential exceptions (for example, the non-European characters in the sentence-end regexp, and, of course, people's names in comments). A Non-ASCII character is difficult to type for most people. Not all setups can display it. Adopting such a character would mean a lot of work for a lot of people. And using such characters as delimiters would introduce yet one more incompatibility with XEmacs which, Stephen informs us, uses #r"..." for raw strings. Why not just adapt that convention? Easy to type, easy to read, easy to parse. > Ted -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-08-02 8:47 ` Alan Mackenzie @ 2014-08-02 9:14 ` David Kastrup 2014-08-02 10:23 ` Alan Mackenzie 2014-08-03 6:50 ` Stephen J. Turnbull 0 siblings, 2 replies; 51+ messages in thread From: David Kastrup @ 2014-08-02 9:14 UTC (permalink / raw) To: emacs-devel Alan Mackenzie <acm@muc.de> writes: > Hello, Ted. > > On Wed, Jul 30, 2014 at 04:54:43PM -0400, Ted Zlatanov wrote: > >> ....., I just want a simple syntax for literal data :) > >> How about using a Unicode character as the marker? (prepares for stoning) > > OK, it's taken time, and nobody else looks like they're about to do it, > so I will cast the first stone. > > NO, NO, NO, NO! The only Unicode characters to be used in Emacs are > those that are also ASCII characters, with a tiny number of essential > exceptions (for example, the non-European characters in the sentence-end > regexp, and, of course, people's names in comments). > > A Non-ASCII character is difficult to type for most people. Not all > setups can display it. Adopting such a character would mean a lot of > work for a lot of people. > > And using such characters as delimiters would introduce yet one more > incompatibility with XEmacs which, Stephen informs us, uses #r"..." for > raw strings. Why not just adapt that convention? Easy to type, easy to > read, easy to parse. Easy to parse? r#"?\" is a complete string. How do you parse it backwards? -- David Kastrup ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-08-02 9:14 ` David Kastrup @ 2014-08-02 10:23 ` Alan Mackenzie 2014-08-02 15:51 ` Richard Stallman 2014-08-03 6:50 ` Stephen J. Turnbull 1 sibling, 1 reply; 51+ messages in thread From: Alan Mackenzie @ 2014-08-02 10:23 UTC (permalink / raw) To: David Kastrup; +Cc: emacs-devel 'afternoon, David. On Sat, Aug 02, 2014 at 11:14:08AM +0200, David Kastrup wrote: > Alan Mackenzie <acm@muc.de> writes: > > Hello, Ted. > > On Wed, Jul 30, 2014 at 04:54:43PM -0400, Ted Zlatanov wrote: > >> ....., I just want a simple syntax for literal data :) > >> How about using a Unicode character as the marker? (prepares for stoning) > > OK, it's taken time, and nobody else looks like they're about to do it, > > so I will cast the first stone. > > NO, NO, NO, NO! The only Unicode characters to be used in Emacs are > > those that are also ASCII characters, with a tiny number of essential > > exceptions (for example, the non-European characters in the sentence-end > > regexp, and, of course, people's names in comments). > > A Non-ASCII character is difficult to type for most people. Not all > > setups can display it. Adopting such a character would mean a lot of > > work for a lot of people. > > And using such characters as delimiters would introduce yet one more > > incompatibility with XEmacs which, Stephen informs us, uses #r"..." for > > raw strings. Why not just adapt that convention? Easy to type, easy to > > read, easy to parse. > Easy to parse? > r#"?\" is a complete string. How do you parse it backwards? Parsing practically _anything_ backwards (especially comments) is difficult. There's nothing particularly difficult about #r"?\" that isn't shared by, e.g. /* /* */. Heuristics will be needed for strings, should raw strings come to exist, just as they are for comments. > -- > David Kastrup -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-08-02 10:23 ` Alan Mackenzie @ 2014-08-02 15:51 ` Richard Stallman 0 siblings, 0 replies; 51+ messages in thread From: Richard Stallman @ 2014-08-02 15:51 UTC (permalink / raw) To: Alan Mackenzie; +Cc: dak, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] There's nothing particularly difficult about #r"?\" that isn't shared by, e.g. /* /* */. C parsing seems to understand that construction, but I am not sure how it does so. (Want to take a look?) In any case, There is no such problem with comments in Lisp. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-08-02 9:14 ` David Kastrup 2014-08-02 10:23 ` Alan Mackenzie @ 2014-08-03 6:50 ` Stephen J. Turnbull 2014-08-03 7:29 ` David Kastrup 2014-08-04 1:55 ` Richard Stallman 1 sibling, 2 replies; 51+ messages in thread From: Stephen J. Turnbull @ 2014-08-03 6:50 UTC (permalink / raw) To: David Kastrup; +Cc: emacs-devel David Kastrup writes: > r#"?\" is a complete string. How do you parse it backwards? By catching the parse error when parsing it as a (normal) string, then reparsing it as a raw string (ie, running backwards over the characters until you hit the second ?"), and check for a leading #r (two tokens of lookahead). Thanks for the example, David, XEmacs is buggy here (or maybe terminating a rawstring with \ will be declared illegal ;-). ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-08-03 6:50 ` Stephen J. Turnbull @ 2014-08-03 7:29 ` David Kastrup 2014-08-03 13:12 ` Stephen J. Turnbull 2014-08-04 1:55 ` Richard Stallman 1 sibling, 1 reply; 51+ messages in thread From: David Kastrup @ 2014-08-03 7:29 UTC (permalink / raw) To: emacs-devel "Stephen J. Turnbull" <stephen@xemacs.org> writes: > David Kastrup writes: > > > r#"?\" is a complete string. How do you parse it backwards? > > By catching the parse error when parsing it as a (normal) string, then > reparsing it as a raw string (ie, running backwards over the > characters until you hit the second ?"), and check for a leading #r > (two tokens of lookahead). > > Thanks for the example, David, XEmacs is buggy here (or maybe > terminating a rawstring with \ will be declared illegal ;-). Uh, I wasn't planning to trip up XEmacs. At any rate, things can get more complex, like with (format "%s%c\n""r#"?\") => "r#\" " which is valid Elisp right now and would remain so, but which would look like containing a valid string r#"?\" after the syntax change when scanning backwards. It's not like syntax highlighting etc don't revert to heuristics (like with "(" in first column), but it's still obvious that this choice is not conflict-free. And I don't see how one could reasonably get around that without also changing the ending delimiter. -- David Kastrup ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-08-03 7:29 ` David Kastrup @ 2014-08-03 13:12 ` Stephen J. Turnbull 2014-08-03 13:27 ` David Kastrup 2014-08-03 13:40 ` David Kastrup 0 siblings, 2 replies; 51+ messages in thread From: Stephen J. Turnbull @ 2014-08-03 13:12 UTC (permalink / raw) To: David Kastrup; +Cc: emacs-devel David Kastrup writes: > At any rate, things can get more complex, like with > > (format "%s%c\n""r#"?\") => "r#\" > " I'm not really worried about more complex. I am concerned about whether there's an unambiguous answer to "what is the value -- or error -- of eval-print-last-sexp at point?" In the case of (format "%s%c\n""r#"?\")-!- it's "r#\"\n". But for (format "%s%c\n""r#"?\"-!-) you could argue that it's ?\" (that's XEmacs's opinion) or "?\\". I guess for XEmacs (which already has this syntax in the wild) the rule should be "longest match wins" (because otherwise there's no way to evaluate r#"?\" in an interactive buffer), but for Emacs that looks like a deal-killer, and it's already present with just r#"?\". *sigh* ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-08-03 13:12 ` Stephen J. Turnbull @ 2014-08-03 13:27 ` David Kastrup 2014-08-03 15:01 ` Stephen J. Turnbull 2014-08-03 13:40 ` David Kastrup 1 sibling, 1 reply; 51+ messages in thread From: David Kastrup @ 2014-08-03 13:27 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: emacs-devel "Stephen J. Turnbull" <stephen@xemacs.org> writes: > I'm not really worried about more complex. I am concerned about > whether there's an unambiguous answer to "what is the value -- or > error -- of eval-print-last-sexp at point?" > > In the case of > > (format "%s%c\n""r#"?\")-!- > > it's "r#\"\n". But for > > (format "%s%c\n""r#"?\"-!-) > > you could argue that it's ?\" (that's XEmacs's opinion) or "?\\". I > guess for XEmacs (which already has this syntax in the wild) the rule > should be "longest match wins" (because otherwise there's no way to > evaluate r#"?\" in an interactive buffer), but for Emacs that looks > like a deal-killer, and it's already present with just r#"?\". I don't understand the reason why this should be a deal-killer for Emacs but not for XEmacs. Is this because of different syntax infrastructure? Or a different tolerance level for conveivable but unlikely problems? Is XEmacs going to run into the same problems when ingesting some of Emacs' highlighting/parsing stuff? -- David Kastrup ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-08-03 13:27 ` David Kastrup @ 2014-08-03 15:01 ` Stephen J. Turnbull 2014-08-04 1:55 ` Richard Stallman 0 siblings, 1 reply; 51+ messages in thread From: Stephen J. Turnbull @ 2014-08-03 15:01 UTC (permalink / raw) To: David Kastrup; +Cc: emacs-devel David Kastrup writes: > I don't understand the reason why this should be a deal-killer for Emacs > but not for XEmacs. XEmacs is stuck with it (backward compatibility with user code, as in practice a lot of users are dependent on 21.5 features that we've refused to backport, we can't really get away with saying "you knew it was a beta"). If there's a better alternative (which I'm not sure there is), Emacs has no backwards compatibility problem, and no XEmacs compatibility problem either. > Is XEmacs going to run into the same problems when ingesting some of > Emacs' highlighting/parsing stuff? Technically, yes. I don't expect to see a lot of real-world code that uses rawstrings that end in "\", though, so we can just document this wart (or document that #r rawstrings that end in "\" have undefined behavior). But why should Emacs put up with such a wart? ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-08-03 15:01 ` Stephen J. Turnbull @ 2014-08-04 1:55 ` Richard Stallman 2014-08-04 6:38 ` David Kastrup 0 siblings, 1 reply; 51+ messages in thread From: Richard Stallman @ 2014-08-04 1:55 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: dak, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] I don't want a wart like this in Emacs. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-08-04 1:55 ` Richard Stallman @ 2014-08-04 6:38 ` David Kastrup 2014-08-05 1:41 ` Richard Stallman 0 siblings, 1 reply; 51+ messages in thread From: David Kastrup @ 2014-08-04 6:38 UTC (permalink / raw) To: Richard Stallman; +Cc: Stephen J. Turnbull, emacs-devel Richard Stallman <rms@gnu.org> writes: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > I don't want a wart like this in Emacs. Well, I am not sure about the size of the wart in practice. It has not apparently caused much of a disturbance for XEmacs. It certainly seems less relevant in practice than our traditional wart (info "(emacs) Left Margin Paren") with regard to reliable detection of strings out of context. The Elisp solution of providing a manual "\(" escape sequence does not work for languages such as Scheme/Guile and various others. I definitely see a use case for raw strings. It's also worth noting that python-mode appears to do a pretty good job finding and highlighting the various Python raw strings, and those should have similar problems. There will probably be outliers like those I constructed, but I have to admit that I have not run into them yet. I most certainly have run into the "Left Margin Paren" problem numerous times, in contrast. -- David Kastrup ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-08-04 6:38 ` David Kastrup @ 2014-08-05 1:41 ` Richard Stallman 2014-08-05 6:15 ` David Kastrup 0 siblings, 1 reply; 51+ messages in thread From: Richard Stallman @ 2014-08-05 1:41 UTC (permalink / raw) To: David Kastrup; +Cc: stephen, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] Well, I am not sure about the size of the wart in practice. It has not apparently caused much of a disturbance for XEmacs. It certainly seems less relevant in practice than our traditional wart (info "(emacs) Left Margin Paren") with regard to reliable detection of strings out of context. That problem is in a different feature (finding the start of a function), and we recommend a preventive measure to avoid it. So it is not a real problem. In Elisp, it is a solved problem. But even if it were a real problem, this argument is invalid in form. The existence of one problem we can't fix does not make it good to create another. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-08-05 1:41 ` Richard Stallman @ 2014-08-05 6:15 ` David Kastrup 0 siblings, 0 replies; 51+ messages in thread From: David Kastrup @ 2014-08-05 6:15 UTC (permalink / raw) To: Richard Stallman; +Cc: stephen, emacs-devel Richard Stallman <rms@gnu.org> writes: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > Well, I am not sure about the size of the wart in practice. It has not > apparently caused much of a disturbance for XEmacs. It certainly seems > less relevant in practice than our traditional wart > > (info "(emacs) Left Margin Paren") > > with regard to reliable detection of strings out of context. > > That problem is in a different feature (finding the start of a > function), and we recommend a preventive measure to avoid it. The preventive measure is not working in source buffers other than Elisp and it requires manual intervention. M-q seems to avoid _moving_ an opening parent to the front of the line in strings: that is already a big help in avoiding them to creep in when reformatting code. auto-fill-mode however doesn't, so you don't get help against accidentally introducing them. > So it is not a real problem. In Elisp, it is a solved problem. More like a "problem with known manual workarounds". > But even if it were a real problem, this argument is invalid in form. > The existence of one problem we can't fix does not make it good > to create another. Sure. I was just putting it in perspective: in practice the ambiguity of r#"?\" without leading context is not going to cause anywhere near the pain users already have to deal with. I am not saying that this is a non-problem. But in contrast to the paren problem, it is a fringe problem not likely to occur in practice. So I consider it likely to be less annoying in its effects to users than a raw string syntax diverging from that of XEmacs which would basically imply that any portable code has to forego raw strings completely. Of course, if Emacs can come up with a significantly better proposal, there is some likelihood that it will eventually _also_ be adopted by XEmacs. But as long as strings and raw strings share the same ending delimiter and/or the ending delimiter of a raw string has a valid other syntactic interpretation on its own, the ambiguity will be there. ASCII does not offer a wealth of delimiter candidates, and having to write something like #r"fa\fa d\fd \fd safa"#r would likely be more annoying than the problem it is supposed to cure. I am not saying that #r"..." is what we should ultimately take, just that I don't see the counterargument as weighing all that strongly. I actually would likely prefer something like #"..." as input but that's even more likely to trip up backward parsing. -- David Kastrup ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-08-03 13:12 ` Stephen J. Turnbull 2014-08-03 13:27 ` David Kastrup @ 2014-08-03 13:40 ` David Kastrup 2014-08-03 15:06 ` Stephen J. Turnbull 1 sibling, 1 reply; 51+ messages in thread From: David Kastrup @ 2014-08-03 13:40 UTC (permalink / raw) To: emacs-devel "Stephen J. Turnbull" <stephen@xemacs.org> writes: > In the case of > > (format "%s%c\n""r#"?\")-!- > > it's "r#\"\n". But for > > (format "%s%c\n""r#"?\"-!-) > > you could argue that it's ?\" (that's XEmacs's opinion) Which is correct according the surrounding syntax. > or "?\\". I guess for XEmacs (which already has this syntax in the > wild) the rule should be "longest match wins" (because otherwise > there's no way to evaluate r#"?\" in an interactive buffer), Longest single-sexp match would be r#"?\" since the correct interpretation "r#"?\" are actually _two_ sexps. So the "correct" single sexp match in _this_ example would indeed be the shortest match ?\" here. Obviously, depending on what transpires before, it is equally easy to have the longer match be correct. -- David Kastrup ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-08-03 13:40 ` David Kastrup @ 2014-08-03 15:06 ` Stephen J. Turnbull 0 siblings, 0 replies; 51+ messages in thread From: Stephen J. Turnbull @ 2014-08-03 15:06 UTC (permalink / raw) To: David Kastrup; +Cc: emacs-devel David Kastrup writes: > Obviously, depending on what transpires before, it is equally easy > to have the longer match be correct. eval-print-last-sexp doesn't (and practically speaking, can't) depend on the the buffer or line being syntacticly correct in any way -- except for the sexp preceding point. So I don't think that argument holds. Regards, ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-08-03 6:50 ` Stephen J. Turnbull 2014-08-03 7:29 ` David Kastrup @ 2014-08-04 1:55 ` Richard Stallman 1 sibling, 0 replies; 51+ messages in thread From: Richard Stallman @ 2014-08-04 1:55 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: dak, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] By catching the parse error when parsing it as a (normal) string, then reparsing it as a raw string (ie, running backwards over the characters until you hit the second ?"), and check for a leading #r (two tokens of lookahead). That is easier said than done. People can give it a try. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-30 20:54 ` Ted Zlatanov 2014-07-30 21:01 ` Matthew Plant 2014-08-02 8:47 ` Alan Mackenzie @ 2014-08-02 9:17 ` Andreas Schwab 2 siblings, 0 replies; 51+ messages in thread From: Andreas Schwab @ 2014-08-02 9:17 UTC (permalink / raw) To: emacs-devel Ted Zlatanov <tzz@lifelogs.com> writes: > How about using a Unicode character as the marker? (prepares for stoning) That's not possible, since they are valid symbol characters. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-27 13:03 ` David Kastrup 2014-07-27 20:58 ` David Caldwell @ 2014-07-28 1:29 ` Stephen J. Turnbull 1 sibling, 0 replies; 51+ messages in thread From: Stephen J. Turnbull @ 2014-07-28 1:29 UTC (permalink / raw) To: David Kastrup; +Cc: emacs-devel David Kastrup writes: > Reports of XEmacs being dead may be exaggerated, but it does look a > lot like suspended animation. I'm OK with the latter description. > When rawstrings are supported, it becomes more expedient to recognize > things like \n and \t, probably also \f in regexps (\b is already > taken). Sure. AFAIK most of the modern regexp syntaxes do. I guess that it's possible that there are regexps out in the real world that contain "\n" and work because "n" would work there too, so that is a change in semantics. (It's a shame that Emacs doesn't consider that kind of thing an error, because it's almost certainly a bug.) I don't really see a point in \f, though. Emacs users (at least old-timers) are used to seeing "^L" in their code, and I haven't seen an Emacs configured to display that as an actual form feed in at least 20 years. \t is useful because it displays the same as a number (nondeterministic) of spaces, and \n is useful because embedding an actual newline in a string messes up your indentation, often leaving a lone double quote on the next line (when newline terminates the string). ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-25 19:47 Raw string literals in Emacs lisp Matthew Plant ` (2 preceding siblings ...) 2014-07-26 1:19 ` Stephen J. Turnbull @ 2014-07-26 21:37 ` Thorsten Jolitz 2014-07-29 6:32 ` William Xu 4 siblings, 0 replies; 51+ messages in thread From: Thorsten Jolitz @ 2014-07-26 21:37 UTC (permalink / raw) To: emacs-devel Matthew Plant <maplant2@illinois.edu> writes: > I think that raw string literals would be a really nice thing to add > to Emacs > lisp. The most immediate benefit is that writing regexps would be much > easier. > And since most of the work that goes into major modes is writing > regexp, writing > major modes would become a lot faster. BTW, I recently wrote a little library called ,---- | drx.el --- declarative dynamic regular expressions `---- available on github (https://github.com/tj64/drx). Its main purpose was enabling one more level of abstraction when writing (org-mode) regexps, i.e. replace the hardcoded ,---- | "^" (BOL) | "$" (EOL) | "\*" (Org STAR) `---- in regexps strings like ,---- | "^\\* foo$" `---- with variables ,---- | (defvar drx-BOL "^") | (defvar drx-EOL "$") | (defvar drx-STAR (regexp-quote "*")) `---- and build regexps with functions calls like ,---- | (drx " foo" t t t) `---- The idea was based on an analysis of what would be needed for a true Org Minor Mode, i.e. the application of Org's core functionality outside of the Org major-mode. At the lowest level, the core obstacle is in the hard-coded regexp snippets spread all over the Org sources that don't match anymore when the org elements are in comment sections of programming major-modes. E.g. this would match 'old-school' headers in emacs-lisp-mode: #+begin_src emacs-lisp (let ((drx-BOL "^;;") (drx-STAR ";")) (format "%S" (drx " foo" t t t))) #+end_src #+results: : "^;;; foo$" and this 'outshine' (outcommented org-mode) headers: #+begin_src emacs-lisp (let ((drx-BOL "^;; ")) (format "%S" (drx " foo" t t t))) #+end_src #+results: : "^;; \\* foo$" and this 'outshine' headers in css-mode: #+begin_src emacs-lisp (let ((drx-BOL "^/\\* ") (drx-EOL "\\*/$")) (format "%S" (drx " foo" t t t))) #+end_src #+results: : "^/\\* \\* foo\\*/$" The idea was rejected by the Org maintainers, but the library does exist now, and the reason I mention it here is that it makes writing regexps much faster and easier (with a different approach than rx.el, the regexps itself are still written as strings, only the plumbing is done declaratively. Here are a few more complex examples from the drx.el test section: #+begin_src emacs-lisp (format "%S" (let ((drx-BOL "^;;") (drx-STAR ";")) (drx " foo" t '(2 2) nil))) #+end_src #+results: : "^;;\\(;\\{2\\}\\)\\{2\\} foo" #+begin_src emacs-lisp (format "%S" (drx "foo" t t t t)) #+end_src #+results: : "^\\*\\(foo\\)$" #+begin_src emacs-lisp (format "%S" (drx "foo" nil nil nil 'alt "bar")) #+end_src #+results: : "\\(foo\\|bar\\)" #+begin_src emacs-lisp (format "%S" (drx "foo" nil nil nil 'shy "bar")) #+end_src #+results: : "\\(?:foo\\)\\(?:bar\\)" #+begin_src emacs-lisp (format "%S" (drx "foo" t 2 t 'app "\\(bar\\)" "loo")) #+end_src #+results: : "^\\*\\{2\\}\\(foo\\)\\(bar\\)\\(loo\\)$" #+begin_src emacs-lisp (format "%S" (drx "foo" t '(t t t) t '(t t t) "bar" "loo")) #+end_src #+results: : "^\\(\\(\\*\\)\\(\\*\\)\\)\\(foo\\(bar\\)\\(loo\\)\\)$" so even without raw strings, this helps to avoid typing all these parens and backslashes. By nesting 'drx calls one can create really complex regexps that contain only a few and simple string literals. I don't know (but would be curious to know) how writing regexps this way would affect a library's execution speed, expecially if the 'drx calls appear in low level functions that are called all the time. PS For the sake of completeness, here the docstring of `drx': ,----[ C-h f drx RET ] | drx is a Lisp function in `drx.el'. | | (drx RGXP &optional BOLP STARS EOLP ENCLOSING &rest RGXPS) | | Make regexp combining RGXP and optional RGXPS. | | With BOLP non-nil, add 'drx-BOL' at beginning of regexp, with EOLP | non-nil add 'drx-EOL' at end of regexp. | | STARS, when non-nil, uses 'drx-STAR' and encloses and repeats it. | | ENCLOSING, when non-nil, takes RGXP and optional RGXPS and combines, | encloses and repeats them. | | While BOLP and EOLP are switches that don't do nothing when nil and | insert whatever value 'drx-BOL' and 'drx-EOL' are set to when | non-nil, both arguments STARS and ENCLOSING take either symbols, | numbers, strings or (nested) lists as values and act conditional on | the type. | | All the following 'atomic' argument values are valid for both STARS | and ENCLOSING but with a slightly different meaning: | | STARS: repeat 'drx-STAR' (without enclosing) conditional on argument | value | | ENCLOSING: repeat enclosed combination of RGXP and RGXPS conditional | on argument value | | - nil :: do nothing (no repeater, no enclosing) | | - t :: (and any other symbol w/o special meaning) repeat once | | - n :: (number) repeat n times {n} | | - "n" :: (number-as-string) repeat n times {n} | | - "n," :: (string) repeat >= n times {n,} | | - ",m" :: (string) repeat <= m times {,m} | | - "n,m" :: (string) repeat n to m times {n,m} | | - "?" :: (string) repeat with ? | | - "*" :: (string) repeat with * | | - "+" :: (string) repeat with + | | - "??" :: (string) repeat with ?? | | - "*?" :: (string) repeat with *? | | - "+?" :: (string) repeat with +? | | - "xyz" :: (any other string) repeat once | | Note that, when used with STARS and ENCLOSING, t almost always | means 'enclose and repeat once', while 1 and "1" stand for | 'do not enclose, repeat once' - depending on the context. | | These atomic values can be wrapped in a list and change their | meaning then. In a list of length 1 they specify 'enclose element | first, apply repeater then'. In a list of lenght > 1 the specifier | in the car applies to the combination of all elements, while each of | the specifiers in the cdr applies to one element only. In the case | of argument STAR, an element is always 'drx-STAR'. In the case of | argument ENCLOSING, a non-nil optional argument RGXPS represents the | list of elements, each of them being a regexp string. | | Here are two calls of 'drx' with interchanged list arguments to | STARS and ENCLOSING and their return values, demonstrating the | above: | | ,------------------------------------------------------------ | | (drx "foo" t '(nil t (2)) t '(t nil (2)) | | "bar" "loo") | | "^\(\*\)\(\*\) | Uses keymap `2\', which is not currently defined. | \(foobar\(loo\) | Uses keymap `2\', which is not currently defined. | \)$" | `------------------------------------------------------------ | | ,------------------------------------------------------------ | | (drx "foo" t '(t nil (2)) t '(nil t (2)) | | "bar" "loo") | | "^\(\*\(\*\) | Uses keymap `2\', which is not currently defined. | \)foo\(bar\)\(loo\) | Uses keymap `2\', which is not currently defined. | $" | `------------------------------------------------------------ ups, bug in boxquote.el? should look like this: ,------------------------------------------------------------ | (drx \"foo\" t '(nil t (2)) t '(t nil (2)) | \"bar\" \"loo\") | \"^\\(\\*\\)\\(\\*\\)\\{2\\}\\(foobar\\(loo\\)\\{2\\}\\)$\" `------------------------------------------------------------ ,------------------------------------------------------------ | (drx \"foo\" t '(t nil (2)) t '(nil t (2)) | \"bar\" \"loo\") | \"^\\(\\*\\(\\*\\)\\{2\\}\\)foo\\(bar\\)\\(loo\\)\\{2\\}$\" `------------------------------------------------------------ | | Many more usage examples with their expected outcome can be found as | ERT tests in the test-section of drx.el and should be consulted in | doubt. | | There are a few symbols with special meaning as values of the | ENCLOSING argument (when used as atomic argument or as car of a list | argument), namely: | | - alt :: Concat and enclose RGXP and RGXPS as regexp alternatives. | Eventually add drx-BOL/STARS and drx-EOL before | first/after last alternative. | | - grp :: Concat and enclose RGXP and RGXPS. Eventually add | drx-BOL, STARS and drx-EOL as first/second/last group. | | - shy :: Concat and enclose RGXP and RGXPS as shy regexp | groups. Eventually add drx-BOL, STARS and drx-EOL as | first/second/last group. | | - app :: like 'grp', but rather append RGXP and RGXPS instead | of enclosing them if they are already regexp groups | themselves. | | They create regexp groups but don't apply repeaters to them. | | [back] `---- -- cheers, Thorsten ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-25 19:47 Raw string literals in Emacs lisp Matthew Plant ` (3 preceding siblings ...) 2014-07-26 21:37 ` Thorsten Jolitz @ 2014-07-29 6:32 ` William Xu 2014-07-29 7:40 ` Andreas Schwab 4 siblings, 1 reply; 51+ messages in thread From: William Xu @ 2014-07-29 6:32 UTC (permalink / raw) To: emacs-devel Matthew Plant <maplant2@illinois.edu> writes: > I think that raw string literals would be a really nice thing to add to Emacs > lisp. The most immediate benefit is that writing regexps would be much easier. > And since most of the work that goes into major modes is writing regexp, writing > major modes would become a lot faster. Would love to have this! Here is one of my recent use case: "quote bashslash in a shell command". http://permalink.gmane.org/gmane.emacs.help/98550 The shell command is: echo foo.bar | sed -e 's/\..*//' which will produce "foo" on bash. If i try to pass it to shell-command-to-string: (shell-command-to-string "echo foo.bar | sed -e 's/\..*//'") => "\n" Then i find i need to quote the backslash in emacs once more: (shell-command-to-string "echo foo.bar | sed -e 's/\\..*//'") => "foo\n" Is there a function or other way that can handle this kind of backslash quoting automatically? -- William http://xwl.appspot.com ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: Raw string literals in Emacs lisp. 2014-07-29 6:32 ` William Xu @ 2014-07-29 7:40 ` Andreas Schwab 0 siblings, 0 replies; 51+ messages in thread From: Andreas Schwab @ 2014-07-29 7:40 UTC (permalink / raw) To: William Xu; +Cc: emacs-devel William Xu <william.xwl@gmail.com> writes: > Here is one of my recent use case: "quote bashslash in a shell command". This is not about shell command syntax, but about Lisp syntax. > http://permalink.gmane.org/gmane.emacs.help/98550 > > The shell command is: > echo foo.bar | sed -e 's/\..*//' > > which will produce "foo" on bash. > > If i try to pass it to shell-command-to-string: > (shell-command-to-string "echo foo.bar | sed -e 's/\..*//'") > => "\n" > > Then i find i need to quote the backslash in emacs once more: > (shell-command-to-string "echo foo.bar | sed -e 's/\\..*//'") > => "foo\n" > > Is there a function or other way that can handle this kind of backslash > quoting automatically? Since this is part of the Lisp syntax, so there is no way to solve that programmatically. Andreas. -- Andreas Schwab, SUSE Labs, schwab@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different." ^ permalink raw reply [flat|nested] 51+ messages in thread
end of thread, other threads:[~2014-08-05 6:15 UTC | newest] Thread overview: 51+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-07-25 19:47 Raw string literals in Emacs lisp Matthew Plant 2014-07-25 19:56 ` Tassilo Horn 2014-07-25 20:06 ` Matthew Plant 2014-07-25 20:15 ` Tassilo Horn 2014-07-25 20:24 ` Matthew Plant 2014-07-25 20:33 ` Tom Tromey 2014-07-25 21:40 ` Matthew Plant 2014-07-26 1:19 ` Stephen J. Turnbull 2014-07-26 5:28 ` Matthew Plant 2014-07-26 5:45 ` chad 2014-07-26 19:39 ` Matthew Plant 2014-07-27 12:27 ` Stephen J. Turnbull 2014-07-27 13:03 ` David Kastrup 2014-07-27 20:58 ` David Caldwell 2014-07-27 23:17 ` Matthew Plant 2014-07-28 18:27 ` Richard Stallman 2014-07-28 19:32 ` Matthew Plant 2014-07-29 19:15 ` Richard Stallman 2014-07-30 0:26 ` Matthew Plant 2014-07-30 4:28 ` Richard Stallman 2014-07-30 18:54 ` Matthew Plant 2014-07-28 2:16 ` Stephen J. Turnbull 2014-07-28 7:43 ` Andreas Schwab 2014-07-30 20:28 ` Ted Zlatanov 2014-07-30 20:41 ` David Caldwell 2014-07-30 20:54 ` Ted Zlatanov 2014-07-30 21:01 ` Matthew Plant 2014-07-30 21:16 ` Ted Zlatanov 2014-07-30 21:19 ` Matthew Plant 2014-07-31 10:13 ` Ted Zlatanov 2014-08-02 8:47 ` Alan Mackenzie 2014-08-02 9:14 ` David Kastrup 2014-08-02 10:23 ` Alan Mackenzie 2014-08-02 15:51 ` Richard Stallman 2014-08-03 6:50 ` Stephen J. Turnbull 2014-08-03 7:29 ` David Kastrup 2014-08-03 13:12 ` Stephen J. Turnbull 2014-08-03 13:27 ` David Kastrup 2014-08-03 15:01 ` Stephen J. Turnbull 2014-08-04 1:55 ` Richard Stallman 2014-08-04 6:38 ` David Kastrup 2014-08-05 1:41 ` Richard Stallman 2014-08-05 6:15 ` David Kastrup 2014-08-03 13:40 ` David Kastrup 2014-08-03 15:06 ` Stephen J. Turnbull 2014-08-04 1:55 ` Richard Stallman 2014-08-02 9:17 ` Andreas Schwab 2014-07-28 1:29 ` Stephen J. Turnbull 2014-07-26 21:37 ` Thorsten Jolitz 2014-07-29 6:32 ` William Xu 2014-07-29 7:40 ` Andreas Schwab
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).