From: Anna Glasgall <anna@crossproduct.net>
To: Alan Mackenzie <acm@muc.de>
Cc: emacs-devel@gnu.org
Subject: Re: "Raw" string literals for elisp
Date: Wed, 08 Sep 2021 10:27:17 -0400 [thread overview]
Message-ID: <8ac544527b7f8767cf562fba86fbf19d3414d720.camel@crossproduct.net> (raw)
In-Reply-To: <YTie26fKdA+2rWP7@ACM>
On Wed, 2021-09-08 at 11:30 +0000, Alan Mackenzie wrote:
> Hello, Anna.
>
> Just as a matter of context, I implemented C++ raw strings, and
> recently
> enhanced the code also to handle other CC Mode derived languages such
> as
> C# and Vala.
>
Great, I'll definitely take a look at that.
> On Tue, Sep 07, 2021 at 21:49:33 -0400, Anna Glasgall wrote:
> > [My previous message appears to have been eaten, or at least it's
> > not
> > showing up in the archive; resending from a different From:
> > address.
> > Apologies for any duplication]
>
> > Hello Emacs developers,
>
> > I've long been annoyed by the number of backslashes needed when
> > using
> > string literals in elisp for certain things (regexes, UNC paths,
> > etc),
> > so I started work on a patch (WIP attached) to implement support
> > for
> > "raw" string literals, a la Python r-strings. These are string
> > literals
> > that work exactly like normal string literals, with the exception
> > that
> > backslash escapes (except for \") are not processed; \ may freely
> > appear in the string without need to escape. I've made good
> > progress,
> > but unfortunately I've run into a roadblock and am not sure what to
> > do
> > next.
>
> One not so small point. How do you put a backslash as the _last_
> character in a raw string?
That is an excellent question. I'll need to take a look at how some
other languages handle that :/
Thanks for giving me another test case!
>
> If this is difficult, it may well be worth comparing other languages
> with raw strings. C++ Mode has a complicated system of identifiers
> at
> each end of the raw string (I'm sure you know this). C# represents a
> "
> inside a multi-line string as "". Vala (and, I believe, Python) have
> triple quote delimters """ and cannot represent three quotes in a row
> inside the multi-line string.
>
> It is probably worth while stating explicitly that Elisp raw strings
> can
> be continued across line breaks without having to escape the \n.
>
> > I've successfully taught the elisp reader (read1 in lread.c) how to
> > read r-strings. I thought I had managed to make lisp-mode/elisp-
> > mode
> > happy by allowing "r" to be a prefix character (C-x C-e and the
> > underlying forward-sexp/backward-sexp seemed to work fine at
> > first),
> > but realized that I ran into trouble with strings containing the
> > sequence of characters '\\"'.
>
> > The reader correctly reads r"a\\"" as a string containing the
> > sequence
> > of characters 'a', '\', '"', and M-: works. Unfortunately, if I try
> > sexp-based navigation or e.g. C-x C-e, it falls apart. The parser
> > in
> > syntax.c, which afaict is what lisp-mode is using to try and find
> > sexps
> > in buffer text, doesn't seem to know what to do with this
> > expression.
> > I've spent some time staring at syntax.c, but I must confess that
> > I'm
> > entirely defeated in terms of what changes need to be made here to
> > teach this other parser about prefixed strings in where the prefix
> > has
> > meaning that affects the interpretation of the characters between
> > string fences.
>
> You probably want to use syntax-table text properties. See the page
> "Syntax Properties" in the Elisp manual. In short, you would put,
> say,
> a "punctuation" property on most backslashes to nullify their normal
> action. Possibly, you might want such a property on a double quote
> inside the string. You might also want a property on the linefeeds
> inside a raw string. With these properties, C-M-n and friends will
> work
> properly.
>
> Bear in mind that you will also need to apply and remove these
> properties as the user changes the Lisp text, for example by removing
> a
> \ before a ". There is an established mechanism in Emacs for this
> sort
> of action (which CC Mode doesn't use) which I would advise you to
> use.
>
It was unclear to me how much additional processing during typing would
be acceptable here as opposed to just running the existing C code.
Hopefully native compilation support will to some extent nullify any
penalty from adding additional logic in Lisp here?
> > I've attached a copy of my WIP patch; it's definitely not near
> > final
> > code quality and doesn't have documentation yet, all of which I
> > would
> > take care of before submitting for inclusion. I also haven't filled
> > out
> > the copyright assignment paperwork yet, but should this work reach
> > a
> > point where it was likely to be accepted, I'd be happy to do that.
>
> Thanks!
>
> > I'd very much appreciate some pointers on what to try next here, or
> > some explanation of how syntax.c/syntax.el works beyond what's in
> > the
> > reference manual. If this is a fool's errand I'm tilting at here,
> > I'd
> > also appreciate being told that before I sink more time into it :)
>
> It is definitely NOT a fool's errand. There may be some resistance
> to
> the idea of raw strings from traditionalists, but I hope not. It
> would
> be worth your while really to understand the section in the Elisp
> manual
> on syntax and all the things it can (and can't) do.
>
> Help is always available on emacs-devel.
>
> You're going to have quite a bit of Lisp programming to do. For
> example, font-lock needs to be taught how to fontify a raw string.
>
I am already moderately familiar with writing elisp at this point, but
yes, I still have a lot to learn :)
> But at the end of the exercise, you will have learnt so much about
> Emacs
> that you will qualify as a fully fledged contributor. :-)
>
thanks,
Anna
> > thanks,
>
> > Anna Glasgall
>
next prev parent reply other threads:[~2021-09-08 14:27 UTC|newest]
Thread overview: 120+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-08 1:49 "Raw" string literals for elisp Anna Glasgall
2021-09-08 7:10 ` Po Lu
2021-09-08 14:19 ` Anna Glasgall
2021-09-08 7:12 ` Lars Ingebrigtsen
2021-09-08 14:20 ` Anna Glasgall
2021-09-08 11:30 ` Alan Mackenzie
2021-09-08 14:27 ` Anna Glasgall [this message]
2021-09-08 11:34 ` Adam Porter
2021-09-08 13:59 ` Clément Pit-Claudel
2021-09-08 14:12 ` Adam Porter
2021-09-09 3:09 ` Richard Stallman
2021-09-08 13:10 ` Stefan Monnier
2021-09-08 14:31 ` Anna Glasgall
2021-09-08 15:27 ` Mattias Engdegård
2021-09-08 15:41 ` Stefan Kangas
2021-09-08 16:45 ` Mattias Engdegård
2021-09-08 16:01 ` Alan Mackenzie
2021-09-08 18:24 ` Mattias Engdegård
2021-09-08 19:00 ` Alan Mackenzie
2021-09-08 19:22 ` Philip Kaludercic
2021-09-08 19:36 ` Alan Mackenzie
2021-09-08 21:11 ` Stefan Kangas
2021-09-08 21:24 ` Philip Kaludercic
2021-09-09 6:52 ` tomas
2021-09-08 15:54 ` Stefan Kangas
2021-09-08 16:05 ` tomas
2021-09-08 16:42 ` Lars Ingebrigtsen
2021-09-08 20:08 ` Stefan Monnier
2021-09-08 20:18 ` Stefan Monnier
2021-09-09 7:04 ` tomas
2021-09-09 10:30 ` Mattias Engdegård
2021-09-09 11:36 ` Stefan Kangas
2021-09-09 13:33 ` Mattias Engdegård
2021-09-09 14:32 ` tomas
2021-09-14 10:43 ` Augusto Stoffel
2021-09-14 11:42 ` Ihor Radchenko
2021-09-14 13:18 ` Stefan Monnier
2021-09-14 13:22 ` Stefan Kangas
2021-09-14 14:01 ` Ihor Radchenko
2021-09-14 14:39 ` Clément Pit-Claudel
2021-09-14 15:33 ` Amin Bandali
2021-09-14 16:05 ` Eli Zaretskii
2021-09-14 17:49 ` Jose E. Marchesi
2021-09-08 20:40 ` Anna Glasgall
2021-09-08 21:28 ` Alan Mackenzie
2021-10-02 21:03 ` Daniel Brooks
2021-10-04 0:13 ` Richard Stallman
2021-10-04 0:36 ` Daniel Brooks
2021-10-04 12:00 ` Eli Zaretskii
2021-10-04 15:36 ` character sets as they relate to “Raw” " Daniel Brooks
2021-10-04 16:34 ` Stefan Monnier
2021-10-04 20:49 ` Daniel Brooks
2021-10-04 21:19 ` Alan Mackenzie
2021-10-04 22:19 ` Daniel Brooks
2021-10-05 11:20 ` Alan Mackenzie
2021-10-05 17:08 ` Daniel Brooks
2021-10-06 20:54 ` Richard Stallman
2021-10-07 7:01 ` Eli Zaretskii
2021-10-05 8:55 ` Yuri Khan
2021-10-05 16:25 ` Juri Linkov
2021-10-05 17:15 ` Eli Zaretskii
2021-10-05 18:40 ` [External] : " Drew Adams
2021-10-06 20:54 ` Richard Stallman
2021-10-07 6:54 ` Eli Zaretskii
2021-10-07 13:14 ` Stefan Kangas
2021-10-07 13:34 ` Eli Zaretskii
2021-10-07 14:48 ` Stefan Kangas
2021-10-07 16:00 ` Eli Zaretskii
2021-10-08 0:37 ` Stefan Kangas
2021-10-08 6:53 ` Eli Zaretskii
2021-10-08 15:09 ` Display of em dashes in our documentation Stefan Kangas
2021-10-08 16:12 ` Eli Zaretskii
2021-10-08 17:17 ` Stefan Kangas
2021-10-10 8:00 ` Juri Linkov
2021-10-08 17:27 ` Daniel Brooks
2021-10-08 18:26 ` [External] : " Drew Adams
2021-10-08 17:17 ` character sets as they relate to “Raw” string literals for elisp Alan Mackenzie
2021-10-08 17:42 ` Eli Zaretskii
2021-10-08 18:47 ` Eli Zaretskii
2021-10-08 20:01 ` Alan Mackenzie
2021-10-09 6:18 ` Eli Zaretskii
2021-10-09 10:57 ` Alan Mackenzie
2021-10-09 11:49 ` Eli Zaretskii
2021-10-09 13:08 ` Alan Mackenzie
2021-10-09 13:15 ` Eli Zaretskii
2021-10-09 15:07 ` Alan Mackenzie
2021-10-11 0:45 ` linux console limitations Daniel Brooks
2021-10-12 10:18 ` Alan Mackenzie
2021-10-14 4:05 ` Daniel Brooks
2021-10-10 8:03 ` character sets as they relate to “Raw” string literals for elisp Juri Linkov
2021-10-05 18:23 ` [External] : " Drew Adams
2021-10-05 19:13 ` Stefan Kangas
2021-10-05 19:20 ` Drew Adams
2021-10-05 17:13 ` Daniel Brooks
2021-10-05 12:04 ` Eli Zaretskii
2021-10-05 21:20 ` Richard Stallman
2021-10-05 22:13 ` Daniel Brooks
2021-10-06 12:13 ` Eli Zaretskii
2021-10-06 18:57 ` Daniel Brooks
2021-10-07 4:23 ` Eli Zaretskii
2021-10-07 22:27 ` Richard Stallman
2021-10-08 10:37 ` Po Lu
2021-10-08 10:53 ` Basil L. Contovounesios
2021-10-08 11:27 ` tomas
2021-10-05 22:25 ` character sets as they relate to “Raw†" Stefan Kangas
2021-10-06 6:21 ` Daniel Brooks
2021-10-07 22:20 ` Richard Stallman
2021-10-06 12:29 ` Eli Zaretskii
2021-10-06 12:52 ` Stefan Kangas
2021-10-06 13:10 ` Jean-Christophe Helary
2021-10-06 11:53 ` character sets as they relate to “Raw” " Eli Zaretskii
2021-10-04 18:57 ` Eli Zaretskii
2021-10-04 19:14 ` Yuri Khan
2021-10-05 21:20 ` Richard Stallman
2021-10-06 3:48 ` character sets as they relate to “Raw†" Matthew Carter
2021-10-04 22:29 ` "Raw" " Richard Stallman
2021-10-05 5:39 ` Daniel Brooks
2021-10-05 5:43 ` Jean-Christophe Helary
2021-10-05 8:24 ` Richard Stallman
2021-10-05 12:23 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8ac544527b7f8767cf562fba86fbf19d3414d720.camel@crossproduct.net \
--to=anna@crossproduct.net \
--cc=acm@muc.de \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.