From: Daniel Brooks <db48x@db48x.net>
To: Anna Glasgall <anna@crossproduct.net>
Cc: emacs-devel@gnu.org
Subject: Re: "Raw" string literals for elisp
Date: Sat, 02 Oct 2021 14:03:57 -0700 [thread overview]
Message-ID: <87v92ft9z6.fsf@db48x.net> (raw)
In-Reply-To: <c539acf4a29f85e82c138706e7146dd537eb8534.camel@crossproduct.net> (Anna Glasgall's message of "Wed, 08 Sep 2021 16:40:09 -0400")
Anna Glasgall <anna@crossproduct.net> writes:
> Alan (Dr. Mackenzie? Forgive me, not sure what standards are here),
> your point about strings ending in \ is very well taken and I'm frankly
> not sure what the easiest path forward here is. Having "raw literals
> cannot end in a \" is a weird and unpleasant restriction, although the
> fact that it is one that Python places on r-strings (to my considerable
> surprise; I've been using Python since the mid-00s and have never run
> across this particular syntax oddity before) may mean that it is
> perhaps not so bad. The C++ concept of allowing r-strings to specify
> their own delimiters is perhaps maximally flexible, but is definitely
> going to be a heavier lift to implement than any of the above. I'd love
> to hear people's opinions on the merits of the various possible
> approaches here.
I’ve written a little about raw strings on this mailing list. You might
read 87zgzqz6mu.fsf@db48x.net, but I can summarize or restate the parts
dealing with delimiters.
I happen to love Raku’s choice: you can use any matched pair of
nonalphanumeric unicode characters. U+2603 SNOWMAN is a perfectly
cromulent choice of delimiter as far as Raku is concerned; an example
would be q☃foo☃. Since you can always choose a character that will not
appear in your string, this essentially eliminates all need for escaping
of the delimiter. Raku also lets you use characters that come in left–
and right–handed versions, as long as you order them correctly. For
example q«foo» is allowed, while q»foo« is not. There are unicode
properties that allow this to work without enumerating all of the
possibilities, making it future–proof. (There are only a couple of dozen
pairs, so enumerating them is not hard either.)
Then of course there are languages where the delimiters can be chosen by
the programmer but from a much more constrained set of
possibilities. C++ and Rust seem like good ones that we could mimic.
All of these delimiter styles are quite easy to implement in the reader,
but as Alan points out they can cause some complexity in the
corresponding language modes:
Alan Mackenzie <acm@muc.de> writes:
> When implementing the C++ raw strings, that flexibility caused me a lot
> of grief. For example, changing text in the middle of a C++ raw string,
> I had to check the new text didn't, by chance, form a closing delimiter
> matching the opening one. I would recommend not implementing anything
> like the C++ raw string identifiers.
As such, if we go this route I would recommend Rust–style over C++ style
raw strings. The Rust style is a lot like the C++ style, except that the
extra delimiter must be a sequence of # characters, matching on both
sides, rather than arbitrary source characters. Modes that want to check
for this will have an easier time with Rust–style than C++–style raw
strings.
But ultimately I prefer the exuberance and whimsy of Raku’s approach
over the more staid and pedestrian approaches taken by C++ and Rust.
db48x
next prev parent reply other threads:[~2021-10-02 21:03 UTC|newest]
Thread overview: 120+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-08 1:49 "Raw" string literals for elisp Anna Glasgall
2021-09-08 7:10 ` Po Lu
2021-09-08 14:19 ` Anna Glasgall
2021-09-08 7:12 ` Lars Ingebrigtsen
2021-09-08 14:20 ` Anna Glasgall
2021-09-08 11:30 ` Alan Mackenzie
2021-09-08 14:27 ` Anna Glasgall
2021-09-08 11:34 ` Adam Porter
2021-09-08 13:59 ` Clément Pit-Claudel
2021-09-08 14:12 ` Adam Porter
2021-09-09 3:09 ` Richard Stallman
2021-09-08 13:10 ` Stefan Monnier
2021-09-08 14:31 ` Anna Glasgall
2021-09-08 15:27 ` Mattias Engdegård
2021-09-08 15:41 ` Stefan Kangas
2021-09-08 16:45 ` Mattias Engdegård
2021-09-08 16:01 ` Alan Mackenzie
2021-09-08 18:24 ` Mattias Engdegård
2021-09-08 19:00 ` Alan Mackenzie
2021-09-08 19:22 ` Philip Kaludercic
2021-09-08 19:36 ` Alan Mackenzie
2021-09-08 21:11 ` Stefan Kangas
2021-09-08 21:24 ` Philip Kaludercic
2021-09-09 6:52 ` tomas
2021-09-08 15:54 ` Stefan Kangas
2021-09-08 16:05 ` tomas
2021-09-08 16:42 ` Lars Ingebrigtsen
2021-09-08 20:08 ` Stefan Monnier
2021-09-08 20:18 ` Stefan Monnier
2021-09-09 7:04 ` tomas
2021-09-09 10:30 ` Mattias Engdegård
2021-09-09 11:36 ` Stefan Kangas
2021-09-09 13:33 ` Mattias Engdegård
2021-09-09 14:32 ` tomas
2021-09-14 10:43 ` Augusto Stoffel
2021-09-14 11:42 ` Ihor Radchenko
2021-09-14 13:18 ` Stefan Monnier
2021-09-14 13:22 ` Stefan Kangas
2021-09-14 14:01 ` Ihor Radchenko
2021-09-14 14:39 ` Clément Pit-Claudel
2021-09-14 15:33 ` Amin Bandali
2021-09-14 16:05 ` Eli Zaretskii
2021-09-14 17:49 ` Jose E. Marchesi
2021-09-08 20:40 ` Anna Glasgall
2021-09-08 21:28 ` Alan Mackenzie
2021-10-02 21:03 ` Daniel Brooks [this message]
2021-10-04 0:13 ` Richard Stallman
2021-10-04 0:36 ` Daniel Brooks
2021-10-04 12:00 ` Eli Zaretskii
2021-10-04 15:36 ` character sets as they relate to “Raw” " Daniel Brooks
2021-10-04 16:34 ` Stefan Monnier
2021-10-04 20:49 ` Daniel Brooks
2021-10-04 21:19 ` Alan Mackenzie
2021-10-04 22:19 ` Daniel Brooks
2021-10-05 11:20 ` Alan Mackenzie
2021-10-05 17:08 ` Daniel Brooks
2021-10-06 20:54 ` Richard Stallman
2021-10-07 7:01 ` Eli Zaretskii
2021-10-05 8:55 ` Yuri Khan
2021-10-05 16:25 ` Juri Linkov
2021-10-05 17:15 ` Eli Zaretskii
2021-10-05 18:40 ` [External] : " Drew Adams
2021-10-06 20:54 ` Richard Stallman
2021-10-07 6:54 ` Eli Zaretskii
2021-10-07 13:14 ` Stefan Kangas
2021-10-07 13:34 ` Eli Zaretskii
2021-10-07 14:48 ` Stefan Kangas
2021-10-07 16:00 ` Eli Zaretskii
2021-10-08 0:37 ` Stefan Kangas
2021-10-08 6:53 ` Eli Zaretskii
2021-10-08 15:09 ` Display of em dashes in our documentation Stefan Kangas
2021-10-08 16:12 ` Eli Zaretskii
2021-10-08 17:17 ` Stefan Kangas
2021-10-10 8:00 ` Juri Linkov
2021-10-08 17:27 ` Daniel Brooks
2021-10-08 18:26 ` [External] : " Drew Adams
2021-10-08 17:17 ` character sets as they relate to “Raw” string literals for elisp Alan Mackenzie
2021-10-08 17:42 ` Eli Zaretskii
2021-10-08 18:47 ` Eli Zaretskii
2021-10-08 20:01 ` Alan Mackenzie
2021-10-09 6:18 ` Eli Zaretskii
2021-10-09 10:57 ` Alan Mackenzie
2021-10-09 11:49 ` Eli Zaretskii
2021-10-09 13:08 ` Alan Mackenzie
2021-10-09 13:15 ` Eli Zaretskii
2021-10-09 15:07 ` Alan Mackenzie
2021-10-11 0:45 ` linux console limitations Daniel Brooks
2021-10-12 10:18 ` Alan Mackenzie
2021-10-14 4:05 ` Daniel Brooks
2021-10-10 8:03 ` character sets as they relate to “Raw” string literals for elisp Juri Linkov
2021-10-05 18:23 ` [External] : " Drew Adams
2021-10-05 19:13 ` Stefan Kangas
2021-10-05 19:20 ` Drew Adams
2021-10-05 17:13 ` Daniel Brooks
2021-10-05 12:04 ` Eli Zaretskii
2021-10-05 21:20 ` Richard Stallman
2021-10-05 22:13 ` Daniel Brooks
2021-10-06 12:13 ` Eli Zaretskii
2021-10-06 18:57 ` Daniel Brooks
2021-10-07 4:23 ` Eli Zaretskii
2021-10-07 22:27 ` Richard Stallman
2021-10-08 10:37 ` Po Lu
2021-10-08 10:53 ` Basil L. Contovounesios
2021-10-08 11:27 ` tomas
2021-10-05 22:25 ` character sets as they relate to “Raw†" Stefan Kangas
2021-10-06 6:21 ` Daniel Brooks
2021-10-07 22:20 ` Richard Stallman
2021-10-06 12:29 ` Eli Zaretskii
2021-10-06 12:52 ` Stefan Kangas
2021-10-06 13:10 ` Jean-Christophe Helary
2021-10-06 11:53 ` character sets as they relate to “Raw” " Eli Zaretskii
2021-10-04 18:57 ` Eli Zaretskii
2021-10-04 19:14 ` Yuri Khan
2021-10-05 21:20 ` Richard Stallman
2021-10-06 3:48 ` character sets as they relate to “Raw†" Matthew Carter
2021-10-04 22:29 ` "Raw" " Richard Stallman
2021-10-05 5:39 ` Daniel Brooks
2021-10-05 5:43 ` Jean-Christophe Helary
2021-10-05 8:24 ` Richard Stallman
2021-10-05 12:23 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87v92ft9z6.fsf@db48x.net \
--to=db48x@db48x.net \
--cc=anna@crossproduct.net \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).