unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Daniel Brooks <db48x@db48x.net>
To: Eli Zaretskii <eliz@gnu.org>
Cc: Naoya Yamashita <conao3@gmail.com>, emacs-devel@gnu.org
Subject: Re: [PATCH] Interpret #r"..." as a raw string
Date: Fri, 26 Feb 2021 16:39:05 -0800	[thread overview]
Message-ID: <87zgzqz6mu.fsf@db48x.net> (raw)
In-Reply-To: <83pn0mppjd.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 26 Feb 2021 22:00:54 +0200")

Eli Zaretskii <eliz@gnu.org> writes:

>> Date: Sat, 27 Feb 2021 03:18:57 +0900 (JST)
>> From: Naoya Yamashita <conao3@gmail.com>
>> 
>> I write a patch to allow Emacs reader interpret raw string.
>
> What is a "raw string", and how does it differ from regular Lisp
> strings?
>
> Thanks.

Many languages have multiple string types because they simplify the
process of writing strings that contain quotation characters,
backslashes, or other syntax such as interpolation.

Think of sh, where double–quoted strings allow substitutions, while
single–quoted strings do not. The single–quoted strings are similar to
raw strings. Or Perl, where similar but more complex rules apply,
including strings that look like q{foo} and can be delimited by any
punctuation characters. Or Raku, which allows unicode punctuation as
delimiters such as q«foo». Or Rust, where r"foo" is a raw string that
can be delimited not just by double quotes, but also double quotes plus
an arbitrary number of # characters.

For example, suppose I am writing a shell script and I want to print out
an html anchor:

    echo "<a href=\"https://example.com/\">click here for an example</a>"

vs:

    echo '<a href="https://example.com/">click here for an example</a>'

The single–quoted string is nicer because I don’t have to escape the
quotes. Of course, HTML also allows me to use single quotes in place of
double quotes (and with no change of the semantics of the HTML), so
changing them would also be an option. Perhaps an even better example
would be a shell script that emits elisp, where strings must be
double–quoted.

Of course the primary difference between single– and double–quoted
strings in Shell and Perl is interpolation, rather than escape
characters. In Raku this is extended so that there are half a dozen
different features that can be independently turned on or off for any
given quoted item. Q"foo" is a raw string. q"foo" adds the backslash
escape mechanism for concisely representing various characters such as
tabs, newlines, and so on. qq"foo" adds interpolation on top of
escaping. qw"foo bar" and qqw"foo bar" add word splitting, so that you
get not a single string but a list of the words in the string. qx"foo"
is like the backtick syntax in Shell; it runs the quoted item in a
subshell. qqx"foo" does interpolation on it before running it in the
subshell. Heredocs allow for multiline strings. All of these forms allow
you to use arbitrary punctuation characters as delimiters. Then there is
a whole thing with adjectives where you can pick and choose those
features using an even more uniform syntax. And finally regexes are yet
more fun on top of all of that. Raku even has an unquoting mechanism
that is rather similar to the lisp unquote; it allows the nesting of
different string types.

Most languages don’t go to this extreme, but in languages that have raw
strings they are a way to turn off complicated features that you don’t
want to use in every instance.

As written, Naoya’s raw string patch allows the user to turn off string
escaping, but not to chose alternative delimiters (which has little or
no precedent in elisp) or to turn off string interpolation (which isn’t
built in to the elisp syntax, but is instead implemented by library
functions such as format.)

Naoya, your patch looks fairly good to my unpractised eye, but you might
consider adding an error message for malformed expressions such as
#r'foo', where the character after the r isn’t a double quote character.

Probably best to start thinking about how to document the syntax in the
elisp manual too.

Personally, I quite like the idea. Raw strings are useful for a lot more
than just regular expressions.

db48x



  reply	other threads:[~2021-02-27  0:39 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-26 18:18 [PATCH] Interpret #r"..." as a raw string Naoya Yamashita
2021-02-26 18:27 ` [External] : " Drew Adams
2021-02-26 18:53   ` Naoya Yamashita
2021-02-26 19:03     ` Drew Adams
2021-02-26 19:48     ` Stefan Monnier
2021-02-26 20:23       ` Naoya Yamashita
2021-02-26 20:34         ` Andreas Schwab
2021-02-26 20:39           ` Naoya Yamashita
2021-02-26 20:45             ` Andreas Schwab
2021-02-26 20:50               ` Naoya Yamashita
2021-02-26 20:54                 ` Andreas Schwab
2021-02-26 20:03     ` Eli Zaretskii
2021-02-26 20:34       ` Naoya Yamashita
2021-02-26 19:09 ` Andreas Schwab
2021-02-26 20:00 ` Eli Zaretskii
2021-02-27  0:39   ` Daniel Brooks [this message]
2021-02-27 16:14     ` Richard Stallman
2021-02-27 16:18       ` Stefan Monnier
2021-03-01  5:19         ` Richard Stallman
2021-03-02  5:45           ` Matt Armstrong
2021-03-03  5:53             ` Richard Stallman
2021-03-03  6:14               ` Daniel Brooks
2021-03-03  7:00               ` Eli Zaretskii
2021-03-04  2:47                 ` Matt Armstrong
2021-03-04 13:49                   ` Eli Zaretskii
2021-03-04 16:55                     ` Matt Armstrong
2021-03-05  5:44                       ` Richard Stallman
2021-03-05  5:39                   ` Richard Stallman
2021-03-05  8:01                     ` Eli Zaretskii
2021-03-06  5:13                       ` Richard Stallman
2021-03-06  6:04                         ` Matt Armstrong
2021-03-07  6:13                           ` Richard Stallman
2021-03-07 17:20                             ` [External] : " Drew Adams
2021-03-06  8:27                         ` Eli Zaretskii
2021-03-06  9:51                           ` Daniel Brooks
2021-03-06 10:24                             ` Eli Zaretskii
2021-03-07  6:08                           ` Richard Stallman
2021-02-27 20:41       ` Daniel Brooks
2021-02-28  6:22 ` Zhu Zihao
2021-03-01  5:26   ` Richard Stallman
2021-03-01 12:06 ` Alan Mackenzie
2021-03-01 12:13   ` Andreas Schwab
2021-03-02  5:59   ` Matt Armstrong
2021-03-02  9:56     ` Daniel Brooks
2021-03-02 10:13       ` Andreas Schwab
2021-03-02 10:55         ` Daniel Brooks
2021-03-02 11:18           ` Andreas Schwab
2021-03-02 11:26             ` Daniel Brooks
2021-03-02 11:14       ` Alan Mackenzie
2021-03-02 11:52         ` Daniel Brooks
2021-03-02 12:01     ` Dmitry Gutov
2021-03-02 14:14       ` Alan Mackenzie
2021-03-02 14:32         ` Dmitry Gutov
2021-03-02 15:06           ` Alan Mackenzie
2021-03-02 11:41 ` Aurélien Aptel
2021-03-02 13:49   ` Stefan Monnier
2021-03-02 14:46     ` Aurélien Aptel
2021-03-02 15:11       ` Stefan Monnier
2021-03-02 16:07         ` Aurélien Aptel
2021-03-03  7:31           ` Alfred M. Szmidt
2021-03-03 16:02           ` Stefan Monnier
2021-03-02 20:36     ` Daniel Brooks
2021-03-03  0:27       ` Stefan Monnier
2021-03-03  0:42         ` Daniel Brooks
2021-03-03  8:16       ` Andreas Schwab
2021-03-03  9:25         ` Daniel Brooks
2021-03-03  9:29           ` Andreas Schwab
2021-03-03 10:02             ` Daniel Brooks
2021-03-03 10:11               ` Daniel Brooks
2021-03-03 10:14                 ` Andreas Schwab
2021-03-03 11:48                   ` Daniel Brooks
2021-03-03 10:12       ` Michael Albinus
2021-03-03 10:42         ` Daniel Brooks
2021-03-03 10:49           ` Michael Albinus
2021-03-03 16:12           ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zgzqz6mu.fsf@db48x.net \
    --to=db48x@db48x.net \
    --cc=conao3@gmail.com \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).