From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Matthew Plant <maplant2@illinois.edu>
Newsgroups: gmane.emacs.devel
Subject: Re: Raw string literals in Emacs lisp.
Date: Sun, 27 Jul 2014 18:17:46 -0500
Message-ID: <CAMbiG3-6e2wvOFpM8-LDjVbRkSqb4O24CQ6sfbCqSsF5yB6-EA@mail.gmail.com>
References: <CAMbiG3_eorJe+71ZGaM33w+BqS12izYex4NdD_bMtORqb+x+Vg@mail.gmail.com>
	<878ungor1v.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CAMbiG39qUuq3daUqMbKjDRaakSceU1FhsyhOvNvNqv0wErX1BQ@mail.gmail.com>
	<AECFD120-4664-485C-89AB-B1D367013BB0@gmail.com>
	<CAMbiG3_As8YQpLQec9oMwR4vOytLdG8jff2M5Yy3kitD9zQ5Rw@mail.gmail.com>
	<8761ijng08.fsf@uwakimon.sk.tsukuba.ac.jp>
	<871tt7lzro.fsf@fencepost.gnu.org> <53D567FD.4030708@porkrind.org>
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: multipart/alternative; boundary=001a1132f322b0467604ff35039b
X-Trace: ger.gmane.org 1406503087 15622 80.91.229.3 (27 Jul 2014 23:18:07 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Sun, 27 Jul 2014 23:18:07 +0000 (UTC)
Cc: "emacs-devel@gnu.org" <emacs-devel@gnu.org>
To: David Caldwell <david@porkrind.org>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Jul 28 01:18:01 2014
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1XBXhZ-0008Hw-JG
	for ged-emacs-devel@m.gmane.org; Mon, 28 Jul 2014 01:18:01 +0200
Original-Received: from localhost ([::1]:36659 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1XBXhY-0001oU-Ud
	for ged-emacs-devel@m.gmane.org; Sun, 27 Jul 2014 19:18:00 -0400
Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:39148)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <maplant2@illinois.edu>) id 1XBXhS-0001oN-9k
	for emacs-devel@gnu.org; Sun, 27 Jul 2014 19:17:58 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <maplant2@illinois.edu>) id 1XBXhM-0004Qx-VQ
	for emacs-devel@gnu.org; Sun, 27 Jul 2014 19:17:54 -0400
Original-Received: from mail-lb0-f170.google.com ([209.85.217.170]:43227)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <maplant2@illinois.edu>) id 1XBXhM-0004Qd-KN
	for emacs-devel@gnu.org; Sun, 27 Jul 2014 19:17:48 -0400
Original-Received: by mail-lb0-f170.google.com with SMTP id w7so5313162lbi.29
	for <emacs-devel@gnu.org>; Sun, 27 Jul 2014 16:17:47 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20130820;
	h=x-gm-message-state:mime-version:in-reply-to:references:date
	:message-id:subject:from:to:cc:content-type;
	bh=rDy4IaXm+6doT8wvMI8cpJxNYAFRMGaIbAj9mCwkw4k=;
	b=bKWvzJ2viWIxeiKDs+vNPvLlFO7BnL2VJu+tZG2O0xTHSlZf0M6lV4CKsKa3i6LxN2
	5cJKFswj+BXZ/XYH2vaZwk4f3yEFmDYc6TNmcYuiJP7UWMEg76e1ki4gsTlJK88NiDW+
	K827roeWPF0yQ8Pm7mKrxKEwnjiEXP4Dhm+ctsMN3l+sX3/HbPjnddlvjEFsMBBNwRlN
	qYIXKnqPBp/xsKG3vvG1SPJHHB90hiS+75kfvFLzXvFJMqMxlSmEpyCTuTjUCXfemNs1
	NUlQWsmQoiBni3+8t2geZJd3qR5lYLkaaoZgLi/p4TmNYGa1c4rQgKYrup0rm5yGmixn
	7TFA==
X-Gm-Message-State: ALoCoQlWQ3E09jAcAAsfNwTNRX6JDde8Tuhp9YkqkjhbDtQNAWpwki4adQ/mEpBEZH8Xv4lui+MS
X-Received: by 10.152.9.233 with SMTP id d9mr6087887lab.66.1406503067068; Sun,
	27 Jul 2014 16:17:47 -0700 (PDT)
Original-Received: by 10.112.185.99 with HTTP; Sun, 27 Jul 2014 16:17:46 -0700 (PDT)
In-Reply-To: <53D567FD.4030708@porkrind.org>
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
	[fuzzy]
X-Received-From: 209.85.217.170
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:173191
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/173191>

--001a1132f322b0467604ff35039b
Content-Type: text/plain; charset=UTF-8

I think this is a very good idea. However, agreeing upon which semantics
are needed may prove problematic. Do you have any suggestions on this
point? The easiest method would probably just go off some other predefined
rules like perl's (but definitely not perl's).

-Matt

On Sunday, July 27, 2014, David Caldwell <david@porkrind.org> wrote:

> On 7/27/14 6:03 AM, David Kastrup wrote:
> > "Stephen J. Turnbull" <turnbull@sk.tsukuba.ac.jp <javascript:;>> writes:
> >
> >> Sure, you can do a lot for readability as PCRE or Python regexps have
> >> done, but regexps are unreadable almost by design, and those regexp
> >> syntaxes benefit from rawstrings, too.  Almost anything (that doesn't
> >> involve changing the meaning of existing legal programs) that improves
> >> readability of regexps is worthwhile.
> >>
> >> Rawstrings are cheap and effective.
> >
> > When rawstrings are supported, it becomes more expedient to recognize
> > things like \n and \t, probably also \f in regexps (\b is already
> > taken).  At the current point of time, they just evaluate to n and t.
> > That makes input of tabs and newlines in raw strings a nuisance and a
> > potential source of errors.
> >
> > It's not actually an issue with rawstrings as such, but rather of their
> > use within regexps.
>
> Why not, then, skip rawstrings completely and go directly to a regular
> expression reader: #r// (or even just #//) instead of #r""?
>
> Then you can add whatever semantics are needed for good regexp reading
> (ie, let '\n', '\t', and others get escaped in the string reading, but
> allow '\(' to go through unescaped). This will be just as easy to
> implement as raw strings.
>
> Languages like Javascript, Perl, Ruby, Bash, and Groovy have shown that
> having a special support for regexps at a language level is a very
> effective way of dealing with them. Plus it opens the door to
> extensions: #r//p for PCRE/Perl syntax[1] or #r//x for more readable
> regexps[2], etc.
>
> I think using rawstrings is too generic an answer to the problem. Given
> that so much of Emacs's functionality is reliant an regular expressions,
> it makes sense to design something specifically for them. Doing that
> means they can be tailored and tweaked for maximum functionality without
> worrying about possible other usages that people might come up (which
> will undoubtedly happen with rawstrings).
>
> -David
>
> [1] And practically every other language on the planet. Really, it seems
> like only Emacs is left in the dark ages of basic POSIX regexps where
> '(' means literal paren and not matching.
>
> [2] Another Perl feature, it allows whitespace and comments in regexps,
> for much improved readability. See http://perldoc.perl.org/perlre.html#/x
>
>

--001a1132f322b0467604ff35039b
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

I think this is a very good idea. However, agreeing upon which semantics ar=
e needed may prove problematic. Do you have any suggestions on this point? =
The easiest method would probably just go off some other predefined rules l=
ike perl&#39;s (but definitely not perl&#39;s).<div>
<br></div><div>-Matt<br><div><div><br>On Sunday, July 27, 2014, David Caldw=
ell &lt;<a href=3D"mailto:david@porkrind.org">david@porkrind.org</a>&gt; wr=
ote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border=
-left:1px #ccc solid;padding-left:1ex">
On 7/27/14 6:03 AM, David Kastrup wrote:<br>
&gt; &quot;Stephen J. Turnbull&quot; &lt;<a href=3D"javascript:;" onclick=
=3D"_e(event, &#39;cvml&#39;, &#39;turnbull@sk.tsukuba.ac.jp&#39;)">turnbul=
l@sk.tsukuba.ac.jp</a>&gt; writes:<br>
&gt;<br>
&gt;&gt; Sure, you can do a lot for readability as PCRE or Python regexps h=
ave<br>
&gt;&gt; done, but regexps are unreadable almost by design, and those regex=
p<br>
&gt;&gt; syntaxes benefit from rawstrings, too. =C2=A0Almost anything (that=
 doesn&#39;t<br>
&gt;&gt; involve changing the meaning of existing legal programs) that impr=
oves<br>
&gt;&gt; readability of regexps is worthwhile.<br>
&gt;&gt;<br>
&gt;&gt; Rawstrings are cheap and effective.<br>
&gt;<br>
&gt; When rawstrings are supported, it becomes more expedient to recognize<=
br>
&gt; things like \n and \t, probably also \f in regexps (\b is already<br>
&gt; taken). =C2=A0At the current point of time, they just evaluate to n an=
d t.<br>
&gt; That makes input of tabs and newlines in raw strings a nuisance and a<=
br>
&gt; potential source of errors.<br>
&gt;<br>
&gt; It&#39;s not actually an issue with rawstrings as such, but rather of =
their<br>
&gt; use within regexps.<br>
<br>
Why not, then, skip rawstrings completely and go directly to a regular<br>
expression reader: #r// (or even just #//) instead of #r&quot;&quot;?<br>
<br>
Then you can add whatever semantics are needed for good regexp reading<br>
(ie, let &#39;\n&#39;, &#39;\t&#39;, and others get escaped in the string r=
eading, but<br>
allow &#39;\(&#39; to go through unescaped). This will be just as easy to<b=
r>
implement as raw strings.<br>
<br>
Languages like Javascript, Perl, Ruby, Bash, and Groovy have shown that<br>
having a special support for regexps at a language level is a very<br>
effective way of dealing with them. Plus it opens the door to<br>
extensions: #r//p for PCRE/Perl syntax[1] or #r//x for more readable<br>
regexps[2], etc.<br>
<br>
I think using rawstrings is too generic an answer to the problem. Given<br>
that so much of Emacs&#39;s functionality is reliant an regular expressions=
,<br>
it makes sense to design something specifically for them. Doing that<br>
means they can be tailored and tweaked for maximum functionality without<br=
>
worrying about possible other usages that people might come up (which<br>
will undoubtedly happen with rawstrings).<br>
<br>
-David<br>
<br>
[1] And practically every other language on the planet. Really, it seems<br=
>
like only Emacs is left in the dark ages of basic POSIX regexps where<br>
&#39;(&#39; means literal paren and not matching.<br>
<br>
[2] Another Perl feature, it allows whitespace and comments in regexps,<br>
for much improved readability. See <a href=3D"http://perldoc.perl.org/perlr=
e.html#/x" target=3D"_blank">http://perldoc.perl.org/perlre.html#/x</a><br>
<br>
</blockquote></div></div></div>

--001a1132f322b0467604ff35039b--