From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: tomas@tuxteam.de Newsgroups: gmane.emacs.devel Subject: Re: "Raw" string literals for elisp Date: Thu, 9 Sep 2021 09:04:31 +0200 Message-ID: <20210909070431.GB16259@tuxteam.de> References: <4209edd83cfee7c84b2d75ebfcd38784fa21b23c.camel@crossproduct.net> <20210908160531.GA18656@tuxteam.de> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="kXdP64Ggrk/fb43R" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="11575"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mutt/1.5.21 (2010-09-15) Cc: emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Sep 09 09:06:36 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mOE8q-0002mz-68 for ged-emacs-devel@m.gmane-mx.org; Thu, 09 Sep 2021 09:06:36 +0200 Original-Received: from localhost ([::1]:51076 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mOE8p-0005lI-1Z for ged-emacs-devel@m.gmane-mx.org; Thu, 09 Sep 2021 03:06:35 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:44708) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mOE6t-0003pO-FP for emacs-devel@gnu.org; Thu, 09 Sep 2021 03:04:37 -0400 Original-Received: from mail.tuxteam.de ([5.199.139.25]:56242) by eggs.gnu.org with esmtps (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.90_1) (envelope-from ) id 1mOE6r-00048Q-LT for emacs-devel@gnu.org; Thu, 09 Sep 2021 03:04:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=tuxteam.de; s=mail; h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date; bh=ugwcFaq4TTlFrfzVmdpdykwoGpbWkMYpXVmHrDS0bpU=; b=NYS7PttLbkmvlnLRqL3e8YdCiR7Vwe2ySHB85BZu6+r9NdxO+bFmVFnQQYNmifVBg6+DNKSTY2DuacggQYW/Z5CAwpG9DOPkWy4MVQ47Ad8d+DYqRXa1yawNgWmhpIc8LLfm3N9MFTap4u3bj/bwhWC9qBQi5jSjTMr2/MBQ9SwRr0CgsRa3QZ98pOiSi72WhTDRxaSWa6nQcQDqJ9NxdB7lPmNgaC354xZpXITweCaDwgspZCY6wmvRK6/56n0AzffZC4sSlRKJ4hrmh3D+T7PR5hLuhMJeVmElxJ8tN5Sj7WweJlUgV98uDkVEu0Z/U2AVveic0GU3Ilo74MeGsA==; Original-Received: from tomas by mail.tuxteam.de with local (Exim 4.80) (envelope-from ) id 1mOE6p-0004U3-4V; Thu, 09 Sep 2021 09:04:31 +0200 Content-Disposition: inline In-Reply-To: Received-SPF: pass client-ip=5.199.139.25; envelope-from=tomas@tuxteam.de; helo=mail.tuxteam.de X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:274428 Archived-At: --kXdP64Ggrk/fb43R Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Sep 08, 2021 at 04:18:23PM -0400, Stefan Monnier wrote: > > I just think these are two separate dimensions which happen to align > > in the "regexp and backslash" case. >=20 > BTW, they can align in somewhat funny ways sometimes. > E.g. the raw-string version of the regexp "[ \t\n]" turns into something = like >=20 > #r"[ > ]" Actually I was using "align" in a rather metaphorical sense, but you are making a very good point: one might want to have some ot the "classical C escapes" (\n, \r and some of its ilk, perhaps even \b), but then `raw' wouldn't be raw anymore. > which is not ideal in terms of clarity. Similarly a regexp that matches > the NUL character will be problematic when written as a raw string > because it will need to embed the NUL character in the source code, > which in turn will cause tools like `grep` to treat the file as binary. >=20 > For the first problem above we can/should extend our regexp syntax to > include \t and \n as regexps that match TAB and LF respectively (that > would also be handy when writing regexps in the minibuffer). For regexps proper there's an escape hatch, since there is a language "on top" that could be extended a bit (e.g. via the [:...:] character class notation or something). But that would be unwieldy indeed. > But \0 is already used for other things so there's no such "obvious" > workaround for the second case :-( Yes, the very handy `\x' notation has much history. Hard to move whithin that cupboard without breaking anything :-) Cheers - t --kXdP64Ggrk/fb43R Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAmE5sf8ACgkQBcgs9XrR2kZMrQCfdOSyxsRK7/cnJ5mdWjuBEBUj GqsAn2CoAReZZ4DzIIQ/GmsmHfC3CYs4 =PBnx -----END PGP SIGNATURE----- --kXdP64Ggrk/fb43R--