From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Pierre Neidhardt Newsgroups: gmane.emacs.devel Subject: Re: rx.el sexp regexp syntax (WAS: Off Topic) Date: Fri, 25 May 2018 10:52:03 +0200 Message-ID: <87h8mw3yoc.fsf@gmail.com> References: NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" X-Trace: blaine.gmane.org 1527238250 7603 195.159.176.226 (25 May 2018 08:50:50 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 25 May 2018 08:50:50 +0000 (UTC) User-Agent: mu4e 1.0; emacs 26.1 Cc: van@scratch.space, eliz@gnu.org, Noam Postavsky , emacs-devel@gnu.org To: rms@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri May 25 10:50:46 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fM8Qv-0001rp-TZ for ged-emacs-devel@m.gmane.org; Fri, 25 May 2018 10:50:46 +0200 Original-Received: from localhost ([::1]:42542 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fM8T3-0007xr-2q for ged-emacs-devel@m.gmane.org; Fri, 25 May 2018 04:52:57 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:45689) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fM8SI-0007aW-BJ for emacs-devel@gnu.org; Fri, 25 May 2018 04:52:11 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fM8SF-0001MS-8s for emacs-devel@gnu.org; Fri, 25 May 2018 04:52:10 -0400 Original-Received: from mail-wr0-x233.google.com ([2a00:1450:400c:c0c::233]:33878) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fM8SF-0001Ke-0D; Fri, 25 May 2018 04:52:07 -0400 Original-Received: by mail-wr0-x233.google.com with SMTP id j1-v6so7838411wrm.1; Fri, 25 May 2018 01:52:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=references:user-agent:from:to:cc:subject:in-reply-to:date :message-id:mime-version; bh=H4jWVGfuAwTnPlLl9QbiPhJobtkJaTiwpjmCtKznxz8=; b=g3IeEz3t4+7snGTLQBkvRaH78AZpJK7j07Pn5KyeeyHS+GZ16HMFkFaXikLwC04LvV cr2LrOj1zIdnouk3laHji5/QcjHsuCl+EZolIoiKWtGYOyUj1A/nzol9woGifvu6ejnE RSOh1eCyP2izU6gwd8NG9G9U2GGBbk78b19SwCWk2kec+znIu3qP7pqdGSgyM2kRLhxg l9jOdVXJuoZxNPPlxjpezRcCtMgDWGsIoJiVgZolvlynNnLmxRaOxGRIPDt+CxMMurZ3 tyLusj0CLPOrqdSsuw7LQwQuAEaTCs/dzpn7nJVLqJBgjiGB4n0JX2MrkSNyx4cuJ1M4 Z6kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:references:user-agent:from:to:cc:subject :in-reply-to:date:message-id:mime-version; bh=H4jWVGfuAwTnPlLl9QbiPhJobtkJaTiwpjmCtKznxz8=; b=ubIG0XQ8RemRA+b3PVj1miq4DZNGPPlU8QAXaAVYs7jzpiDYvbYMt8gJbLEZH9zgd7 Njds42T+vp9N0viXYmvlUogtCd9/IfiHRRcMbUymRQ/xfXqkm+JA9W7Z0nadIIiF3KZm KGShL2NM1BBl51l2qC4VrpQxkY9M09Y3zyYrzC+vP23yIuc29CA1UPr/UfokC2FUXARk L0ZoPiYUBrTf326yM1WOkhDl/PiZLajDQ4wONwXtQgf4S19hqS8jkeRjVWdCnZ15baIa XJfIG9a4DPESoSoW1cghZheoAvA5tYZAS0BnmtnBESntAjs9M7LV1YdLJyoI/8/7aK9K 33Lg== X-Gm-Message-State: ALKqPwcYow0KMkH+T9X2wu2C4715CWIR966dasEu2FgKa6UHJRTXDsGs tcwWK2pyO0eC/qquOg9N5Ddm0mUj X-Google-Smtp-Source: AB8JxZqJSEQj4RfN8jN2e12uw36HTR+SJca6cLnAGwIXAgVbxX/SS9Q2FP1PNEAodGvUQvTmEUKdBw== X-Received: by 2002:adf:8212:: with SMTP id 18-v6mr1272453wrb.144.1527238325489; Fri, 25 May 2018 01:52:05 -0700 (PDT) Original-Received: from mimimi (87-89-234-173.abo.bbox.fr. [87.89.234.173]) by smtp.gmail.com with ESMTPSA id b16-v6sm23933645wrm.89.2018.05.25.01.52.04 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 25 May 2018 01:52:04 -0700 (PDT) In-reply-to: X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:400c:c0c::233 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:225705 Archived-At: --=-=-= Content-Type: text/plain rx.el is one of the best concepts I've discovered in a long time. It's another instance of "Don't come up with a new (mini)language when Lisp can do better": it's easier to learn, more flexible, easier to write, much easier to read and as a consequence much more maintainable. > Some people, when confronted with a problem, think "I know, I'll use > regular expressions." Now they have two problems. > -- Jamie Zawinski It's also much more "programmable" thanks to its `eval' expression. (It's possible to count!) See http://francismurillo.github.io/2017-03-30-Exploring-Emacs-rx-Macro/ for some nice examples. I think it's high time we moved away from traditional regexps and embraced the concept of rx.el. I'm thinking of implementing it for Guile. At the moment the rx.el implementation is built on top of Emacs regexps which are implemented in C. I believe this does not use the power of Lisp as much as it could. The traditional regexps work in two steps: first build a blackbox automaton from the string expression, then test if the input matches. Building the automaton is costly. In C, we build it once and save the result in a variable so that every regexp match does not rebuild the automaton each time. In high-level languages, automatons are automatically cached to save the cost of building them. The rx.el library/concept could alleviate this issue altogether: because we express the automaton directly in Lisp, the parsing step is not needed and thus the building cost could be tremendously reduced. So the rx.el building steps rx expression -> regexp string -> C regexp automaton could boil down to simply rx automaton It would be interesting to compare the performance. This also means that there would be no need for caching on behalf of the supporting language. What do you think? -- Pierre Neidhardt --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAEBCAAdFiEEUPM+LlsMPZAEJKvom9z0l6S7zH8FAlsHzrMACgkQm9z0l6S7 zH+02AgAlWUTv/be4L0RzGWTjcQzTE8SqbXPisIEdmhcMTJaTEXKzZcODCVtnNYw Ba1Imrib1CbM5ql34DqYG7aErOFDa18FFeLgkLEAmUBlfmB6VTpXp7LiTwqvNxeB hpUpa2trQIwRthGLd1BszO/9K4We2fkvlrSOirOGU5Jf56ebfd9uJNurPHW53tom ASV/rnlmsKZjKU/zcCOkniaAsny30cZn8m+1d1b5xockeqGQIE+WsBf0HML8Teds gmeIy0MUHp5mu5QNqPVpulD4VRFhj1/llyouwKYUWS8hm1qNv5/Jl3EUlvbGMKsl jT+l2bqB/shla1X+3pqJvGYKEzOLtA== =AGAg -----END PGP SIGNATURE----- --=-=-=--