unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: "Mattias Engdegård" <mattiase@acm.org>
To: emacs-devel <emacs-devel@gnu.org>
Subject: New rx implementation with extension constructs
Date: Mon, 2 Sep 2019 23:19:47 +0200	[thread overview]
Message-ID: <DCF786B9-E536-48C3-9C8D-92E948A90183@acm.org> (raw)

The rx regexp notation is nice to use but the implementation isn't wonderful; there is a proposed replacement rewritten from the ground up. It is cleaner, has fewer bugs, and is maybe twice as fast.

Most importantly, there is now a proper extension mechanism: for global definitions,

 (rx-define snobol-identifier (seq alpha (0+ alnum))

which are available anywhere, and local ones,

 (rx-let ((natnum (1+ digit))
          (integer (seq (opt "-") natnum)))
   ...body...)

where a set of definitions are only available in a lexical scope. This zero-cost construct can be placed inside a function, or at top-level enclosing multiple variable and function definitions, all sharing the same named rx forms.

Both rx-define and rx-let admit two kinds of definitions:

 NAME RX-FORM
 NAME (ARGS...) RX-FORM

for plain rx symbols and for parametrised forms, respectively. For example:

 (rx-let ((name (1+ letter))
          (comma-separated (x) (seq x (0+ "," x))))
  (rx (comma-separated name)))

works just as expected. &rest arguments are permitted, and expand to implicit (seq ...) forms.
No provision was made for macros able to execute arbitrary Lisp code; I just couldn't find a use for them, and decided to wait until someone would tell me otherwise. Thus, all parametrised forms work by plain substitution.

The code currently resides at https://gitlab.com/mattiase/ry; it will naturally be renamed to `rx' once it's in the Emacs tree. It can be integrated in a separate branch of the Emacs source repo if you wish, or as patches if you prefer that for reviewing. The diffs don't make much sense since it is a reimplementation with very little in common with the old code.

The exact form of the extension mechanism isn't set in stone, and I'd welcome any suggestions for improvement.




             reply	other threads:[~2019-09-02 21:19 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-02 21:19 Mattias Engdegård [this message]
2019-09-04 14:18 ` New rx implementation with extension constructs Mattias Engdegård
2019-09-04 17:03   ` Paul Eggert
2019-09-05 10:56     ` Aurélien Aptel
2019-09-05 11:17       ` Mattias Engdegård
2019-09-05 12:34         ` immerrr again
2019-09-05 19:04           ` Mattias Engdegård
2019-09-05 15:38   ` Noam Postavsky
2019-09-05 16:49     ` Mattias Engdegård
2019-09-06 14:09 ` Mattias Engdegård
2019-09-07 14:13   ` Noam Postavsky
     [not found]     ` <0E5A5E92-E48F-4003-A742-508663BA984A@acm.org>
2019-09-11 18:11       ` Mattias Engdegård
2019-09-17 12:53         ` Mattias Engdegård

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DCF786B9-E536-48C3-9C8D-92E948A90183@acm.org \
    --to=mattiase@acm.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).