From: Noam Postavsky <npostavs@gmail.com>
To: "Mattias Engdegård" <mattiase@acm.org>
Cc: emacs-devel <emacs-devel@gnu.org>
Subject: Re: New rx implementation with extension constructs
Date: Thu, 5 Sep 2019 11:38:23 -0400 [thread overview]
Message-ID: <CAM-tV-8jURNdweBRp+wS0U5dARVwUNUsKAOccFWqYDzBeLELMg@mail.gmail.com> (raw)
In-Reply-To: <1C71289F-C5D5-4F9C-947C-374110C1D572@acm.org>
> works just as expected. &rest arguments are permitted, and expand to
> implicit (seq ...) forms. No provision was made for macros able to
> execute arbitrary Lisp code; I just couldn't find a use for them, and
> decided to wait until someone would tell me otherwise. Thus, all
> parametrised forms work by plain substitution.
Do you mean that macros don't support (literal LISP-FORM) and (regexp
LISP-FORM)? Or something else?
> +;; The `rx--translate...' functions below return (REGEXP . PRECEDENCE),
> +;; where REGEXP is a list of string expressions that will be
> +;; concatenated into a regexp, and PRECEDENCE is one of
> +;;
> +;; t -- can be used as argument to postfix operators
> +;; seq -- can be concatenated in sequence with other seq or higher
> +;; lseq -- can be concatenated to the left of rseq or higher
> +;; rseq -- can be concatenated to the right of lseq or higher
> +;; nil -- can only be used in alternatives
> +;;
> +;; They form a lattice:
> +;;
> +;; t highest precedence
> +;; |
> +;; seq
> +;; / \
> +;; lseq rseq
> +;; \ /
> +;; nil lowest precedence
It would help to add some concrete examples (i.e., of things that
would count as `t', `seq', etc) to this abstract explanation.
> +(defun rx--translate-symbol (sym)
> + "Translate an rx symbol. Return (REGEXP . PRECEDENCE)."
> + (pcase sym
> + ((or 'nonl 'not-newline 'any) (cons (list ".") t))
Is there a reason not to use '((".") . t) here (and similar for the rest
of the alternatives)? If yes, then it's probably worth mentioning in a
comment.
> +(defun rx--string-to-intervals (str)
> + "Decode STR as intervals: A-Z becomes (?A . ?Z), and the single
> +character X becomes (?X . ?X). Return the intervals in a list."
> + ;; We could just do string-to-multibyte on the string and work with
> + ;; that instead of this `decode-char' workaround.
> (let ((decode-char
> - ;; Make sure raw bytes are decoded as such, to avoid confusion with
> - ;; U+0080..U+00FF.
> (if (multibyte-string-p str)
> #'identity
> (lambda (c) (if (<= #x80 c #xff)
> @@ -483,477 +280,657 @@ rx-check-any-string
> c))))
If not using string-to-multibyte, I think this lambda can be replaced
with #'unibyte-char-to-multibyte.
next prev parent reply other threads:[~2019-09-05 15:38 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-02 21:19 New rx implementation with extension constructs Mattias Engdegård
2019-09-04 14:18 ` Mattias Engdegård
2019-09-04 17:03 ` Paul Eggert
2019-09-05 10:56 ` Aurélien Aptel
2019-09-05 11:17 ` Mattias Engdegård
2019-09-05 12:34 ` immerrr again
2019-09-05 19:04 ` Mattias Engdegård
2019-09-05 15:38 ` Noam Postavsky [this message]
2019-09-05 16:49 ` Mattias Engdegård
2019-09-06 14:09 ` Mattias Engdegård
2019-09-07 14:13 ` Noam Postavsky
[not found] ` <0E5A5E92-E48F-4003-A742-508663BA984A@acm.org>
2019-09-11 18:11 ` Mattias Engdegård
2019-09-17 12:53 ` Mattias Engdegård
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAM-tV-8jURNdweBRp+wS0U5dARVwUNUsKAOccFWqYDzBeLELMg@mail.gmail.com \
--to=npostavs@gmail.com \
--cc=emacs-devel@gnu.org \
--cc=mattiase@acm.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).