unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Noam Postavsky <npostavs@gmail.com>
To: "Mattias Engdegård" <mattiase@acm.org>
Cc: emacs-devel <emacs-devel@gnu.org>
Subject: Re: New rx implementation with extension constructs
Date: Thu, 5 Sep 2019 11:38:23 -0400	[thread overview]
Message-ID: <CAM-tV-8jURNdweBRp+wS0U5dARVwUNUsKAOccFWqYDzBeLELMg@mail.gmail.com> (raw)
In-Reply-To: <1C71289F-C5D5-4F9C-947C-374110C1D572@acm.org>

> works just as expected. &rest arguments are permitted, and expand to
> implicit (seq ...) forms.  No provision was made for macros able to
> execute arbitrary Lisp code; I just couldn't find a use for them, and
> decided to wait until someone would tell me otherwise. Thus, all
> parametrised forms work by plain substitution.

Do you mean that macros don't support (literal LISP-FORM) and (regexp
LISP-FORM)?  Or something else?

> +;; The `rx--translate...' functions below return (REGEXP . PRECEDENCE),
> +;; where REGEXP is a list of string expressions that will be
> +;; concatenated into a regexp, and PRECEDENCE is one of
> +;;
> +;;  t    -- can be used as argument to postfix operators
> +;;  seq  -- can be concatenated in sequence with other seq or higher
> +;;  lseq -- can be concatenated to the left of rseq or higher
> +;;  rseq -- can be concatenated to the right of lseq or higher
> +;;  nil  -- can only be used in alternatives
> +;;
> +;; They form a lattice:
> +;;
> +;;           t          highest precedence
> +;;           |
> +;;          seq
> +;;         /   \
> +;;      lseq   rseq
> +;;         \   /
> +;;          nil         lowest precedence

It would help to add some concrete examples (i.e., of things that
would count as `t', `seq', etc) to this abstract explanation.

> +(defun rx--translate-symbol (sym)
> +  "Translate an rx symbol.  Return (REGEXP . PRECEDENCE)."
> +  (pcase sym
> +    ((or 'nonl 'not-newline 'any) (cons (list ".") t))

Is there a reason not to use '((".") . t) here (and similar for the rest
of the alternatives)?  If yes, then it's probably worth mentioning in a
comment.

> +(defun rx--string-to-intervals (str)
> +  "Decode STR as intervals: A-Z becomes (?A . ?Z), and the single
> +character X becomes (?X . ?X).  Return the intervals in a list."
> +  ;; We could just do string-to-multibyte on the string and work with
> +  ;; that instead of this `decode-char' workaround.
>    (let ((decode-char
> -         ;; Make sure raw bytes are decoded as such, to avoid confusion with
> -         ;; U+0080..U+00FF.
>           (if (multibyte-string-p str)
>               #'identity
>             (lambda (c) (if (<= #x80 c #xff)
> @@ -483,477 +280,657 @@ rx-check-any-string
>                           c))))

If not using string-to-multibyte, I think this lambda can be replaced
with #'unibyte-char-to-multibyte.



  parent reply	other threads:[~2019-09-05 15:38 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-02 21:19 New rx implementation with extension constructs Mattias Engdegård
2019-09-04 14:18 ` Mattias Engdegård
2019-09-04 17:03   ` Paul Eggert
2019-09-05 10:56     ` Aurélien Aptel
2019-09-05 11:17       ` Mattias Engdegård
2019-09-05 12:34         ` immerrr again
2019-09-05 19:04           ` Mattias Engdegård
2019-09-05 15:38   ` Noam Postavsky [this message]
2019-09-05 16:49     ` Mattias Engdegård
2019-09-06 14:09 ` Mattias Engdegård
2019-09-07 14:13   ` Noam Postavsky
     [not found]     ` <0E5A5E92-E48F-4003-A742-508663BA984A@acm.org>
2019-09-11 18:11       ` Mattias Engdegård
2019-09-17 12:53         ` Mattias Engdegård

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAM-tV-8jURNdweBRp+wS0U5dARVwUNUsKAOccFWqYDzBeLELMg@mail.gmail.com \
    --to=npostavs@gmail.com \
    --cc=emacs-devel@gnu.org \
    --cc=mattiase@acm.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).