From: Stefan Monnier <monnier@iro.umontreal.ca>
To: Alan Mackenzie <acm@muc.de>
Cc: emacs-devel@gnu.org
Subject: Re: Fixing ill-conditioned regular expressions. Proof of concept.
Date: Fri, 13 Mar 2015 18:53:38 -0400 [thread overview]
Message-ID: <jwvh9topm9f.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <20150223181205.GA2861@acm.fritz.box> (Alan Mackenzie's message of "Mon, 23 Feb 2015 18:12:05 +0000")
> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> ;;
> ;; f i x - r e . e l
This is not using the standard file format. I strongly suggest you use
something like auto-insert-mode which will get that right for you.
> ;; The AST is an internal representation of the regular expression string.
> ;; Elements are represented as follows:
Why invent a new format, instead of using rx? I'm pretty sure someone
has already provided a parser from string to rx format (and of course
we also already have a dumper from rx to string).
> ;; PTR and AD
> ;; ----------
> ;; Whilst building and transforming trees, what we really need is pointers to
> ;; either the car or cdr of a cons cell, to enable us to
> ;; overwrite them.
I think your code will gain a lot in readability if you made it
functional: take an rx expression and return a *new* rx expression
without modifying the original one.
As it stands, I find it rather impenetrable.
> Return 'shortened, t or nil.
Our conventions say this should be
Return `shortened', t or nil.
> (if (eq ad 'cdr) (error "fix-re--R+R*->R+ got 'cdr"))
> (let ((e0 (if (eq ad 'car) (car ptr) (cadr ptr)))
> (e1 (if (eq ad 'car) (cadr ptr) (caddr ptr))))
> (when (and (consp e0) (consp e1))
> (let ((op0 (car e0))
> (op1 (car e1))
> ;; (link (if (eq ad 'car) (cddr ptr) (cdddr ptr)))
> op
> )
> (when
> (and (memq op0 '(+ * \?))
> (memq op1 '(+ * \?))
> (equal (cdr e0) (cdr e1))) ; Both Rs are the same.
> (cond
> ((and (eq op0 '\?) (eq op1 '\?)) ; Cant combine R?R?
> nil)
> ((and (eq op0 '+) (eq op1 '+)) ; R+R+ -> RR+
> (fix-re--chop-+* ptr 'car)
> t)
> (t
> (setq op
> (cond
> ((or (eq op0 '+) (eq op1 '+)) '+)
> ((or (eq op0 '*) (eq op1 '*)) '*)))
> (setcar e0 op)
> (fix-re--chop (if (eq ad 'car) ptr (cdr ptr)) 'cadr)
> 'shortened))
> )))))
For example, I think that if you make it functional rather than
imperative, the above could turn into something along the lines of
(pcase ptr ;We don't have `ad' any more.
(`((,op0 . ,r0)
(,op1 . ,r1)
. (and ,rest
(guard (and (equal r0 r1)
(memq op0 '(+ * \?))
(memq op1 '(+ * \?)))))))
(cond
((and (eq op0 '\?) (eq op1 '\?)) ; Cant combine R?R?
ptr)
((and (eq op0 '+) (eq op1 '+)) ; R+R+ -> RR+
`(,(car r0) ,(car ptr) . ,rest))
(t
`((,(cond
((or (eq op0 '+) (eq op1 '+)) '+)
((or (eq op0 '*) (eq op1 '*)) '*)) . r0)
. ,rest)))))
-- Stefan>
prev parent reply other threads:[~2015-03-13 22:53 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-23 18:12 Fixing ill-conditioned regular expressions. Proof of concept Alan Mackenzie
2015-02-23 19:55 ` Paul Eggert
2015-02-23 20:21 ` Alan Mackenzie
2015-02-23 22:19 ` Paul Eggert
2015-02-23 22:42 ` Alan Mackenzie
2015-02-23 23:07 ` Artur Malabarba
2015-02-23 23:37 ` Paul Eggert
2015-02-25 10:08 ` Alan Mackenzie
2015-02-26 1:11 ` Stephen J. Turnbull
2015-02-26 8:46 ` Paul Eggert
2015-02-26 10:11 ` Alan Mackenzie
2015-02-26 11:05 ` Tassilo Horn
2015-02-26 13:09 ` Alan Mackenzie
2015-02-26 13:46 ` Stefan Monnier
2015-02-26 16:21 ` Alan Mackenzie
2015-02-26 19:12 ` Stefan Monnier
2015-02-26 20:01 ` Alan Mackenzie
2015-02-27 13:45 ` Stefan Monnier
2015-02-24 16:29 ` Stefan Monnier
2015-02-24 6:20 ` Philipp Stephani
2015-03-13 22:53 ` Stefan Monnier [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=jwvh9topm9f.fsf-monnier+emacs@gnu.org \
--to=monnier@iro.umontreal.ca \
--cc=acm@muc.de \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.