From: Alan Mackenzie <acm@muc.de>
To: Tom Tromey <tom@tromey.com>
Cc: rms@gnu.org, Pierre Neidhardt <ambrevar@gmail.com>,
Noam Postavsky <npostavs@gmail.com>,
emacs-devel@gnu.org, van@scratch.space, eliz@gnu.org
Subject: Re: rx.el sexp regexp syntax
Date: Sun, 27 May 2018 20:16:29 +0000 [thread overview]
Message-ID: <20180527201629.GC11447@ACM> (raw)
In-Reply-To: <87a7slyr3v.fsf@tromey.com>
Hello, Tom.
On Sun, May 27, 2018 at 10:56:36 -0600, Tom Tromey wrote:
> >>>>> "Alan" == Alan Mackenzie <acm@muc.de> writes:
> >> Building the automaton is costly. In C, we build it once and save the
> >> result in a variable so that every regexp match does not rebuild the
> >> automaton each time.
> Alan> Emacs has a (moderately large) cache of regexps, so that building the
> Alan> automatons is done very rarely. Possibly just once each for each
> Alan> session of Emacs.
> I wonder about both of these statements.
> On the one hand, AFAICT the regex cache is 20 items. From search.c:
> #define REGEXP_CACHE_SIZE 20
> That seems pretty small to me, given how prevalent regexps are in elisp.
Hmm. I must have misremembered. I thought the cache size was 60, for
some reason. Now that RAM is measured in gigabytes, we could probably
increase that 20 (if there's any need).
> On the other hand, in the past when I have tried to profile Emacs, I
> haven't seen regexp compilation show up too much. IIRC I did see regexp
> matching and the GC. Maybe this just points out the efficacy of the
> cache -- maybe 20 items is plenty.
Maybe. I just don't know.
> Perhaps the regexp matcher could use some micro-optimizations, like the
> token-threading the bytecode interpreter does.
> Alan> Are you suggesting here building an interpreter in Lisp directly to
> Alan> execute rx expressions?
> It's interesting, IMO, to consider compiling rx (or regexps generally)
> to lisp bytecode. Perhaps with the JIT, it would boost performance in
> some cases. (It may be slower, but it's worthwhile to do the
> experiment.)
> For other work in this area see Stefan's lex-parse-re package. I think
> it includes a regexp matcher in elisp.
I'll need to have a look at that.
> Tom
--
Alan Mackenzie (Nuremberg, Germany).
next prev parent reply other threads:[~2018-05-27 20:16 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-24 10:47 rx.el sexp regexp syntax (WAS: Off Topic) Noam Postavsky
2018-05-24 10:58 ` Van L
2018-05-25 2:57 ` Richard Stallman
2018-05-25 8:52 ` Pierre Neidhardt
2018-05-25 15:51 ` Alan Mackenzie
2018-05-25 16:47 ` Pierre Neidhardt
2018-05-25 18:01 ` rx.el sexp regexp syntax Eric Abrahamsen
2018-05-25 18:12 ` Pierre Neidhardt
2018-05-25 18:56 ` Eric Abrahamsen
2018-05-25 21:42 ` Clément Pit-Claudel
2018-05-25 21:51 ` Eric Abrahamsen
2018-05-25 22:27 ` Michael Heerdegen
2018-05-25 22:44 ` Eric Abrahamsen
2018-05-27 20:27 ` Stefan Monnier
2018-05-28 16:37 ` Pierre Neidhardt
2018-05-28 17:15 ` Stefan Monnier
2018-05-29 3:10 ` Richard Stallman
2018-05-29 7:28 ` Robert Pluim
2018-05-29 8:27 ` Philipp Stephani
2018-05-30 3:24 ` Richard Stallman
2018-05-30 7:25 ` Robert Pluim
2018-05-31 3:53 ` Richard Stallman
2018-05-31 8:57 ` Robert Pluim
2018-05-31 4:13 ` Clément Pit-Claudel
2018-05-31 14:19 ` Stefan Monnier
2018-05-31 15:43 ` Drew Adams
2018-05-31 16:12 ` João Távora
2018-05-31 16:18 ` Robert Pluim
2018-05-31 16:48 ` Basil L. Contovounesios
2018-05-31 17:02 ` Basil L. Contovounesios
2018-05-31 18:40 ` João Távora
2018-06-02 19:33 ` Eric Abrahamsen
2018-06-03 3:49 ` Stefan Monnier
2018-06-03 4:59 ` Eric Abrahamsen
2018-06-03 14:51 ` Helmut Eller
2018-06-03 15:15 ` Eric Abrahamsen
2018-06-03 15:53 ` Helmut Eller
2018-06-03 16:40 ` Eric Abrahamsen
2018-06-03 19:57 ` Drew Adams
2018-06-03 21:15 ` Eric Abrahamsen
2018-06-03 23:23 ` Drew Adams
2018-06-04 13:56 ` Stefan Monnier
2018-06-04 15:24 ` Drew Adams
2018-06-04 15:44 ` Pierre Neidhardt
2018-05-25 18:17 ` rx.el sexp regexp syntax (WAS: Off Topic) Alan Mackenzie
2018-05-25 20:35 ` Peter Neidhardt
2018-05-25 21:01 ` rx.el sexp regexp syntax Michael Heerdegen
2018-05-25 23:32 ` Peter Neidhardt
2018-05-27 16:56 ` Tom Tromey
2018-05-27 20:16 ` Alan Mackenzie [this message]
2018-05-27 20:23 ` Stefan Monnier
2018-05-27 20:16 ` Stefan Monnier
2018-05-28 16:36 ` Pierre Neidhardt
2018-05-28 17:04 ` Stefan Monnier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180527201629.GC11447@ACM \
--to=acm@muc.de \
--cc=ambrevar@gmail.com \
--cc=eliz@gnu.org \
--cc=emacs-devel@gnu.org \
--cc=npostavs@gmail.com \
--cc=rms@gnu.org \
--cc=tom@tromey.com \
--cc=van@scratch.space \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).