unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Alan Mackenzie <acm@muc.de>
To: Tom Tromey <tom@tromey.com>
Cc: rms@gnu.org, Pierre Neidhardt <ambrevar@gmail.com>,
	Noam Postavsky <npostavs@gmail.com>,
	emacs-devel@gnu.org, van@scratch.space, eliz@gnu.org
Subject: Re: rx.el sexp regexp syntax
Date: Sun, 27 May 2018 20:16:29 +0000	[thread overview]
Message-ID: <20180527201629.GC11447@ACM> (raw)
In-Reply-To: <87a7slyr3v.fsf@tromey.com>

Hello, Tom.

On Sun, May 27, 2018 at 10:56:36 -0600, Tom Tromey wrote:
> >>>>> "Alan" == Alan Mackenzie <acm@muc.de> writes:

> >> Building the automaton is costly.  In C, we build it once and save the
> >> result in a variable so that every regexp match does not rebuild the
> >> automaton each time.

> Alan> Emacs has a (moderately large) cache of regexps, so that building the
> Alan> automatons is done very rarely.  Possibly just once each for each
> Alan> session of Emacs.

> I wonder about both of these statements.

> On the one hand, AFAICT the regex cache is 20 items.  From search.c:

>    #define REGEXP_CACHE_SIZE 20

> That seems pretty small to me, given how prevalent regexps are in elisp.

Hmm.  I must have misremembered.  I thought the cache size was 60, for
some reason.  Now that RAM is measured in gigabytes, we could probably
increase that 20 (if there's any need).

> On the other hand, in the past when I have tried to profile Emacs, I
> haven't seen regexp compilation show up too much.  IIRC I did see regexp
> matching and the GC.  Maybe this just points out the efficacy of the
> cache -- maybe 20 items is plenty.

Maybe.  I just don't know.

> Perhaps the regexp matcher could use some micro-optimizations, like the
> token-threading the bytecode interpreter does.

> Alan> Are you suggesting here building an interpreter in Lisp directly to
> Alan> execute rx expressions?

> It's interesting, IMO, to consider compiling rx (or regexps generally)
> to lisp bytecode.  Perhaps with the JIT, it would boost performance in
> some cases.  (It may be slower, but it's worthwhile to do the
> experiment.)

> For other work in this area see Stefan's lex-parse-re package.  I think
> it includes a regexp matcher in elisp.

I'll need to have a look at that.

> Tom

-- 
Alan Mackenzie (Nuremberg, Germany).



  reply	other threads:[~2018-05-27 20:16 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-24 10:47 rx.el sexp regexp syntax (WAS: Off Topic) Noam Postavsky
2018-05-24 10:58 ` Van L
2018-05-25  2:57 ` Richard Stallman
2018-05-25  8:52   ` Pierre Neidhardt
2018-05-25 15:51     ` Alan Mackenzie
2018-05-25 16:47       ` Pierre Neidhardt
2018-05-25 18:01         ` rx.el sexp regexp syntax Eric Abrahamsen
2018-05-25 18:12           ` Pierre Neidhardt
2018-05-25 18:56             ` Eric Abrahamsen
2018-05-25 21:42               ` Clément Pit-Claudel
2018-05-25 21:51                 ` Eric Abrahamsen
2018-05-25 22:27                   ` Michael Heerdegen
2018-05-25 22:44                     ` Eric Abrahamsen
2018-05-27 20:27           ` Stefan Monnier
2018-05-28 16:37             ` Pierre Neidhardt
2018-05-28 17:15               ` Stefan Monnier
2018-05-29  3:10                 ` Richard Stallman
2018-05-29  7:28                   ` Robert Pluim
2018-05-29  8:27                 ` Philipp Stephani
2018-05-30  3:24                   ` Richard Stallman
2018-05-30  7:25                     ` Robert Pluim
2018-05-31  3:53                       ` Richard Stallman
2018-05-31  8:57                         ` Robert Pluim
2018-05-31  4:13                       ` Clément Pit-Claudel
2018-05-31 14:19                       ` Stefan Monnier
2018-05-31 15:43                         ` Drew Adams
2018-05-31 16:12                           ` João Távora
2018-05-31 16:18                             ` Robert Pluim
2018-05-31 16:48                               ` Basil L. Contovounesios
2018-05-31 17:02                                 ` Basil L. Contovounesios
2018-05-31 18:40                                   ` João Távora
2018-06-02 19:33             ` Eric Abrahamsen
2018-06-03  3:49               ` Stefan Monnier
2018-06-03  4:59                 ` Eric Abrahamsen
2018-06-03 14:51                 ` Helmut Eller
2018-06-03 15:15                   ` Eric Abrahamsen
2018-06-03 15:53                     ` Helmut Eller
2018-06-03 16:40                       ` Eric Abrahamsen
2018-06-03 19:57                       ` Drew Adams
2018-06-03 21:15                         ` Eric Abrahamsen
2018-06-03 23:23                           ` Drew Adams
2018-06-04 13:56                         ` Stefan Monnier
2018-06-04 15:24                           ` Drew Adams
2018-06-04 15:44                             ` Pierre Neidhardt
2018-05-25 18:17         ` rx.el sexp regexp syntax (WAS: Off Topic) Alan Mackenzie
2018-05-25 20:35           ` Peter Neidhardt
2018-05-25 21:01           ` rx.el sexp regexp syntax Michael Heerdegen
2018-05-25 23:32             ` Peter Neidhardt
2018-05-27 16:56       ` Tom Tromey
2018-05-27 20:16         ` Alan Mackenzie [this message]
2018-05-27 20:23       ` Stefan Monnier
2018-05-27 20:16     ` Stefan Monnier
2018-05-28 16:36       ` Pierre Neidhardt
2018-05-28 17:04         ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180527201629.GC11447@ACM \
    --to=acm@muc.de \
    --cc=ambrevar@gmail.com \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=npostavs@gmail.com \
    --cc=rms@gnu.org \
    --cc=tom@tromey.com \
    --cc=van@scratch.space \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).