unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Helmut Eller <eller.helmut@gmail.com>
To: Danny McClanahan <dmcc2@hypnicjerk.ai>
Cc: Ihor Radchenko <yantar92@posteo.net>,
	 "emacs-devel@gnu.org" <emacs-devel@gnu.org>
Subject: Re: prior work on non-backtracking regex engine?
Date: Mon, 08 Apr 2024 14:19:13 +0200	[thread overview]
Message-ID: <87edbgt5by.fsf@gmail.com> (raw)
In-Reply-To: <8eHWNAHASGpm2Qhs7mldB0VTdFaRW0NZJWi9ppNiCPZQ-QO8E7YLUXzJfT7LXYe1mTnwFGACouy7YroB8qsCFDbRuWYHr2gvOTagQY3TW7w=@hypnicjerk.ai> (Danny McClanahan's message of "Sun, 07 Apr 2024 04:42:13 +0000")

On Sun, Apr 07 2024, Danny McClanahan wrote:

> And I was also *super*
> pleased to see that regex-emacs.h itself doesn't expose any dependency
> on the gap buffer or other internal emacs representations (except
> regarding multibyte encoding). So in my amateur evaluation, emacs
> actually seems very well-placed to take advantage of high-performance
> regex engine techniques without any big structural changes.

What's the history of regex-emacs.h?  It seems like in the past it was
regex.h from Gnulib.  That would explain why the regex engine is
relatively well decoupled from the rest.  But it also leads to the
question: why does Emacs no longer use Gnulib's regex engine?

My guess is that it has something to do with the way Emacs's performs
case-insensitive matches.  Another complication may be that Gnulib
doesn't support the non-greedy variants of some operators.

It seems that these days Gnulib uses a DFA based algorithm when possible
and falls back to backtracking for backrefs (and presumably for POSIX
compatible sub-match rules).  So one could argue that Gnulib already has
a lot of what we want and would be the natural place to add a clean API
for the features that Emacs needs.

So does somebody know the details why Gnulib and Emacs went separate
ways in the past?

Helmut



  parent reply	other threads:[~2024-04-08 12:19 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-10 15:41 prior work on non-backtracking regex engine? Danny McClanahan
2024-03-12 23:45 ` Danny McClanahan
2024-03-13 13:23   ` Ihor Radchenko
2024-04-07  4:42     ` Danny McClanahan
2024-04-07 14:15       ` Ihor Radchenko
2024-04-08 12:19       ` Helmut Eller [this message]
2024-04-08 13:13         ` Eli Zaretskii
2024-04-08 14:00       ` Po Lu
2024-04-08 14:23         ` Eli Zaretskii
2024-04-12  0:12           ` Danny McClanahan
2024-04-17 14:23 ` Clément Pit-Claudel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87edbgt5by.fsf@gmail.com \
    --to=eller.helmut@gmail.com \
    --cc=dmcc2@hypnicjerk.ai \
    --cc=emacs-devel@gnu.org \
    --cc=yantar92@posteo.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).