On Mon, Dec 12, 2022 at 04:19:19PM -0500, Stefan Monnier wrote: > > The linked article sounded a bit like "only idiots implement regexen > > in the naive way", and I was pretty sure Emacs devs are not idiots - but > > now I understand the reasons better. There comes my favourite motto: "all generalizations suck" (Michael Heerdegen enjoyed it a couple of threads back). Whoever uses a library these days uses PCRE, and this is, AFAIK, a DFA-with-backtracking thingy. Note that I haven't read the code, so I might well be wrong. > I fully agree that it doesn't make sense to *start* with > a backtracking implementation, yes. > But once you've invested in one, it's harder to move to > something better. > > This said, it *would* be better. Not only in terms of eliminating the > pathological blow ups, but it also offers opportunity to get new > functionality, such as the ability to capture the state of a regexp > match at a specific buffer position (so you can perform a multiline > regexp match one line at a time). It could also make it much more > reasonable to add the possibility to run ELisp code from within the > regexp match engine (e.g. add a \p(NAME) entry which calls the NAME > ELisp function). Perl does the latter. That has saved my bacon from time to time. So that seems possible with a backtracker, too. Don't ask me how, though :-) Saving state at any point would be cool -- you could easily invert control (feeding the regexp machine a spoonful at a time). Think network or an abstract buffer. Cheers -- t