unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* rosie/libpexl library for regex pattern composition
@ 2024-07-27 13:04 Danny McClanahan
  2024-07-28  7:08 ` Helmut Eller
  0 siblings, 1 reply; 5+ messages in thread
From: Danny McClanahan @ 2024-07-27 13:04 UTC (permalink / raw)
  To: emacs-devel@gnu.org

Hello emacs-devel,

I have recently become familiar with the Rosie Pattern Language (https://rosie-lang.org/) by Prof. Jamie Jennings at NCSU. I pinged this list a few months ago about improving the performance and worst-case behavior of regex-emacs.c, and while I'm still working on a prototype for that, others on this list also responded that a method to compose patterns in lisp code might be quite useful as well.

While I understand that tree-sitter tends to be the more accepted way to parse program source, there remain many use cases for which regex or something like it remains applicable, especially parsing the output of external processes (like the built-in M-x grep, or my extension https://github.com/cosmicexplorer/helm-rg). I became especially interested reading https://rosie-lang.org/about/ regarding its focus on enabling maintainable/testable libraries of patterns, which seemed to correspond to my vision of what pattern composition might look like for Emacs extension developers.

While I believe Rosie has a build-time (and possibly run-time) dependency on Lua, PEXL (https://gitlab.com/pexlang/libpexl) is the author's new implementation and is written in very portable C99. It also has several new features and implementation techniques over Rosie. I'm still getting familiar with the project, so I can't speak to any standout features yet, but on its face it seems like a potential substrate we could build lisp-level composable pattern abstractions on top of.

Rosie/PEXL's goals are explicitly focused more on maintainability than sheer performance, so I'm thinking it might make sense to introduce Rosie as a separate interface to the regex engine, while we can keep the regex engine narrowly focused on patterns that we can more easily optimize. For example, I was glad to hear in my previous communications with emacs-devel that there was some receptiveness to deprecating features like runtime lookup of mode-specific word boundaries from the regex engine if it would ease optimization (I'm not sure if that's necessary yet), but one way we could avoid removing more complex functionality like backrefs that extension devs depend on is to direct them to a lisp interface wrapping Rosie, which supports backrefs (it actually supports a strictly more powerful formalization of backrefs than regex engines do; see the author's post on it at https://jamiejennings.com/posts/2023-10-01-dont-look-back-3/).

Like I said, I'm still becoming familiar with Rosie/PEXL, so I don't quite have enough info yet to make a more thorough proposal. But I'd love to know if others are familiar with this project and whether it might correspond to the use cases for lisp-level pattern composition brought up in response to my previous communications about improving the regex engine.

Thanks,
Danny



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-07-29 19:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-27 13:04 rosie/libpexl library for regex pattern composition Danny McClanahan
2024-07-28  7:08 ` Helmut Eller
2024-07-28  7:51   ` Eli Zaretskii
2024-07-29 13:58     ` Danny McClanahan
2024-07-29 19:33       ` Helmut Eller

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).