unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Danny McClanahan <dmcc2@hypnicjerk.ai>
To: "emacs-devel@gnu.org" <emacs-devel@gnu.org>
Subject: rosie/libpexl library for regex pattern composition
Date: Sat, 27 Jul 2024 13:04:28 +0000	[thread overview]
Message-ID: <disFmiXATNq6Fywm_Y7OS4MzSw81H4xm0k1UQ_aAvvC2GsXEAYUXa94WCtJI5TMMhjXnyI_89vovpGAHB-G-kg1WA2-wssPGokj8LWpdAgo=@hypnicjerk.ai> (raw)

Hello emacs-devel,

I have recently become familiar with the Rosie Pattern Language (https://rosie-lang.org/) by Prof. Jamie Jennings at NCSU. I pinged this list a few months ago about improving the performance and worst-case behavior of regex-emacs.c, and while I'm still working on a prototype for that, others on this list also responded that a method to compose patterns in lisp code might be quite useful as well.

While I understand that tree-sitter tends to be the more accepted way to parse program source, there remain many use cases for which regex or something like it remains applicable, especially parsing the output of external processes (like the built-in M-x grep, or my extension https://github.com/cosmicexplorer/helm-rg). I became especially interested reading https://rosie-lang.org/about/ regarding its focus on enabling maintainable/testable libraries of patterns, which seemed to correspond to my vision of what pattern composition might look like for Emacs extension developers.

While I believe Rosie has a build-time (and possibly run-time) dependency on Lua, PEXL (https://gitlab.com/pexlang/libpexl) is the author's new implementation and is written in very portable C99. It also has several new features and implementation techniques over Rosie. I'm still getting familiar with the project, so I can't speak to any standout features yet, but on its face it seems like a potential substrate we could build lisp-level composable pattern abstractions on top of.

Rosie/PEXL's goals are explicitly focused more on maintainability than sheer performance, so I'm thinking it might make sense to introduce Rosie as a separate interface to the regex engine, while we can keep the regex engine narrowly focused on patterns that we can more easily optimize. For example, I was glad to hear in my previous communications with emacs-devel that there was some receptiveness to deprecating features like runtime lookup of mode-specific word boundaries from the regex engine if it would ease optimization (I'm not sure if that's necessary yet), but one way we could avoid removing more complex functionality like backrefs that extension devs depend on is to direct them to a lisp interface wrapping Rosie, which supports backrefs (it actually supports a strictly more powerful formalization of backrefs than regex engines do; see the author's post on it at https://jamiejennings.com/posts/2023-10-01-dont-look-back-3/).

Like I said, I'm still becoming familiar with Rosie/PEXL, so I don't quite have enough info yet to make a more thorough proposal. But I'd love to know if others are familiar with this project and whether it might correspond to the use cases for lisp-level pattern composition brought up in response to my previous communications about improving the regex engine.

Thanks,
Danny



             reply	other threads:[~2024-07-27 13:04 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-27 13:04 Danny McClanahan [this message]
2024-07-28  7:08 ` rosie/libpexl library for regex pattern composition Helmut Eller
2024-07-28  7:51   ` Eli Zaretskii
2024-07-29 13:58     ` Danny McClanahan
2024-07-29 19:33       ` Helmut Eller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='disFmiXATNq6Fywm_Y7OS4MzSw81H4xm0k1UQ_aAvvC2GsXEAYUXa94WCtJI5TMMhjXnyI_89vovpGAHB-G-kg1WA2-wssPGokj8LWpdAgo=@hypnicjerk.ai' \
    --to=dmcc2@hypnicjerk.ai \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).