Re: Regexp bytecode disassembler

all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

From: Eli Zaretskii <eliz@gnu.org>
To: "Mattias Engdegård" <mattiase@acm.org>
Cc: emacs-devel@gnu.org
Subject: Re: Regexp bytecode disassembler
Date: Sat, 21 Mar 2020 21:19:16 +0200	[thread overview]
Message-ID: <834kuhecsr.fsf@gnu.org> (raw)
In-Reply-To: <68FB4EC3-3C67-4D07-8473-5FC671024515@acm.org> (message from Mattias Engdegård on Sat, 21 Mar 2020 17:52:51 +0100)

> From: Mattias Engdegård <mattiase@acm.org>
> Date: Sat, 21 Mar 2020 17:52:51 +0100
> Cc: emacs-devel@gnu.org
> 
> > First, please document this in NEWS and in the ELisp manual.  IMNSHO,
> > this feature will be much less useful without documentation.
> 
> Sorry, I should have been clear on the point that this is primarily a debug and maintenance aid for the regexp-engine developer and not intended as a user-facing feature. Nobody is barred from using it, but they are expected to read the circuit schematics that comes with Emacs (ie, the source code).
> 
> In particular, there is no user interface to the regexp bytecode at all; users can't write program in it and have Emacs run them. It is also not stable in the slightest. Documenting the inner workings of the regexp engine would only put a burden on its maintainers.

I didn't mean the user manual, I meant the ELisp manual.  I don't
agree that this command should remain undocumented, and I don't
understand your opposition to making this more visible and more easily
used.  Having users read the C code is quite an obstacle to some.

> >> +;;;###autoload
> >> +(defun regexp-disasm (regexp)
> > 
> > Why do we need to auto-load this?
> 
> Actually, a function that returns the bytecode in symbolic form turned out to be useful in its own right, and I found it handy for some programmatic uses like comparing the bytecodes of two regexps.

I don't think this answers the question.  Not every useful function is
auto-loaded, is it?  Why is it a problem to have to require this
package?

> >> +         (read-u16 (lambda (ofs) (+ (aref bc ofs)
> >> +                                    (ash (aref bc (1+ ofs)) 8))))
> > 
> > Why lambda-forms and not functions (or desfsubst)?
> 
> Because they need to close over variables in scope.

So you are "saving" one more argument?

> With lexical binding, elisp almost feels like a real programming language!

Maybe so, but this style makes the code harder to read and modify,
IMO.

> >> +               (pcase opcode
> >> +                 (0 (cons 'no-op 1))
> >> +                 (1 (cons 'succeed 1))
> > 
> > Is pcase really needed here?  It looks like a simple cond will do.
> 
> Well, pcase is a lot more readable here, don't you think?

No, I don't, not in this case.  You are just selecting from a list of
fixed values.

> >> +  (interactive "XRegexp (evaluated): ")
> > 
> > This prompt should do a better job describing what kind of input is
> > expected here.
> 
> I'm not sure what else to say in the prompt. I found it more useful to input the regexp as a lisp expression than a string (for cut-and-paste from source code, or for rx) but maybe that's just me.

I envision many people will think a string is expected, thus my
comment.

> +   Any changes here should be reflected in regexp-disasm.el as well.  */

I think the same comment should be near the definition of re_opcode_t.

Thanks.

next prev parent reply	other threads:[~2020-03-21 19:19 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-20 12:27 Regexp bytecode disassembler Mattias Engdegård
2020-03-20 12:58 ` Andreas Schwab
2020-03-20 14:34 ` Eli Zaretskii
2020-03-21 16:52   ` Mattias Engdegård
2020-03-21 19:19     ` Eli Zaretskii [this message]
2020-03-21 20:16       ` Štěpán Němec
2020-03-21 20:30         ` Eli Zaretskii
2020-03-21 20:40           ` Mattias Engdegård
2020-03-21 20:44           ` Štěpán Němec
2020-03-22 14:12             ` Eli Zaretskii
2020-03-22 14:43               ` Štěpán Němec
2020-03-22 16:55                 ` Eli Zaretskii
2020-03-22 17:16                   ` Štěpán Němec
2020-03-22 17:30                     ` Eli Zaretskii
2020-03-22 18:34                       ` Paul Eggert
2020-03-22 18:36                       ` Dmitry Gutov
2020-03-21 20:50           ` Dmitry Gutov
2020-03-21 23:58           ` Drew Adams
2020-03-22  0:02             ` Drew Adams
2020-03-21 20:37       ` Mattias Engdegård
2020-03-22  3:28         ` Eli Zaretskii
2020-03-22  9:23           ` Mattias Engdegård
2020-03-22 10:37             ` Eli Zaretskii
2020-03-22 15:24               ` Mattias Engdegård
2020-03-22 17:06                 ` Eli Zaretskii
2020-03-22 19:39                   ` Mattias Engdegård
2020-03-22 20:12                     ` Eli Zaretskii
2020-03-22 20:22                     ` Corwin Brust
2020-03-20 15:39 ` Pip Cet
2020-03-21 16:56   ` Mattias Engdegård

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=834kuhecsr.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=mattiase@acm.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.