From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Regexp bytecode disassembler Date: Sat, 21 Mar 2020 21:19:16 +0200 Message-ID: <834kuhecsr.fsf@gnu.org> References: <4201DF24-BCC4-4C08-9857-38207B7C10B4@acm.org> <83mu8bdriv.fsf@gnu.org> <68FB4EC3-3C67-4D07-8473-5FC671024515@acm.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="116846"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Mar 21 20:20:20 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jFjfQ-000UJm-NQ for ged-emacs-devel@m.gmane-mx.org; Sat, 21 Mar 2020 20:20:20 +0100 Original-Received: from localhost ([::1]:39936 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jFjfP-0002Tf-PW for ged-emacs-devel@m.gmane-mx.org; Sat, 21 Mar 2020 15:20:19 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:40357) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jFjeq-00021r-50 for emacs-devel@gnu.org; Sat, 21 Mar 2020 15:19:45 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:49731) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jFjeq-0001JG-1D; Sat, 21 Mar 2020 15:19:44 -0400 Original-Received: from [176.228.60.248] (port=2131 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jFjee-000892-C5; Sat, 21 Mar 2020 15:19:43 -0400 In-Reply-To: <68FB4EC3-3C67-4D07-8473-5FC671024515@acm.org> (message from Mattias =?utf-8?Q?Engdeg=C3=A5rd?= on Sat, 21 Mar 2020 17:52:51 +0100) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:245637 Archived-At: > From: Mattias EngdegÄrd > Date: Sat, 21 Mar 2020 17:52:51 +0100 > Cc: emacs-devel@gnu.org > > > First, please document this in NEWS and in the ELisp manual. IMNSHO, > > this feature will be much less useful without documentation. > > Sorry, I should have been clear on the point that this is primarily a debug and maintenance aid for the regexp-engine developer and not intended as a user-facing feature. Nobody is barred from using it, but they are expected to read the circuit schematics that comes with Emacs (ie, the source code). > > In particular, there is no user interface to the regexp bytecode at all; users can't write program in it and have Emacs run them. It is also not stable in the slightest. Documenting the inner workings of the regexp engine would only put a burden on its maintainers. I didn't mean the user manual, I meant the ELisp manual. I don't agree that this command should remain undocumented, and I don't understand your opposition to making this more visible and more easily used. Having users read the C code is quite an obstacle to some. > >> +;;;###autoload > >> +(defun regexp-disasm (regexp) > > > > Why do we need to auto-load this? > > Actually, a function that returns the bytecode in symbolic form turned out to be useful in its own right, and I found it handy for some programmatic uses like comparing the bytecodes of two regexps. I don't think this answers the question. Not every useful function is auto-loaded, is it? Why is it a problem to have to require this package? > >> + (read-u16 (lambda (ofs) (+ (aref bc ofs) > >> + (ash (aref bc (1+ ofs)) 8)))) > > > > Why lambda-forms and not functions (or desfsubst)? > > Because they need to close over variables in scope. So you are "saving" one more argument? > With lexical binding, elisp almost feels like a real programming language! Maybe so, but this style makes the code harder to read and modify, IMO. > >> + (pcase opcode > >> + (0 (cons 'no-op 1)) > >> + (1 (cons 'succeed 1)) > > > > Is pcase really needed here? It looks like a simple cond will do. > > Well, pcase is a lot more readable here, don't you think? No, I don't, not in this case. You are just selecting from a list of fixed values. > >> + (interactive "XRegexp (evaluated): ") > > > > This prompt should do a better job describing what kind of input is > > expected here. > > I'm not sure what else to say in the prompt. I found it more useful to input the regexp as a lisp expression than a string (for cut-and-paste from source code, or for rx) but maybe that's just me. I envision many people will think a string is expected, thus my comment. > + Any changes here should be reflected in regexp-disasm.el as well. */ I think the same comment should be near the definition of re_opcode_t. Thanks.