From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Emacs Lisp JIT Compiler Date: Sun, 19 Aug 2018 16:23:11 -0400 Message-ID: References: <87va8ej4o1.fsf@tromey.com> <87mutpiyz6.fsf@tromey.com> <701cd05f423e0c46595a3010f45414d0.squirrel@dancol.org> <520f536b5a603831c9a57a5f6f0978a2.squirrel@dancol.org> <83va8binu8.fsf@gnu.org> <87bma3i26m.fsf@tromey.com> <83in4aigs7.fsf@gnu.org> <87pnyeql0i.fsf@tromey.com> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1534710089 2422 195.159.176.226 (19 Aug 2018 20:21:29 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 19 Aug 2018 20:21:29 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Aug 19 22:21:25 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1frUCT-0000W5-6m for ged-emacs-devel@m.gmane.org; Sun, 19 Aug 2018 22:21:25 +0200 Original-Received: from localhost ([::1]:44025 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1frUEZ-0001tm-MR for ged-emacs-devel@m.gmane.org; Sun, 19 Aug 2018 16:23:35 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:43516) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1frUEN-0001th-78 for emacs-devel@gnu.org; Sun, 19 Aug 2018 16:23:23 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1frUEK-0008OQ-39 for emacs-devel@gnu.org; Sun, 19 Aug 2018 16:23:23 -0400 Original-Received: from [195.159.176.226] (port=58266 helo=blaine.gmane.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1frUEJ-0008Nl-TF for emacs-devel@gnu.org; Sun, 19 Aug 2018 16:23:20 -0400 Original-Received: from list by blaine.gmane.org with local (Exim 4.84_2) (envelope-from ) id 1frUCB-000084-Ly for emacs-devel@gnu.org; Sun, 19 Aug 2018 22:21:07 +0200 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 28 Original-X-Complaints-To: usenet@blaine.gmane.org Cancel-Lock: sha1:XT9m4KqDomj7H6a8gIa6vvs3Xcg= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 195.159.176.226 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:228697 Archived-At: > Daniel> Likewise, it'd be fantastic to compile regular expressions to DFAs and > Daniel> then generate machine code for the DFAs. You can't go faster than that. > I've been meaning to experiment with this using Stefan's lex.el. > It seems to me that the bytecode compiler could open-code some common > things like (looking-at "some constant"). lex.el's matcher (i.e. an Elisp loop interpetering the DFA data-structure) is already pretty fast (i.e. competitive) in my experience compared to the C based regex.c code. So I think that a C implementation of lex.el's matcher would already be so fast that generating machine code for it would only make sense in extremely rare circumstances. > One "simple" way to improve regexp matching right now would be to remove > the self-modifying code and change the implementation to use token > threading, like we did for the bytecode interpreter. Indeed, our regex.c code has a lot of room for improvement. Given its nasty worst case behavior, I've been reluctant to invest any more time into it. > I think removing this self-modifying stuff is also useful if we ever > want to introduce first-class regexp objects. It's also needed to make it re-entrant. Stefan