From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= Newsgroups: gmane.emacs.devel Subject: New rx implementation with extension constructs Date: Mon, 2 Sep 2019 23:19:47 +0200 Message-ID: Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="249497"; mail-complaints-to="usenet@blaine.gmane.org" To: emacs-devel Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Sep 02 23:20:14 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1i4tkD-0012o6-Ht for ged-emacs-devel@m.gmane.org; Mon, 02 Sep 2019 23:20:13 +0200 Original-Received: from localhost ([::1]:40112 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i4tkC-0007fe-AO for ged-emacs-devel@m.gmane.org; Mon, 02 Sep 2019 17:20:12 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:36408) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i4tjx-0007fT-2l for emacs-devel@gnu.org; Mon, 02 Sep 2019 17:19:58 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i4tjv-00058z-E4 for emacs-devel@gnu.org; Mon, 02 Sep 2019 17:19:56 -0400 Original-Received: from mail203c50.megamailservers.eu ([91.136.10.213]:46182 helo=mail193c50.megamailservers.eu) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i4tju-00053e-To for emacs-devel@gnu.org; Mon, 02 Sep 2019 17:19:55 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1567459189; bh=I63wcudWD74q4n/RZyq8cg4hf0aYh8+kMrWnN11TxHg=; h=From:Subject:Date:To:From; b=YB6SsRAX4MyN1g9JblTdDlu0wu0BaV9PPHFLm0HGPOS3AY7riqhq4nG9vpGbwcI7t 3u1t9yO1Swo994WPXR6VpXnDoHk2GnmUlLfTeALLBVq0bQJCevBD4AuH6155nZ7/Yc cbWQ0hbkNJE+Xs2Bv7I+zaXZlbJb4MInxvkKGVcs= Feedback-ID: mattiase@acm.or Original-Received: from [192.168.0.4] ([188.150.171.71]) (authenticated bits=0) by mail193c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id x82LJlI3022027 for ; Mon, 2 Sep 2019 21:19:49 +0000 X-Mailer: Apple Mail (2.3445.104.11) X-CTCH-RefID: str=0001.0A0B020F.5D6D8775.003C, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=Yv8hubQX c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=M9u8WbzkAAAA:20 a=_bQchdcRzWbG15ZIcWkA:9 a=CjuIK1q_8ugA:10 a=Q7YcUrlJ2r6QPcw2_obk:22 a=pHzHmUro8NiASowvMSCR:22 a=Ew2E2A-JSTLzCXPT_086:22 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic] X-Received-From: 91.136.10.213 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:239789 Archived-At: The rx regexp notation is nice to use but the implementation isn't = wonderful; there is a proposed replacement rewritten from the ground up. = It is cleaner, has fewer bugs, and is maybe twice as fast. Most importantly, there is now a proper extension mechanism: for global = definitions, (rx-define snobol-identifier (seq alpha (0+ alnum)) which are available anywhere, and local ones, (rx-let ((natnum (1+ digit)) (integer (seq (opt "-") natnum))) ...body...) where a set of definitions are only available in a lexical scope. This = zero-cost construct can be placed inside a function, or at top-level = enclosing multiple variable and function definitions, all sharing the = same named rx forms. Both rx-define and rx-let admit two kinds of definitions: NAME RX-FORM NAME (ARGS...) RX-FORM for plain rx symbols and for parametrised forms, respectively. For = example: (rx-let ((name (1+ letter)) (comma-separated (x) (seq x (0+ "," x)))) (rx (comma-separated name))) works just as expected. &rest arguments are permitted, and expand to = implicit (seq ...) forms. No provision was made for macros able to execute arbitrary Lisp code; I = just couldn't find a use for them, and decided to wait until someone = would tell me otherwise. Thus, all parametrised forms work by plain = substitution. The code currently resides at https://gitlab.com/mattiase/ry; it will = naturally be renamed to `rx' once it's in the Emacs tree. It can be = integrated in a separate branch of the Emacs source repo if you wish, or = as patches if you prefer that for reviewing. The diffs don't make much = sense since it is a reimplementation with very little in common with the = old code. The exact form of the extension mechanism isn't set in stone, and I'd = welcome any suggestions for improvement.