From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: =?UTF-8?Q?Cl=C3=A9ment_Pit-Claudel?= Newsgroups: gmane.emacs.devel Subject: Re: prior work on non-backtracking regex engine? Date: Wed, 17 Apr 2024 16:23:41 +0200 Message-ID: <1a8ae194-92d7-407e-b9c0-200279c30bff@gmail.com> References: <3a9IKoS2YLqJYosdfpFVdq8ashG0LPPJdB-ugdUgJEqM6-O3RWFeCu01FUPYBsp87xchkX-z1PRlNqJQm8ge_h3v0ziCWcME2fx-6PW-UP4=@hypnicjerk.ai> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="26887"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla Thunderbird To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Apr 17 16:24:00 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rx6Cd-0006iG-Pw for ged-emacs-devel@m.gmane-mx.org; Wed, 17 Apr 2024 16:23:59 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rx6CS-0002Yc-Dq; Wed, 17 Apr 2024 10:23:48 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rx6CQ-0002YM-Sp for emacs-devel@gnu.org; Wed, 17 Apr 2024 10:23:46 -0400 Original-Received: from mail-ed1-x530.google.com ([2a00:1450:4864:20::530]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rx6CP-0007U3-5Q for emacs-devel@gnu.org; Wed, 17 Apr 2024 10:23:46 -0400 Original-Received: by mail-ed1-x530.google.com with SMTP id 4fb4d7f45d1cf-56fffb1d14bso1440141a12.1 for ; Wed, 17 Apr 2024 07:23:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1713363822; x=1713968622; darn=gnu.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id:from :to:cc:subject:date:message-id:reply-to; bh=ZguGSnWAR19OmoDt7PBNtLkO5UfI6xiK9h+wFkDDdAY=; b=iqno9YgXkFq6DI09lyohvDfvXY12ss3grf0M1PstHu658CztOFosalrL9OLP413OOk /39Lfp15Yiog0dChlGu0BW9P1bdSvqLcIZvaC0tpL/gM5gN8Jkc9rbaV02o3RvXDI9Z/ GcRPJelQAxvfzm837Y4CD3dBCZrrto/v8Y/VG+YhFuhC3sVae2gKaoU5ptTZUtR33El4 o2OLeY7sUtTsepZpIW335c/tH/u5o9eiGHEjBR7a+C+lK8BewHWJtEzP8FSh1C0FXUMy ph/a1rUfOmi0jamfGHJU2o8HMJt7Azo+IGMoXfszfm/TmfIrBA/DmZliSM25tIUYbUlc Fa3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713363822; x=1713968622; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ZguGSnWAR19OmoDt7PBNtLkO5UfI6xiK9h+wFkDDdAY=; b=S4NamzY+oFMLabaJXhNkHORIbtirbwoBxaRMeWvcWcMBsw5YGaMrljAQeK62Tafuic JD0wtEVireNcN8dYIdjGNpweopnmZpMHq+Um+E9ZaIdmz8xCUSp2KI2ulZCqY8pfwEBY cifnENHWSot047AHA1SSrvwBS77QW2jXz/T/r6xGx4nyqwGFDTIpJDaIQ42yFTbml4NM wQKwqZFpdZz4vBsMG3ThUHHUSKSP+8k0qRoXloXwN0nbzyxUW+gjr3MnieT13tI8WjR3 go77QfBhOG2+5j9df4X4CJ6pB5WER9ZsyfW2W2ZXB5SWofVJ48Iwkju1tsGE33ZVcY7W pyLA== X-Gm-Message-State: AOJu0Yy9GchD2spNqXgaX6jA30/oNoWTElXG4kr0rGYTRp2zeWLKbNnN tSaN9cDCDxZe2PrDwMXBQIlOCiFmCiFtJzhpaIV8vML9mLxXlrdJGamvGQ== X-Google-Smtp-Source: AGHT+IET06YBBSd+qKVE38VeTOBVz4gsVITNe/fmLVopJrzKB5c3UGDEVmxQnEPoSQJukmBg8yH0AA== X-Received: by 2002:a05:6402:1cca:b0:570:3490:c9d0 with SMTP id ds10-20020a0564021cca00b005703490c9d0mr5887922edb.12.1713363822203; Wed, 17 Apr 2024 07:23:42 -0700 (PDT) Original-Received: from [192.168.178.20] ([81.221.202.206]) by smtp.gmail.com with ESMTPSA id g3-20020a056402428300b005701f033da5sm4572681edc.79.2024.04.17.07.23.41 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 17 Apr 2024 07:23:41 -0700 (PDT) Content-Language: fr, en-US In-Reply-To: <3a9IKoS2YLqJYosdfpFVdq8ashG0LPPJdB-ugdUgJEqM6-O3RWFeCu01FUPYBsp87xchkX-z1PRlNqJQm8ge_h3v0ziCWcME2fx-6PW-UP4=@hypnicjerk.ai> Received-SPF: pass client-ip=2a00:1450:4864:20::530; envelope-from=cpitclaudel@gmail.com; helo=mail-ed1-x530.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:317770 Archived-At: On 3/10/24 16:41, Danny McClanahan wrote: > (3) Have there been prior investigations of non-backtracking regex engines in emacs, or trying to use an external regex engine in general? What was the outcome, and does it seem like a useful research direction? Here is a relevant thread: https://lists.gnu.org/archive/html/emacs-devel/2016-12/msg00622.html The gap is indeed a significant issue. If you wanted linear-time support, my guess is your best bets today would be RE2, rust-regex, or the linear engine in Chromium/V8. All of them would require sizable changes, and none would replace the existing engine because they don't support backreferences. The one in Chromium is fairly small and simple. Clément.