From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Make peg.el a built-in library? Date: Fri, 01 Oct 2021 14:05:20 -0400 Message-ID: References: <875yvtbbn3.fsf@ericabrahamsen.net> <83wno8u3uz.fsf@gnu.org> <87v93s9q4n.fsf@ericabrahamsen.net> <875yvafjr9.fsf@ericabrahamsen.net> <878rzdreem.fsf@alphapapa.net> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="18471"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Cc: emacs-devel@gnu.org To: Adam Porter Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Oct 01 20:10:53 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mWMzk-0004ZM-De for ged-emacs-devel@m.gmane-mx.org; Fri, 01 Oct 2021 20:10:52 +0200 Original-Received: from localhost ([::1]:54020 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mWMzj-0003VF-EX for ged-emacs-devel@m.gmane-mx.org; Fri, 01 Oct 2021 14:10:51 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:38460) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mWMuW-0003UG-Qa for emacs-devel@gnu.org; Fri, 01 Oct 2021 14:05:30 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:52614) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mWMuU-0005oW-2A for emacs-devel@gnu.org; Fri, 01 Oct 2021 14:05:27 -0400 Original-Received: from pmg3.iro.umontreal.ca (localhost [127.0.0.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id E8215440F9E; Fri, 1 Oct 2021 14:05:22 -0400 (EDT) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 36D8C440C70; Fri, 1 Oct 2021 14:05:21 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1633111521; bh=XlDSSO8WyzaDw1f/eGZiqY/bzK7dknktAnRtq2AIJtw=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=eOgD5rsOVhgUhT/P6O/Ta7rhCGsChLYLAWTjO4StU6aeIRJ+db0/2nKd3ONPh2ZRS YVg6o/qCgUM2G4UGa/oppt6xyc+Bu/o5RFCO3F4FCH3CyPAz5gRueoNRKKsaxl7d8t vEEPSza6f2rt3Y1AX4OsJVphvEEUOVY+1z6iPh8IcoftTQJN3bcJSgz9YReeFNmrJz F+a8MKQHT2szdN/YZCUiZQMnthbuzkZXhVpF84AcWuO/jLzJqhpIl01SSvOtOs+x+2 AXGtcZ1lF2FK8draC2P/LKoytpp3qWXLxzk1cCRfceJuq5IAcITc51MI2DkWIfW7g8 Gosp3RE1zMdEA== Original-Received: from ceviche (modemcable004.216-203-24.mc.videotron.ca [24.203.216.4]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 1BDFB1202FC; Fri, 1 Oct 2021 14:05:21 -0400 (EDT) In-Reply-To: <878rzdreem.fsf@alphapapa.net> (Adam Porter's message of "Thu, 30 Sep 2021 15:34:25 -0500") Received-SPF: pass client-ip=132.204.25.50; envelope-from=monnier@iro.umontreal.ca; helo=mailscanner.iro.umontreal.ca X-Spam_score_int: -42 X-Spam_score: -4.3 X-Spam_bar: ---- X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:276002 Archived-At: > In org-ql, the PEX is redefined at load time and/or run time, being > derived from search keywords that are defined by the package and > possibly by the user. So the PEX can't be defined in advance, at > compile time. So having to use `with-peg-rules' means having to use > `eval'. If the grammar changes radically at run time, based on external/user data there's probably no better way than via `eval` or similar (`load`, `byte-compile`, you name it). But if the changes are sufficiently limited (e.g. have an (or "foo" "bar" ....) with a variable set of strings that can match), then we can do better. E.g. we could have a PEX of the form (re FORM) where FORM can be any ELisp expression that returns a regular expression. Then `org-ql.el` could do (let ((predicate-re (regexp-opt predicate-names))) (peg-parse ((query (+ term (opt (+ (syntax-class whitespace) (any))))) [...] (predicate (re predicate-re)) [...]))) -- Stefan PS: BTW, regarding your comment: ;; Sort the keywords longest-first to work around what seems to be an ;; obscure bug in `peg': when one keyword is a substring of another, ;; and the shorter one is listed first, the shorter one fails to match. The behavior you describe indeed seems like a bug, but maybe what you see is slightly different (and not a bug): if you have a PEX like (and (or "foo" "foobar") "X") the "foo" will match when faced with "foobarX" and the parser won't backtrack to try and match the "foobar" when the "X" fails to match. It's one of those differences between BNF and PEG grammars. So indeed you do want to sort from longest to shortest to avoid this problem.