From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Adam Porter Newsgroups: gmane.emacs.devel Subject: Re: Make peg.el a built-in library? Date: Sat, 2 Oct 2021 02:32:22 -0500 Message-ID: References: <875yvtbbn3.fsf@ericabrahamsen.net> <83wno8u3uz.fsf@gnu.org> <87v93s9q4n.fsf@ericabrahamsen.net> <875yvafjr9.fsf@ericabrahamsen.net> <878rzdreem.fsf@alphapapa.net> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="35678"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Oct 02 09:33:39 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mWZWc-00096I-MU for ged-emacs-devel@m.gmane-mx.org; Sat, 02 Oct 2021 09:33:38 +0200 Original-Received: from localhost ([::1]:47706 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mWZWb-00087f-P0 for ged-emacs-devel@m.gmane-mx.org; Sat, 02 Oct 2021 03:33:37 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:44622) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mWZVd-0007PP-4n for emacs-devel@gnu.org; Sat, 02 Oct 2021 03:32:37 -0400 Original-Received: from mail-lf1-f49.google.com ([209.85.167.49]:45706) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mWZVb-0008Sg-6L for emacs-devel@gnu.org; Sat, 02 Oct 2021 03:32:36 -0400 Original-Received: by mail-lf1-f49.google.com with SMTP id u18so47952547lfd.12 for ; Sat, 02 Oct 2021 00:32:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=0Jcn1ICeBMOLj4RNORahO5j6TjDOyY+HZTBarP6HNF8=; b=vSraY6skJ7ql6LLWY4scEYxhpU0pHhS3TTW5J4ZoW1IJwziaR2nEMxJIo/MYsYt//y x/GfYmBwvgfrJq4b1trGXTfDBXyHs2nYm8zbMEO3BcHc6ymn5xrZCia6QvYcd3NNHWFT PD23sZxbXiT+Yh/ArK5mGp4KiyWccyC0vwe25CkJDP7GqLZxsqnViQoyDmkE+wUtgUFu /w/Isna2PIXrc59mKUm7xtylA8gD1xq609WFrXvORaVxRd0A87/ItFzJcbeBkox0MGyR D4IjManvo3UuvjatHz3nzGXZwqBBDP76Bxgwd1xIIeAs8Gmv//Q0eJDb0f4sTTpfe9fs CS3A== X-Gm-Message-State: AOAM5339ber1jO4JaIWeyHXb/ZCNjSD24cliC0f+umMTzh9Moq+HK/Bl X8G/f/8Hi1kgT0vvu2CEUAduzsTFEAcByDiCLOY= X-Google-Smtp-Source: ABdhPJwF5wthPqvIrkcijllLgC2PWH92fMXAAVoeNrPEOOa/0YXcF5sYusme/N8E3XvfNamT1qJAvM234rZnv8QED9k= X-Received: by 2002:a2e:8881:: with SMTP id k1mr2352539lji.443.1633159953266; Sat, 02 Oct 2021 00:32:33 -0700 (PDT) In-Reply-To: Received-SPF: pass client-ip=209.85.167.49; envelope-from=alphadeltapapa@gmail.com; helo=mail-lf1-f49.google.com X-Spam_score_int: -13 X-Spam_score: -1.4 X-Spam_bar: - X-Spam_report: (-1.4 / 5.0 requ) BAYES_00=-1.9, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:276051 Archived-At: On Fri, Oct 1, 2021 at 1:05 PM Stefan Monnier wrote: > > > In org-ql, the PEX is redefined at load time and/or run time, being > > derived from search keywords that are defined by the package and > > possibly by the user. So the PEX can't be defined in advance, at > > compile time. So having to use `with-peg-rules' means having to use > > `eval'. > > If the grammar changes radically at run time, based on external/user > data there's probably no better way than via `eval` or similar (`load`, > `byte-compile`, you name it). > > But if the changes are sufficiently limited (e.g. have an (or "foo" > "bar" ....) with a variable set of strings that can match), then we can > do better. In org-ql's case, it's the latter: the grammar doesn't fundamentally change, only the list of strings that can be matched in a certain expression: https://github.com/alphapapa/org-ql/blob/31aeb0a2505acf8044c07824888ddec7f3e529c1/org-ql.el#L869 > E.g. we could have a PEX of the form (re FORM) where FORM can be any > ELisp expression that returns a regular expression. Then `org-ql.el` > could do > > (let ((predicate-re (regexp-opt predicate-names))) > (peg-parse > ((query (+ term > (opt (+ (syntax-class whitespace) (any))))) > [...] > (predicate (re predicate-re)) > [...]))) That would be helpful, yes. > PS: BTW, regarding your comment: > > ;; Sort the keywords longest-first to work around what seems to be an > ;; obscure bug in `peg': when one keyword is a substring of another, > ;; and the shorter one is listed first, the shorter one fails to match. > > The behavior you describe indeed seems like a bug, but maybe what you > see is slightly different (and not a bug): if you have a PEX like > (and (or "foo" "foobar") "X") > the "foo" will match when faced with "foobarX" and the parser won't > backtrack to try and match the "foobar" when the "X" fails to match. Hmm, thanks. I think an example of the problem is that a predicate in org-ql might have a shorter alias, e.g. "heading" is has the alias "h", and predicates are followed by arguments, like "heading:foo", so IIRC, without sorting them there, "heading:foo" would work, while "h:foo" wouldn't. (Or maybe a better example is predicates that optionally accept keyword-style arguments, like "ts-active:from=2021-10-01", which has the alias "ts-a", and could also be called without arguments, like "ts-a:".) > It's one of those differences between BNF and PEG grammars. > So indeed you do want to sort from longest to shortest to avoid > this problem. Thanks, I didn't realize that.