From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Pattern matching on match-string groups #elisp #question Date: Thu, 25 Feb 2021 23:31:14 -0500 Message-ID: References: <87v9agxkld.fsf@tcd.ie> <80CE2366-76F4-4548-B956-F16DFCE23E4C@acm.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="22903"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Cc: "Basil L. Contovounesios" , Ag Ibragimov , emacs-devel@gnu.org To: Mattias =?windows-1252?Q?Engdeg=E5rd?= Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Feb 26 05:33:06 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lFUoK-0005qv-Up for ged-emacs-devel@m.gmane-mx.org; Fri, 26 Feb 2021 05:33:04 +0100 Original-Received: from localhost ([::1]:49438 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lFUoJ-0005zU-J7 for ged-emacs-devel@m.gmane-mx.org; Thu, 25 Feb 2021 23:33:03 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:47898) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lFUmm-00050h-Lt for emacs-devel@gnu.org; Thu, 25 Feb 2021 23:31:28 -0500 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:57538) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lFUmj-0005pf-SG for emacs-devel@gnu.org; Thu, 25 Feb 2021 23:31:27 -0500 Original-Received: from pmg2.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 0A1E780712; Thu, 25 Feb 2021 23:31:24 -0500 (EST) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 1FB9780462; Thu, 25 Feb 2021 23:31:22 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1614313882; bh=t2S3m6UOH0itKKnBvZer6dJayXnDxghOADrKpHjCTVM=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=GkFcbxxxfMAdsjIEJZ0UuT1zdO6cQsS1PM1b2Y3Icmg3QDfVsOqBV/UOERzc384fs 6Rncn+wY/Kn0prZcy4LdrO8dExUJjjpobQEE/2loiA8iVOyGAtIis6DRmGtgrDIz2B 7+KcsWw8tFbJ7LkhvH0vgFFfEi9EKoz/4g2Y1/kmMZ3e4laE6B6S4fEb5mHbsD2S00 +l4kCmrxiIIBynVGjWTq+ECop4Z/4RZnymzD5vRnQWB+QeyD2jgy2hcxQcDDovtI25 SyuTDnpdDWzfcNdy4lyvfqVy4E6kUkH7REp6/qpr3xBHPOBZHUnn6OiRqJ4WKA8R5B XEXXtyHE6Og4Q== Original-Received: from alfajor (unknown [216.154.41.47]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id D6AEB1203E2; Thu, 25 Feb 2021 23:31:21 -0500 (EST) In-Reply-To: <80CE2366-76F4-4548-B956-F16DFCE23E4C@acm.org> ("Mattias =?windows-1252?Q?Engdeg=E5rd=22's?= message of "Thu, 25 Feb 2021 19:28:16 +0100") Received-SPF: pass client-ip=132.204.25.50; envelope-from=monnier@iro.umontreal.ca; helo=mailscanner.iro.umontreal.ca X-Spam_score_int: -42 X-Spam_score: -4.3 X-Spam_bar: ---- X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:265658 Archived-At: >> I'd say it's a bug. The patch below would fix it. Mattias, WDYT? > Thank you, it looks like multiple bugs. The lack of (pred stringp) was of > course an oversight but unrelated to pcase-let, right? Yes, it's unrelated. I just noticed it along the way. >> (let* ((rx--pcase-vars nil) >> (regexp (rx--to-expr (rx--pcase-transform (cons 'seq regexps))))) >> - `(and (pred (string-match ,regexp)) >> + `(and (pred stringp) >> + (app (lambda (s) (string-match ,regexp s)) (pred identity)) > It does seem to work, but why exactly do we need this monstrosity instead of > (pred (string-match ,regexp))? Good question. I'm not sure how best to explain or document it, sadly. One of the reasons is that predicates are presumed to be (mostly) pure functions, so `pcase` feels free to call them fewer times or more times as it pleases. But that also largely applies to `app`, so that's not a very good explanation. Maybe a better explanation is that `pcase-let` optimizes the pattern match code under the assumption that the pattern will match, so it skips the tests that determine whether the pattern matches or not. [ That doesn't mean it skips all the tests: if the pattern is (or `(a ,v) `(b ,_ ,v)) it *will* test to see if the first element is `a` in order to decide what to bind `v` to, but it won't bother to check if the first element is `b` since it presumes that the pattern does match and it knows that there's no further alternative. ] Note that this explanation is not very convincing either because it's not clear if the test that it skipped is `(identity VAR)` or `(identity (string-match ...))` so it's unclear whether the `string-match` is eliminated. > Is it because pcase-let throws away all `pred` clauses somewhere? > It makes sense to do so but I haven't found exactly where this takes > place in pcase.el yet... The magic is in `pcase--if`. I.e. a lower-level than `pred`. It's linked to the special undocumented pcase pattern `pcase--dontcare` (whose name is not well chosen, suggestions for better names are welcome) which is a pattern that not only can never match but also prevents pcase from attempting to match any further patterns (IOW it forces pcase to just go with the last branch that it failed to match). You might also want to look at `byte-optimize--pcase` for another example where I use this pattern. > Perhaps the assumption that non-binding clauses like `pred` (and what else, > `guard`?) are all side-effect free and can be thrown away in pcase-let[*] > should be documented? Agreed. > Not that I would have read it, of course... At least I'd get to point you to the doc and shame you publicly. > I'll push a fix as soon as I understand the machinery a bit better, but > right now I'm wary of getting my fingers snapped off by the gears > and knives. Come on, that's why we start with ten of those damn fingers. You surely can still afford to risk one or two. Stefan