From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: John Wiegley Newsgroups: gmane.emacs.devel Subject: Re: The poor state of documentation of pcase like things. Date: Thu, 17 Dec 2015 16:42:13 -0800 Message-ID: References: <20151216202605.GA3752@acm.fritz.box> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" X-Trace: ger.gmane.org 1450399366 23768 80.91.229.3 (18 Dec 2015 00:42:46 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 18 Dec 2015 00:42:46 +0000 (UTC) Cc: Alan Mackenzie , Emacs developers To: Kaushal Modi Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Dec 18 01:42:41 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1a9j84-0003ZP-GU for ged-emacs-devel@m.gmane.org; Fri, 18 Dec 2015 01:42:40 +0100 Original-Received: from localhost ([::1]:57823 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9j83-0003YY-Ob for ged-emacs-devel@m.gmane.org; Thu, 17 Dec 2015 19:42:39 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:41309) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9j7o-0003Y5-RQ for emacs-devel@gnu.org; Thu, 17 Dec 2015 19:42:26 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a9j7l-0006E3-JY for emacs-devel@gnu.org; Thu, 17 Dec 2015 19:42:24 -0500 Original-Received: from mail-pa0-x22e.google.com ([2607:f8b0:400e:c03::22e]:34323) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9j7l-0006Dz-Be for emacs-devel@gnu.org; Thu, 17 Dec 2015 19:42:21 -0500 Original-Received: by mail-pa0-x22e.google.com with SMTP id wq6so50825431pac.1 for ; Thu, 17 Dec 2015 16:42:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:in-reply-to:date:message-id:references :user-agent:mail-followup-to:mime-version:content-type; bh=kcOuuUhIS3MOFBnX6zOMOPgkDpl3AjzWs1/VJb3zqQI=; b=ydxrOACIC4vjVIBaeSpbR6vGFe6kCP8oL885YS9AF8EYjUwZNB4gZI/JKr7BIPodrl L0I0CNPb+UX9ZblQMZFrUsylVmhmxFlsJyynFRI59LepSZKQWlEAFBrhXHzt9xeBDk6n 03hEmZWSgjaREjmvfYEdtKMYoxbn8hGSJi3Z9rwOBegFC2SefkMtO3bZpDWiBa8XlhNj 3MGnh0r0G7WN9a7Cj8d2WMLPC6jjtBZhKkksnfyYGtlCBla6ML8BLxq60O8tsMJlsqGB bYh21kMGUvdFim7TBP+8nVn2DbOO0GlHER8FKs0oamlELLXTfbBV+B2qYQfy6bQFjH2a MpiQ== X-Received: by 10.66.164.196 with SMTP id ys4mr979293pab.119.1450399340561; Thu, 17 Dec 2015 16:42:20 -0800 (PST) Original-Received: from Vulcan.local (76-234-68-79.lightspeed.frokca.sbcglobal.net. [76.234.68.79]) by smtp.gmail.com with ESMTPSA id q66sm7017877pfi.13.2015.12.17.16.42.18 (version=TLS1 cipher=AES128-SHA bits=128/128); Thu, 17 Dec 2015 16:42:19 -0800 (PST) X-Google-Original-From: "John Wiegley" Original-Received: by Vulcan.local (Postfix, from userid 501) id 2D6E611971001; Thu, 17 Dec 2015 16:42:18 -0800 (PST) In-Reply-To: (Kaushal Modi's message of "Wed, 16 Dec 2015 15:53:38 -0500") User-Agent: Gnus/5.130014 (Ma Gnus v0.14) Emacs/24.5 (darwin) Mail-Followup-To: Kaushal Modi , Alan Mackenzie , Emacs developers X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2607:f8b0:400e:c03::22e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:196442 Archived-At: --=-=-= Content-Type: text/plain >>>>> Kaushal Modi writes: > I would welcome a short tutorial on how (and why) to use pcase. The following is a brief pcase tutorial. I welcome any edits and comments. Also, I wonder if anyone would be willing to hammer this into a form better suited to the Emacs Lisp manual. I'm not familiar enough with the "language" of that document at the moment to emulate it, though I could do some reading next week if no one else is interested in word-smithing. John Pattern Matching with pcase All data fits into some kind of pattern. The most explicit pattern is a description of the data itself. Let's consider the following value as a running example: '(1 2 (4 . 5) "Hello") # Exact matches Explicitly stated, this is a list of four elements, where the first two elements are the integers 1 and 2, the third is a cons consisting of a car of 4 and a cdr of 5, and the fourth is the string "Hello". This states an explicit pattern that we can match against using an equality test: (equal value '(1 2 (4 . 5) "Hello")) # Pattern matches Where patterns become useful is when we want to generalize a bit. Let's say we want to do a similar equality test, but we don't care what the final string's contents are, only that it's a string. Even though it's simply state, this becomes quite difficult using an equality test: (and (equal (subseq value 0 3) '(1 2 (4 .5))) (stringp (nth 3 value))) What we would prefer is a more direct language for encoding our description of the *family of values we'd like to match against*. The way we said in English was: the first three elements exactly so, and the last element, any string. This is how we'd phrase that using `pcase': (pcase value (`(1 2 (4 . 5) ,(pred stringp)) (message "It matched!"))) Think of `pcase' as a form of `cond', where instead of evaluating each test for non-nil, it compares a series of *patterns* against the value under consideration (often called the "scrutinee" in the literature). There can be many patterns, and the first one wins, as with cond. # Capturing matches But `pcase' can go one step further: Not only can we compare a candidate value against a family of possible values described by their pattern, we can also "capture" sub-values from that pattern for later use. Continuing from the last example, let's say we want to print the string that match, even though we didn't care about the contents of the string for the sake of the match: (pcase value (`(1 2 (4 . 5) ,(and (pred stringp) foo)) (message "It matched, and the string was %s" foo))) Whenever a naked symbol like `foo' occurs as a UPattern (see next section), the part of the value being matched at that position is bound to a local variable of the same name. # QPatterns and UPatterns To master `pcase', there are two types of patterns you must know: UPatterns and QPatterns. UPatterns are the "logical" aspect of pattern matching, where we describe the kind of data we'd like to match against, and other special actions to take when it matches; and QPatterns are the "literal" aspect, stating the exact form of a particular match. QPatterns are by far the easiest to think about. To match against any atom, string, or list of the same, the corresponding QPattern is that exact value. So the QPattern "foo" matches the string "foo", 1 matches the atom 1, etc. `pcase' matches against a list of UPatterns, so to use a QPattern, we must backquote it: (pcase value (`1 (message "Matched a 1")) (`2 (message "Matched a 2")) (`"Hello" (message "Matched the string Hello"))) The only special QPattern is the anti-quoting pattern, `,foo`, which allows you to use UPatterns within QPatterns! The analogy to macro expansion is direct, so you can think of them similarly. For example: (pcase value (`(1 2 ,(or `3 `4)) (message "Matched either the list (1 2 3) or (1 2 4)"))) # More on UPatterns There are many special UPatterns, and their variety makes this the hardest aspect to master. Let's consider them one by one. ## Underscore `_' To match against anything whatsoever, no matter its type or value, use underscore. Thus to match against a list containing anything at all at its head, we'd use: (pcase value (`(_ 1 2) (message "Matched a list of anything followed by (2 3)"))) ## Self-quoting If an atom is self-quoting, we don't need to use backquotes to match against it. This means that the QPattern `1 is identical to the UPattern 1: (pcase value (1 (message "Matched a 1")) (2 (message "Matched a 2")) ("Hello" (message "Matched the string Hello"))) ## Symbol When performing a match, if a symbol occurs within a UPattern, it binds whatever was found at that position to a local symbol of the same name. Some examples will help to make this clearer: (pcase value (`(1 2 ,foo 3) (message "Matched 1, 2, something now bound to foo, and 3")) (foo (message "Match anything at all, and bind it to foo!")) (`(,the-car . ,the-cdr)) (message "Match any cons cell, binding the car and cdr locally")) The reason for doing this is two-fold: Either to refer to a previous match later in the pattern (where it is compared using `eq'), or to make use of a matched value within the related code block: (pcase value (`(1 2 ,foo ,foo 3) (message "Matched (1 2 %s %s 3)" foo))) ## `(or UPAT ...)` and `(and UPAT ...) We can express boolean logic within a pattern match using the `or` and `and` Patterns: (pcase value (`(1 2 ,(or 3 4) ,(and (pred stringp) (pred (string> "aaa")) (pred (lambda (x) (> (length x) 10))))) (message "Matched 1, 2, 3 or 4, and a long string " "that is lexically greater than 'aaa'"))) ## `pred' predicates Arbitrary predicates can be applied to matched elements, where the predicate will be passed the object that matched. As in the previous example, lambdas can be used to form arbitrarily complex predicates, with their own logic. ## guard expressions At any point within a match, you may assert that something is true by inserting a guard. This might consult some other variable to confirm the validity of a pattern at a given time, or it might reference a local symbol that was earlier bound by the match itself, as described above: (pcase value (`(1 2 ,foo ,(guard (and (not (numberp foo)) (/= foo 10))) (message "Matched 1, 2, anything, and then anything again, " "but only if the first anything wasn't the number 10")))) Note that in this example, the guard occurs at a match position, so even though the guard doesn't refer to what is being matched, if it passes, then whatever occurs at that position (the fourth element of the list), would be an unnamed successful matched. This is rather bad form, so we can be more explicit about the logic here: (pcase value (`(1 2 ,(and foo (guard (and (not (numberp foo)) (/= foo 10)))) _) (message "Matched 1, 2, anything, and then anything again, " "but only if the first anything wasn't the number 10")))) This means the same, but associates the guard with the value it tests, and makes it clear that we don't care what the fourth element is, only that it exists. ## Pattern let bindings Within a pattern we can match sub-patterns, using a special form of let that has a meaning specific to `pcase': (pcase value (`(1 2 ,(and foo (let 3 foo))) (message "A weird way of matching (1 2 3)"))) This example is a bit contrived, but it allows us to build up complex guard patterns that might match against values captured elsewhere in the surrounding code: (pcase value1 (`(1 2 ,foo) (pcase value2 (`(1 2 ,(and (let (or 3 4) foo) bar)) (message "A nested pcase depends on the results of the first"))))) Here the third value of `value2' -- which must be a list of exactly three elements, starting with 1 and 2 -- is being bound to the local variable `bar', but only if foo was a 3 or 4. There are many other ways this logic could be expressed, but this gives you a test of how flexibly you can introduce arbitrary pattern matching of other values within any UPattern. # `pcase-let' and `pcase-let*' That's all there is to know about `pcase'! The other two utilities you might like to use are `pcase-let` and `pcase-let*`, which do similar things to their UPattern counter-part `let', but as regular Lisp forms: (pcase-let ((`(1 2 ,foo) value1) (`(3 4 ,bar) value2)) (message "value1 is a list of (1 2 %s); value2 ends with %s" foo bar)) Note that `pcase-let' does not fail, and always executes the correspond forms unless there is a type error. That is, `value1' above is not required to fit the form of the match exactly. Rather, every binding that can paired is bound to its corresponding element, but every binding that cannot is bound to nil: (pcase-let ((`(1 2 ,foo) '(10))) (message "foo = %s" foo)) => prints "foo = nil" (pcase-let ((`(1 2 ,foo) 10)) (message "foo = %s" foo)) => Lisp error, 10 is not a list (pcase-let ((`(1 2 ,foo) '(3 4 10))) (message "foo = %s" foo)) => prints "foo = 10" Thus, `pcase-let' could be thought of as a more expressive form of `destructuring-bind'. The `pcase-let*' variant, like `let*', allows you to reference bound local symbols from prior matches. (pcase-let* ((`(1 2 ,foo) '(1 2 3)) (`(3 4 ,bar) (list 3 4 foo))) (message "foo = %s, bar = %s" foo bar)) => foo = 3, bar = 3 However, if you name a symbol with same name in a later UPattern, it is not used as an `eq' test, but rather shadows that symbol: (pcase-let* ((`(1 2 ,foo) '(1 2 3)) (`(3 4 ,foo) '(3 4 5))) (message "1 2 %s" foo)) This prints out "1 2 5", rather current match. --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQGcBAEBCgAGBQJWc1ZlAAoJEMFE2PTxn+YwDpkL/1QOHVFSFyaDZIgEv7hKb9yE HTS0rGAZnNHv1szij1QqzVMYMOLUc1FusIbkJiB9TDdCkqsICM7ERvS+12rG1qp+ yvmaqV6UWKjYhsdTFB8tUxEuULLgc39i4+np0j9KrXJrqCuDEecZ+KwlH85s0ZxE HQ4Kw3Cxcvq3GGvlZE7ma6qZhPwq4jUkc4eup66upZoKBFk3oMB9i5pR75kPYeYa 0/JGgTXBvVfV3kNg8ubN3Cm1rTeQsStIj2BwPqFCKjldVQ71wHlxr1W4xt3pSYjl wHC890pkTVr6h/ZyTkdIweGNuUn931ZY/rLdcGT3BoWS0s4D2qiyPOEcscMrlKcZ pD9BKXULe6iA+pd02W+sQPonVYX6zw+kaH/tyGyB2qkhHs7H48OTyyd0r7NxQou3 9C9sWIIFDC29XTkPkWN6n+77o8OIkp5YkyyfcWQYR7WnpWgXj5HbdyfNuTsx63Up xKk9CV66KNbAino8YDTC+OLECOYP0cbeXvhFpQMI/g== =RpcX -----END PGP SIGNATURE----- --=-=-=--