From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#36496: [PATCH] Describe the rx notation in the lisp manual Date: Thu, 04 Jul 2019 19:28:01 +0300 Message-ID: <838stdbw8e.fsf@gnu.org> References: <0C783D67-9502-408B-B845-5599BD596361@acm.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="105268"; mail-complaints-to="usenet@blaine.gmane.org" Cc: 36496@debbugs.gnu.org To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Thu Jul 04 18:31:39 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1hj4e2-000R6s-Cs for geb-bug-gnu-emacs@m.gmane.org; Thu, 04 Jul 2019 18:31:38 +0200 Original-Received: from localhost ([::1]:47602 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hj4bc-00039g-UU for geb-bug-gnu-emacs@m.gmane.org; Thu, 04 Jul 2019 12:29:08 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:47205) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hj4bX-00039O-8a for bug-gnu-emacs@gnu.org; Thu, 04 Jul 2019 12:29:04 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hj4bW-0008UP-2n for bug-gnu-emacs@gnu.org; Thu, 04 Jul 2019 12:29:03 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:42989) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hj4bV-0008UA-Uu for bug-gnu-emacs@gnu.org; Thu, 04 Jul 2019 12:29:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1hj4bV-0006VP-Le for bug-gnu-emacs@gnu.org; Thu, 04 Jul 2019 12:29:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 04 Jul 2019 16:29:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 36496 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 36496-submit@debbugs.gnu.org id=B36496.156225769924952 (code B ref 36496); Thu, 04 Jul 2019 16:29:01 +0000 Original-Received: (at 36496) by debbugs.gnu.org; 4 Jul 2019 16:28:19 +0000 Original-Received: from localhost ([127.0.0.1]:51809 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hj4ao-0006UN-RW for submit@debbugs.gnu.org; Thu, 04 Jul 2019 12:28:19 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:34812) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hj4an-0006U9-GV for 36496@debbugs.gnu.org; Thu, 04 Jul 2019 12:28:17 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:60506) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hj4ai-0008CR-7k; Thu, 04 Jul 2019 12:28:12 -0400 Original-Received: from [176.228.60.248] (port=2465 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1hj4ah-00067g-0z; Thu, 04 Jul 2019 12:28:11 -0400 In-reply-to: <0C783D67-9502-408B-B845-5599BD596361@acm.org> (message from Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= on Thu, 4 Jul 2019 14:13:26 +0200) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:162067 Archived-At: > From: Mattias EngdegÄrd > Date: Thu, 4 Jul 2019 14:13:26 +0200 > > The rx notation is useful and complex enough to merit inclusion in the manual. > > Right now, it's mainly described in the `rx' doc string, which is fairly well-written but quite long and a bit unstructured. Describing it in the manual permits a different pace and style of exposition, the inclusion of examples and related information, structured into separate sections with cross-references. > > Proposed patch attached. It covers all rx features, functions, macros, including the pcase pattern, and a mention of the corresponding string regexp constructs. This is a large section. The ELisp reference is already a large book, printed in two separate volumes. So I think if we want to include this section, it will have to be on a separate file that is conditionally included @ifnottex. Alternatively, we could make this a separate manual. > The existing `rx' doc string can be left unchanged, or reduced to something more concise, perhaps without a description of the entire rx language but with a manual reference. Suggestions are welcome. Yes, the doc string should be reduced to the summary of the constructs. > +@table @code > +@item (let @var{ref} @var{rx-expr}@dots{}) > +Bind the name @var{ref} to a submatch that matches @var{rx-expr}@enddots{}. ^^^^^^^^^^^^^^^^^^^^^^^ "Bind the symbol @var{ref}", no? > +@example > +@group > +(rx "/*" ; Initial /* > + (zero-or-more > + (or (not (any "*")) ; Either non-*, > + (seq "*" ; or * followed by > + (not (any "/"))))) ; non-/ > + (one-or-more "*") ; At least one star, > + "/") ; and the final / > +@end group > +@end example > + > +or, using shorter synonyms and written more compactly, This last line needs @noindent before it. > +@table @asis > +@item @code{"some-string"} Why @code{"..."} and not @samp{...}? The latter will look better both in print and in Info format. > +Corresponding string regexp: @samp{AB@dots{}} (subexpressions in sequence). ^^^^^^^^^^^^^^^^ I think this should use @samp{@var{a}@var{b}@dots{}} instead. And likewise for the other "corresponding string regexps". The reason is that neither A nor B stand for themselves, literally, they are meta-variables. > +Match the @var{rx}s once or not at all.@* "Match @var{rx} or an empty string" sounds better to me. > +Match the @var{rx}s zero or more times, non-greedily.@* I would add here a cross-reference to where greedy matching is described. > +@item @code{(any @var{charset}@dots{})} Please don't call this "charset", as that term is already taken by a very different creature in Emacs. I suggest "character set" instead. > +Each @var{charset} is a character, a string representing the set of > +its characters, a range or a character class. A range is either a > +hyphen-separated string like @code{"A-Z"}, or a cons of characters > +like @code{(?A . ?Z)}. Again, a cross-reference to where "character class" described would be good here, as would a @cindex entry for "character class in rx". > +@item @code{space}, @code{whitespace}, @code{white} > +Match any character that has whitespace syntax. Only ASCII or also non-ASCII? This should be spelled out. > +@xref{Syntax Class Table} for details. Please note that ^ Comma missing there. > +@kbd{M-x describe-categories @key{RET}}. @xref{Categories} for how ^ Likewise. Thanks.