unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: immerrr again <immerrr@gmail.com>
To: npostavs@users.sourceforge.net
Cc: 17862@debbugs.gnu.org, Andreas Schwab <schwab@suse.de>
Subject: bug#17862: 24.3; regexp-opt docstring is incorrect
Date: Fri, 29 Jul 2016 06:57:37 +0300	[thread overview]
Message-ID: <CAFKGYcMiL0D08KZ0O=-zyp9ze9-YkWE4OU_ELk95+mfiw4oicg@mail.gmail.com> (raw)
In-Reply-To: <87zip19rm4.fsf@users.sourceforge.net>

[-- Attachment #1: Type: text/plain, Size: 2644 bytes --]

Thank you for the review!

Please, see my answers inline and an updated patch attached.

On Fri, Jul 29, 2016 at 4:10 AM,  <npostavs@users.sourceforge.net> wrote:
> immerrr again <immerrr@gmail.com> writes:
>
>> +non-@code{nil}
>
> Should be just non-nil, I believe.
>

I was following the precedent set by other documentation in searching.texi.

>> +    the resulting regexp is surrounded by @samp{\(} and @samp{\)}.
>> +
>> +Otherwise the resulting regexp is surrounded by @samp{\(?:} and
>> +@samp{\)}.
>
> This is not 100% accurate, here is a counterexample:
>
>     (regexp-opt '("a" "b" "c")) ;=> "[abc]"
>
> Not sure how detailed we want to be about this is in the docs though.
>

Agreed, I think reference documentation must cover all cases.  I have updated
it.  The description still seems concise but might have become confusing and
might need an illustrative example like yours.

>>
>>  This simplified definition of @code{regexp-opt} produces a
>>  regular expression which is equivalent to the actual value
>> -(but not as efficient):
>> +(but typically more efficient):
>
> "not as efficient" refers to the output of the simplified definition.
>

Right.  The paragraph referring to "regexp-opt" got me confused, I have
renamed the simplified version "simplified-regexp-opt" for clarity.

>>
>>  @example
>>  (defun regexp-opt (strings &optional paren)
>> -  (let ((open-paren (if paren "\\(" ""))
>> -        (close-paren (if paren "\\)" "")))
>> +  (let ((open-paren (make-open-paren paren))
>> +        (close-paren (make-close-paren paren)))
>
> I'm not sure that adding these undefined make-{open,close}-paren
> functions makes things any clearer.
>

I agree that adding "impenetrable definitions" might be unclear, but I'd argue
that it's still an improvement because the equivalence statement is
plain false now:

- "words" and "symbol" keywords add special backslash characters
around the result

- if PAREN is nil, the result is still grouped (unless the strings can be
  represented as a character set, but that's not handled in the
simplified version)

I doubt we can put all the logic behind PAREN into the simplified version.  I
made an attempt to clarify the "impenetrable definitions" by their name, but if
that's still unacceptable we might want to drop the "is equivalent to" part
which is currently misleading.

>>  The returned regexp is typically more efficient than the equivalent regexp:
>>
>> - (let ((open (if PAREN \"\\\\(\" \"\")) (close (if PAREN \"\\\\)\" \"\")))
>> + (let ((open (make-open-paren PAREN)) (close (make-close-paren PAREN)))
>
> Same comment here.

Same response here :)

Cheers,
immerrr

[-- Attachment #2: 0001-Fix-regexp-opt-documentation-bug-17862.patch --]
[-- Type: text/x-patch, Size: 5045 bytes --]

From 1cf5026b56b6b1b58f8401c7642d06a2809f1e65 Mon Sep 17 00:00:00 2001
From: immerrr <immerrr@gmail.com>
Date: Sun, 7 Feb 2016 12:46:37 +0300
Subject: [PATCH] Fix regexp-opt documentation (bug #17862)

* lisp/emacs-lisp/regexp-opt.el (regexp-opt): update PAREN doc

* doc/lispref/searching.texi (Regexp Functions): update PAREN doc
---
 doc/lispref/searching.texi    | 43 ++++++++++++++++++++++++++++---------------
 lisp/emacs-lisp/regexp-opt.el | 41 +++++++++++++++++++++++++++++++----------
 2 files changed, 59 insertions(+), 25 deletions(-)

diff --git a/doc/lispref/searching.texi b/doc/lispref/searching.texi
index 1243d72..b0ad435 100644
--- a/doc/lispref/searching.texi
+++ b/doc/lispref/searching.texi
@@ -946,23 +946,36 @@ Regexp Functions
 more efficient, but is almost never worth the effort.}.
 @c E.g., see http://debbugs.gnu.org/2816
 
-If the optional argument @var{paren} is non-@code{nil}, then the
-returned regular expression is always enclosed by at least one
-parentheses-grouping construct.  If @var{paren} is @code{words}, then
-that construct is additionally surrounded by @samp{\<} and @samp{\>};
-alternatively, if @var{paren} is @code{symbols}, then that construct
-is additionally surrounded by @samp{\_<} and @samp{\_>}
-(@code{symbols} is often appropriate when matching
-programming-language keywords and the like).
-
-This simplified definition of @code{regexp-opt} produces a
-regular expression which is equivalent to the actual value
-(but not as efficient):
+The optional argument @var{paren} can be any of the following:
+
+a string
+    the resulting regexp is preceded by @var{paren} and followed by
+    @samp{\)}, e.g. use @samp{"\\(?1:"} to produce an explicitly
+    numbered group.
+
+@code{words}
+    the resulting regexp is surrounded by @samp{\<\(} and @samp{\)\>}.
+
+@code{symbols}
+    the resulting regexp is surrounded by @samp{\_<\(} and @samp{\)\_>}
+    (this is often appropriate when maching programming-language
+    keywords and the like).
+
+non-@code{nil}
+    the resulting regexp is surrounded by @samp{\(} and @samp{\)}.
+
+@code{nil}
+    if all @var{strings} are single-character, the resulting regexp is
+    not surrounded, otherwise it is surrounded by @samp{\(?:} and
+    @samp{\)}.
+
+The resulting regexp of @code{regexp-opt} is equivalent to but usually
+more efficient than that of a simplified version:
 
 @example
-(defun regexp-opt (strings &optional paren)
-  (let ((open-paren (if paren "\\(" ""))
-        (close-paren (if paren "\\)" "")))
+(defun simplified-regexp-opt (strings &optional paren)
+  (let ((open-paren (open-paren-logic-as-described-above paren))
+        (close-paren (close-paren-logic-as-described-above paren)))
     (concat open-paren
             (mapconcat 'regexp-quote strings "\\|")
             close-paren)))
diff --git a/lisp/emacs-lisp/regexp-opt.el b/lisp/emacs-lisp/regexp-opt.el
index b1e132a..080aa32 100644
--- a/lisp/emacs-lisp/regexp-opt.el
+++ b/lisp/emacs-lisp/regexp-opt.el
@@ -86,18 +86,39 @@
 ;;;###autoload
 (defun regexp-opt (strings &optional paren)
   "Return a regexp to match a string in the list STRINGS.
-Each string should be unique in STRINGS and should not contain any regexps,
-quoted or not.  If optional PAREN is non-nil, ensure that the returned regexp
-is enclosed by at least one regexp grouping construct.
-The returned regexp is typically more efficient than the equivalent regexp:
+Each string should be unique in STRINGS and should not contain
+any regexps, quoted or not.  Optional PAREN specifies how the
+returned regexp is surrounded by grouping constructs.
 
- (let ((open (if PAREN \"\\\\(\" \"\")) (close (if PAREN \"\\\\)\" \"\")))
-   (concat open (mapconcat \\='regexp-quote STRINGS \"\\\\|\") close))
+The optional argument PAREN can be any of the following:
 
-If PAREN is `words', then the resulting regexp is additionally surrounded
-by \\=\\< and \\>.
-If PAREN is `symbols', then the resulting regexp is additionally surrounded
-by \\=\\_< and \\_>."
+a string
+    the resulting regexp is preceded by PAREN and followed by
+    \\), e.g.  use \"\\\\(?1:\" to produce an explicitly numbered
+    group.
+
+`words'
+    the resulting regexp is surrounded by \\=\\<\\( and \\)\\>.
+
+`symbols'
+    the resulting regexp is surrounded by \\_<\\( and \\)\\_>.
+
+non-nil
+    the resulting regexp is surrounded by \\( and \\).
+
+nil
+    if all STRINGS are single-character, the resulting regexp is
+    not surrounded, otherwise it is surrounded by \\(?: and \\).
+
+The resulting regexp is equivalent to but usually more efficient
+than that of a simplified version:
+
+ (defun simplified-regexp-opt (strings &optional paren)
+   (let ((open-paren (open-paren-logic-as-described-above paren))
+         (close-paren (close-paren-logic-as-described-above paren)))
+     (concat open-paren
+             (mapconcat 'regexp-quote strings \"\\\\|\")
+             close-paren)))"
   (save-match-data
     ;; Recurse on the sorted list.
     (let* ((max-lisp-eval-depth 10000)
-- 
2.9.2


  reply	other threads:[~2016-07-29  3:57 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-27  7:20 bug#17862: 24.3; regexp-opt docstring is incorrect immerrr again
2014-06-30 13:37 ` Stefan Monnier
2014-07-01  7:15   ` immerrr again
2014-07-01  6:52 ` Glenn Morris
2014-07-01  7:16   ` Andreas Schwab
2014-07-01 15:41     ` Glenn Morris
2014-07-01 16:22       ` Stefan Monnier
2016-02-07 10:51         ` immerrr again
2016-07-29  1:10           ` npostavs
2016-07-29  3:57             ` immerrr again [this message]
2016-07-30 13:28               ` npostavs
2016-08-21 12:47                 ` Noam Postavsky
2016-08-25 13:21                   ` immerrr again
2016-08-26  1:08                     ` npostavs
2016-08-29  8:45                       ` immerrr again
2016-09-02  3:06                         ` npostavs
2016-09-02  7:03                           ` Eli Zaretskii
2016-09-02 12:30                             ` immerrr again
2016-09-02 12:31                               ` immerrr again
2016-09-02 13:11                               ` Eli Zaretskii
2016-09-03 20:11                                 ` Noam Postavsky
2016-09-04  2:36                                   ` Eli Zaretskii
2016-09-04  3:59                                     ` npostavs

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFKGYcMiL0D08KZ0O=-zyp9ze9-YkWE4OU_ELk95+mfiw4oicg@mail.gmail.com' \
    --to=immerrr@gmail.com \
    --cc=17862@debbugs.gnu.org \
    --cc=npostavs@users.sourceforge.net \
    --cc=schwab@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).