From: immerrr again <immerrr@gmail.com>
To: Noam Postavsky <npostavs@users.sourceforge.net>
Cc: 17862@debbugs.gnu.org, Andreas Schwab <schwab@suse.de>
Subject: bug#17862: 24.3; regexp-opt docstring is incorrect
Date: Thu, 25 Aug 2016 16:21:41 +0300 [thread overview]
Message-ID: <CAERznn-RE=SBRjYqcvwLMD7ivmjbm6XdDJ=abYvZe9UzP+HW+Q@mail.gmail.com> (raw)
In-Reply-To: <CAM-tV-9gKGSpT0p_PL_=C6Aej5ETSSuNAs4KcMSr2EVTbXoSEg@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 1341 bytes --]
On Sun, Aug 21, 2016 at 3:47 PM, Noam Postavsky
<npostavs@users.sourceforge.net> wrote:
> On Sat, Jul 30, 2016 at 9:28 AM, <npostavs@users.sourceforge.net> wrote:
>>
>> <snip>
>>
>> Hah, sounds like a challenge :) How about
>>
>> (defun simplified-regexp-opt (strings &optional paren)
>> (let ((parens (cond ((eq paren 'words) '("\\<\\(" . "\\)\\>"))
>> ((eq paren 'symbols) '("\\_<\\(" . "\\)\\_>"))
>> ((null paren) '("\\(?:" . "\\)"))
>> (t '("\\(" . "\\)")))))
>> (concat (car paren)
>> (mapconcat 'regexp-quote strings "\\|")
>> (cdr paren))))
>>
>>> +@code{nil}
>>> + if all @var{strings} are single-character, the resulting regexp is
>>> + not surrounded, otherwise it is surrounded by @samp{\(?:} and
>>> + @samp{\)}.
>>
>> Zero character strings also:
>>
>> (regexp-opt '("a" "")) ;=> "a?"
>>
>> How about saying "the regexp may be surrounded with \?(: ) to ensure that
>> it constitutes a single expression (such that appending a postfix
>> operator like '+' will apply to the whole expression)."
>>
>
> ping?
Sorry about that, other things took priority.
I liked your ideas, attached is a patch that borrows quite heavily
from them. What do you think?
[-- Attachment #2: 0001-Fix-regexp-opt-documentation-bug-17862.patch --]
[-- Type: text/x-patch, Size: 5650 bytes --]
From 3801c2fb7c75c1d45ea7a126fdfb801ff13de5a4 Mon Sep 17 00:00:00 2001
From: immerrr <immerrr@gmail.com>
Date: Sun, 7 Feb 2016 12:46:37 +0300
Subject: [PATCH] Fix regexp-opt documentation (bug #17862)
* lisp/emacs-lisp/regexp-opt.el (regexp-opt): update PAREN doc
* doc/lispref/searching.texi (Regexp Functions): update PAREN doc
---
doc/lispref/searching.texi | 52 ++++++++++++++++++++++++++++---------------
lisp/emacs-lisp/regexp-opt.el | 46 +++++++++++++++++++++++++++++---------
2 files changed, 70 insertions(+), 28 deletions(-)
diff --git a/doc/lispref/searching.texi b/doc/lispref/searching.texi
index 1243d72..e42f630 100644
--- a/doc/lispref/searching.texi
+++ b/doc/lispref/searching.texi
@@ -946,26 +946,42 @@ Regexp Functions
more efficient, but is almost never worth the effort.}.
@c E.g., see http://debbugs.gnu.org/2816
-If the optional argument @var{paren} is non-@code{nil}, then the
-returned regular expression is always enclosed by at least one
-parentheses-grouping construct. If @var{paren} is @code{words}, then
-that construct is additionally surrounded by @samp{\<} and @samp{\>};
-alternatively, if @var{paren} is @code{symbols}, then that construct
-is additionally surrounded by @samp{\_<} and @samp{\_>}
-(@code{symbols} is often appropriate when matching
-programming-language keywords and the like).
-
-This simplified definition of @code{regexp-opt} produces a
-regular expression which is equivalent to the actual value
-(but not as efficient):
+The optional argument @var{paren} can be any of the following:
+
+a string
+ the resulting regexp is preceded by @var{paren} and followed by
+ @samp{\)}, e.g. use @samp{"\\(?1:"} to produce an explicitly
+ numbered group.
+
+@code{words}
+ the resulting regexp is surrounded by @samp{\<\(} and @samp{\)\>}.
+
+@code{symbols}
+ the resulting regexp is surrounded by @samp{\_<\(} and @samp{\)\_>}
+ (this is often appropriate when maching programming-language
+ keywords and the like).
+
+non-@code{nil}
+ the resulting regexp is surrounded by @samp{\(} and @samp{\)}.
+
+@code{nil}
+ if all @var{strings} are single-character, the resulting regexp is
+ not surrounded, otherwise it is surrounded by @samp{\(?:} and
+ @samp{\)}.
+
+The resulting regexp of @code{regexp-opt} is equivalent to but usually
+more efficient than that of a simplified version:
@example
-(defun regexp-opt (strings &optional paren)
- (let ((open-paren (if paren "\\(" ""))
- (close-paren (if paren "\\)" "")))
- (concat open-paren
- (mapconcat 'regexp-quote strings "\\|")
- close-paren)))
+(defun simplified-regexp-opt (strings &optional paren)
+ (let ((parens (cond ((stringp paren) (cons paren "\\)"))
+ ((eq paren 'words) '("\\<\\(" . "\\)\\>"))
+ ((eq paren 'symbols) '("\\_<\\(" . "\\)\\_>"))
+ ((null paren) '("\\(?:" . "\\)"))
+ (t '("\\(" . "\\)")))))
+ (concat (car paren)
+ (mapconcat 'regexp-quote strings "\\|")
+ (cdr paren))))
@end example
@end defun
diff --git a/lisp/emacs-lisp/regexp-opt.el b/lisp/emacs-lisp/regexp-opt.el
index b1e132a..95c6109 100644
--- a/lisp/emacs-lisp/regexp-opt.el
+++ b/lisp/emacs-lisp/regexp-opt.el
@@ -86,18 +86,44 @@
;;;###autoload
(defun regexp-opt (strings &optional paren)
"Return a regexp to match a string in the list STRINGS.
-Each string should be unique in STRINGS and should not contain any regexps,
-quoted or not. If optional PAREN is non-nil, ensure that the returned regexp
-is enclosed by at least one regexp grouping construct.
-The returned regexp is typically more efficient than the equivalent regexp:
+Each string should be unique in STRINGS and should not contain
+any regexps, quoted or not. Optional PAREN specifies how the
+returned regexp is surrounded by grouping constructs.
- (let ((open (if PAREN \"\\\\(\" \"\")) (close (if PAREN \"\\\\)\" \"\")))
- (concat open (mapconcat \\='regexp-quote STRINGS \"\\\\|\") close))
+The optional argument PAREN can be any of the following:
-If PAREN is `words', then the resulting regexp is additionally surrounded
-by \\=\\< and \\>.
-If PAREN is `symbols', then the resulting regexp is additionally surrounded
-by \\=\\_< and \\_>."
+a string
+ the resulting regexp is preceded by PAREN and followed by
+ \\), e.g. use \"\\\\(?1:\" to produce an explicitly numbered
+ group.
+
+`words'
+ the resulting regexp is surrounded by \\=\\<\\( and \\)\\>.
+
+`symbols'
+ the resulting regexp is surrounded by \\_<\\( and \\)\\_>.
+
+non-nil
+ the resulting regexp is surrounded by \\( and \\).
+
+nil
+ the resulting regexp is surrounded by \\(?: and \\), if it is
+ necessary to ensure that postfix operator appended to it will
+ apply to the whole expression.
+
+The resulting regexp is equivalent to but usually more efficient
+than that of a simplified version:
+
+ (defun simplified-regexp-opt (strings &optional paren)
+ (let ((parens
+ (cond ((stringp paren) (cons paren \"\\\\)\"))
+ ((eq paren 'words) '(\"\\\\\\=<\\\\(\" . \"\\\\)\\\\>\"))
+ ((eq paren 'symbols) '(\"\\\\_<\\\\(\" . \"\\\\)\\\\_>\"))
+ ((null paren) '(\"\\\\(?:\" . \"\\\\)\"))
+ (t '(\"\\\\(\" . \"\\\\)\")))))
+ (concat (car paren)
+ (mapconcat 'regexp-quote strings \"\\\\|\")
+ (cdr paren))))"
(save-match-data
;; Recurse on the sorted list.
(let* ((max-lisp-eval-depth 10000)
--
2.9.2
next prev parent reply other threads:[~2016-08-25 13:21 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-27 7:20 bug#17862: 24.3; regexp-opt docstring is incorrect immerrr again
2014-06-30 13:37 ` Stefan Monnier
2014-07-01 7:15 ` immerrr again
2014-07-01 6:52 ` Glenn Morris
2014-07-01 7:16 ` Andreas Schwab
2014-07-01 15:41 ` Glenn Morris
2014-07-01 16:22 ` Stefan Monnier
2016-02-07 10:51 ` immerrr again
2016-07-29 1:10 ` npostavs
2016-07-29 3:57 ` immerrr again
2016-07-30 13:28 ` npostavs
2016-08-21 12:47 ` Noam Postavsky
2016-08-25 13:21 ` immerrr again [this message]
2016-08-26 1:08 ` npostavs
2016-08-29 8:45 ` immerrr again
2016-09-02 3:06 ` npostavs
2016-09-02 7:03 ` Eli Zaretskii
2016-09-02 12:30 ` immerrr again
2016-09-02 12:31 ` immerrr again
2016-09-02 13:11 ` Eli Zaretskii
2016-09-03 20:11 ` Noam Postavsky
2016-09-04 2:36 ` Eli Zaretskii
2016-09-04 3:59 ` npostavs
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAERznn-RE=SBRjYqcvwLMD7ivmjbm6XdDJ=abYvZe9UzP+HW+Q@mail.gmail.com' \
--to=immerrr@gmail.com \
--cc=17862@debbugs.gnu.org \
--cc=npostavs@users.sourceforge.net \
--cc=schwab@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).