* Regexp for matching (defun lines
@ 2024-07-31 19:23 Heime
2024-07-31 19:59 ` [External] : " Drew Adams
0 siblings, 1 reply; 11+ messages in thread
From: Heime @ 2024-07-31 19:23 UTC (permalink / raw)
To: Heime via Users list for the GNU Emacs text editor
I awm using the following regexp to match "(defun" lines
("defun" "^\\s-*(defun\\s-+\\([[:alnum:]-]+\\)") 1)
I need regexp suggestions to improve upon it.
Sent with Proton Mail secure email.
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [External] : Regexp for matching (defun lines
2024-07-31 19:23 Regexp for matching (defun lines Heime
@ 2024-07-31 19:59 ` Drew Adams
2024-07-31 20:14 ` Heime
0 siblings, 1 reply; 11+ messages in thread
From: Drew Adams @ 2024-07-31 19:59 UTC (permalink / raw)
To: Heime, 'Help-Gnu-Emacs (help-gnu-emacs@gnu.org)'
[-- Attachment #1: Type: text/plain, Size: 1397 bytes --]
> I awm using the following regexp to match "(defun" lines
>
> ("defun" "^\\s-*(defun\\s-+\\([[:alnum:]-]+\\)") 1)
>
> I need regexp suggestions to improve upon it.
FWIW, in `imenu+.el' I use these regexps for function
and macro definitions:
imenup-lisp-fn-defn-regexp-1:
(concat
"^\\s-*("
(regexp-opt
'("defun" "cl-defun" "defun*" "defsubst" "cl-defsubst"
"define-inline" "define-advice" "defadvice" "define-skeleton"
"define-compilation-mode" "define-minor-mode"
"define-global-minor-mode" "define-globalized-minor-mode"
"define-derived-mode" "define-generic-mode" "defsetf"
"define-setf-expander" "define-method-combination"
"defgeneric" "cl-defgeneric" "defmethod" "cl-defmethod"
"ert-deftest" "icicle-define-command"
"icicle-define-file-command")
t)
"\\s-+\\(\\(\\sw\\|\\s_\\)+\\)")
imenup-lisp-fn-defn-regexp-2 (defs with a quoted name):
(concat "^\\s-*("
(regexp-opt '("defalias" "fset") t)
"\\s-+'\\s-*\\(\\(\\sw\\|\\s_\\)+\\)")
imenup-lisp-macro-defn-regexp:
"(\\s-*\\(defmacro\\|cl-defmacro\\|cl-define-compiler-macro\\|\
define-compiler-macro\\|define-modify-macro\\)\\s-+\\([^ \t()]+\\)"
You don't need all of those (e.g., Icicles defs and
old defadvice defs for older Emacs releases). But
you might want to include some other ways to define
functions and macros.
HTH.
[-- Attachment #2: winmail.dat --]
[-- Type: application/ms-tnef, Size: 14858 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [External] : Regexp for matching (defun lines
2024-07-31 19:59 ` [External] : " Drew Adams
@ 2024-07-31 20:14 ` Heime
2024-07-31 21:02 ` Drew Adams
0 siblings, 1 reply; 11+ messages in thread
From: Heime @ 2024-07-31 20:14 UTC (permalink / raw)
To: Drew Adams; +Cc: 'Help-Gnu-Emacs (help-gnu-emacs@gnu.org)'
On Thursday, August 1st, 2024 at 7:59 AM, Drew Adams <drew.adams@oracle.com> wrote:
> > I awm using the following regexp to match "(defun" lines
> >
> > ("defun" "^\\s-*(defun\\s-+\\([[:alnum:]-]+\\)") 1)
> >
> > I need regexp suggestions to improve upon it.
>
>
> FWIW, in `imenu+.el' I use these regexps for function
> and macro definitions:
>
> imenup-lisp-fn-defn-regexp-1:
>
> (concat
> "^\\s-("
> (regexp-opt
> '("defun" "cl-defun" "defun" "defsubst" "cl-defsubst"
> "define-inline" "define-advice" "defadvice" "define-skeleton"
> "define-compilation-mode" "define-minor-mode"
> "define-global-minor-mode" "define-globalized-minor-mode"
> "define-derived-mode" "define-generic-mode" "defsetf"
> "define-setf-expander" "define-method-combination"
> "defgeneric" "cl-defgeneric" "defmethod" "cl-defmethod"
> "ert-deftest" "icicle-define-command"
> "icicle-define-file-command")
> t)
> "\\s-+\\(\\(\\sw\\|\\s_\\)+\\)")
I see that you use "\\sw". What is the advantage verses "[[:alnum:]-_]"
Why do you use the OR "\\|" with "\\s_" ?
> imenup-lisp-fn-defn-regexp-2 (defs with a quoted name):
>
> (concat "^\\s-("
> (regexp-opt '("defalias" "fset") t)
> "\\s-+'\\s-\\(\\(\\sw\\|\\s_\\)+\\)")
>
>
> imenup-lisp-macro-defn-regexp:
>
> "(\\s-*\\(defmacro\\|cl-defmacro\\|cl-define-compiler-macro\\|\
> define-compiler-macro\\|define-modify-macro\\)\\s-+\\([^ \t()]+\\)"
>
>
> You don't need all of those (e.g., Icicles defs and
> old defadvice defs for older Emacs releases). But
> you might want to include some other ways to define
> functions and macros.
>
> HTH.
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [External] : Regexp for matching (defun lines
2024-07-31 20:14 ` Heime
@ 2024-07-31 21:02 ` Drew Adams
2024-07-31 21:15 ` Heime
2024-07-31 21:29 ` Heime
0 siblings, 2 replies; 11+ messages in thread
From: Drew Adams @ 2024-07-31 21:02 UTC (permalink / raw)
To: Heime; +Cc: 'Help-Gnu-Emacs (help-gnu-emacs@gnu.org)'
> > (concat
> > "^\\s-("
> > (regexp-opt...)
> > t)
> > "\\s-+\\(\\(\\sw\\|\\s_\\)+\\)")
>
> I see that you use "\\sw". What is the advantage verses "[[:alnum:]-_]"
No special advantage. You can include any other
chars you want, so you can pick up, e.g.,
(defun foo!@$%^&*+={}/:42<>? ()
(message "Hello"))
Perfectly legitimate, and none of those chars
even require escaping.
A function name can include ANY chars, including
whitespace and chars that normally have special
meaning for Lisp, but some need to be escaped
in the defun. If you want to handle such cases,
go for it.
My point was really to point out that there are
many ways to define a function, other than just
`defun'.
> Why do you use the OR "\\|" with "\\s_" ?
Word-syntax chars plus symbol-syntax chars.
But use whatever you like.
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [External] : Regexp for matching (defun lines
2024-07-31 21:02 ` Drew Adams
@ 2024-07-31 21:15 ` Heime
2024-08-01 2:11 ` Drew Adams
2024-07-31 21:29 ` Heime
1 sibling, 1 reply; 11+ messages in thread
From: Heime @ 2024-07-31 21:15 UTC (permalink / raw)
To: Drew Adams; +Cc: 'Help-Gnu-Emacs (help-gnu-emacs@gnu.org)'
On Thursday, August 1st, 2024 at 9:02 AM, Drew Adams <drew.adams@oracle.com> wrote:
> > > (concat
> > > "^\\s-("
> > > (regexp-opt...)
> > > t)
> > > "\\s-+\\(\\(\\sw\\|\\s_\\)+\\)")
> >
> > I see that you use "\\sw". What is the advantage verses "[[:alnum:]-_]"
>
>
> No special advantage. You can include any other
> chars you want, so you can pick up, e.g.,
>
> (defun foo!@$%^&*+={}/:42<>? ()
>
> (message "Hello"))
>
> Perfectly legitimate, and none of those chars
> even require escaping.
>
> A function name can include ANY chars, including
> whitespace and chars that normally have special
> meaning for Lisp, but some need to be escaped
> in the defun. If you want to handle such cases,
> go for it.
>
> My point was really to point out that there are
> many ways to define a function, other than just
> `defun'.
>
> > Why do you use the OR "\\|" with "\\s_" ?
>
>
> Word-syntax chars plus symbol-syntax chars.
> But use whatever you like.
Looked into the Elisp Ref Manual and could not find a description of \s_
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [External] : Regexp for matching (defun lines
2024-07-31 21:02 ` Drew Adams
2024-07-31 21:15 ` Heime
@ 2024-07-31 21:29 ` Heime
2024-08-01 2:08 ` Drew Adams
1 sibling, 1 reply; 11+ messages in thread
From: Heime @ 2024-07-31 21:29 UTC (permalink / raw)
To: Drew Adams; +Cc: 'Help-Gnu-Emacs (help-gnu-emacs@gnu.org)'
On Thursday, August 1st, 2024 at 9:02 AM, Drew Adams <drew.adams@oracle.com> wrote:
> > > (concat
> > > "^\\s-("
> > > (regexp-opt...)
> > > t)
> > > "\\s-+\\(\\(\\sw\\|\\s_\\)+\\)")
> >
> > I see that you use "\\sw". What is the advantage verses "[[:alnum:]-_]"
>
>
> No special advantage. You can include any other
> chars you want, so you can pick up, e.g.,
>
> (defun foo!@$%^&*+={}/:42<>? ()
>
> (message "Hello"))
>
> Perfectly legitimate, and none of those chars
> even require escaping.
Meaning that \\sw is superior to [[:alnum:]-_], right ?
> A function name can include ANY chars, including
> whitespace and chars that normally have special
> meaning for Lisp, but some need to be escaped
> in the defun. If you want to handle such cases,
> go for it.
>
> My point was really to point out that there are
> many ways to define a function, other than just
> `defun'.
>
> > Why do you use the OR "\\|" with "\\s_" ?
>
>
> Word-syntax chars plus symbol-syntax chars.
> But use whatever you like.
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [External] : Regexp for matching (defun lines
2024-07-31 21:29 ` Heime
@ 2024-08-01 2:08 ` Drew Adams
2024-08-01 2:24 ` Heime
0 siblings, 1 reply; 11+ messages in thread
From: Drew Adams @ 2024-08-01 2:08 UTC (permalink / raw)
To: Heime; +Cc: 'Help-Gnu-Emacs (help-gnu-emacs@gnu.org)'
> > > > (concat
> > > > "^\\s-("
> > > > (regexp-opt...)
> > > > t)
> > > > "\\s-+\\(\\(\\sw\\|\\s_\\)+\\)")
> > >
> > > I see that you use "\\sw". What is the advantage verses "[[:alnum:]-
> _]"
> >
> > No special advantage. You can include any other
> > chars you want, so you can pick up, e.g.,
> >
> > (defun foo!@$%^&*+={}/:42<>? ()
> > (message "Hello"))
> >
> > Perfectly legitimate, and none of those chars
> > even require escaping.
>
> Meaning that \\sw is superior to [[:alnum:]-_], right ?
No.
\\sw means word-char syntax.
[[:alnum:]-_] means alphanumeric- or symbol-char syntax.
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [External] : Regexp for matching (defun lines
2024-07-31 21:15 ` Heime
@ 2024-08-01 2:11 ` Drew Adams
0 siblings, 0 replies; 11+ messages in thread
From: Drew Adams @ 2024-08-01 2:11 UTC (permalink / raw)
To: Heime; +Cc: 'Help-Gnu-Emacs (help-gnu-emacs@gnu.org)'
> Looked into the Elisp Ref Manual and could not find a description of \s_
https://www.gnu.org/software/emacs/manual/html_node/elisp/Syntax-Descriptors.html
for \s syntax (syntax descriptors).
https://www.gnu.org/software/emacs/manual/html_node/elisp/Syntax-Class-Table.html
for _ representing symbol constituents.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [External] : Regexp for matching (defun lines
2024-08-01 2:08 ` Drew Adams
@ 2024-08-01 2:24 ` Heime
2024-08-01 3:34 ` Drew Adams
0 siblings, 1 reply; 11+ messages in thread
From: Heime @ 2024-08-01 2:24 UTC (permalink / raw)
To: Drew Adams; +Cc: 'Help-Gnu-Emacs (help-gnu-emacs@gnu.org)'
Sent with Proton Mail secure email.
On Thursday, August 1st, 2024 at 2:08 PM, Drew Adams <drew.adams@oracle.com> wrote:
> > > > > (concat
> > > > > "^\\s-("
> > > > > (regexp-opt...)
> > > > > t)
> > > > > "\\s-+\\(\\(\\sw\\|\\s_\\)+\\)")
> > > >
> > > > I see that you use "\\sw". What is the advantage verses "[[:alnum:]-
> > > > _]"
> > >
> > > No special advantage. You can include any other
> > > chars you want, so you can pick up, e.g.,
> > >
> > > (defun foo!@$%^&*+={}/:42<>? ()
> > > (message "Hello"))
> > >
> > > Perfectly legitimate, and none of those chars
> > > even require escaping.
> >
> > Meaning that \\sw is superior to [[:alnum:]-_], right ?
>
>
> No.
>
> \\sw means word-char syntax.
> [[:alnum:]-_] means alphanumeric- or symbol-char syntax.
\sw is equivalent to "[:word:]", that includes digits. And [:alnum:]
is alphabetic and numeric. What is the difference ?
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [External] : Regexp for matching (defun lines
2024-08-01 2:24 ` Heime
@ 2024-08-01 3:34 ` Drew Adams
2024-08-01 4:15 ` Heime
0 siblings, 1 reply; 11+ messages in thread
From: Drew Adams @ 2024-08-01 3:34 UTC (permalink / raw)
To: Heime; +Cc: 'Help-Gnu-Emacs (help-gnu-emacs@gnu.org)'
> > > > > > (concat
> > > > > > "^\\s-("
> > > > > > (regexp-opt...)
> > > > > > t)
> > > > > > "\\s-+\\(\\(\\sw\\|\\s_\\)+\\)")
> > > > >
> > > > > I see that you use "\\sw". What is the advantage verses
> "[[:alnum:]-
> > > > > _]"
> > > >
> > > > No special advantage. You can include any other
> > > > chars you want, so you can pick up, e.g.,
> > > >
> > > > (defun foo!@$%^&*+={}/:42<>? ()
> > > > (message "Hello"))
> > > >
> > > > Perfectly legitimate, and none of those chars
> > > > even require escaping.
> > >
> > > Meaning that \\sw is superior to [[:alnum:]-_], right ?
> >
> >
> > No.
> >
> > \\sw means word-char syntax.
> > [[:alnum:]-_] means alphanumeric- or symbol-char syntax.
>
> \sw is equivalent to "[:word:]", that includes digits. And [:alnum:]
> is alphabetic and numeric. What is the difference ?
https://www.gnu.org/software/emacs/manual/html_node/elisp/Char-Classes.html
says:
‘[:alnum:]’
This matches any letter or digit. For multibyte characters, it
matches characters whose Unicode ‘general-category’ property (*note
Character Properties::) indicates they are alphabetic or decimal
number characters.
The same is not said for [:word:].
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [External] : Regexp for matching (defun lines
2024-08-01 3:34 ` Drew Adams
@ 2024-08-01 4:15 ` Heime
0 siblings, 0 replies; 11+ messages in thread
From: Heime @ 2024-08-01 4:15 UTC (permalink / raw)
To: Drew Adams; +Cc: 'Help-Gnu-Emacs (help-gnu-emacs@gnu.org)'
Sent with Proton Mail secure email.
On Thursday, August 1st, 2024 at 3:34 PM, Drew Adams <drew.adams@oracle.com> wrote:
> > > > > > > (concat
> > > > > > > "^\\s-("
> > > > > > > (regexp-opt...)
> > > > > > > t)
> > > > > > > "\\s-+\\(\\(\\sw\\|\\s_\\)+\\)")
> > > > > >
> > > > > > I see that you use "\\sw". What is the advantage verses
> > > > > > "[[:alnum:]-
> > > > > > _]"
> > > > >
> > > > > No special advantage. You can include any other
> > > > > chars you want, so you can pick up, e.g.,
> > > > >
> > > > > (defun foo!@$%^&*+={}/:42<>? ()
> > > > > (message "Hello"))
> > > > >
> > > > > Perfectly legitimate, and none of those chars
> > > > > even require escaping.
> > > >
> > > > Meaning that \\sw is superior to [[:alnum:]-_], right ?
> > >
> > > No.
> > >
> > > \\sw means word-char syntax.
> > > [[:alnum:]-_] means alphanumeric- or symbol-char syntax.
> >
> > \sw is equivalent to "[:word:]", that includes digits. And [:alnum:]
> > is alphabetic and numeric. What is the difference ?
>
>
> https://www.gnu.org/software/emacs/manual/html_node/elisp/Char-Classes.html
>
> says:
>
> ‘[:alnum:]’
> This matches any letter or digit. For multibyte characters, it
> matches characters whose Unicode ‘general-category’ property (*note
> Character Properties::) indicates they are alphabetic or decimal
> number characters.
> The same is not said for [:word:].
I thought that alphabetic and words constitute the same characters.
[:word:] also matches accented letter (e.g., in French, Spanish, Icelandic).
It is difficult to know what is actually defined these days.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-08-01 4:15 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-31 19:23 Regexp for matching (defun lines Heime
2024-07-31 19:59 ` [External] : " Drew Adams
2024-07-31 20:14 ` Heime
2024-07-31 21:02 ` Drew Adams
2024-07-31 21:15 ` Heime
2024-08-01 2:11 ` Drew Adams
2024-07-31 21:29 ` Heime
2024-08-01 2:08 ` Drew Adams
2024-08-01 2:24 ` Heime
2024-08-01 3:34 ` Drew Adams
2024-08-01 4:15 ` Heime
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).