all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* Regexp for matching (defun lines
@ 2024-07-31 19:23 Heime
  2024-07-31 19:59 ` [External] : " Drew Adams
  0 siblings, 1 reply; 11+ messages in thread
From: Heime @ 2024-07-31 19:23 UTC (permalink / raw)
  To: Heime via Users list for the GNU Emacs text editor

I awm using the following regexp to match "(defun" lines

("defun" "^\\s-*(defun\\s-+\\([[:alnum:]-]+\\)") 1)

I need regexp suggestions to improve upon it.



Sent with Proton Mail secure email.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [External] : Regexp for matching (defun lines
  2024-07-31 19:23 Regexp for matching (defun lines Heime
@ 2024-07-31 19:59 ` Drew Adams
  2024-07-31 20:14   ` Heime
  0 siblings, 1 reply; 11+ messages in thread
From: Drew Adams @ 2024-07-31 19:59 UTC (permalink / raw)
  To: Heime, 'Help-Gnu-Emacs (help-gnu-emacs@gnu.org)'

[-- Attachment #1: Type: text/plain, Size: 1397 bytes --]

> I awm using the following regexp to match "(defun" lines
> 
> ("defun" "^\\s-*(defun\\s-+\\([[:alnum:]-]+\\)") 1)
> 
> I need regexp suggestions to improve upon it.

FWIW, in `imenu+.el' I use these regexps for function
and macro definitions:

imenup-lisp-fn-defn-regexp-1:

(concat
 "^\\s-*("
 (regexp-opt
  '("defun" "cl-defun" "defun*" "defsubst" "cl-defsubst"
    "define-inline" "define-advice" "defadvice" "define-skeleton"
    "define-compilation-mode" "define-minor-mode"
    "define-global-minor-mode" "define-globalized-minor-mode"
    "define-derived-mode" "define-generic-mode" "defsetf"
    "define-setf-expander" "define-method-combination"
    "defgeneric" "cl-defgeneric" "defmethod" "cl-defmethod"
    "ert-deftest" "icicle-define-command"
    "icicle-define-file-command")
  t)
 "\\s-+\\(\\(\\sw\\|\\s_\\)+\\)")


imenup-lisp-fn-defn-regexp-2 (defs with a quoted name):

(concat "^\\s-*("
        (regexp-opt '("defalias" "fset") t)
        "\\s-+'\\s-*\\(\\(\\sw\\|\\s_\\)+\\)")


imenup-lisp-macro-defn-regexp:

"(\\s-*\\(defmacro\\|cl-defmacro\\|cl-define-compiler-macro\\|\
define-compiler-macro\\|define-modify-macro\\)\\s-+\\([^ \t()]+\\)"


You don't need all of those (e.g., Icicles defs and
old defadvice defs for older Emacs releases).  But
you might want to include some other ways to define
functions and macros.

HTH.

[-- Attachment #2: winmail.dat --]
[-- Type: application/ms-tnef, Size: 14858 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [External] : Regexp for matching (defun lines
  2024-07-31 19:59 ` [External] : " Drew Adams
@ 2024-07-31 20:14   ` Heime
  2024-07-31 21:02     ` Drew Adams
  0 siblings, 1 reply; 11+ messages in thread
From: Heime @ 2024-07-31 20:14 UTC (permalink / raw)
  To: Drew Adams; +Cc: 'Help-Gnu-Emacs (help-gnu-emacs@gnu.org)'

On Thursday, August 1st, 2024 at 7:59 AM, Drew Adams <drew.adams@oracle.com> wrote:

> > I awm using the following regexp to match "(defun" lines
> > 
> > ("defun" "^\\s-*(defun\\s-+\\([[:alnum:]-]+\\)") 1)
> > 
> > I need regexp suggestions to improve upon it.
> 
> 
> FWIW, in `imenu+.el' I use these regexps for function
> and macro definitions:
> 
> imenup-lisp-fn-defn-regexp-1:
> 
> (concat
> "^\\s-("
> (regexp-opt
> '("defun" "cl-defun" "defun" "defsubst" "cl-defsubst"
> "define-inline" "define-advice" "defadvice" "define-skeleton"
> "define-compilation-mode" "define-minor-mode"
> "define-global-minor-mode" "define-globalized-minor-mode"
> "define-derived-mode" "define-generic-mode" "defsetf"
> "define-setf-expander" "define-method-combination"
> "defgeneric" "cl-defgeneric" "defmethod" "cl-defmethod"
> "ert-deftest" "icicle-define-command"
> "icicle-define-file-command")
> t)
> "\\s-+\\(\\(\\sw\\|\\s_\\)+\\)")

I see that you use "\\sw".  What is the advantage verses "[[:alnum:]-_]"
Why do you use the OR "\\|" with "\\s_" ?

 
> imenup-lisp-fn-defn-regexp-2 (defs with a quoted name):
> 
> (concat "^\\s-("
> (regexp-opt '("defalias" "fset") t)
> "\\s-+'\\s-\\(\\(\\sw\\|\\s_\\)+\\)")
> 
> 
> imenup-lisp-macro-defn-regexp:
> 
> "(\\s-*\\(defmacro\\|cl-defmacro\\|cl-define-compiler-macro\\|\
> define-compiler-macro\\|define-modify-macro\\)\\s-+\\([^ \t()]+\\)"
> 
> 
> You don't need all of those (e.g., Icicles defs and
> old defadvice defs for older Emacs releases). But
> you might want to include some other ways to define
> functions and macros.
> 
> HTH.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [External] : Regexp for matching (defun lines
  2024-07-31 20:14   ` Heime
@ 2024-07-31 21:02     ` Drew Adams
  2024-07-31 21:15       ` Heime
  2024-07-31 21:29       ` Heime
  0 siblings, 2 replies; 11+ messages in thread
From: Drew Adams @ 2024-07-31 21:02 UTC (permalink / raw)
  To: Heime; +Cc: 'Help-Gnu-Emacs (help-gnu-emacs@gnu.org)'

> > (concat
> >  "^\\s-("
> >  (regexp-opt...)
> >  t)
> >  "\\s-+\\(\\(\\sw\\|\\s_\\)+\\)")
> 
> I see that you use "\\sw".  What is the advantage verses "[[:alnum:]-_]"

No special advantage. You can include any other
chars you want, so you can pick up, e.g.,

(defun foo!@$%^&*+={}/:42<>? ()
  (message "Hello"))

Perfectly legitimate, and none of those chars
even require escaping.

A function name can include ANY chars, including
whitespace and chars that normally have special
meaning for Lisp, but some need to be escaped
in the defun.  If you want to handle such cases,
go for it.

My point was really to point out that there are
many ways to define a function, other than just
`defun'.

> Why do you use the OR "\\|" with "\\s_" ?

Word-syntax chars plus symbol-syntax chars.
But use whatever you like.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [External] : Regexp for matching (defun lines
  2024-07-31 21:02     ` Drew Adams
@ 2024-07-31 21:15       ` Heime
  2024-08-01  2:11         ` Drew Adams
  2024-07-31 21:29       ` Heime
  1 sibling, 1 reply; 11+ messages in thread
From: Heime @ 2024-07-31 21:15 UTC (permalink / raw)
  To: Drew Adams; +Cc: 'Help-Gnu-Emacs (help-gnu-emacs@gnu.org)'

On Thursday, August 1st, 2024 at 9:02 AM, Drew Adams <drew.adams@oracle.com> wrote:

> > > (concat
> > > "^\\s-("
> > > (regexp-opt...)
> > > t)
> > > "\\s-+\\(\\(\\sw\\|\\s_\\)+\\)")
> > 
> > I see that you use "\\sw". What is the advantage verses "[[:alnum:]-_]"
> 
> 
> No special advantage. You can include any other
> chars you want, so you can pick up, e.g.,
> 
> (defun foo!@$%^&*+={}/:42<>? ()
> 
> (message "Hello"))
> 
> Perfectly legitimate, and none of those chars
> even require escaping.
> 
> A function name can include ANY chars, including
> whitespace and chars that normally have special
> meaning for Lisp, but some need to be escaped
> in the defun. If you want to handle such cases,
> go for it.
> 
> My point was really to point out that there are
> many ways to define a function, other than just
> `defun'.
> 
> > Why do you use the OR "\\|" with "\\s_" ?
> 
> 
> Word-syntax chars plus symbol-syntax chars.
> But use whatever you like.

Looked into the Elisp Ref Manual and could not find a description of \s_



^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [External] : Regexp for matching (defun lines
  2024-07-31 21:02     ` Drew Adams
  2024-07-31 21:15       ` Heime
@ 2024-07-31 21:29       ` Heime
  2024-08-01  2:08         ` Drew Adams
  1 sibling, 1 reply; 11+ messages in thread
From: Heime @ 2024-07-31 21:29 UTC (permalink / raw)
  To: Drew Adams; +Cc: 'Help-Gnu-Emacs (help-gnu-emacs@gnu.org)'

On Thursday, August 1st, 2024 at 9:02 AM, Drew Adams <drew.adams@oracle.com> wrote:

> > > (concat
> > > "^\\s-("
> > > (regexp-opt...)
> > > t)
> > > "\\s-+\\(\\(\\sw\\|\\s_\\)+\\)")
> > 
> > I see that you use "\\sw". What is the advantage verses "[[:alnum:]-_]"
> 
> 
> No special advantage. You can include any other
> chars you want, so you can pick up, e.g.,
> 
> (defun foo!@$%^&*+={}/:42<>? ()
>
> (message "Hello"))
>
> Perfectly legitimate, and none of those chars
> even require escaping.

Meaning that \\sw is superior to [[:alnum:]-_], right ?
 
> A function name can include ANY chars, including
> whitespace and chars that normally have special
> meaning for Lisp, but some need to be escaped
> in the defun. If you want to handle such cases,
> go for it.
> 
> My point was really to point out that there are
> many ways to define a function, other than just
> `defun'.
> 
> > Why do you use the OR "\\|" with "\\s_" ?
> 
> 
> Word-syntax chars plus symbol-syntax chars.
> But use whatever you like.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [External] : Regexp for matching (defun lines
  2024-07-31 21:29       ` Heime
@ 2024-08-01  2:08         ` Drew Adams
  2024-08-01  2:24           ` Heime
  0 siblings, 1 reply; 11+ messages in thread
From: Drew Adams @ 2024-08-01  2:08 UTC (permalink / raw)
  To: Heime; +Cc: 'Help-Gnu-Emacs (help-gnu-emacs@gnu.org)'

> > > > (concat
> > > > "^\\s-("
> > > > (regexp-opt...)
> > > > t)
> > > > "\\s-+\\(\\(\\sw\\|\\s_\\)+\\)")
> > >
> > > I see that you use "\\sw". What is the advantage verses "[[:alnum:]-
> _]"
> >
> > No special advantage. You can include any other
> > chars you want, so you can pick up, e.g.,
> >
> > (defun foo!@$%^&*+={}/:42<>? ()
> >   (message "Hello"))
> >
> > Perfectly legitimate, and none of those chars
> > even require escaping.
> 
> Meaning that \\sw is superior to [[:alnum:]-_], right ?

No.

\\sw means word-char syntax.
[[:alnum:]-_] means alphanumeric- or symbol-char syntax.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [External] : Regexp for matching (defun lines
  2024-07-31 21:15       ` Heime
@ 2024-08-01  2:11         ` Drew Adams
  0 siblings, 0 replies; 11+ messages in thread
From: Drew Adams @ 2024-08-01  2:11 UTC (permalink / raw)
  To: Heime; +Cc: 'Help-Gnu-Emacs (help-gnu-emacs@gnu.org)'

> Looked into the Elisp Ref Manual and could not find a description of \s_

https://www.gnu.org/software/emacs/manual/html_node/elisp/Syntax-Descriptors.html

for \s syntax (syntax descriptors).

https://www.gnu.org/software/emacs/manual/html_node/elisp/Syntax-Class-Table.html

for _ representing symbol constituents.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [External] : Regexp for matching (defun lines
  2024-08-01  2:08         ` Drew Adams
@ 2024-08-01  2:24           ` Heime
  2024-08-01  3:34             ` Drew Adams
  0 siblings, 1 reply; 11+ messages in thread
From: Heime @ 2024-08-01  2:24 UTC (permalink / raw)
  To: Drew Adams; +Cc: 'Help-Gnu-Emacs (help-gnu-emacs@gnu.org)'






Sent with Proton Mail secure email.

On Thursday, August 1st, 2024 at 2:08 PM, Drew Adams <drew.adams@oracle.com> wrote:

> > > > > (concat
> > > > > "^\\s-("
> > > > > (regexp-opt...)
> > > > > t)
> > > > > "\\s-+\\(\\(\\sw\\|\\s_\\)+\\)")
> > > > 
> > > > I see that you use "\\sw". What is the advantage verses "[[:alnum:]-
> > > > _]"
> > > 
> > > No special advantage. You can include any other
> > > chars you want, so you can pick up, e.g.,
> > > 
> > > (defun foo!@$%^&*+={}/:42<>? ()
> > > (message "Hello"))
> > > 
> > > Perfectly legitimate, and none of those chars
> > > even require escaping.
> > 
> > Meaning that \\sw is superior to [[:alnum:]-_], right ?
> 
> 
> No.
> 
> \\sw means word-char syntax.
> [[:alnum:]-_] means alphanumeric- or symbol-char syntax.

\sw is equivalent to "[:word:]", that includes digits.  And [:alnum:] 
is alphabetic and numeric.  What is the difference ?






^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [External] : Regexp for matching (defun lines
  2024-08-01  2:24           ` Heime
@ 2024-08-01  3:34             ` Drew Adams
  2024-08-01  4:15               ` Heime
  0 siblings, 1 reply; 11+ messages in thread
From: Drew Adams @ 2024-08-01  3:34 UTC (permalink / raw)
  To: Heime; +Cc: 'Help-Gnu-Emacs (help-gnu-emacs@gnu.org)'

> > > > > > (concat
> > > > > > "^\\s-("
> > > > > > (regexp-opt...)
> > > > > > t)
> > > > > > "\\s-+\\(\\(\\sw\\|\\s_\\)+\\)")
> > > > >
> > > > > I see that you use "\\sw". What is the advantage verses
> "[[:alnum:]-
> > > > > _]"
> > > >
> > > > No special advantage. You can include any other
> > > > chars you want, so you can pick up, e.g.,
> > > >
> > > > (defun foo!@$%^&*+={}/:42<>? ()
> > > > (message "Hello"))
> > > >
> > > > Perfectly legitimate, and none of those chars
> > > > even require escaping.
> > >
> > > Meaning that \\sw is superior to [[:alnum:]-_], right ?
> >
> >
> > No.
> >
> > \\sw means word-char syntax.
> > [[:alnum:]-_] means alphanumeric- or symbol-char syntax.
> 
> \sw is equivalent to "[:word:]", that includes digits.  And [:alnum:]
> is alphabetic and numeric.  What is the difference ?

https://www.gnu.org/software/emacs/manual/html_node/elisp/Char-Classes.html

says:

‘[:alnum:]’
     This matches any letter or digit.  For multibyte characters, it
     matches characters whose Unicode ‘general-category’ property (*note
     Character Properties::) indicates they are alphabetic or decimal
     number characters.

The same is not said for [:word:].

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [External] : Regexp for matching (defun lines
  2024-08-01  3:34             ` Drew Adams
@ 2024-08-01  4:15               ` Heime
  0 siblings, 0 replies; 11+ messages in thread
From: Heime @ 2024-08-01  4:15 UTC (permalink / raw)
  To: Drew Adams; +Cc: 'Help-Gnu-Emacs (help-gnu-emacs@gnu.org)'






Sent with Proton Mail secure email.

On Thursday, August 1st, 2024 at 3:34 PM, Drew Adams <drew.adams@oracle.com> wrote:

> > > > > > > (concat
> > > > > > > "^\\s-("
> > > > > > > (regexp-opt...)
> > > > > > > t)
> > > > > > > "\\s-+\\(\\(\\sw\\|\\s_\\)+\\)")
> > > > > > 
> > > > > > I see that you use "\\sw". What is the advantage verses
> > > > > > "[[:alnum:]-
> > > > > > _]"
> > > > > 
> > > > > No special advantage. You can include any other
> > > > > chars you want, so you can pick up, e.g.,
> > > > > 
> > > > > (defun foo!@$%^&*+={}/:42<>? ()
> > > > > (message "Hello"))
> > > > > 
> > > > > Perfectly legitimate, and none of those chars
> > > > > even require escaping.
> > > > 
> > > > Meaning that \\sw is superior to [[:alnum:]-_], right ?
> > > 
> > > No.
> > > 
> > > \\sw means word-char syntax.
> > > [[:alnum:]-_] means alphanumeric- or symbol-char syntax.
> > 
> > \sw is equivalent to "[:word:]", that includes digits. And [:alnum:]
> > is alphabetic and numeric. What is the difference ?
> 
> 
> https://www.gnu.org/software/emacs/manual/html_node/elisp/Char-Classes.html
> 
> says:
> 
> ‘[:alnum:]’
> This matches any letter or digit. For multibyte characters, it
> matches characters whose Unicode ‘general-category’ property (*note
> Character Properties::) indicates they are alphabetic or decimal
> number characters.

> The same is not said for [:word:].

I thought that alphabetic and words constitute the same characters.
[:word:] also matches accented letter (e.g., in French, Spanish, Icelandic).
It is difficult to know what is actually defined these days.





^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-08-01  4:15 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-31 19:23 Regexp for matching (defun lines Heime
2024-07-31 19:59 ` [External] : " Drew Adams
2024-07-31 20:14   ` Heime
2024-07-31 21:02     ` Drew Adams
2024-07-31 21:15       ` Heime
2024-08-01  2:11         ` Drew Adams
2024-07-31 21:29       ` Heime
2024-08-01  2:08         ` Drew Adams
2024-08-01  2:24           ` Heime
2024-08-01  3:34             ` Drew Adams
2024-08-01  4:15               ` Heime

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.