* bug#74861: 31.0.50; etags no longer processing shy groups as expected in --regex options
@ 2024-12-13 16:06 dvilleneuve.4142
2024-12-14 8:33 ` Eli Zaretskii
0 siblings, 1 reply; 5+ messages in thread
From: dvilleneuve.4142 @ 2024-12-13 16:06 UTC (permalink / raw)
To: 74861
Hi,
On a C file (foo.c) with the following line:
DEFINE_FF(TAG func)
the etags command used to tag "func" (version 26.1):
etags --regex='/DEFINE_FF *(\(?:TAG \)? *\([^)]+\)/\1/' \
--output ETAGS-test foo.c
produces
$ od -c ETAGS-test
0000000 \f \n f o o . c , 2 8 \n D E F I N
0000020 E _ F F ( T A G f u n c 177 f u
0000040 n c 001 1 , 0 \n
so "func" between \x7f and \x01.
With versions 27.2 and later, we get:
$ od -c ETAGS-test
0000000 \f \n f o o . c , 0 \n
0000012
so no match, and replacing \1 by \2 in the --regex argument
(admittedly not well defined since there is a single registered group),
we get:
$ od -c ETAGS-test
0000000 \f \n f o o . c , 3 2 \n D E F I N
0000020 E _ F F ( T A G f u n c 177 T A
0000040 G f u n c 001 1 , 0 \n
so "TAG func" between \x7f and \x01.
In both cases the result is different from the one from 26.1,
which I would think is the expected one.
Using
etags --regex='/DEFINE_FF *(\(TAG \)? *\([^)]+\)/\2/' \
--output ETAGS-test foo.c
(that is, without a shy group) works as expected, in 26.1 and later.
In GNU Emacs 31.0.50 (build 1, x86_64-pc-linux-gnu, X toolkit, Xaw
scroll bars) of 2024-12-12 built on tarmac
Repository revision: 020128e9dc31fb3b06c39614b7eb20ddb5b3725a
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.12302007
System Description: Red Hat Enterprise Linux 9.5 (Plow)
Configured using:
'configure --prefix=/home/daniel/tmplocal --with-gif=ifavailable'
Configured features:
FREETYPE GLIB GMP GNUTLS GSETTINGS HARFBUZZ JPEG LIBSELINUX LIBXML2
MODULES NOTIFY INOTIFY PDUMPER PNG SECCOMP SOUND THREADS TIFF
TOOLKIT_SCROLL_BARS X11 XDBE XFT XIM XPM LUCID ZLIB
Important settings:
value of $LC_COLLATE: C.UTF-8
value of $LANG: en_US.UTF-8
value of $XMODIFIERS: @im=ibus
locale-coding-system: utf-8-unix
Major mode: Lisp Interaction
Minor modes in effect:
global-subword-mode: t
subword-mode: t
display-time-mode: t
delete-selection-mode: t
server-mode: t
gcm-grep-minor-mode: t
tooltip-mode: t
global-eldoc-mode: t
eldoc-mode: t
show-paren-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
horizontal-scroll-bar-mode: t
global-font-lock-mode: t
font-lock-mode: t
minibuffer-regexp-mode: t
line-number-mode: t
transient-mark-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
Load-path shadows:
None found.
Features:
(shadow sort mail-extr emacsbug message yank-media puny dired
dired-loaddefs rfc822 mml mml-sec epa derived epg rfc6068 epg-config
gnus-util text-property-search mm-decode mm-bodies mm-encode mail-parse
rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045
ietf-drums mm-util mail-prsvr mail-utils apropos thingatpt misearch
multi-isearch mule-util info time-date jka-compr cap-words superword
subword time delsel server rng-nxml rng-valid rng-loc rng-uri rng-parse
nxml-parse rng-match rng-dt rng-util rng-pttrn nxml-ns nxml-mode
nxml-outln nxml-rap sgml-mode facemenu dom nxml-util nxml-enc xmltok
elisp/init rainbow-delimiters mmm/init mmm-mode mmm-univ mmm-class
mmm-region mmm-auto mmm-vars mmm-utils mmm-compat elisp/menu help-fns
radix-tree help-mode elisp/tabbar elisp/menu-theme elisp/gcm-c-mode
elisp/gcm-temp elisp/gcm-comm cc-mode cc-fonts cc-guess cc-menus cc-cmds
cc-styles cc-align cc-engine cc-vars cc-defs sh-script rx smie treesit
executable imenu cus-edit pp cus-load wid-edit advice elisp/gcm-grep
easy-mmode etags fileloop generator xref project ring finder-inf package
browse-url url url-proxy url-privacy url-expand url-methods url-history
url-cookie generate-lisp-file url-domsuf url-util mailcap url-handlers
url-parse auth-source cl-seq eieio eieio-core cl-macs icons
password-cache json subr-x map byte-opt gv bytecomp byte-compile
url-vars cl-loaddefs cl-lib rmc iso-transl tooltip cconv eldoc paren
electric uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel
term/x-win x-win term/common-win x-dnd touch-screen tool-bar dnd fontset
image regexp-opt fringe tabulated-list replace newcomment text-mode
lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow isearch
easymenu timer select scroll-bar mouse jit-lock font-lock syntax
font-core term/tty-colors frame minibuffer nadvice seq simple cl-generic
indonesian philippine cham georgian utf-8-lang misc-lang vietnamese
tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
composite emoji-zwj charscript charprop case-table epa-hook
jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs
theme-loaddefs faces cus-face macroexp files window text-properties
overlay sha1 md5 base64 format env code-pages mule custom widget keymap
hashtable-print-readable backquote threads inotify dynamic-setting
system-font-setting font-render-setting x-toolkit x multi-tty
move-toolbar make-network-process emacs)
Memory information:
((conses 16 171213 17314) (symbols 48 16296 0) (strings 32 50080 4604) (string-bytes 1 1436061) (vectors 16 26391)
(vector-slots 8 263425 5024) (floats 8 61 10) (intervals 56 712 29) (buffers 984 11))
^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#74861: 31.0.50; etags no longer processing shy groups as expected in --regex options
2024-12-13 16:06 bug#74861: 31.0.50; etags no longer processing shy groups as expected in --regex options dvilleneuve.4142
@ 2024-12-14 8:33 ` Eli Zaretskii
2024-12-14 18:39 ` Paul Eggert
0 siblings, 1 reply; 5+ messages in thread
From: Eli Zaretskii @ 2024-12-14 8:33 UTC (permalink / raw)
To: dvilleneuve.4142, Paul Eggert; +Cc: 74861
> Date: Fri, 13 Dec 2024 11:06:30 -0500
> From: dvilleneuve.4142@gmail.com
>
> On a C file (foo.c) with the following line:
>
> DEFINE_FF(TAG func)
>
> the etags command used to tag "func" (version 26.1):
>
> etags --regex='/DEFINE_FF *(\(?:TAG \)? *\([^)]+\)/\1/' \
> --output ETAGS-test foo.c
>
> produces
>
> $ od -c ETAGS-test
> 0000000 \f \n f o o . c , 2 8 \n D E F I N
> 0000020 E _ F F ( T A G f u n c 177 f u
> 0000040 n c 001 1 , 0 \n
>
> so "func" between \x7f and \x01.
>
> With versions 27.2 and later, we get:
> $ od -c ETAGS-test
> 0000000 \f \n f o o . c , 0 \n
> 0000012
>
> so no match, and replacing \1 by \2 in the --regex argument
> (admittedly not well defined since there is a single registered group),
> we get:
>
> $ od -c ETAGS-test
> 0000000 \f \n f o o . c , 3 2 \n D E F I N
> 0000020 E _ F F ( T A G f u n c 177 T A
> 0000040 G f u n c 001 1 , 0 \n
>
> so "TAG func" between \x7f and \x01.
>
> In both cases the result is different from the one from 26.1,
> which I would think is the expected one.
>
> Using
>
> etags --regex='/DEFINE_FF *(\(TAG \)? *\([^)]+\)/\2/' \
> --output ETAGS-test foo.c
>
> (that is, without a shy group) works as expected, in 26.1 and later.
I'm guessing this is because Emacs 27 switched to the Gnulib's regex
implementation in etags and other lib-src programs, whereas previous
versions used the Emacs's own regex code (which is still used for
Emacs's own regex search and replacement code).
CC'ing Paul Eggert, in the hope he can tell whether this is expected
or not, or how to fix it.
^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#74861: 31.0.50; etags no longer processing shy groups as expected in --regex options
2024-12-14 8:33 ` Eli Zaretskii
@ 2024-12-14 18:39 ` Paul Eggert
[not found] ` <e26d1e0d-4dbc-4a3d-9a03-a8505fe4ee4c@gmail.com>
0 siblings, 1 reply; 5+ messages in thread
From: Paul Eggert @ 2024-12-14 18:39 UTC (permalink / raw)
To: Eli Zaretskii, dvilleneuve.4142; +Cc: 74861
On 12/14/24 01:33, Eli Zaretskii wrote:
> I'm guessing this is because Emacs 27 switched to the Gnulib's regex
> implementation in etags and other lib-src programs, whereas previous
> versions used the Emacs's own regex code (which is still used for
> Emacs's own regex search and replacement code).
>
> CC'ing Paul Eggert, in the hope he can tell whether this is expected
> or not, or how to fix it.
Yes that's expected, as glibc/Gnulib regex doesn't do shy groups. A
workaround is to not use shy groups, e.g.:
etags --regex='/DEFINE_FF *(\(TAG \)? *\([^)]+\)/\2/' \
--output ETAGS-test foo.c
This should be portable between both older and newer etags.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-12-15 7:17 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-13 16:06 bug#74861: 31.0.50; etags no longer processing shy groups as expected in --regex options dvilleneuve.4142
2024-12-14 8:33 ` Eli Zaretskii
2024-12-14 18:39 ` Paul Eggert
[not found] ` <e26d1e0d-4dbc-4a3d-9a03-a8505fe4ee4c@gmail.com>
2024-12-15 5:54 ` Eli Zaretskii
2024-12-15 7:17 ` Paul Eggert
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).