* bug#73771: 30.0.91; etags generates broken TAGS file for multi-line regex match @ 2024-10-12 14:39 Morgan Willcock 2024-10-12 16:22 ` Eli Zaretskii 0 siblings, 1 reply; 5+ messages in thread From: Morgan Willcock @ 2024-10-12 14:39 UTC (permalink / raw) To: 73771 It appears as though using the multi-line regex matching feature of etags generates a TAGS file which accidentally contains additional newline characters. To test, create a source file for a test language which has some variable definitions to match, where variables are defined using a keyword "define" followed by a variable name that begins with "$": ## Create a test source file. > /tmp/source printf 'Top of file\n\n' ## Write a multi-line variable definition that ends at a newline >>/tmp/source printf 'define\n$a\n' ## Write a multi-line variable definition that doesn't end at a newline. >>/tmp/source printf 'define\n$b;\n' Now create a TAGS file based on multi-line matching the variable definitions: etags --lang=none --regex='/define[ \t\n]+\(\$[a-z]+\)/\1/m' -o /tmp/TAGS /tmp/source Note that the capture group does not include any newline characters, but the contents of the TAGS file seems have inserted an additional newline character in the line which locates $a: cat /tmp/TAGS /tmp/source,24 $a $a4,20 $b;$b6,30 Now try to locate $a using the TAGS file, which fails with the message "xref--not-found-error: No apropos found for: a": emacs -Q \ --eval "(find-file \"/tmp/source\")" \ --eval "(visit-tags-table \"/tmp/TAGS\")" \ --eval "(xref-find-apropos \"a\")" The problem only seems to occur where the multi-line regex match finishes immediately before the line ending, i.e. there are no issues locating $b: emacs -Q \ --eval "(find-file \"/tmp/source\")" \ --eval "(visit-tags-table \"/tmp/TAGS\")" \ --eval "(xref-find-apropos \"b\")" In GNU Emacs 30.0.91 (build 2, x86_64-pc-linux-gnu, X toolkit, cairo version 1.16.0, Xaw3d scroll bars) of 2024-09-12 built on inspiron Windowing system distributor 'The X.Org Foundation', version 11.0.12101007 System Description: Debian GNU/Linux 12 (bookworm) Configured using: 'configure --with-native-compilation=aot --with-xml2 --with-x-toolkit=lucid' Configured features: CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GSETTINGS HARFBUZZ JPEG LIBSELINUX LIBXML2 MODULES NATIVE_COMP NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS WEBP X11 XAW3D XDBE XIM XINPUT2 XPM LUCID ZLIB Important settings: value of $LANG: en_GB.UTF-8 value of $XMODIFIERS: @im=ibus locale-coding-system: utf-8-unix Major mode: Shell-script Minor modes in effect: server-mode: t global-corfu-mode: t corfu-mode: t jabber-activity-mode: t which-key-mode: t global-devil-mode: t devil-mode: t erc-ring-mode: t erc-netsplit-mode: t erc-menu-mode: t erc-list-mode: t erc-imenu-mode: t erc-pcomplete-mode: t erc-button-mode: t erc-fill-mode: t erc-stamp-mode: t erc-irccontrols-mode: t erc-move-to-prompt-mode: t erc-readonly-mode: t erc-scrolltobottom-mode: t erc-spelling-mode: t erc-track-mode: t erc-track-minor-mode: t erc-match-mode: t erc-autojoin-mode: t erc-networks-mode: t sh-electric-here-document-mode: t savehist-mode: t minibuffer-electric-default-mode: t minibuffer-depth-indicate-mode: t ido-everywhere: t recentf-mode: t global-display-fill-column-indicator-mode: t display-fill-column-indicator-mode: t global-hl-line-mode: t display-time-mode: t flyspell-mode: t editorconfig-mode: t tooltip-mode: t global-eldoc-mode: t show-paren-mode: t electric-indent-mode: t mouse-wheel-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t minibuffer-regexp-mode: t column-number-mode: t line-number-mode: t transient-mark-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t Load-path shadows: /home/mwillcock/.emacs.d/elpa/org-9.7.12/ol-man hides /usr/local/share/emacs/30.0.91/lisp/org/ol-man /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-colview hides /usr/local/share/emacs/30.0.91/lisp/org/org-colview /home/mwillcock/.emacs.d/elpa/org-9.7.12/ox-texinfo hides /usr/local/share/emacs/30.0.91/lisp/org/ox-texinfo /home/mwillcock/.emacs.d/elpa/org-9.7.12/ol-doi hides /usr/local/share/emacs/30.0.91/lisp/org/ol-doi /home/mwillcock/.emacs.d/elpa/org-9.7.12/ol-docview hides /usr/local/share/emacs/30.0.91/lisp/org/ol-docview /home/mwillcock/.emacs.d/elpa/org-9.7.12/ox-ascii hides /usr/local/share/emacs/30.0.91/lisp/org/ox-ascii /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-refile hides /usr/local/share/emacs/30.0.91/lisp/org/org-refile /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-ctags hides /usr/local/share/emacs/30.0.91/lisp/org/org-ctags /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-latex hides /usr/local/share/emacs/30.0.91/lisp/org/ob-latex /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-clojure hides /usr/local/share/emacs/30.0.91/lisp/org/ob-clojure /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-inlinetask hides /usr/local/share/emacs/30.0.91/lisp/org/org-inlinetask /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-js hides /usr/local/share/emacs/30.0.91/lisp/org/ob-js /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-groovy hides /usr/local/share/emacs/30.0.91/lisp/org/ob-groovy /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-lua hides /usr/local/share/emacs/30.0.91/lisp/org/ob-lua /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-lilypond hides /usr/local/share/emacs/30.0.91/lisp/org/ob-lilypond /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-table hides /usr/local/share/emacs/30.0.91/lisp/org/ob-table /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-id hides /usr/local/share/emacs/30.0.91/lisp/org/org-id /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-compat hides /usr/local/share/emacs/30.0.91/lisp/org/org-compat /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-exp hides /usr/local/share/emacs/30.0.91/lisp/org/ob-exp /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-version hides /usr/local/share/emacs/30.0.91/lisp/org/org-version /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-R hides /usr/local/share/emacs/30.0.91/lisp/org/ob-R /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-indent hides /usr/local/share/emacs/30.0.91/lisp/org/org-indent /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-julia hides /usr/local/share/emacs/30.0.91/lisp/org/ob-julia /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-sed hides /usr/local/share/emacs/30.0.91/lisp/org/ob-sed /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-keys hides /usr/local/share/emacs/30.0.91/lisp/org/org-keys /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-ditaa hides /usr/local/share/emacs/30.0.91/lisp/org/ob-ditaa /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-lint hides /usr/local/share/emacs/30.0.91/lisp/org/org-lint /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-ruby hides /usr/local/share/emacs/30.0.91/lisp/org/ob-ruby /home/mwillcock/.emacs.d/elpa/org-9.7.12/oc-bibtex hides /usr/local/share/emacs/30.0.91/lisp/org/oc-bibtex /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-core hides /usr/local/share/emacs/30.0.91/lisp/org/ob-core /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-comint hides /usr/local/share/emacs/30.0.91/lisp/org/ob-comint /home/mwillcock/.emacs.d/elpa/org-9.7.12/ol hides /usr/local/share/emacs/30.0.91/lisp/org/ol /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-capture hides /usr/local/share/emacs/30.0.91/lisp/org/org-capture /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-agenda hides /usr/local/share/emacs/30.0.91/lisp/org/org-agenda /home/mwillcock/.emacs.d/elpa/org-9.7.12/ol-gnus hides /usr/local/share/emacs/30.0.91/lisp/org/ol-gnus /home/mwillcock/.emacs.d/elpa/org-9.7.12/ox-koma-letter hides /usr/local/share/emacs/30.0.91/lisp/org/ox-koma-letter /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-haskell hides /usr/local/share/emacs/30.0.91/lisp/org/ob-haskell /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-faces hides /usr/local/share/emacs/30.0.91/lisp/org/org-faces /home/mwillcock/.emacs.d/elpa/org-9.7.12/org hides /usr/local/share/emacs/30.0.91/lisp/org/org /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-dot hides /usr/local/share/emacs/30.0.91/lisp/org/ob-dot /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-lisp hides /usr/local/share/emacs/30.0.91/lisp/org/ob-lisp /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-awk hides /usr/local/share/emacs/30.0.91/lisp/org/ob-awk /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-element hides /usr/local/share/emacs/30.0.91/lisp/org/org-element /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-goto hides /usr/local/share/emacs/30.0.91/lisp/org/org-goto /home/mwillcock/.emacs.d/elpa/org-9.7.12/ox-org hides /usr/local/share/emacs/30.0.91/lisp/org/ox-org /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-makefile hides /usr/local/share/emacs/30.0.91/lisp/org/ob-makefile /home/mwillcock/.emacs.d/elpa/org-9.7.12/ox-publish hides /usr/local/share/emacs/30.0.91/lisp/org/ox-publish /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-maxima hides /usr/local/share/emacs/30.0.91/lisp/org/ob-maxima /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-loaddefs hides /usr/local/share/emacs/30.0.91/lisp/org/org-loaddefs /home/mwillcock/.emacs.d/elpa/org-9.7.12/oc hides /usr/local/share/emacs/30.0.91/lisp/org/oc /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-python hides /usr/local/share/emacs/30.0.91/lisp/org/ob-python /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-screen hides /usr/local/share/emacs/30.0.91/lisp/org/ob-screen /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-C hides /usr/local/share/emacs/30.0.91/lisp/org/ob-C /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-mouse hides /usr/local/share/emacs/30.0.91/lisp/org/org-mouse /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-attach-git hides /usr/local/share/emacs/30.0.91/lisp/org/org-attach-git /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-shell hides /usr/local/share/emacs/30.0.91/lisp/org/ob-shell /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-gnuplot hides /usr/local/share/emacs/30.0.91/lisp/org/ob-gnuplot /home/mwillcock/.emacs.d/elpa/org-9.7.12/ox-beamer hides /usr/local/share/emacs/30.0.91/lisp/org/ox-beamer /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-pcomplete hides /usr/local/share/emacs/30.0.91/lisp/org/org-pcomplete /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-tempo hides /usr/local/share/emacs/30.0.91/lisp/org/org-tempo /home/mwillcock/.emacs.d/elpa/org-9.7.12/ox-odt hides /usr/local/share/emacs/30.0.91/lisp/org/ox-odt /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-lob hides /usr/local/share/emacs/30.0.91/lisp/org/ob-lob /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-sass hides /usr/local/share/emacs/30.0.91/lisp/org/ob-sass /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-fold-core hides /usr/local/share/emacs/30.0.91/lisp/org/org-fold-core /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-table hides /usr/local/share/emacs/30.0.91/lisp/org/org-table /home/mwillcock/.emacs.d/elpa/org-9.7.12/ol-irc hides /usr/local/share/emacs/30.0.91/lisp/org/ol-irc /home/mwillcock/.emacs.d/elpa/org-9.7.12/oc-basic hides /usr/local/share/emacs/30.0.91/lisp/org/oc-basic /home/mwillcock/.emacs.d/elpa/org-9.7.12/ox-md hides /usr/local/share/emacs/30.0.91/lisp/org/ox-md /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-emacs-lisp hides /usr/local/share/emacs/30.0.91/lisp/org/ob-emacs-lisp /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-macs hides /usr/local/share/emacs/30.0.91/lisp/org/org-macs /home/mwillcock/.emacs.d/elpa/org-9.7.12/ol-w3m hides /usr/local/share/emacs/30.0.91/lisp/org/ol-w3m /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-datetree hides /usr/local/share/emacs/30.0.91/lisp/org/org-datetree /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-persist hides /usr/local/share/emacs/30.0.91/lisp/org/org-persist /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-matlab hides /usr/local/share/emacs/30.0.91/lisp/org/ob-matlab /home/mwillcock/.emacs.d/elpa/org-9.7.12/ol-mhe hides /usr/local/share/emacs/30.0.91/lisp/org/ol-mhe /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-cycle hides /usr/local/share/emacs/30.0.91/lisp/org/org-cycle /home/mwillcock/.emacs.d/elpa/org-9.7.12/ox-man hides /usr/local/share/emacs/30.0.91/lisp/org/ox-man /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-ref hides /usr/local/share/emacs/30.0.91/lisp/org/ob-ref /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-archive hides /usr/local/share/emacs/30.0.91/lisp/org/org-archive /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob hides /usr/local/share/emacs/30.0.91/lisp/org/ob /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-scheme hides /usr/local/share/emacs/30.0.91/lisp/org/ob-scheme /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-org hides /usr/local/share/emacs/30.0.91/lisp/org/ob-org /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-eshell hides /usr/local/share/emacs/30.0.91/lisp/org/ob-eshell /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-crypt hides /usr/local/share/emacs/30.0.91/lisp/org/org-crypt /home/mwillcock/.emacs.d/elpa/org-9.7.12/ol-rmail hides /usr/local/share/emacs/30.0.91/lisp/org/ol-rmail /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-plot hides /usr/local/share/emacs/30.0.91/lisp/org/org-plot /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-clock hides /usr/local/share/emacs/30.0.91/lisp/org/org-clock /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-macro hides /usr/local/share/emacs/30.0.91/lisp/org/org-macro /home/mwillcock/.emacs.d/elpa/org-9.7.12/ox-icalendar hides /usr/local/share/emacs/30.0.91/lisp/org/ox-icalendar /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-fortran hides /usr/local/share/emacs/30.0.91/lisp/org/ob-fortran /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-plantuml hides /usr/local/share/emacs/30.0.91/lisp/org/ob-plantuml /home/mwillcock/.emacs.d/elpa/org-9.7.12/ol-bibtex hides /usr/local/share/emacs/30.0.91/lisp/org/ol-bibtex /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-protocol hides /usr/local/share/emacs/30.0.91/lisp/org/org-protocol /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-octave hides /usr/local/share/emacs/30.0.91/lisp/org/ob-octave /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-timer hides /usr/local/share/emacs/30.0.91/lisp/org/org-timer /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-forth hides /usr/local/share/emacs/30.0.91/lisp/org/ob-forth /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-feed hides /usr/local/share/emacs/30.0.91/lisp/org/org-feed /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-list hides /usr/local/share/emacs/30.0.91/lisp/org/org-list /home/mwillcock/.emacs.d/elpa/org-9.7.12/ol-info hides /usr/local/share/emacs/30.0.91/lisp/org/ol-info /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-perl hides /usr/local/share/emacs/30.0.91/lisp/org/ob-perl /home/mwillcock/.emacs.d/elpa/org-9.7.12/oc-csl hides /usr/local/share/emacs/30.0.91/lisp/org/oc-csl /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-processing hides /usr/local/share/emacs/30.0.91/lisp/org/ob-processing /home/mwillcock/.emacs.d/elpa/org-9.7.12/ol-eshell hides /usr/local/share/emacs/30.0.91/lisp/org/ol-eshell /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-eval hides /usr/local/share/emacs/30.0.91/lisp/org/ob-eval /home/mwillcock/.emacs.d/elpa/org-9.7.12/ox hides /usr/local/share/emacs/30.0.91/lisp/org/ox /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-habit hides /usr/local/share/emacs/30.0.91/lisp/org/org-habit /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-calc hides /usr/local/share/emacs/30.0.91/lisp/org/ob-calc /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-footnote hides /usr/local/share/emacs/30.0.91/lisp/org/org-footnote /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-element-ast hides /usr/local/share/emacs/30.0.91/lisp/org/org-element-ast /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-attach hides /usr/local/share/emacs/30.0.91/lisp/org/org-attach /home/mwillcock/.emacs.d/elpa/org-9.7.12/ox-latex hides /usr/local/share/emacs/30.0.91/lisp/org/ox-latex /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-ocaml hides /usr/local/share/emacs/30.0.91/lisp/org/ob-ocaml /home/mwillcock/.emacs.d/elpa/org-9.7.12/ol-eww hides /usr/local/share/emacs/30.0.91/lisp/org/ol-eww /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-css hides /usr/local/share/emacs/30.0.91/lisp/org/ob-css /home/mwillcock/.emacs.d/elpa/org-9.7.12/ox-html hides /usr/local/share/emacs/30.0.91/lisp/org/ox-html /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-mobile hides /usr/local/share/emacs/30.0.91/lisp/org/org-mobile /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-sqlite hides /usr/local/share/emacs/30.0.91/lisp/org/ob-sqlite /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-tangle hides /usr/local/share/emacs/30.0.91/lisp/org/ob-tangle /home/mwillcock/.emacs.d/elpa/org-9.7.12/oc-biblatex hides /usr/local/share/emacs/30.0.91/lisp/org/oc-biblatex /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-java hides /usr/local/share/emacs/30.0.91/lisp/org/ob-java /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-entities hides /usr/local/share/emacs/30.0.91/lisp/org/org-entities /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-num hides /usr/local/share/emacs/30.0.91/lisp/org/org-num /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-duration hides /usr/local/share/emacs/30.0.91/lisp/org/org-duration /home/mwillcock/.emacs.d/elpa/org-9.7.12/ol-bbdb hides /usr/local/share/emacs/30.0.91/lisp/org/ol-bbdb /home/mwillcock/.emacs.d/elpa/org-9.7.12/ob-sql hides /usr/local/share/emacs/30.0.91/lisp/org/ob-sql /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-src hides /usr/local/share/emacs/30.0.91/lisp/org/org-src /home/mwillcock/.emacs.d/elpa/org-9.7.12/org-fold hides /usr/local/share/emacs/30.0.91/lisp/org/org-fold /home/mwillcock/.emacs.d/elpa/org-9.7.12/oc-natbib hides /usr/local/share/emacs/30.0.91/lisp/org/oc-natbib Features: (shadow ecomplete emacsbug sort mail-extr textsec uni-scripts idna-mapping ucs-normalize uni-confusable textsec-check gnus-async gnus-bcklg gnus-ml disp-table rect mule-util display-line-numbers pulse gnus-topic nndraft nnmh utf-7 nnfolder nnnil gnus-agent gnus-srvr gnus-score score-mode nnvirtual gnus-cache gnus-demon nntp network-stream nsm epa-file cursor-sensor server cal-iso autorevert face-remap init corfu sly sly-completion sly-buttons sly-messages sly-common apropos arc-mode archive-mode hyperspec plantuml-mode dash powershell php-mode mode-local consult-imenu consult php-face php php-project jabber jabber-ourversion hippie-exp hexrgb fsm sgml-mode facemenu srv dns starttls tls goto-addr yaml-mode markdown-mode lua-mode advice edmacro kmacro kixtart-docstrings kixtart-mode tempo etags fileloop org-msg let-alist color ox-odt rng-loc rng-uri rng-parse rng-match rng-dt rng-util rng-pttrn nxml-parse nxml-ns nxml-enc xmltok nxml-util ox-latex ox-icalendar ox-html table ox-ascii ox-publish ox org-attach htmlize gnus-msg gnus-icalendar icalendar gnus-dired gnus-cite which-key devil delight comp comp-cstr ags-mode speedbar ezimage dframe shadowfile eglot jsonrpc xref flymake diff ert ewoc debug backtrace warnings python project cc-mode cc-fonts cc-guess cc-menus cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs erc-sasl erc-sasl-ecdsa-nist256p-challenge erc-sasl-scram-sha-512 erc-sasl-scram-sha-256 erc-sasl-scram-sha-1 erc-sasl-external erc-sasl-plain sasl-scram-sha256 sasl-scram-rfc sasl-scram-sha-1 rfc2104 hex-util sasl sasl-anonymous sasl-login sasl-plain erc-ring erc-netsplit erc-menu erc-list erc-imenu imenu erc-pcomplete erc-button erc-fill erc-stamp erc-goodies erc-spelling erc-track erc-match erc-join erc erc-backend erc-networks erc-common erc-compat compat erc-loaddefs ediff ediff-merg ediff-mult ediff-wind ediff-diff ediff-help ediff-init ediff-util vc-git diff-mode track-changes vc-dispatcher org-indent oc-basic cl-extra ol-eww eww url-queue mm-url ol-rmail ol-mhe ol-irc ol-info ol-gnus nnselect gnus-art mm-uu mml2015 gnus-sum ol-docview ol-bibtex bibtex ol-bbdb ol-w3m ol-doi org-link-doi appt diary-lib diary-loaddefs org-capture ob-shell ob-plantuml ob-dot org-goto org-clock comp-run comp-common org-duration org-agenda org-element org-persist org-id org-element-ast inline avl-tree generator org-refile org ob ob-tangle ob-ref ob-lob ob-table ob-exp org-macro org-src sh-script smie treesit executable ob-comint org-pcomplete org-list org-footnote org-faces org-entities noutline outline ob-emacs-lisp ob-core ob-eval org-cycle org-table ol org-fold org-fold-core org-keys oc org-loaddefs thingatpt org-version org-compat org-macs autoinsert compile bookmark savehist crm minibuf-eldef mb-depth ido tramp-cache time-stamp tramp-sh tramp rx trampver tramp-integration files-x tramp-message tramp-compat xdg format-spec tramp-loaddefs recentf tree-widget shell pcomplete comint ansi-osc ansi-color ring easy-mmode display-fill-column-indicator hl-line time gnus-group gnus-undo gnus-start gnus-dbus dbus gnus-cloud nnimap nnmail mail-source utf7 nnoo parse-time iso8601 gnus-spec gnus-int gnus-range gnus-win gnus nnheader range cus-edit pp cus-load wid-edit dictionary external-completion dictionary-connection flyspell ispell shr pixel-fill kinsoku url-file svg xml dom modus-vivendi-theme modus-themes editorconfig editorconfig-core editorconfig-core-handle editorconfig-fnmatch mm-view mml-smime smime gnutls dig smtpmail message sendmail yank-media puny rfc822 mml mml-sec epa derived epg rfc6068 gnus-util text-property-search time-date mm-decode mm-bodies mm-encode mail-parse rfc2231 rfc2047 rfc2045 mm-util ietf-drums mail-prsvr mailabbrev mail-utils gmm-utils mailheader epg-config help-mode doc-view filenotify jka-compr image-mode exif dired dired-loaddefs dabbrev find-func cal-menu calendar cal-loaddefs desktop frameset pcase consult-autoloads corfu-autoloads dash-autoloads delight-autoloads do-at-point-autoloads fsm-autoloads htmlize-autoloads indent-bars-autoloads lua-mode-autoloads markdown-mode-autoloads org-autoloads package-lint-autoloads php-mode-autoloads rainbow-mode-autoloads renpy-mode-autoloads sly-autoloads totp-auth-autoloads base32-autoloads info vertico-autoloads wgrep-autoloads yaml-mode-autoloads package browse-url url url-proxy url-privacy url-expand url-methods url-history url-cookie generate-lisp-file url-domsuf url-util mailcap url-handlers url-parse auth-source cl-seq eieio eieio-core cl-macs icons password-cache json subr-x map byte-opt gv bytecomp byte-compile url-vars cl-loaddefs cl-lib rmc iso-transl tooltip cconv eldoc paren electric uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel term/x-win x-win term/common-win x-dnd touch-screen tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu timer select scroll-bar mouse jit-lock font-lock syntax font-core term/tty-colors frame minibuffer nadvice seq simple cl-generic indonesian philippine cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite emoji-zwj charscript charprop case-table epa-hook jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs theme-loaddefs faces cus-face macroexp files window text-properties overlay sha1 md5 base64 format env code-pages mule custom widget keymap hashtable-print-readable backquote threads dbusbind inotify dynamic-setting system-font-setting font-render-setting cairo x-toolkit xinput2 x multi-tty move-toolbar make-network-process native-compile emacs) Memory information: ((conses 16 1957145 208906) (symbols 48 53381 27) (strings 32 284451 9119) (string-bytes 1 7768066) (vectors 16 178913) (vector-slots 8 2954038 108871) (floats 8 816 812) (intervals 56 129474 5683) (buffers 984 42)) ^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#73771: 30.0.91; etags generates broken TAGS file for multi-line regex match 2024-10-12 14:39 bug#73771: 30.0.91; etags generates broken TAGS file for multi-line regex match Morgan Willcock @ 2024-10-12 16:22 ` Eli Zaretskii 2024-10-13 10:48 ` Francesco Potortì 0 siblings, 1 reply; 5+ messages in thread From: Eli Zaretskii @ 2024-10-12 16:22 UTC (permalink / raw) To: Morgan Willcock, Francesco Potortì; +Cc: 73771 > From: Morgan Willcock <morgan@ice9.digital> > Date: Sat, 12 Oct 2024 15:39:59 +0100 > > It appears as though using the multi-line regex matching feature of > etags generates a TAGS file which accidentally contains additional > newline characters. > > To test, create a source file for a test language which has some > variable definitions to match, where variables are defined using a > keyword "define" followed by a variable name that begins with "$": > > ## Create a test source file. > > /tmp/source printf 'Top of file\n\n' > ## Write a multi-line variable definition that ends at a newline > >>/tmp/source printf 'define\n$a\n' > ## Write a multi-line variable definition that doesn't end at a newline. > >>/tmp/source printf 'define\n$b;\n' > > Now create a TAGS file based on multi-line matching the variable > definitions: > > etags --lang=none --regex='/define[ \t\n]+\(\$[a-z]+\)/\1/m' -o /tmp/TAGS /tmp/source > > Note that the capture group does not include any newline characters, but > the contents of the TAGS file seems have inserted an additional newline > character in the line which locates $a: > > cat /tmp/TAGS > > /tmp/source,24 > $a > $a4,20 > $b;$b6,30 This is because etags always records one extra character with the regexp match, which is harmless, unless that extra character is a newline. The patch below fixes it: diff --git a/lib-src/etags.c b/lib-src/etags.c index a822a82..848d8ea 100644 --- a/lib-src/etags.c +++ b/lib-src/etags.c @@ -7420,7 +7420,7 @@ regex_tag_multiline (void) /* Force explicit tag name, if a name is there. */ pfnote (name, true, buffer + linecharno, - charno - linecharno + 1, lineno, linecharno); + charno - linecharno, lineno, linecharno); if (debug) fprintf (stderr, "%s on %s:%"PRIdMAX": %s\n", Francesco, why does the code add one more character there? It looks to me like an off-by-one error, because "charno - linecharno + 1" is interpreted by pfnote as the length of the portion of the line to record the regexp match. This code was there since you first introduced multi-line regexps back in 2002. Am I missing something here? (Removing the +1" part will need to update the expected results in the test suite, as they currently include that extra character, but that is okay.) ^ permalink raw reply related [flat|nested] 5+ messages in thread
* bug#73771: 30.0.91; etags generates broken TAGS file for multi-line regex match 2024-10-12 16:22 ` Eli Zaretskii @ 2024-10-13 10:48 ` Francesco Potortì 2024-10-13 12:58 ` Eli Zaretskii 0 siblings, 1 reply; 5+ messages in thread From: Francesco Potortì @ 2024-10-13 10:48 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 73771, Morgan Willcock >This is because etags always records one extra character with the >regexp match, which is harmless, unless that extra character is a >newline. The patch below fixes it: > >diff --git a/lib-src/etags.c b/lib-src/etags.c >index a822a82..848d8ea 100644 >--- a/lib-src/etags.c >+++ b/lib-src/etags.c >@@ -7420,7 +7420,7 @@ regex_tag_multiline (void) > > /* Force explicit tag name, if a name is there. */ > pfnote (name, true, buffer + linecharno, >- charno - linecharno + 1, lineno, linecharno); >+ charno - linecharno, lineno, linecharno); > > if (debug) > fprintf (stderr, "%s on %s:%"PRIdMAX": %s\n", > >Francesco, why does the code add one more character there? While I can't remember the reason, I am sure it was not done by chance. Sure, I may have been wrong at that time, or maybe that reason is now obsolete. To check, one should run the etags regression test and check the results. Off the top of my head, it may have to do with implicitly named tags that are initial substring of a different tag, but this may be a fake reconstruction of my mind, ^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#73771: 30.0.91; etags generates broken TAGS file for multi-line regex match 2024-10-13 10:48 ` Francesco Potortì @ 2024-10-13 12:58 ` Eli Zaretskii 2024-10-19 8:20 ` Eli Zaretskii 0 siblings, 1 reply; 5+ messages in thread From: Eli Zaretskii @ 2024-10-13 12:58 UTC (permalink / raw) To: Francesco Potortì; +Cc: 73771, morgan > From: Francesco Potortì <pot@gnu.org> > Date: Sun, 13 Oct 2024 12:48:40 +0200 > Cc: 73771@debbugs.gnu.org, > Morgan Willcock <morgan@ice9.digital> > > >This is because etags always records one extra character with the > >regexp match, which is harmless, unless that extra character is a > >newline. The patch below fixes it: > > > >diff --git a/lib-src/etags.c b/lib-src/etags.c > >index a822a82..848d8ea 100644 > >--- a/lib-src/etags.c > >+++ b/lib-src/etags.c > >@@ -7420,7 +7420,7 @@ regex_tag_multiline (void) > > > > /* Force explicit tag name, if a name is there. */ > > pfnote (name, true, buffer + linecharno, > >- charno - linecharno + 1, lineno, linecharno); > >+ charno - linecharno, lineno, linecharno); > > > > if (debug) > > fprintf (stderr, "%s on %s:%"PRIdMAX": %s\n", > > > >Francesco, why does the code add one more character there? > > While I can't remember the reason, I am sure it was not done by chance. > > Sure, I may have been wrong at that time, or maybe that reason is now obsolete. To check, one should run the etags regression test and check the results. I already did. The only differences I see are in that 1 extra character, which is now no longer written to TAGS in the tests which use multi-line regexps. Note that the change is only in this single call to pfnote, from the code that is part of generating tags from multi-line regexp matches. > Off the top of my head, it may have to do with implicitly named tags that are initial substring of a different tag, but this may be a fake reconstruction of my mind, The tags we produce from regexp matches are norally explicitly named, AFAICT: /* Match occurred. Construct a tag. */ while (charno < rp->regs.end[0]) if (buffer[charno++] == '\n') lineno++, linecharno = charno; name = rp->name; if (name[0] == '\0') name = NULL; else /* make a named tag */ name = substitute (buffer, rp->name, &rp->regs); /* Force explicit tag name, if a name is there. */ pfnote (name, true, buffer + linecharno, charno - linecharno, lineno, linecharno); But if you say that we should add that offset of 1 when 'name' is NULL, I'm okay with doing that. ^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#73771: 30.0.91; etags generates broken TAGS file for multi-line regex match 2024-10-13 12:58 ` Eli Zaretskii @ 2024-10-19 8:20 ` Eli Zaretskii 0 siblings, 0 replies; 5+ messages in thread From: Eli Zaretskii @ 2024-10-19 8:20 UTC (permalink / raw) To: pot; +Cc: 73771-done, morgan > Cc: 73771@debbugs.gnu.org, morgan@ice9.digital > Date: Sun, 13 Oct 2024 15:58:44 +0300 > From: Eli Zaretskii <eliz@gnu.org> > > > From: Francesco Potortì <pot@gnu.org> > > Date: Sun, 13 Oct 2024 12:48:40 +0200 > > Cc: 73771@debbugs.gnu.org, > > Morgan Willcock <morgan@ice9.digital> > > > > >This is because etags always records one extra character with the > > >regexp match, which is harmless, unless that extra character is a > > >newline. The patch below fixes it: > > > > > >diff --git a/lib-src/etags.c b/lib-src/etags.c > > >index a822a82..848d8ea 100644 > > >--- a/lib-src/etags.c > > >+++ b/lib-src/etags.c > > >@@ -7420,7 +7420,7 @@ regex_tag_multiline (void) > > > > > > /* Force explicit tag name, if a name is there. */ > > > pfnote (name, true, buffer + linecharno, > > >- charno - linecharno + 1, lineno, linecharno); > > >+ charno - linecharno, lineno, linecharno); > > > > > > if (debug) > > > fprintf (stderr, "%s on %s:%"PRIdMAX": %s\n", > > > > > >Francesco, why does the code add one more character there? > > > > While I can't remember the reason, I am sure it was not done by chance. > > > > Sure, I may have been wrong at that time, or maybe that reason is now obsolete. To check, one should run the etags regression test and check the results. > > I already did. The only differences I see are in that 1 extra > character, which is now no longer written to TAGS in the tests which > use multi-line regexps. > > Note that the change is only in this single call to pfnote, from the > code that is part of generating tags from multi-line regexp matches. > > > Off the top of my head, it may have to do with implicitly named tags that are initial substring of a different tag, but this may be a fake reconstruction of my mind, > > The tags we produce from regexp matches are norally explicitly named, > AFAICT: > > /* Match occurred. Construct a tag. */ > while (charno < rp->regs.end[0]) > if (buffer[charno++] == '\n') > lineno++, linecharno = charno; > name = rp->name; > if (name[0] == '\0') > name = NULL; > else /* make a named tag */ > name = substitute (buffer, rp->name, &rp->regs); > > /* Force explicit tag name, if a name is there. */ > pfnote (name, true, buffer + linecharno, > charno - linecharno, lineno, linecharno); > > But if you say that we should add that offset of 1 when 'name' is > NULL, I'm okay with doing that. I've decided to install the change on the master branch. Let's see what, if anything, it will break. I'm closing the bug with this message. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-10-19 8:20 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-10-12 14:39 bug#73771: 30.0.91; etags generates broken TAGS file for multi-line regex match Morgan Willcock 2024-10-12 16:22 ` Eli Zaretskii 2024-10-13 10:48 ` Francesco Potortì 2024-10-13 12:58 ` Eli Zaretskii 2024-10-19 8:20 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).