unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
@ 2022-02-03 15:09 David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-02-21  2:11 ` Dmitry Gutov
                   ` (2 more replies)
  0 siblings, 3 replies; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-02-03 15:09 UTC (permalink / raw)
  To: 53749


[-- Attachment #1.1: Type: text/plain, Size: 2058 bytes --]

I've recently been trying to use xref commands with a tags table in a
TeX repository, and many of the results are sub-optimal.  This is a
known issue -- within living memory there have been at least two
discussions related to it on help-gnu-emacs:

https://lists.gnu.org/archive/html/help-gnu-emacs/2018-06/msg00126.html
https://lists.gnu.org/archive/html/help-gnu-emacs/2021-07/msg00436.html

Neither discussion resulted in any code, at least not that I can find,
and the issues mentioned there remain.  For example,
xref-find-definitions on, say, '\mycommand' returns

No definitions found for: mycommand.

(The absence of the escape char in the search string makes the search
fail, as the tag name in the table will be '\mycommand'.)

Similarly, any xref command on 'my:citekey' will only search by default
for the half of the symbol under point, stopping at the colon.

There are many other behaviors that are suboptimal, as well, so in the
end I wrote a new xref backend for TeX buffers (cloning large portions
of the default etags backend), and wondered whether it might be welcome
in GNU Emacs.

A few remarks:

1. The code should work as it stands both in the AUCTeX and the in-tree
modes.  The AUCTeX hooks I've included in the patch are provisional, as
I would want to discuss with them how they would want to handle it,
should the patch be accepted in some form.

2. Along the way I found some issues with how etags parses TeX files,
issues which affect the usefulness of the xref commands, so I've made
changes in etags.c as well.  When running the test suite for etags the
only diffs occurred in the TeX-related sections of the resulting tags
file, and location information in those sections was good.

3. The patch as it stands enables all the changes by default to give
what I judge to be the best out-of-the-box experience, but wiser heads
may well have other ideas.

4. If it looks like the patch will make it into Emacs in some form, I'm
going to need to assign copyright, so I'd appreciate help with getting
that started.

Thanks,

David.

[-- Attachment #1.2: Type: text/html, Size: 2444 bytes --]

[-- Attachment #2: 0001-Provide-an-xref-backend-for-TeX-buffers.patch --]
[-- Type: text/x-patch, Size: 23052 bytes --]

From 9f5b2547fef5597f9c41e1c46f99746095a0834c Mon Sep 17 00:00:00 2001
From: David Fussner <dfussner@googlemail.com>
Date: Wed, 2 Feb 2022 13:31:51 +0000
Subject: [PATCH] Provide an xref backend for TeX buffers

* lib-src/etags.c (TeX_commands): Improve parsing of commands in TeX
buffers.
(TEX_defenv): Expand list of commands to tag by default in TeX
buffers.
(TeX_help):
* doc/emacs/maintaining.texi (Tag Syntax): Document new tagged
commands.

* lisp/textmodes/tex-mode.el (tex--xref-backend): New function to name
xref backend.
(tex-common-initialization): Set up xref backend for in-tree TeX
modes.
(tex-set-auctex-xref-backend): New function to do the same for AUCTeX
modes.
(xref-backend-identifier-at-point)
(xref-backend-identifier-completion-table)
(xref-backend-identifier-completion-ignore-case)
(xref-backend-definitions, xref-backend-apropos)
(xref-backend-references): New TeX implementations of the generic xref
backend functions.
(tex-xref-apropos-regexp, tex-xref-references-in-directory): New
helper functions for backend.
(tex-thingatpt-modes-list): New var.
(tex-thingatpt-is-texsymbol): New defcustom.
(tex-set-thingatpt-symbol): New command to apply value of previous
buffer-locally.
(tex--symbol-or-texsymbol): New helper function for previous.
(tex-thingatpt--beginning-of-texsymbol)
(tex-thingatpt--end-of-texsymbol): New functions to define texsymbol
"thing" for 'thing-at-point'.
(tex-thingatpt-syntax-table, tex-escape-char): New vars to do the
same.
(tex--thing-at-point): New function to return texsymbol
'thing-at-point'.
(tex-thingatpt-include-escape, tex-xref-try-alternate-forms): New
defcustoms to refine behavior of the xref backend.
(tex--include-escape-p): New function to do the same.
---
 doc/emacs/maintaining.texi |   9 +-
 lib-src/etags.c            |  83 ++++++++--
 lisp/textmodes/tex-mode.el | 331 +++++++++++++++++++++++++++++++++++++
 3 files changed, 411 insertions(+), 12 deletions(-)

diff --git a/doc/emacs/maintaining.texi b/doc/emacs/maintaining.texi
index edcc6075f7..1de435246e 100644
--- a/doc/emacs/maintaining.texi
+++ b/doc/emacs/maintaining.texi
@@ -2565,8 +2565,13 @@ Tag Syntax
 @code{\section}, @code{\subsection}, @code{\subsubsection},
 @code{\eqno}, @code{\label}, @code{\ref}, @code{\cite},
 @code{\bibitem}, @code{\part}, @code{\appendix}, @code{\entry},
-@code{\index}, @code{\def}, @code{\newcommand}, @code{\renewcommand},
-@code{\newenvironment} and @code{\renewenvironment} are tags.
+@code{\index}, @code{\def}, @code{\edef}, @code{\gdef}, @code{\xdef},
+@code{\newcommand}, @code{\renewcommand}, @code{\newenvironment},
+@code{\renewenvironment}, @code{\DeclareRobustCommand},
+@code{\newrobustcmd}, @code{\renewrobustcmd}, @code{\let},
+@code{\csdef}, @code{\csedef}, @code{\csgdef}, @code{\csxdef},
+@code{\csletcs}, and @code{\cslet} are tags.  So too are the arguments
+of any starred variants of these commands, when such variants exist.
 
 Other commands can make tags as well, if you specify them in the
 environment variable @env{TEXTAGS} before invoking @command{etags}.  The
diff --git a/lib-src/etags.c b/lib-src/etags.c
index aa5bc8839d..e5269aa456 100644
--- a/lib-src/etags.c
+++ b/lib-src/etags.c
@@ -793,8 +793,12 @@ #define STDIN 0x1001		/* returned by getopt_long on --parse-stdin */
 "In LaTeX text, the argument of any of the commands '\\chapter',\n\
 '\\section', '\\subsection', '\\subsubsection', '\\eqno', '\\label',\n\
 '\\ref', '\\cite', '\\bibitem', '\\part', '\\appendix', '\\entry',\n\
-'\\index', '\\def', '\\newcommand', '\\renewcommand',\n\
-'\\newenvironment' or '\\renewenvironment' is a tag.\n\
+'\\index', '\\def', '\\edef', '\\gdef', '\\xdef', '\\newcommand',\n\
+'\\renewcommand', '\\newenvironment', '\\renewenvironment',\n\
+'\\DeclareRobustCommand, '\\newrobustcmd', '\\renewrobustcmd',\n\
+'\\let', '\\csdef', '\\csedef', '\\csgdef', '\\csxdef', '\\csletcs',\n\
+or '\\cslet' is a tag.  So is the argument of any of the starred\n\
+variants of these commands, when a starred variant exists.\n\
 \n\
 Other commands can be specified by setting the environment variable\n\
 'TEXTAGS' to a colon-separated list like, for example,\n\
@@ -5673,11 +5677,19 @@ Scheme_functions (FILE *inf)
 static linebuffer *TEX_toktab = NULL; /* Table with tag tokens */
 
 /* Default set of control sequences to put into TEX_toktab.
-   The value of environment var TEXTAGS is prepended to this.  */
+   The value of environment var TEXTAGS is prepended to this.
+   (2021) Add variants of '\def', some additional LaTeX commands,
+   and common variants from the 'etoolbox' package.  Also, add
+   starred variants of the commands if they exist.  Starred
+   variants need to appear before their unstarred versions. */
 static const char *TEX_defenv = "\
-:chapter:section:subsection:subsubsection:eqno:label:ref:cite:bibitem\
-:part:appendix:entry:index:def\
-:newcommand:renewcommand:newenvironment:renewenvironment";
+:chapter*:section*:subsection*:subsubsection*:part*:label:ref\
+:chapter:section:subsection:subsubsection:eqno:cite:bibitem\
+:part:appendix:entry:index:def:edef:gdef:xdef:newcommand*:newcommand\
+:renewcommand*:renewcommand:newenvironment*:newenvironment\
+:renewenvironment*:renewenvironment:DeclareRobustCommand*\
+:DeclareRobustCommand:renewrobustcmd*:renewrobustcmd:newrobustcmd*\
+:newrobustcmd:let:csdef:csedef:csgdef:csxdef:csletcs:cslet";
 
 static void TEX_decode_env (const char *, const char *);
 
@@ -5736,19 +5748,70 @@ TeX_commands (FILE *inf)
 	      {
 		char *p;
 		ptrdiff_t namelen, linelen;
-		bool opgrp = false;
+		bool opgrp = false, one_esc = false;
 
 		cp = skip_spaces (cp + key->len);
+		/* Skip the optional arguments to commands in the tags list so
+		   that these arguments don't end up as the name of the tag.
+		   The name will instead come from the argument in curly braces
+		   that follows the optional ones.  */
+		if (*cp == '[' || *cp == '(')
+		  {
+		    while (*cp != TEX_opgrp && *cp != '\0')
+		      cp++;
+		  }
 		if (*cp == TEX_opgrp)
 		  {
 		    opgrp = true;
 		    cp++;
 		  }
+		/* Jumping to a TeX command definition doesn't work in at
+		   least some of the editors that use ctags.  Changes in
+		   tex-mode.el in GNU Emacs address these issues for etags;
+		   uncomment the following five lines to get a quick & dirty
+		   improvement in programs using ctags as well, though some
+		   parts of the behavior will remain suboptimal.  The
+		   undocumented ctags option '--no-duplicates' may help.  */
+
+		/* if (CTAGS && *cp == TEX_esc) */
+		/*   { */
+		/*     cp++; */
+		/*     one_esc = true; */
+		/*   } */
+
+		/* Add optional argument brackets '(' and '[' so that these
+		   arguments don't appear in tag names.  Also add '=' as it's
+		   relational in the vast majority of cases.  */
 		for (p = cp;
-		     (!c_isspace (*p) && *p != '#' &&
-		      *p != TEX_opgrp && *p != TEX_clgrp);
+		     (!c_isspace (*p) && *p != '#' && *p != '=' &&
+		      *p != '[' && *p != '(' && *p != TEX_opgrp &&
+		      *p != TEX_clgrp);
 		     p++)
-		  continue;
+		  /* Allow only one escape char in a tag name, which
+		     (primarily) enables tagging a TeX command's different,
+		     possibly temporary, '\let' bindings.  */
+		  if (*p == TEX_esc)
+		    {
+		      if (!one_esc)
+			{
+			  one_esc = true;
+			  continue;
+			}
+		      else
+			break;
+		    }
+		  else
+		    continue;
+		/* Re-scan to catch (highly unusual) cases where a
+		   command name is of the form '\('.  */
+		if ((*p == '(' || *p == '[') && (p - cp) < 2)
+		  {
+		    for (p = cp;
+			 (!c_isspace (*p) && *p != '#' &&
+			  *p != TEX_opgrp && *p != TEX_clgrp);
+			 p++)
+		      continue;
+		  }
 		namelen = p - cp;
 		linelen = lb.len;
 		if (!opgrp || *p == TEX_clgrp)
diff --git a/lisp/textmodes/tex-mode.el b/lisp/textmodes/tex-mode.el
index ab94036d01..3a7178c055 100644
--- a/lisp/textmodes/tex-mode.el
+++ b/lisp/textmodes/tex-mode.el
@@ -1291,6 +1291,9 @@ tex-common-initialization
 	      (syntax-propertize-rules latex-syntax-propertize-rules))
   ;; TABs in verbatim environments don't do what you think.
   (setq-local indent-tabs-mode nil)
+  ;; Set up xref backend in TeX buffers.
+  (add-hook 'xref-backend-functions #'tex--xref-backend nil t)
+  (tex-set-thingatpt-symbol)
   ;; Other vars that should be buffer-local.
   (make-local-variable 'tex-command)
   (make-local-variable 'tex-start-of-header)
@@ -3659,6 +3662,334 @@ tex-chktex
       (process-send-region tex-chktex--process (point-min) (point-max))
       (process-send-eof tex-chktex--process))))
 
+\f
+;;; Xref backend
+
+;; Here we define an xref backend for TeX, adapting the default etags
+;; backend so that the main xref user commands (including
+;; `xref-find-definitions', `xref-find-apropos', and
+;; `xref-find-references' [on M-., C-M-., and M-?, respectively]) work
+;; in TeX buffers.  This mostly involves defining a new THING for
+;; `thing-at-point' (texsymbol), then substituting that THING for
+;; `symbol' in TeX buffers, at least by (configurable) default.  The
+;; TeX escape character will by default appear in the resulting string
+;; only when the xref command uses string search and not regexp
+;; search, though this too is configurable.  The new THING type also
+;; improves the accuracy of other commands that use `thing-at-point'
+;; in TeX buffers, like `project-find-regexp'.  TODO: Include commands
+;; that call `bounds-of-thing-at-point' (for example
+;; `isearch-forward-thing-at-point') in the mechanism.
+
+(defvar tex-thingatpt-modes-list
+  '(tex-mode doctex-mode latex-mode plain-tex-mode slitex-mode)
+  "Major modes where `thing-at-point' may use the `texsymbol' type.
+
+When a buffer's `major-mode' is in this list, and when
+`tex-thingatpt-is-texsymbol' is t (the default), any command in
+that buffer that calls `thing-at-point' with a `symbol' argument
+actually uses the `texsymbol' argument, instead.")
+
+(defcustom tex-thingatpt-is-texsymbol t
+  "When non-nil replace `symbol' by `texsymbol' for `thing-at-point'.
+
+This applies only to TeX buffers.  The `texsymbol' \"thing\"
+modifies the standard `symbol' for use in such buffers.
+
+When nil, restore the default behavior of `thing-at-point' in TeX
+buffers.
+
+Custom will automatically apply changes in all TeX buffers, but
+if you set the variable outside of Custom it won't take effect
+until you apply it with \\[tex-set-thingatpt-symbol].  Without a
+prefix argument (\\[universal-argument]) this applies only to the
+current buffer, but with one it applies to all TeX buffers in
+`buffer-list'.  (TeX buffers are those whose `major-mode' is a
+member of `tex-thingatpt-modes-list'.)"
+  :type 'boolean
+  :group 'tex-file
+  :initialize #'custom-initialize-default
+  :set (lambda (var val)
+         (set-default var val)
+         (tex-set-thingatpt-symbol t))
+  :version "29.1")
+
+(defcustom tex-thingatpt-include-escape '(xref-find-definitions
+                                          xref-find-definitions-other-window
+                                          xref-find-definitions-other-frame)
+  "If non-nil, include `tex-escape-char' in `thing-at-point'.
+
+This variable only takes effect when `tex-thingatpt-is-texsymbol'
+is t (the default), changing the argument passed to
+`thing-at-point' from `symbol' to `texsymbol'.  When that is the
+case, the values of this variable act as follows:
+
+When t, `thing-at-point' will always include a
+`tex-escape-char' (usually `\\'), should one be present, in the
+string it returns in TeX buffers.
+
+When nil, `thing-at-point' will never include the
+`tex-escape-char' in the string it returns in TeX buffers.
+
+Otherwise, it's a list of commands for which `thing-at-point'
+will always include the `tex-escape-char' in the string it
+returns.  The three xref commands listed by default may cease to
+function properly in TeX buffers if set to nil, but setting
+`tex-xref-try-alternate-forms' to t will rectify that."
+  :type '(choice (const :tag "Always include tex-escape-char" t)
+                 (const :tag "Never include tex-escape-char" nil)
+                 (set :tag "Include tex-escape-char for these commands"
+		      (repeat :inline t (symbol :tag "command"))))
+  :group 'tex-file
+  :version "29.1")
+
+(defcustom tex-xref-try-alternate-forms nil
+  "Non-nil means find definitions of alternate forms of commands.
+
+If `xref-find-definitions' returns nil for the current form of
+the TeX command name, try the alternative form, which will have
+the `tex-escape-char' (usually `\\') either stripped from or
+prepended to the current form, depending on whether or not the
+current form starts with that character.
+
+This may be particularly useful in documents that mix `\\def' and
+`\\csdef' when defining commands."
+  :type 'boolean
+  :group 'tex-file
+  :version "29.1")
+
+(defvar tex-escape-char ?\\
+  "The current TeX escape character.
+
+The `etags' program only recognizes `\\' (92) and `!' (33) as
+escape characters in TeX documents, and if it detects the latter
+it also uses `<>' as the TeX grouping construct rather than `{}'.
+Setting this variable to anything other than `\\' or `!' will not
+be useful without changes to `etags', at least for commands that
+search tags tables, such as \\[xref-find-definitions] and \
+\\[xref-find-apropos].")
+
+(defvar tex-thingatpt-syntax-table
+  (let* ((ost (if (boundp 'TeX-mode-syntax-table)
+                  TeX-mode-syntax-table
+                tex-mode-syntax-table))
+         (st (make-syntax-table ost)))
+    (modify-syntax-entry ?# "'" st)
+    (modify-syntax-entry ?= "'" st)
+    (modify-syntax-entry ?` "'" st)
+    (modify-syntax-entry ?\" "'" st)
+    (modify-syntax-entry ?' "'" st)
+    st)
+  "Syntax table for delimiting `thing-at-point' in TeX buffers.
+
+When `tex-thingatpt-is-texsymbol' is t, this syntax table helps
+to define what a `texsymbol' is.")
+
+(defun tex--xref-backend () 'tex)
+
+;; Setup AUCTeX modes.  (Should this be in AUCTeX itself?)
+
+(add-hook 'TeX-mode-hook #'tex-set-auctex-xref-backend)
+(add-hook 'TeX-mode-hook #'tex-set-thingatpt-symbol)
+
+(defun tex-set-auctex-xref-backend ()
+  (add-hook 'xref-backend-functions #'tex--xref-backend nil t))
+
+(declare-function xref-item-location "xref")
+(declare-function xref--project-root "xref" (project))
+(declare-function xref--convert-hits "xref" (hits regexp))
+(declare-function apropos-parse-pattern "apropos" (pattern))
+(declare-function semantic-symref-perform-search "semantic/symref")
+(declare-function semantic-symref-instantiate "semantic/symref")
+(declare-function project-external-roots "project")
+(declare-function find-tag--completion-ignore-case "etags")
+(declare-function etags--xref-find-definitions "etags")
+(declare-function etags--xref-apropos-additional "etags" (regexp))
+(declare-function cl-delete-if "cl-seq")
+(defvar etags-xref-prefer-current-file)
+
+(cl-defmethod xref-backend-identifier-at-point ((_backend (eql 'tex)))
+  (require 'etags)
+  (thing-at-point 'symbol t))
+
+(cl-defmethod xref-backend-identifier-completion-table ((_backend
+                                                         (eql 'tex)))
+  (tags-lazy-completion-table))
+
+(cl-defmethod xref-backend-identifier-completion-ignore-case ((_backend
+                                                               (eql 'tex)))
+  (find-tag--completion-ignore-case))
+
+(cl-defmethod xref-backend-definitions ((_backend (eql 'tex)) symbol)
+  (let* ((file (and buffer-file-name (expand-file-name buffer-file-name)))
+         (alt-sym (if (char-equal tex-escape-char (aref symbol 0))
+                      (substring symbol 1)
+                    (concat (string tex-escape-char) symbol)))
+         (prelim-definitions (etags--xref-find-definitions symbol))
+         (definitions (if (or prelim-definitions
+                              (not tex-xref-try-alternate-forms))
+                          prelim-definitions
+                        (etags--xref-find-definitions alt-sym)))
+         same-file-definitions)
+    (when (and etags-xref-prefer-current-file file)
+      (setq definitions
+            (cl-delete-if
+             (lambda (definition)
+               (when (equal file
+                            (xref-location-group
+                             (xref-item-location definition)))
+                 (push definition same-file-definitions)
+                 t))
+             definitions))
+      (setq definitions (nconc (nreverse same-file-definitions)
+                               definitions)))
+    definitions))
+
+(cl-defmethod xref-backend-apropos ((_backend (eql 'tex)) pattern)
+  (let ((regexp (tex-xref-apropos-regexp pattern)))
+    (nconc
+     (or
+      (etags--xref-find-definitions regexp t)
+      (etags--xref-find-definitions pattern t))
+     (etags--xref-apropos-additional regexp))))
+
+(cl-defmethod xref-backend-references ((_backend (eql 'tex)) identifier)
+  (mapcan
+   (lambda (dir)
+     (message "Searching %s..." dir)
+     (redisplay)
+     (prog1
+         (tex-xref-references-in-directory identifier dir)
+       (message "Searching %s... done" dir)))
+   (let ((pr (project-current t)))
+     (cons
+      (xref--project-root pr)
+      (project-external-roots pr)))))
+
+(defun tex-xref-apropos-regexp (pattern)
+  "Return a regexp from PATTERN similar to `apropos'.
+
+Unlike the standard xref function, if `regexp-quote' returns a
+string different from the original PATTERN, the TeX function
+passes that modified string, rather than PATTERN itself, to
+`apropos-parse-pattern'."
+  (let ((re (regexp-quote pattern)))
+    (apropos-parse-pattern
+     (if (string-equal re pattern)
+         ;; Split into words
+         (or (split-string pattern "[ \t]+" t)
+             (user-error "No word list given"))
+       re))))
+
+(defun tex-xref-references-in-directory (symbol dir)
+  "Find all references to SYMBOL in directory DIR.
+Return a list of xref values.
+
+This function uses the Semantic Symbol Reference API.  In TeX
+buffers the value returned when passing SYMBOL to `regexp-quote'
+becomes the default search term.  If this symref instantiation
+finds no matches, a second tries again with the original SYMBOL
+as search term, instead.  Both searches set keyword `searchtype:'
+to \\='regexp instead of xref's \\='symbol.
+
+See `semantic-symref-tool-alist' for details on which tools are
+used, and when.  See also `xref-references-in-directory' and
+comments in its code, the latter copied into the TeX
+implementation for convenience."
+  (cl-assert (directory-name-p dir))
+  (require 'semantic/symref)
+  (defvar semantic-symref-tool)
+  (defvar ede-minor-mode)
+
+  ;; Some symref backends use `ede-project-root-directory' as the root
+  ;; directory for the search, rather than `default-directory'. Since
+  ;; the caller has specified `dir', we bind `ede-minor-mode' to nil
+  ;; to force the backend to use `default-directory'.
+  (let* ((ede-minor-mode nil)
+         (default-directory dir)
+         ;; FIXME: Remove CScope and Global from the recognized tools?
+         ;; The current implementations interpret the symbol search as
+         ;; "find all calls to the given function", but not function
+         ;; definition. And they return nothing when passed a variable
+         ;; name, even a global one.
+         (semantic-symref-tool 'detect)
+         (case-fold-search nil)
+         (texsymbol (regexp-quote symbol))
+         (inst (semantic-symref-instantiate :searchfor texsymbol
+                                            :searchtype 'regexp
+                                            :searchscope 'subdirs
+                                            :resulttype 'line-and-text))
+         (alt-inst (semantic-symref-instantiate :searchfor symbol
+                                                :searchtype 'regexp
+                                                :searchscope 'subdirs
+                                                :resulttype 'line-and-text)))
+    (or
+     (xref--convert-hits (semantic-symref-perform-search inst)
+                         (format "%s" texsymbol))
+     (xref--convert-hits (semantic-symref-perform-search alt-inst)
+                         (format "%s" symbol)))))
+
+(put 'texsymbol 'beginning-op 'tex-thingatpt--beginning-of-texsymbol)
+
+(put 'texsymbol 'end-op 'tex-thingatpt--end-of-texsymbol)
+
+(defun tex-set-thingatpt-symbol (&optional all)
+  "Set meaning of `thing-at-point' `symbol' in (ALL?) TeX buffers.
+
+When `tex-thingatpt-is-texsymbol' is t, set `thing-at-point' to
+use the `texsymbol' \"thing\" instead of `symbol', otherwise
+maintain or restore the default.  Without an optional ALL make
+changes only in current buffer, with ALL make changes in all TeX
+buffers in `buffer-list'."
+  (interactive "P")
+  (require 'thingatpt)
+  (if all
+      (dolist (buf (buffer-list))
+        (with-current-buffer buf
+          (tex--symbol-or-texsymbol)))
+    (tex--symbol-or-texsymbol)))
+
+(defun tex--symbol-or-texsymbol ()
+  (when (memq major-mode tex-thingatpt-modes-list)
+    (if tex-thingatpt-is-texsymbol
+        (setq-local thing-at-point-provider-alist
+                    (add-to-list 'thing-at-point-provider-alist
+                            '(symbol . tex--thing-at-point)))
+      (setq-local thing-at-point-provider-alist
+                  (delete '(symbol . tex--thing-at-point)
+                          thing-at-point-provider-alist)))))
+
+(defun tex--thing-at-point ()
+  "Pass `thing' type `texsymbol' to `bounds-of-thing-at-point'.
+
+When `tex-thingatpt-is-texsymbol' is t, calls in TeX buffers to
+`thing-at-point' with argument `symbol' will use this function."
+  (let* ((sytab (make-syntax-table tex-thingatpt-syntax-table))
+         (bounds (with-syntax-table sytab
+                   (unless (char-equal tex-escape-char ?\\)
+                     (modify-syntax-entry ?\\ "_")
+                     (modify-syntax-entry tex-escape-char "\\")
+                     (modify-syntax-entry ?< "(>")
+                     (modify-syntax-entry ?> ")<"))
+                   (bounds-of-thing-at-point 'texsymbol))))
+    (when bounds
+      (buffer-substring-no-properties (car bounds) (cdr bounds)))))
+
+(defun tex--include-escape-p (command)
+  (or (eq tex-thingatpt-include-escape t)
+      (memq command tex-thingatpt-include-escape)))
+
+(defun tex-thingatpt--beginning-of-texsymbol ()
+  "Move point to the beginning of the current TeX symbol."
+  (and (re-search-backward "\\([][()]\\|\\(\\sw\\|\\s_\\|\\s.\\)+\\)")
+       (skip-syntax-backward "w_.")
+       (when (tex--include-escape-p this-command)
+         (skip-syntax-backward "\\/"))))
+
+(defun tex-thingatpt--end-of-texsymbol ()
+  "Move point to the end of the current TeX symbol."
+  (and (re-search-forward "\\([][()]\\|\\(\\sw\\|\\s_\\|\\s.\\)+\\)")
+       (skip-syntax-forward "w_.")))
+
 (make-obsolete-variable 'tex-mode-load-hook
                         "use `with-eval-after-load' instead." "28.1")
 (run-hooks 'tex-mode-load-hook)
-- 
2.17.6


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-03 15:09 bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-02-21  2:11 ` Dmitry Gutov
  2022-02-21  9:48   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-09-08 13:25   ` Lars Ingebrigtsen
  2022-02-21 12:35 ` Arash Esbati
  2022-02-25 20:16 ` Augusto Stoffel
  2 siblings, 2 replies; 66+ messages in thread
From: Dmitry Gutov @ 2022-02-21  2:11 UTC (permalink / raw)
  To: David Fussner, 53749

Hi!

Let us first discuss whether we could make do without an additional Xref 
backend. Just to make sure.

On 03.02.2022 17:09, David Fussner via Bug reports for GNU Emacs, the 
Swiss army knife of text editors wrote:
> Similarly, any xref command on 'my:citekey' will only search by default
> for the half of the symbol under point, stopping at the colon.

etags's implementation of 'xref-backend-identifier-at-point' calls 
'find-tag--default', which consults 'find-tag-default-function' and
(get major-mode 'find-tag-default-function).

So if your main goal was to alter which string gets searched for (based 
on text around point), you can define a function which returns the 
necessary string (as you did in the patch) and then either set 
'find-tag-default-function' to that function, or put it on the 
'find-tag-default-function' property for the respective major mode 
functions.

> There are many other behaviors that are suboptimal, as well, so in the
> end I wrote a new xref backend for TeX buffers (cloning large portions
> of the default etags backend), and wondered whether it might be welcome
> in GNU Emacs.

Could you point out the other changes which were required?





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-21  2:11 ` Dmitry Gutov
@ 2022-02-21  9:48   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-02-21 17:28     ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-02-21 23:55     ` Dmitry Gutov
  2022-09-08 13:25   ` Lars Ingebrigtsen
  1 sibling, 2 replies; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-02-21  9:48 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 53749

(Resending to include the mailing list -- sorry!)

Hi Dmitry,

Many thanks for looking into this.

>
> So if your main goal was to alter which string gets searched for (based
> on text around point), you can define a function which returns the
> necessary string (as you did in the patch) and then either set
> 'find-tag-default-function' to that function, or put it on the
> 'find-tag-default-function' property for the respective major mode
> functions.
>
> > There are many other behaviors that are suboptimal, as well, so in the
> > end I wrote a new xref backend for TeX buffers (cloning large portions
> > of the default etags backend), and wondered whether it might be welcome
> > in GNU Emacs.
>
> Could you point out the other changes which were required?

As you've noticed, I tried at first to get by without a new backend,
but I ran into a few issues that I couldn't solve that way, hence the
current patch.  A couple of examples:

1. TeX is very generous with the characters it includes in its
symbols, so what looks like a standard symbol to it can look like a
regexp either to grep or to emacs, so I needed to changes things in
xref-find-apropos and in xref-find-references to take this into
account.  (See tex-xref-apropos-regexp and
tex-xref-references-in-directory.)  Sometimes using a search string
that had been put through regexp-quote was wrong, as when a user
provided their own regexp in the minibuffer, so in both those cases I
provided fallbacks to a different search in case the default search
came up empty.  I couldn't see how to do this without a new backend.

2.  A package like biblatex creates what amounts to a separate
namespace using the \newbibmacro mechanism, so pretty much every
biblatex style has both a \cite command and a cite bibmacro, and I
wanted to allow emacs to differentiate between them when using
xref-find-definitions.  Because users of the etoolbox package (like
biblatex) may well mix commands with and without the escape char "\",
I also provided a variable to allow users to find when a \command is
called using \csuse{command} instead.  Again, this required a fallback
search (see xref-backend-definitions) which I couldn't see how to
provide without a new backend.

Does this make any sense?  I can give more specific examples if you
like -- try running xref-find-references on a TeX command with "@" in
it.  (If memory serves, that behaved badly here on an unpatched emacs,
but maybe I'm misremembering.)

David.

On Mon, 21 Feb 2022 at 02:11, Dmitry Gutov <dgutov@yandex.ru> wrote:
>
> Hi!
>
> Let us first discuss whether we could make do without an additional Xref
> backend. Just to make sure.
>
> On 03.02.2022 17:09, David Fussner via Bug reports for GNU Emacs, the
> Swiss army knife of text editors wrote:
> > Similarly, any xref command on 'my:citekey' will only search by default
> > for the half of the symbol under point, stopping at the colon.
>
> etags's implementation of 'xref-backend-identifier-at-point' calls
> 'find-tag--default', which consults 'find-tag-default-function' and
> (get major-mode 'find-tag-default-function).
>
> So if your main goal was to alter which string gets searched for (based
> on text around point), you can define a function which returns the
> necessary string (as you did in the patch) and then either set
> 'find-tag-default-function' to that function, or put it on the
> 'find-tag-default-function' property for the respective major mode
> functions.
>
> > There are many other behaviors that are suboptimal, as well, so in the
> > end I wrote a new xref backend for TeX buffers (cloning large portions
> > of the default etags backend), and wondered whether it might be welcome
> > in GNU Emacs.
>
> Could you point out the other changes which were required?





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-03 15:09 bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-02-21  2:11 ` Dmitry Gutov
@ 2022-02-21 12:35 ` Arash Esbati
  2022-02-21 14:03   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-02-25 20:16 ` Augusto Stoffel
  2 siblings, 1 reply; 66+ messages in thread
From: Arash Esbati @ 2022-02-21 12:35 UTC (permalink / raw)
  To: 53749; +Cc: dfussner

David Fussner via "Bug reports for GNU Emacs, the Swiss army knife of
text editors" <bug-gnu-emacs@gnu.org> writes:

> diff --git a/lib-src/etags.c b/lib-src/etags.c
> index aa5bc8839d..e5269aa456 100644
> --- a/lib-src/etags.c
> +++ b/lib-src/etags.c
> [...]
>  /* Default set of control sequences to put into TEX_toktab.
> -   The value of environment var TEXTAGS is prepended to this.  */
> +   The value of environment var TEXTAGS is prepended to this.
> +   (2021) Add variants of '\def', some additional LaTeX commands,
> +   and common variants from the 'etoolbox' package.  Also, add
> +   starred variants of the commands if they exist.  Starred
> +   variants need to appear before their unstarred versions. */
>  static const char *TEX_defenv = "\
> -:chapter:section:subsection:subsubsection:eqno:label:ref:cite:bibitem\
> -:part:appendix:entry:index:def\
> -:newcommand:renewcommand:newenvironment:renewenvironment";
> +:chapter*:section*:subsection*:subsubsection*:part*:label:ref\
> +:chapter:section:subsection:subsubsection:eqno:cite:bibitem\
> +:part:appendix:entry:index:def:edef:gdef:xdef:newcommand*:newcommand\
> +:renewcommand*:renewcommand:newenvironment*:newenvironment\
> +:renewenvironment*:renewenvironment:DeclareRobustCommand*\
> +:DeclareRobustCommand:renewrobustcmd*:renewrobustcmd:newrobustcmd*\
> +:newrobustcmd:let:csdef:csedef:csgdef:csxdef:csletcs:cslet";

Hi David,

thanks for looking into this.  While you're at it, can you also please
add support for the former xparse \newcommand variants which are now
(now is October 2020) part of LaTeX kernel, namely:

\NewDocumentCommand
\RenewDocumentCommand
\ProvideDocumentCommand
\DeclareDocumentCommand
\NewDocumentEnvironment
\RenewDocumentEnvironment
\ProvideDocumentEnvironment
\DeclareDocumentEnvironment
\NewExpandableDocumentCommand
\RenewExpandableDocumentCommand
\ProvideExpandableDocumentCommand
\DeclareExpandableDocumentCommand

TIA.  Best, Arash





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-21 12:35 ` Arash Esbati
@ 2022-02-21 14:03   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-02-21 14:03 UTC (permalink / raw)
  To: Arash Esbati; +Cc: 53749

Hi Arash,

Thank you for the list!  I had fully intended to add the new LaTeX 3
commands but managed somehow to forget.  If you see anything else I've
omitted please let me know.

David.

On Mon, 21 Feb 2022 at 12:36, Arash Esbati <arash@gnu.org> wrote:
>
> David Fussner via "Bug reports for GNU Emacs, the Swiss army knife of
> text editors" <bug-gnu-emacs@gnu.org> writes:
>
> > diff --git a/lib-src/etags.c b/lib-src/etags.c
> > index aa5bc8839d..e5269aa456 100644
> > --- a/lib-src/etags.c
> > +++ b/lib-src/etags.c
> > [...]
> >  /* Default set of control sequences to put into TEX_toktab.
> > -   The value of environment var TEXTAGS is prepended to this.  */
> > +   The value of environment var TEXTAGS is prepended to this.
> > +   (2021) Add variants of '\def', some additional LaTeX commands,
> > +   and common variants from the 'etoolbox' package.  Also, add
> > +   starred variants of the commands if they exist.  Starred
> > +   variants need to appear before their unstarred versions. */
> >  static const char *TEX_defenv = "\
> > -:chapter:section:subsection:subsubsection:eqno:label:ref:cite:bibitem\
> > -:part:appendix:entry:index:def\
> > -:newcommand:renewcommand:newenvironment:renewenvironment";
> > +:chapter*:section*:subsection*:subsubsection*:part*:label:ref\
> > +:chapter:section:subsection:subsubsection:eqno:cite:bibitem\
> > +:part:appendix:entry:index:def:edef:gdef:xdef:newcommand*:newcommand\
> > +:renewcommand*:renewcommand:newenvironment*:newenvironment\
> > +:renewenvironment*:renewenvironment:DeclareRobustCommand*\
> > +:DeclareRobustCommand:renewrobustcmd*:renewrobustcmd:newrobustcmd*\
> > +:newrobustcmd:let:csdef:csedef:csgdef:csxdef:csletcs:cslet";
>
> Hi David,
>
> thanks for looking into this.  While you're at it, can you also please
> add support for the former xparse \newcommand variants which are now
> (now is October 2020) part of LaTeX kernel, namely:
>
> \NewDocumentCommand
> \RenewDocumentCommand
> \ProvideDocumentCommand
> \DeclareDocumentCommand
> \NewDocumentEnvironment
> \RenewDocumentEnvironment
> \ProvideDocumentEnvironment
> \DeclareDocumentEnvironment
> \NewExpandableDocumentCommand
> \RenewExpandableDocumentCommand
> \ProvideExpandableDocumentCommand
> \DeclareExpandableDocumentCommand
>
> TIA.  Best, Arash





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-21  9:48   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-02-21 17:28     ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-02-21 23:56       ` Dmitry Gutov
  2022-02-21 23:55     ` Dmitry Gutov
  1 sibling, 1 reply; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-02-21 17:28 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 53749

Hi Dmitry,

I found a bit of time to test, and the problem with "@" in command
names appears when a search string for xref-find-references ends with
"@". The results returned will miss out valid hits, depending on what
follows the "@" in the actual command name in the TeX file.

Hope this might help,

David.

On Mon, 21 Feb 2022 at 09:48, David Fussner <dfussner@googlemail.com> wrote:
>
> (Resending to include the mailing list -- sorry!)
>
> Hi Dmitry,
>
> Many thanks for looking into this.
>
> >
> > So if your main goal was to alter which string gets searched for (based
> > on text around point), you can define a function which returns the
> > necessary string (as you did in the patch) and then either set
> > 'find-tag-default-function' to that function, or put it on the
> > 'find-tag-default-function' property for the respective major mode
> > functions.
> >
> > > There are many other behaviors that are suboptimal, as well, so in the
> > > end I wrote a new xref backend for TeX buffers (cloning large portions
> > > of the default etags backend), and wondered whether it might be welcome
> > > in GNU Emacs.
> >
> > Could you point out the other changes which were required?
>
> As you've noticed, I tried at first to get by without a new backend,
> but I ran into a few issues that I couldn't solve that way, hence the
> current patch.  A couple of examples:
>
> 1. TeX is very generous with the characters it includes in its
> symbols, so what looks like a standard symbol to it can look like a
> regexp either to grep or to emacs, so I needed to changes things in
> xref-find-apropos and in xref-find-references to take this into
> account.  (See tex-xref-apropos-regexp and
> tex-xref-references-in-directory.)  Sometimes using a search string
> that had been put through regexp-quote was wrong, as when a user
> provided their own regexp in the minibuffer, so in both those cases I
> provided fallbacks to a different search in case the default search
> came up empty.  I couldn't see how to do this without a new backend.
>
> 2.  A package like biblatex creates what amounts to a separate
> namespace using the \newbibmacro mechanism, so pretty much every
> biblatex style has both a \cite command and a cite bibmacro, and I
> wanted to allow emacs to differentiate between them when using
> xref-find-definitions.  Because users of the etoolbox package (like
> biblatex) may well mix commands with and without the escape char "\",
> I also provided a variable to allow users to find when a \command is
> called using \csuse{command} instead.  Again, this required a fallback
> search (see xref-backend-definitions) which I couldn't see how to
> provide without a new backend.
>
> Does this make any sense?  I can give more specific examples if you
> like -- try running xref-find-references on a TeX command with "@" in
> it.  (If memory serves, that behaved badly here on an unpatched emacs,
> but maybe I'm misremembering.)
>
> David.
>
> On Mon, 21 Feb 2022 at 02:11, Dmitry Gutov <dgutov@yandex.ru> wrote:
> >
> > Hi!
> >
> > Let us first discuss whether we could make do without an additional Xref
> > backend. Just to make sure.
> >
> > On 03.02.2022 17:09, David Fussner via Bug reports for GNU Emacs, the
> > Swiss army knife of text editors wrote:
> > > Similarly, any xref command on 'my:citekey' will only search by default
> > > for the half of the symbol under point, stopping at the colon.
> >
> > etags's implementation of 'xref-backend-identifier-at-point' calls
> > 'find-tag--default', which consults 'find-tag-default-function' and
> > (get major-mode 'find-tag-default-function).
> >
> > So if your main goal was to alter which string gets searched for (based
> > on text around point), you can define a function which returns the
> > necessary string (as you did in the patch) and then either set
> > 'find-tag-default-function' to that function, or put it on the
> > 'find-tag-default-function' property for the respective major mode
> > functions.
> >
> > > There are many other behaviors that are suboptimal, as well, so in the
> > > end I wrote a new xref backend for TeX buffers (cloning large portions
> > > of the default etags backend), and wondered whether it might be welcome
> > > in GNU Emacs.
> >
> > Could you point out the other changes which were required?





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-21  9:48   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-02-21 17:28     ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-02-21 23:55     ` Dmitry Gutov
  1 sibling, 0 replies; 66+ messages in thread
From: Dmitry Gutov @ 2022-02-21 23:55 UTC (permalink / raw)
  To: David Fussner; +Cc: 53749

On 21.02.2022 11:48, David Fussner wrote:
> Sometimes using a search string
> that had been put through regexp-quote was wrong, as when a user
> provided their own regexp in the minibuffer, so in both those cases I
> provided fallbacks to a different search in case the default search
> came up empty.  I couldn't see how to do this without a new backend.

One way to deal with that is to treat all user inputs as regexps there. 
Perhaps some will have to be more verbose that ideal, but as long as the 
user is familiar with the regexp syntax, the behavior will be both 
powerful and predictable.

> 2.  A package like biblatex creates what amounts to a separate
> namespace using the \newbibmacro mechanism, so pretty much every
> biblatex style has both a \cite command and a cite bibmacro, and I
> wanted to allow emacs to differentiate between them when using
> xref-find-definitions.  Because users of the etoolbox package (like
> biblatex) may well mix commands with and without the escape char "\",
> I also provided a variable to allow users to find when a \command is
> called using \csuse{command} instead.  Again, this required a fallback
> search (see xref-backend-definitions) which I couldn't see how to
> provide without a new backend.

Could those be be disambiguated when the tags are scanned, instead? Then 
the user will tailor their input to find the one or the other.

Or if we want more fuzzier matching, perhaps creating mode-specific 
values of etags-xref-find-definitions-tag-order could help.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-21 17:28     ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-02-21 23:56       ` Dmitry Gutov
  2022-02-22 15:19         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 66+ messages in thread
From: Dmitry Gutov @ 2022-02-21 23:56 UTC (permalink / raw)
  To: David Fussner; +Cc: 53749

On 21.02.2022 19:28, David Fussner wrote:
> Hi Dmitry,
> 
> I found a bit of time to test, and the problem with "@" in command
> names appears when a search string for xref-find-references ends with
> "@". The results returned will miss out valid hits, depending on what
> follows the "@" in the actual command name in the TeX file.

Sorry, I have very little familiarity with TeX.

Do you have a step-by-step scenario? Perhaps using one of the .texi 
manuals already existing in the repo?





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-21 23:56       ` Dmitry Gutov
@ 2022-02-22 15:19         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-02-23  2:21           ` Dmitry Gutov
  0 siblings, 1 reply; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-02-22 15:19 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 53749

[-- Attachment #1: Type: text/plain, Size: 3748 bytes --]

Hi Dmitry,

> Do you have a step-by-step scenario? Perhaps using one of the .texi
> manuals already existing in the repo?

I can't find a good example in the emacs repo, but I'll try to talk through
what happens with a code snippet from biblatex.sty, which I hope will
explain some of the issues we're discussing, even if it is a little
artificial.

\DeclareBiblatexOption{global,type}[string]{uniquename}[true]{%
  \ifcsdef{blx@opt@uniquename@#1}
    {\letcs\blx@uniquename{blx@opt@uniquename@#1}}
    {\blx@err@invopt{uniquename=#1}{}}}
\def\blx@opt@uniquename@false{false}
\def\blx@opt@uniquename@init{init}
\def\blx@opt@uniquename@true{full}
\def\blx@opt@uniquename@full{full}
\def\blx@opt@uniquename@allinit{allinit}
\def\blx@opt@uniquename@allfull{allfull}
\def\blx@opt@uniquename@mininit{mininit}
\def\blx@opt@uniquename@minfull{minfull}

If you do M-? on \ifcsdef{blx@opt@uniquename@#1} using the default backend,
the default search string is blx@opt@uniquename@, and you'll get two hits,
that line and the following one.  Stepping through
xref-references-in-directory shows that the semantic-symref search (using
grep) only finds those two using the :searchtype 'symbol, and they're
returned.  If you change 'symbol to 'regexp, grep finds all the matches in
that code snippet, but then xref--convert-hits uses (format "\\_<%s\\_>"),
which again loses all but the first two hits when it scans the list
provided by grep.  Either grep or emacs here will miss out on valid hits
unless you change both the semantic-symref instantiation and the format
specification.

> One way to deal with that is to treat all user inputs as regexps there.
Perhaps some will have to be more verbose that ideal, but as      > long as
the user is familiar with the regexp syntax, the behavior will be both
powerful and predictable

If I understand you right, I think that's what I'm trying to do, but
allowing for users who perhaps aren't too familiar with emacs regexps and
who might typically just accept the default search string offered by xref.

>  Could those be disambiguated when the tags are scanned, instead? Then
the user will tailor their input to find the one or the other.

If I understand you correctly, that's also what I try to do -- each tagged
command in the tags file is searched by the name of the tag, which in these
cases will either start with the escape char or not.  Looking at the
biblatex snippet, if you come across \csuse{blx@opt@uniquename@false}
somewhere in a file, and you want to see what the definition is, you can't
know apriori how it was defined, with \def or with \csdef.  This snippet
above mixes both styles, and I hoped that a user would be allowed to choose
whether to search for both styles without necessarily having to try both
forms of the string in separate searches.  In fact, as the code stands, it
only does the second search if the first one fails, so it still more or
less keeps the two command-naming styles separate.

The simplest fix is to remove the escape char from all tag names, which I
suggest to users of ctags in some commented-out code in etags.c. This does
lose the ability to differentiate \def'ed commands and \csdef'd ones,
especially as in some circumstances they can have the same name.  I'm not
sure how great a loss that is, on the other hand.  Is that what you had in
mind?

> Or if we want more fuzzier matching, perhaps creating mode-specific
values of etags-xref-find-definitions-tag-order could help.

Yeah, you're right, I'm pretty sure I could use a buffer-local value of
that variable to get xref-find-definitions to do the fuzzy matching I'm
after.  Does the discussion above at all help to convince you that there
are other issues that might still require a new backend?

David.

[-- Attachment #2: Type: text/html, Size: 4259 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-22 15:19         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-02-23  2:21           ` Dmitry Gutov
  2022-02-23 10:45             ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 66+ messages in thread
From: Dmitry Gutov @ 2022-02-23  2:21 UTC (permalink / raw)
  To: David Fussner; +Cc: 53749

Hi David,

On 22.02.2022 17:19, David Fussner wrote:

>  > Do you have a step-by-step scenario? Perhaps using one of the .texi
>  > manuals already existing in the repo?
> 
> I can't find a good example in the emacs repo, but I'll try to talk 
> through what happens with a code snippet from biblatex.sty, which I hope 
> will explain some of the issues we're discussing, even if it is a little 
> artificial.

Thank you.

> \DeclareBiblatexOption{global,type}[string]{uniquename}[true]{%
>    \ifcsdef{blx@opt@uniquename@#1}
>      {\letcs\blx@uniquename{blx@opt@uniquename@#1}}
>      {\blx@err@invopt{uniquename=#1}{}}}
> \def\blx@opt@uniquename@false{false}
> \def\blx@opt@uniquename@init{init}
> \def\blx@opt@uniquename@true{full}
> \def\blx@opt@uniquename@full{full}
> \def\blx@opt@uniquename@allinit{allinit}
> \def\blx@opt@uniquename@allfull{allfull}
> \def\blx@opt@uniquename@mininit{mininit}
> \def\blx@opt@uniquename@minfull{minfull}
> 
> If you do M-? on \ifcsdef{blx@opt@uniquename@#1} using the default 
> backend, the default search string is blx@opt@uniquename@, and you'll 
> get two hits, that line and the following one.  Stepping through 
> xref-references-in-directory shows that the semantic-symref search 
> (using grep) only finds those two using the :searchtype 'symbol, and 
> they're returned.  If you change 'symbol to 'regexp, grep finds all the 
> matches in that code snippet, but then xref--convert-hits uses (format 
> "\\_<%s\\_>"), which again loses all but the first two hits when it 
> scans the list provided by grep.  Either grep or emacs here will miss 
> out on valid hits unless you change both the semantic-symref 
> instantiation and the format specification.

That might call for a different implementation of 'references' indeed.

But could you make 'blx@opt@uniquename' the default search string in 
that example? Does that make sense?

And if not, all in all, I wouldn't worry too much about 
xref-find-references, since TeX is more of a text format (IMHO) than a 
program with well-defined identifiers. Perhaps using project-find-regexp 
most of the time will save you a lot of the trouble?

>  > One way to deal with that is to treat all user inputs as regexps 
> there. Perhaps some will have to be more verbose that ideal, but as      
>  > long as the user is familiar with the regexp syntax, the behavior 
> will be both powerful and predictable
> 
> If I understand you right, I think that's what I'm trying to do, but 
> allowing for users who perhaps aren't too familiar with emacs regexps 
> and who might typically just accept the default search string offered by 
> xref.

I'm not sure how I feel about the extra "fuzziness" in the behavior 
which comes with this approach.

>  >  Could those be disambiguated when the tags are scanned, instead? 
> Then the user will tailor their input to find the one or the other.
> 
> If I understand you correctly, that's also what I try to do -- each 
> tagged command in the tags file is searched by the name of the tag, 
> which in these cases will either start with the escape char or not.  
> Looking at the biblatex snippet, if you come across 
> \csuse{blx@opt@uniquename@false} somewhere in a file, and you want to 
> see what the definition is, you can't know apriori how it was defined, 
> with \def or with \csdef.  This snippet above mixes both styles, and I 
> hoped that a user would be allowed to choose whether to search for both 
> styles without necessarily having to try both forms of the string in 
> separate searches.  In fact, as the code stands, it only does the second 
> search if the first one fails, so it still more or less keeps the two 
> command-naming styles separate.

The parser could create both qualified (with \def or \csdef) and 
unqualified entries for the same definition. Maybe make it optional 
(with -Q argument to etags). Then the user could search using any of 
these formats.

>  > Or if we want more fuzzier matching, perhaps creating mode-specific 
> values of etags-xref-find-definitions-tag-order could help.
> 
> Yeah, you're right, I'm pretty sure I could use a buffer-local value of 
> that variable to get xref-find-definitions to do the fuzzy matching I'm 
> after. Does the discussion above at all help to convince you that there 
> are other issues that might still require a new backend?

The suggestion about a buffer-local value of that var was made in the 
context of trying to make it work with the current etags backend. At 
least, in the first patch. If only because I don't really like to see 
duplicated code.

If we find another place where we really want to diverge, we could also 
try adding some behavior-altering variable first.

After that, we might as well add a new backend (I'm not really against 
it, just prefer to exhaust other options first), but hopefully someone 
else (more familiar with tex-mode) could take over this discussion at 
that point, and the subsequent responsibility for the added code. That 
person could be yourself too, under right conditions.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-23  2:21           ` Dmitry Gutov
@ 2022-02-23 10:45             ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-02-24  2:23               ` Dmitry Gutov
  0 siblings, 1 reply; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-02-23 10:45 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 53749

Hi Dmitry,

Thanks again for looking at all this, and for your patience.

On Wed, 23 Feb 2022 at 02:21, Dmitry Gutov <dgutov@yandex.ru> wrote:

>
> That might call for a different implementation of 'references' indeed.
>
> But could you make 'blx@opt@uniquename' the default search string in
> that example? Does that make sense?
>

I guess it might be possible to come up with a regexp to suppress the
@ in some positions in the string, but the bad news is that if you M-?
with that search string you get no results at all with the default
backend. Grep finds the same two as before, but the default format
specification eliminates even those.  So you're left looking at a
string in your buffer and xref is telling you it isn't there.

> And if not, all in all, I wouldn't worry too much about
> xref-find-references, since TeX is more of a text format (IMHO) than a
> program with well-defined identifiers. Perhaps using project-find-regexp
> most of the time will save you a lot of the trouble?
>

You're quite right that C-x p g works well in this instance, and I
tried to improve how thing-at-point finds search strings in TeX
buffers for this command.  I guess TeX is a little bit of a bad fit
both for text modes and for prog modes, but I confess I'm still uneasy
at the thought of M-? returning such misleading results.  What would
you think about putting project-find-regexp on M-? in TeX buffers?
That is, assuming I don't find reasonably common TeX constructs that
defeat it?

> > If I understand you right, I think that's what I'm trying to do, but
> > allowing for users who perhaps aren't too familiar with emacs regexps
> > and who might typically just accept the default search string offered by
> > xref.
>
> I'm not sure how I feel about the extra "fuzziness" in the behavior
> which comes with this approach.

I see your point here.

>
> The parser could create both qualified (with \def or \csdef) and
> unqualified entries for the same definition. Maybe make it optional
> (with -Q argument to etags). Then the user could search using any of
> these formats.
>

I guess we could make etags do some of the work, perhaps adding also a
distinction between tagged commands that require this duplication
(\def & \csdef) and those that don't (\chapter).  Aside from making
tags files a lot bigger, and possibly adding another option to a
program already overloaded with them -- neither of which is a
showstopper -- I suspect it could work pretty well for
xref-find-definitions.

>
> The suggestion about a buffer-local value of that var was made in the
> context of trying to make it work with the current etags backend. At
> least, in the first patch. If only because I don't really like to see
> duplicated code.
>
> If we find another place where we really want to diverge, we could also
> try adding some behavior-altering variable first.
>
> After that, we might as well add a new backend (I'm not really against
> it, just prefer to exhaust other options first), but hopefully someone
> else (more familiar with tex-mode) could take over this discussion at
> that point, and the subsequent responsibility for the added code. That
> person could be yourself too, under right conditions.

I certainly concur about duplicated code, and I really did try hard to
get by without a new backend, but I won't pretend that I exhausted all
or even nearly all of the possibilities. If I'm understanding you
correctly, you'd prefer a few, small changes to the backend code in
etags.el (and xref.el), should that be necessary, to a whole new
backend which limits changes to tex-mode.el.  If this understanding is
reasonably accurate, I can have another look at earlier iterations of
the code to see what I missed, and perhaps come up with something that
works right without so much duplication. It may well take me some
time, so apologies in advance for being slow.

David.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-23 10:45             ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-02-24  2:23               ` Dmitry Gutov
  2022-02-24 13:15                 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 66+ messages in thread
From: Dmitry Gutov @ 2022-02-24  2:23 UTC (permalink / raw)
  To: David Fussner; +Cc: 53749

Hi David,

On 23.02.2022 12:45, David Fussner via Bug reports for GNU Emacs, the 
Swiss army knife of text editors wrote:

> I guess it might be possible to come up with a regexp to suppress the
> @ in some positions in the string, but the bad news is that if you M-?
> with that search string you get no results at all with the default
> backend. Grep finds the same two as before, but the default format
> specification eliminates even those.  So you're left looking at a
> string in your buffer and xref is telling you it isn't there.

That's odd. I've tried searching for 'blx@opt@uniquename' inside \...@, 
and 'grep -w' successfully finds it. Post-processing fails, apparently, 
but that depends on the contents of the syntax table. So one solution 
might be to update tex-mode's syntax table.

>> And if not, all in all, I wouldn't worry too much about
>> xref-find-references, since TeX is more of a text format (IMHO) than a
>> program with well-defined identifiers. Perhaps using project-find-regexp
>> most of the time will save you a lot of the trouble?
>>
> 
> You're quite right that C-x p g works well in this instance, and I
> tried to improve how thing-at-point finds search strings in TeX
> buffers for this command.  I guess TeX is a little bit of a bad fit
> both for text modes and for prog modes, but I confess I'm still uneasy
> at the thought of M-? returning such misleading results.  What would
> you think about putting project-find-regexp on M-? in TeX buffers?
> That is, assuming I don't find reasonably common TeX constructs that
> defeat it?

At the face of it, the suggestion seems odd (those command's features 
and user expectations are different), but it wouldn't be out of the 
question to circle back to it later.

>> The parser could create both qualified (with \def or \csdef) and
>> unqualified entries for the same definition. Maybe make it optional
>> (with -Q argument to etags). Then the user could search using any of
>> these formats.
>>
> 
> I guess we could make etags do some of the work, perhaps adding also a
> distinction between tagged commands that require this duplication
> (\def & \csdef) and those that don't (\chapter).  Aside from making
> tags files a lot bigger, and possibly adding another option to a
> program already overloaded with them -- neither of which is a
> showstopper -- I suspect it could work pretty well for
> xref-find-definitions.

IIUC tag files for LaTeX aren't going to be particularly big anyway 
(book projects are almost always smaller than even a mid-sized software 
project), so the size might never be a problem.

But then again, I could be very wrong about that.

>> The suggestion about a buffer-local value of that var was made in the
>> context of trying to make it work with the current etags backend. At
>> least, in the first patch. If only because I don't really like to see
>> duplicated code.
>>
>> If we find another place where we really want to diverge, we could also
>> try adding some behavior-altering variable first.
>>
>> After that, we might as well add a new backend (I'm not really against
>> it, just prefer to exhaust other options first), but hopefully someone
>> else (more familiar with tex-mode) could take over this discussion at
>> that point, and the subsequent responsibility for the added code. That
>> person could be yourself too, under right conditions.
> 
> I certainly concur about duplicated code, and I really did try hard to
> get by without a new backend, but I won't pretend that I exhausted all
> or even nearly all of the possibilities. If I'm understanding you
> correctly, you'd prefer a few, small changes to the backend code in
> etags.el (and xref.el), should that be necessary, to a whole new
> backend which limits changes to tex-mode.el.  If this understanding is
> reasonably accurate, I can have another look at earlier iterations of
> the code to see what I missed, and perhaps come up with something that
> works right without so much duplication. It may well take me some
> time, so apologies in advance for being slow.

Yes, please.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-24  2:23               ` Dmitry Gutov
@ 2022-02-24 13:15                 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-02-24 13:15 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 53749

Thanks Dmitry.  I'll post back here when I've got something.

David.

On Thu, 24 Feb 2022 at 02:23, Dmitry Gutov <dgutov@yandex.ru> wrote:
>
> Hi David,
>
> On 23.02.2022 12:45, David Fussner via Bug reports for GNU Emacs, the
> Swiss army knife of text editors wrote:
>
> > I guess it might be possible to come up with a regexp to suppress the
> > @ in some positions in the string, but the bad news is that if you M-?
> > with that search string you get no results at all with the default
> > backend. Grep finds the same two as before, but the default format
> > specification eliminates even those.  So you're left looking at a
> > string in your buffer and xref is telling you it isn't there.
>
> That's odd. I've tried searching for 'blx@opt@uniquename' inside \...@,
> and 'grep -w' successfully finds it. Post-processing fails, apparently,
> but that depends on the contents of the syntax table. So one solution
> might be to update tex-mode's syntax table.
>
> >> And if not, all in all, I wouldn't worry too much about
> >> xref-find-references, since TeX is more of a text format (IMHO) than a
> >> program with well-defined identifiers. Perhaps using project-find-regexp
> >> most of the time will save you a lot of the trouble?
> >>
> >
> > You're quite right that C-x p g works well in this instance, and I
> > tried to improve how thing-at-point finds search strings in TeX
> > buffers for this command.  I guess TeX is a little bit of a bad fit
> > both for text modes and for prog modes, but I confess I'm still uneasy
> > at the thought of M-? returning such misleading results.  What would
> > you think about putting project-find-regexp on M-? in TeX buffers?
> > That is, assuming I don't find reasonably common TeX constructs that
> > defeat it?
>
> At the face of it, the suggestion seems odd (those command's features
> and user expectations are different), but it wouldn't be out of the
> question to circle back to it later.
>
> >> The parser could create both qualified (with \def or \csdef) and
> >> unqualified entries for the same definition. Maybe make it optional
> >> (with -Q argument to etags). Then the user could search using any of
> >> these formats.
> >>
> >
> > I guess we could make etags do some of the work, perhaps adding also a
> > distinction between tagged commands that require this duplication
> > (\def & \csdef) and those that don't (\chapter).  Aside from making
> > tags files a lot bigger, and possibly adding another option to a
> > program already overloaded with them -- neither of which is a
> > showstopper -- I suspect it could work pretty well for
> > xref-find-definitions.
>
> IIUC tag files for LaTeX aren't going to be particularly big anyway
> (book projects are almost always smaller than even a mid-sized software
> project), so the size might never be a problem.
>
> But then again, I could be very wrong about that.
>
> >> The suggestion about a buffer-local value of that var was made in the
> >> context of trying to make it work with the current etags backend. At
> >> least, in the first patch. If only because I don't really like to see
> >> duplicated code.
> >>
> >> If we find another place where we really want to diverge, we could also
> >> try adding some behavior-altering variable first.
> >>
> >> After that, we might as well add a new backend (I'm not really against
> >> it, just prefer to exhaust other options first), but hopefully someone
> >> else (more familiar with tex-mode) could take over this discussion at
> >> that point, and the subsequent responsibility for the added code. That
> >> person could be yourself too, under right conditions.
> >
> > I certainly concur about duplicated code, and I really did try hard to
> > get by without a new backend, but I won't pretend that I exhausted all
> > or even nearly all of the possibilities. If I'm understanding you
> > correctly, you'd prefer a few, small changes to the backend code in
> > etags.el (and xref.el), should that be necessary, to a whole new
> > backend which limits changes to tex-mode.el.  If this understanding is
> > reasonably accurate, I can have another look at earlier iterations of
> > the code to see what I missed, and perhaps come up with something that
> > works right without so much duplication. It may well take me some
> > time, so apologies in advance for being slow.
>
> Yes, please.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-03 15:09 bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-02-21  2:11 ` Dmitry Gutov
  2022-02-21 12:35 ` Arash Esbati
@ 2022-02-25 20:16 ` Augusto Stoffel
  2022-02-26  9:29   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2 siblings, 1 reply; 66+ messages in thread
From: Augusto Stoffel @ 2022-02-25 20:16 UTC (permalink / raw)
  To: 53749; +Cc: dfussner

Hi David,

I took a superficial look at this thread, and this seems very nice.

I was wondering why you want to be able to find the definition of macros
with @ in their name.  Those are "private" macros that the user
shouldn't have occasion to use.  Is it for a TeX programmer mode?

Let me also mention a library I wrote for analyzing TeX code (accessible
to Emacs via LSP):

    https://github.com/astoff/digestif

It's written in Lua (can run on the LuaTeX interpreter) and uses PEGs
for flexible parsing.  If you want to be very ambitious about what you
are able to parse, I think regexps are not sufficient.

Digestif can handle \cite{messed up reference} just fine, for example.

On Thu,  3 Feb 2022 at 15:09, David Fussner via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org> wrote:

> I've recently been trying to use xref commands with a tags table in a
> TeX repository, and many of the results are sub-optimal.  This is a
> known issue -- within living memory there have been at least two
> discussions related to it on help-gnu-emacs:
>
> https://lists.gnu.org/archive/html/help-gnu-emacs/2018-06/msg00126.html
> https://lists.gnu.org/archive/html/help-gnu-emacs/2021-07/msg00436.html
>
> Neither discussion resulted in any code, at least not that I can find,
> and the issues mentioned there remain.  For example,
> xref-find-definitions on, say, '\mycommand' returns
>
> No definitions found for: mycommand.
>
> (The absence of the escape char in the search string makes the search
> fail, as the tag name in the table will be '\mycommand'.)
>
> Similarly, any xref command on 'my:citekey' will only search by default
> for the half of the symbol under point, stopping at the colon.
>
> There are many other behaviors that are suboptimal, as well, so in the
> end I wrote a new xref backend for TeX buffers (cloning large portions
> of the default etags backend), and wondered whether it might be welcome
> in GNU Emacs.
>
> A few remarks:
>
> 1. The code should work as it stands both in the AUCTeX and the in-tree
> modes.  The AUCTeX hooks I've included in the patch are provisional, as
> I would want to discuss with them how they would want to handle it,
> should the patch be accepted in some form.
>
> 2. Along the way I found some issues with how etags parses TeX files,
> issues which affect the usefulness of the xref commands, so I've made
> changes in etags.c as well.  When running the test suite for etags the
> only diffs occurred in the TeX-related sections of the resulting tags
> file, and location information in those sections was good.
>
> 3. The patch as it stands enables all the changes by default to give
> what I judge to be the best out-of-the-box experience, but wiser heads
> may well have other ideas.
>
> 4. If it looks like the patch will make it into Emacs in some form, I'm
> going to need to assign copyright, so I'd appreciate help with getting
> that started.
>
> Thanks,
>
> David.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-25 20:16 ` Augusto Stoffel
@ 2022-02-26  9:29   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-02-26 10:56     ` Augusto Stoffel
  0 siblings, 1 reply; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-02-26  9:29 UTC (permalink / raw)
  To: Augusto Stoffel; +Cc: 53749

Hi Augusto,

On Fri, 25 Feb 2022 at 20:16, Augusto Stoffel <arstoffel@gmail.com> wrote:
>
> Hi David,

>
> I took a superficial look at this thread, and this seems very nice.

Thanks!

>
> I was wondering why you want to be able to find the definition of macros
> with @ in their name.  Those are "private" macros that the user
> shouldn't have occasion to use.  Is it for a TeX programmer mode?

I confess that TeX developers are indeed one of the main targets for
the feature as I envisioned it.  For creating and following \labels,
\refs, and \cites (of all sorts) I find RefTeX very handy, as well as
for jumping around \chapters and \sections and the like.  What I miss
when developing are the code-navigation features of something like
xref, which are (from the user point of view) both simple and
powerful.  My modest goal was to make Emacs' extensive infrastructure
work a little better out of the box for TeX documents, especially for
styles and other collections of macros.

>
> Let me also mention a library I wrote for analyzing TeX code (accessible
> to Emacs via LSP):
>
>     https://github.com/astoff/digestif
>
> It's written in Lua (can run on the LuaTeX interpreter) and uses PEGs
> for flexible parsing.  If you want to be very ambitious about what you
> are able to parse, I think regexps are not sufficient.
>
> Digestif can handle \cite{messed up reference} just fine, for example.
>

This looks very nice indeed, and if I'm reading it right provides a
replacement both for RefTeX and for the code-navigation features I'm
trying to implement.  I figure I'll continue trying to get improved
out-of-the-box features into core, and if I manage to satisfy Dmitry
we'll then have a choice, but in any case I'm going to have a longer
look at digestif when I get some time.

Thanks for the hint!

David.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-26  9:29   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-02-26 10:56     ` Augusto Stoffel
  2022-02-27 18:42       ` Arash Esbati
  0 siblings, 1 reply; 66+ messages in thread
From: Augusto Stoffel @ 2022-02-26 10:56 UTC (permalink / raw)
  To: David Fussner; +Cc: 53749

On Sat, 26 Feb 2022 at 09:29, David Fussner <dfussner@googlemail.com> wrote:

> Hi Augusto,
>
> On Fri, 25 Feb 2022 at 20:16, Augusto Stoffel <arstoffel@gmail.com> wrote:
>>
>> Hi David,
>
>>
>> I took a superficial look at this thread, and this seems very nice.
>
> Thanks!
>
>>
>> I was wondering why you want to be able to find the definition of macros
>> with @ in their name.  Those are "private" macros that the user
>> shouldn't have occasion to use.  Is it for a TeX programmer mode?
>
> I confess that TeX developers are indeed one of the main targets for
> the feature as I envisioned it.  For creating and following \labels,
> \refs, and \cites (of all sorts) I find RefTeX very handy, as well as
> for jumping around \chapters and \sections and the like.  What I miss
> when developing are the code-navigation features of something like
> xref, which are (from the user point of view) both simple and
> powerful.  My modest goal was to make Emacs' extensive infrastructure
> work a little better out of the box for TeX documents, especially for
> styles and other collections of macros.

Sorry for entering a tangent, but here's one more thing I dislike about
RefTeX you might want to consider.  If you type \label{something}, as
opposed to using the RefTeX command to add a label (or if you edit the
label by hand) then RefTeX will not reparse the document and get out of
sync.  Or at least that was the case when I still used RefTeX.  So it
might be worth considering some cache invalidation scheme there.
(Digestif has caching for multifile documents, but parsing a single file
is fast enough that this is not a problem I need to worry :-).)

>>
>> Let me also mention a library I wrote for analyzing TeX code (accessible
>> to Emacs via LSP):
>>
>>     https://github.com/astoff/digestif
>>
>> It's written in Lua (can run on the LuaTeX interpreter) and uses PEGs
>> for flexible parsing.  If you want to be very ambitious about what you
>> are able to parse, I think regexps are not sufficient.
>>
>> Digestif can handle \cite{messed up reference} just fine, for example.
>>
>
> This looks very nice indeed, and if I'm reading it right provides a
> replacement both for RefTeX and for the code-navigation features I'm
> trying to implement.

That's right.  Also command completion (including snippets, if that's
your thing) and Eldoc.

>  I figure I'll continue trying to get improved
> out-of-the-box features into core, and if I manage to satisfy Dmitry
> we'll then have a choice, but in any case I'm going to have a longer
> look at digestif when I get some time.

Let me mention one last thing, since you seem interested in a TeX
programming mode.

Digestif will not work great out of the box for programming because it
correctly considers @ to have catcode "other" (so it can't be part of
the name of a command).  But this is trivial to change and, in fact,
Digestif already has a "latex-prog" mode that simulates the correct
catcodes.  It would be easy to include a "latex-expl3" mode as well.

The problem is that there's no way for Emacs to communicate that one of
these programming modes is to be used.  This could be fixed in two ways:

A. by creating latex-prog and latex-expl3 derived modes in Emacs, or

B. adding heuristics to Digestif to decide if a given file is "document"
   or "code".

Do you have any thoughts about A?  Would there be any other benefits in
Emacs to justify the latex-prog and latex-expl3 major modes?  It seems
that (at least in AUCTeX) @ is always considered a letter, which may be
innocuous but is kinda wrong.

>
> Thanks for the hint!
>
> David.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-26 10:56     ` Augusto Stoffel
@ 2022-02-27 18:42       ` Arash Esbati
  2022-02-28  9:09         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 66+ messages in thread
From: Arash Esbati @ 2022-02-27 18:42 UTC (permalink / raw)
  To: Augusto Stoffel; +Cc: 53749, David Fussner

Augusto Stoffel <arstoffel@gmail.com> writes:

> If you type \label{something}, as opposed to using the RefTeX command
> to add a label (or if you edit the label by hand) then RefTeX will not
> reparse the document and get out of sync.

If you know the known labels to RefTeX are out of sync, you can issue
`C-c )' with a prefix argument:

,----[ C-h f reftex-reference RET ]
| reftex-reference is an interactive native compiled Lisp function in
| ‘reftex-ref.el’.
| 
| (reftex-reference &optional TYPE NO-INSERT CUT)
| 
| Make a LaTeX reference.  Look only for labels of a certain TYPE.
| With prefix arg, force to rescan buffer for labels.  This should only be
| necessary if you have recently entered labels yourself without using
| reftex-label.  Rescanning of the buffer can also be requested from the
| label selection menu.
| The function returns the selected label or nil.
| If NO-INSERT is non-nil, do not insert \ref command, just return label.
| When called with 2 C-u prefix args, disable magic word recognition.
| 
|   Probably introduced at or before Emacs version 20.1.
| 
`----

Or in the labels *RefTeX select* buffer, you have these choices:

 r / C-u r  Reparse document / Reparse entire document.

I usually hit r when I don't find the label I'm looking for.

> Or at least that was the case when I still used RefTeX.  So it might
> be worth considering some cache invalidation scheme there.

The question is if it's worth the effort where a remedy is already in
place.

Best, Arash





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-27 18:42       ` Arash Esbati
@ 2022-02-28  9:09         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-02-28 11:54           ` Arash Esbati
  2022-02-28 13:05           ` Augusto Stoffel
  0 siblings, 2 replies; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-02-28  9:09 UTC (permalink / raw)
  To: Arash Esbati; +Cc: 53749, Augusto Stoffel

Hi Augusto,

For what it's worth, I've always just done what Arash suggests when
RefTeX gets out of sync, and haven't had any issues with it that I can
remember.  (To be fair, my use cases haven't exactly been exotic.)

> The problem is that there's no way for Emacs to communicate that one of
> these programming modes is to be used.  This could be fixed in two ways:
>
> A. by creating latex-prog and latex-expl3 derived modes in Emacs, or
>
> B. adding heuristics to Digestif to decide if a given file is "document"
>    or "code".
>
> Do you have any thoughts about A?  Would there be any other benefits in
> Emacs to justify the latex-prog and latex-expl3 major modes?  It seems
> that (at least in AUCTeX) @ is always considered a letter, which may be
> innocuous but is kinda wrong.

The only thought I have is that it sounds like a new major mode would
be overkill for what you need here.  I would think that a variable or
defcustom might do the trick, or at most maybe a minor mode?  When
navigating code I really want to be able to follow the commands to
their source no matter whether the command is internal or for users,
though I can see how in a code-completion setting you might want to be
able to separate the two more cleanly.  Obviously, I'm not the person
you need to convince about all of this -- that would be Arash and the
emacs maintainers, themselves.

Best,

David.

On Sun, 27 Feb 2022 at 18:43, Arash Esbati <arash@gnu.org> wrote:
>
> Augusto Stoffel <arstoffel@gmail.com> writes:
>
> > If you type \label{something}, as opposed to using the RefTeX command
> > to add a label (or if you edit the label by hand) then RefTeX will not
> > reparse the document and get out of sync.
>
> If you know the known labels to RefTeX are out of sync, you can issue
> `C-c )' with a prefix argument:
>
> ,----[ C-h f reftex-reference RET ]
> | reftex-reference is an interactive native compiled Lisp function in
> | ‘reftex-ref.el’.
> |
> | (reftex-reference &optional TYPE NO-INSERT CUT)
> |
> | Make a LaTeX reference.  Look only for labels of a certain TYPE.
> | With prefix arg, force to rescan buffer for labels.  This should only be
> | necessary if you have recently entered labels yourself without using
> | reftex-label.  Rescanning of the buffer can also be requested from the
> | label selection menu.
> | The function returns the selected label or nil.
> | If NO-INSERT is non-nil, do not insert \ref command, just return label.
> | When called with 2 C-u prefix args, disable magic word recognition.
> |
> |   Probably introduced at or before Emacs version 20.1.
> |
> `----
>
> Or in the labels *RefTeX select* buffer, you have these choices:
>
>  r / C-u r  Reparse document / Reparse entire document.
>
> I usually hit r when I don't find the label I'm looking for.
>
> > Or at least that was the case when I still used RefTeX.  So it might
> > be worth considering some cache invalidation scheme there.
>
> The question is if it's worth the effort where a remedy is already in
> place.
>
> Best, Arash





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-28  9:09         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-02-28 11:54           ` Arash Esbati
  2022-02-28 13:11             ` Augusto Stoffel
  2022-02-28 13:05           ` Augusto Stoffel
  1 sibling, 1 reply; 66+ messages in thread
From: Arash Esbati @ 2022-02-28 11:54 UTC (permalink / raw)
  To: David Fussner; +Cc: 53749, Augusto Stoffel

David Fussner <dfussner@googlemail.com> writes:

>> The problem is that there's no way for Emacs to communicate that one of
>> these programming modes is to be used.  This could be fixed in two ways:
>>
>> A. by creating latex-prog and latex-expl3 derived modes in Emacs, or
>>
>> B. adding heuristics to Digestif to decide if a given file is "document"
>>    or "code".
>>
>> Do you have any thoughts about A?  Would there be any other benefits in
>> Emacs to justify the latex-prog and latex-expl3 major modes?  It seems
>> that (at least in AUCTeX) @ is always considered a letter, which may be
>> innocuous but is kinda wrong.
>
> The only thought I have is that it sounds like a new major mode would
> be overkill for what you need here.  I would think that a variable or
> defcustom might do the trick, or at most maybe a minor mode?

Sorry if I'm missing something here, I wasn't tracking this thread.  But
does doctex-mode (or docTeX in AUCTeX) fit the bill here?

Best, Arash





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-28  9:09         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-02-28 11:54           ` Arash Esbati
@ 2022-02-28 13:05           ` Augusto Stoffel
  1 sibling, 0 replies; 66+ messages in thread
From: Augusto Stoffel @ 2022-02-28 13:05 UTC (permalink / raw)
  To: David Fussner; +Cc: 53749, Arash Esbati

On Mon, 28 Feb 2022 at 09:09, David Fussner <dfussner@googlemail.com> wrote:

> For what it's worth, I've always just done what Arash suggests when
> RefTeX gets out of sync, and haven't had any issues with it that I can
> remember.  (To be fair, my use cases haven't exactly been exotic.)

Sure, I'm aware you have to do the manual resync when using RefTeX.  I
just think it's a totally unnecessary hurdle at this time and age.  I
used to advice the RefTeX commands so they would reparse the document
every time, and this worked just fine.  (Granted, I never worked on
anything over 100 pages or so, but it should also be possible to reparse
individual files of a multifile project so that the file sizes never
become an issue.)

>> The problem is that there's no way for Emacs to communicate that one of
>> these programming modes is to be used.  This could be fixed in two ways:
>>
>> A. by creating latex-prog and latex-expl3 derived modes in Emacs, or
>>
>> B. adding heuristics to Digestif to decide if a given file is "document"
>>    or "code".
>>
>> Do you have any thoughts about A?  Would there be any other benefits in
>> Emacs to justify the latex-prog and latex-expl3 major modes?  It seems
>> that (at least in AUCTeX) @ is always considered a letter, which may be
>> innocuous but is kinda wrong.
>
> The only thought I have is that it sounds like a new major mode would
> be overkill for what you need here.  I would think that a variable or
> defcustom might do the trick, or at most maybe a minor mode?  When
> navigating code I really want to be able to follow the commands to
> their source no matter whether the command is internal or for users,
> though I can see how in a code-completion setting you might want to be
> able to separate the two more cleanly.  Obviously, I'm not the person
> you need to convince about all of this -- that would be Arash and the
> emacs maintainers, themselves.

Okay, thanks for your insight.

>
> Best,
>
> David.
>
> On Sun, 27 Feb 2022 at 18:43, Arash Esbati <arash@gnu.org> wrote:
>>
>> Augusto Stoffel <arstoffel@gmail.com> writes:
>>
>> > If you type \label{something}, as opposed to using the RefTeX command
>> > to add a label (or if you edit the label by hand) then RefTeX will not
>> > reparse the document and get out of sync.
>>
>> If you know the known labels to RefTeX are out of sync, you can issue
>> `C-c )' with a prefix argument:
>>
>> ,----[ C-h f reftex-reference RET ]
>> | reftex-reference is an interactive native compiled Lisp function in
>> | ‘reftex-ref.el’.
>> |
>> | (reftex-reference &optional TYPE NO-INSERT CUT)
>> |
>> | Make a LaTeX reference.  Look only for labels of a certain TYPE.
>> | With prefix arg, force to rescan buffer for labels.  This should only be
>> | necessary if you have recently entered labels yourself without using
>> | reftex-label.  Rescanning of the buffer can also be requested from the
>> | label selection menu.
>> | The function returns the selected label or nil.
>> | If NO-INSERT is non-nil, do not insert \ref command, just return label.
>> | When called with 2 C-u prefix args, disable magic word recognition.
>> |
>> |   Probably introduced at or before Emacs version 20.1.
>> |
>> `----
>>
>> Or in the labels *RefTeX select* buffer, you have these choices:
>>
>>  r / C-u r  Reparse document / Reparse entire document.
>>
>> I usually hit r when I don't find the label I'm looking for.
>>
>> > Or at least that was the case when I still used RefTeX.  So it might
>> > be worth considering some cache invalidation scheme there.
>>
>> The question is if it's worth the effort where a remedy is already in
>> place.
>>
>> Best, Arash





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-28 11:54           ` Arash Esbati
@ 2022-02-28 13:11             ` Augusto Stoffel
  2022-02-28 19:04               ` Arash Esbati
  0 siblings, 1 reply; 66+ messages in thread
From: Augusto Stoffel @ 2022-02-28 13:11 UTC (permalink / raw)
  To: Arash Esbati; +Cc: 53749, David Fussner

On Mon, 28 Feb 2022 at 12:54, Arash Esbati <arash@gnu.org> wrote:

> Sorry if I'm missing something here, I wasn't tracking this thread.  But
> does doctex-mode (or docTeX in AUCTeX) fit the bill here?

Ah, I forgot about that one.  I mean basically that, but for files like
plain.tex or tikz.code.tex; or also when writing a .sty file directly
for personal purposes only.

But since tex-mode and derived ones always pretend @ is a letter, I
guess there's no real need for a dedicated TeX programming mode.

Now, how about expl3 code, where _ and : are letters too?





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-28 13:11             ` Augusto Stoffel
@ 2022-02-28 19:04               ` Arash Esbati
  2022-03-01  8:46                 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 66+ messages in thread
From: Arash Esbati @ 2022-02-28 19:04 UTC (permalink / raw)
  To: Augusto Stoffel; +Cc: 53749, David Fussner

Augusto Stoffel <arstoffel@gmail.com> writes:

> Now, how about expl3 code, where _ and : are letters too?

AUCTeX has a style file expl3.el[1] which changes the syntax for "_" and
":".  Can't tell about the builtin tex/latex-mode.

Best, Arash

Footnotes:
[1]  http://git.savannah.gnu.org/cgit/auctex.git/tree/style/expl3.el





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-28 19:04               ` Arash Esbati
@ 2022-03-01  8:46                 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-03-01  8:46 UTC (permalink / raw)
  To: Arash Esbati; +Cc: 53749, Augusto Stoffel

Unless I'm missing something, the in-tree code hasn't (yet) made any
syntax changes for expl3.

Best,

David.

On Mon, 28 Feb 2022 at 19:04, Arash Esbati <arash@gnu.org> wrote:
>
> Augusto Stoffel <arstoffel@gmail.com> writes:
>
> > Now, how about expl3 code, where _ and : are letters too?
>
> AUCTeX has a style file expl3.el[1] which changes the syntax for "_" and
> ":".  Can't tell about the builtin tex/latex-mode.
>
> Best, Arash
>
> Footnotes:
> [1]  http://git.savannah.gnu.org/cgit/auctex.git/tree/style/expl3.el





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-02-21  2:11 ` Dmitry Gutov
  2022-02-21  9:48   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-09-08 13:25   ` Lars Ingebrigtsen
  2022-09-08 13:34     ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 66+ messages in thread
From: Lars Ingebrigtsen @ 2022-09-08 13:25 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 53749, David Fussner

Dmitry Gutov <dgutov@yandex.ru> writes:

> Let us first discuss whether we could make do without an additional
> Xref backend. Just to make sure.

(I'm going through old bug reports that unfortunately weren't resolved
at the time.)

I've only skimmed this bug report, so I might well have missed
something.  Was there a conclusion here as to what should be done?  It
looks like useful functionality to me (but it's been years since I've
written tex-y stuff).

In any case, if this is to be applied, we'd need to have a copyright
assignment to the FSF on file.  David, would you be willing to sign
that?





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-09-08 13:25   ` Lars Ingebrigtsen
@ 2022-09-08 13:34     ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-09-08 13:39       ` Lars Ingebrigtsen
  0 siblings, 1 reply; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-09-08 13:34 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 53749, Dmitry Gutov

Hi Lars,

The conclusion at the time was that the patch needed reworking before
Dmitry was happy with it, and I've not yet found enough time to do so,
though I'm still fully intending to make the necessary changes. Please
leave the bug open so I can restart the conversation when I have a
better patch. (Oh, and I'm more than happy to sign the copyright
assignment whenever Dmitry judges the patch to be ready.)

Thanks for the reminder.

David.

On Thu, 8 Sept 2022 at 14:25, Lars Ingebrigtsen <larsi@gnus.org> wrote:
>
> Dmitry Gutov <dgutov@yandex.ru> writes:
>
> > Let us first discuss whether we could make do without an additional
> > Xref backend. Just to make sure.
>
> (I'm going through old bug reports that unfortunately weren't resolved
> at the time.)
>
> I've only skimmed this bug report, so I might well have missed
> something.  Was there a conclusion here as to what should be done?  It
> looks like useful functionality to me (but it's been years since I've
> written tex-y stuff).
>
> In any case, if this is to be applied, we'd need to have a copyright
> assignment to the FSF on file.  David, would you be willing to sign
> that?





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-09-08 13:34     ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-09-08 13:39       ` Lars Ingebrigtsen
  2022-09-08 15:50         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 66+ messages in thread
From: Lars Ingebrigtsen @ 2022-09-08 13:39 UTC (permalink / raw)
  To: David Fussner; +Cc: 53749, Dmitry Gutov

David Fussner <dfussner@googlemail.com> writes:

> The conclusion at the time was that the patch needed reworking before
> Dmitry was happy with it, and I've not yet found enough time to do so,
> though I'm still fully intending to make the necessary changes. Please
> leave the bug open so I can restart the conversation when I have a
> better patch.

Of course.

> (Oh, and I'm more than happy to sign the copyright
> assignment whenever Dmitry judges the patch to be ready.)

Here's the form to get started:


Please email the following information to assign@gnu.org, and we
will send you the assignment form for your past and future changes.

Please use your full legal name (in ASCII characters) as the subject
line of the message.
----------------------------------------------------------------------
REQUEST: SEND FORM FOR PAST AND FUTURE CHANGES

[What is the name of the program or package you're contributing to?]
Emacs

[Did you copy any files or text written by someone else in these changes?
Even if that material is free software, we need to know about it.]

[Do you have an employer who might have a basis to claim to own
your changes?  Do you attend a school which might make such a claim?]

[For the copyright registration, what country are you a citizen of?]

[What year were you born?]

[Please write your email address here.]

[Please write your postal address here.]

[Which files have you changed so far, and which new files have you written
so far?]





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-09-08 13:39       ` Lars Ingebrigtsen
@ 2022-09-08 15:50         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-09-03  9:08           ` Stefan Kangas
  0 siblings, 1 reply; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-09-08 15:50 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 53749, Dmitry Gutov

Thanks Lars, will do.

On Thu, 8 Sept 2022 at 14:39, Lars Ingebrigtsen <larsi@gnus.org> wrote:
>
> David Fussner <dfussner@googlemail.com> writes:
>
> > The conclusion at the time was that the patch needed reworking before
> > Dmitry was happy with it, and I've not yet found enough time to do so,
> > though I'm still fully intending to make the necessary changes. Please
> > leave the bug open so I can restart the conversation when I have a
> > better patch.
>
> Of course.
>
> > (Oh, and I'm more than happy to sign the copyright
> > assignment whenever Dmitry judges the patch to be ready.)
>
> Here's the form to get started:
>
>
> Please email the following information to assign@gnu.org, and we
> will send you the assignment form for your past and future changes.
>
> Please use your full legal name (in ASCII characters) as the subject
> line of the message.
> ----------------------------------------------------------------------
> REQUEST: SEND FORM FOR PAST AND FUTURE CHANGES
>
> [What is the name of the program or package you're contributing to?]
> Emacs
>
> [Did you copy any files or text written by someone else in these changes?
> Even if that material is free software, we need to know about it.]
>
> [Do you have an employer who might have a basis to claim to own
> your changes?  Do you attend a school which might make such a claim?]
>
> [For the copyright registration, what country are you a citizen of?]
>
> [What year were you born?]
>
> [Please write your email address here.]
>
> [Please write your postal address here.]
>
> [Which files have you changed so far, and which new files have you written
> so far?]





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2022-09-08 15:50         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-09-03  9:08           ` Stefan Kangas
  2023-09-03 10:03             ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 66+ messages in thread
From: Stefan Kangas @ 2023-09-03  9:08 UTC (permalink / raw)
  To: David Fussner; +Cc: 53749, Lars Ingebrigtsen, Dmitry Gutov

David Fussner <dfussner@googlemail.com> writes:

> Thanks Lars, will do.
>
> On Thu, 8 Sept 2022 at 14:39, Lars Ingebrigtsen <larsi@gnus.org> wrote:
>>
>> David Fussner <dfussner@googlemail.com> writes:
>>
>> > The conclusion at the time was that the patch needed reworking before
>> > Dmitry was happy with it, and I've not yet found enough time to do so,
>> > though I'm still fully intending to make the necessary changes. Please
>> > leave the bug open so I can restart the conversation when I have a
>> > better patch.
>>
>> Of course.

That was a year ago.  Have you made any progress here?





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2023-09-03  9:08           ` Stefan Kangas
@ 2023-09-03 10:03             ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-09-03 10:46               ` Stefan Kangas
  0 siblings, 1 reply; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-09-03 10:03 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: 53749, Lars Ingebrigtsen, Dmitry Gutov

[-- Attachment #1: Type: text/plain, Size: 891 bytes --]

Hi Stefan

Thanks for the nudge. I do in fact have a patch that I'm just about finding
time to test, so I'll try to get it to the list within a week or two.

Thanks, and best,

David.

On Sun, 3 Sept 2023, 10:08 Stefan Kangas, <stefankangas@gmail.com> wrote:

> David Fussner <dfussner@googlemail.com> writes:
>
> > Thanks Lars, will do.
> >
> > On Thu, 8 Sept 2022 at 14:39, Lars Ingebrigtsen <larsi@gnus.org> wrote:
> >>
> >> David Fussner <dfussner@googlemail.com> writes:
> >>
> >> > The conclusion at the time was that the patch needed reworking before
> >> > Dmitry was happy with it, and I've not yet found enough time to do so,
> >> > though I'm still fully intending to make the necessary changes. Please
> >> > leave the bug open so I can restart the conversation when I have a
> >> > better patch.
> >>
> >> Of course.
>
> That was a year ago.  Have you made any progress here?
>

[-- Attachment #2: Type: text/html, Size: 1693 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2023-09-03 10:03             ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-09-03 10:46               ` Stefan Kangas
  2023-09-13 11:10                 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 66+ messages in thread
From: Stefan Kangas @ 2023-09-03 10:46 UTC (permalink / raw)
  To: David Fussner; +Cc: 53749, Lars Ingebrigtsen, Dmitry Gutov

David Fussner <dfussner@googlemail.com> writes:

> Thanks for the nudge. I do in fact have a patch that I'm just about finding
> time to test, so I'll try to get it to the list within a week or two.

Sounds good, and thank you.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2023-09-03 10:46               ` Stefan Kangas
@ 2023-09-13 11:10                 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-09-13 13:42                   ` Stefan Kangas
  2023-09-13 15:23                   ` Dmitry Gutov
  0 siblings, 2 replies; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-09-13 11:10 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: 53749, Lars Ingebrigtsen, Dmitry Gutov

[-- Attachment #1: Type: text/plain, Size: 1592 bytes --]

Hi Dmitry,

I've belatedly found some time to get the xref commands working better
in TeX buffers, this time using the default etags backend, as you
requested last year.  The basic strategy remains the same -- create a
new thing-at-point argument "texsymbol" which replaces "symbol" in a
definable set of major modes, then pass the resulting search term to
xref.  Changes in etags.c ensure that the various TeX modes and the
tags tables are cooperating with each other, and I added a new option
to etags (--tex-alt-forms) to handle some of the complexities of the
TeX escape character (as you suggested).  I also manipulate some
variables buffer-locally to make things like project-find-regexp and
isearch-forward-thing-at-point work better in such buffers.

I attach a patch against current master. There is another patch which
contains changes to the test suite in test/manual/etags, but I'll
leave that one in case the changes I've made to etags.c need further
work.

I've sent patches to AUCTeX trying to fix a couple of issues there
with xref-find-references. There's more work to be done on related
issues in tex-mode.el, too, but this patch is a start.

Thanks,

David.

P.S. I'm also starting the copyright assignment process, in case these
changes prove acceptable.

On Sun, 3 Sept 2023 at 11:46, Stefan Kangas <stefankangas@gmail.com> wrote:
>
> David Fussner <dfussner@googlemail.com> writes:
>
> > Thanks for the nudge. I do in fact have a patch that I'm just about finding
> > time to test, so I'll try to get it to the list within a week or two.
>
> Sounds good, and thank you.

[-- Attachment #2: 0001-Fix-behavior-of-xref-commands-in-TeX-buffers.patch --]
[-- Type: text/x-patch, Size: 23126 bytes --]

From d5a77bd1dc45e0638df3e4c763a168912c93b5b5 Mon Sep 17 00:00:00 2001
From: David Fussner <dfussner@googlemail.com>
Date: Wed, 13 Sep 2023 11:59:54 +0100
Subject: [PATCH] Fix behavior of xref commands in TeX buffers

* lib-src/etags.c (longopts): Add new option --tex-alt-forms.
(TeX_commands): Improve parsing of commands in TeX buffers.
(TEX_defenv): Expand list of commands to tag by default in TeX
buffers.
(TeX_help):
* doc/emacs/maintaining.texi (Tag Syntax): Document new tagged
commands and new user option.
(Identifier Search): Add note about auto-mode-alist and
xref-find-references.

* lisp/textmodes/tex-mode.el (tex-common-initialization): Set up xref
modifications for in-tree TeX modes.
(tex-thingatpt-modes-list): New var.
(tex-thingatpt-is-texsymbol): New defcustom.
(tex-set-thingatpt-symbol): New command to apply value of previous
buffer-locally.
(tex--symbol-or-texsymbol): New helper function for previous.
(tex--thing-at-point): New function to return texsymbol
'thing-at-point'.
(tex-thingatpt--beginning-of-texsymbol)
(tex-thingatpt--end-of-texsymbol): New functions to define texsymbol
"thing" for 'thing-at-point'.
(tex-thingatpt-syntax-table, tex-escape-char): New vars to do the
same.
(tex-thingatpt-include-escape): New defcustom to refine behavior of
previous.
(tex--include-escape-p): New function to do the same.
(tex-thingatpt-syntax-table): New function to access and modify the
syntax table of the same name.
---
 doc/emacs/maintaining.texi |  33 +++++-
 lib-src/etags.c            | 122 ++++++++++++++++++---
 lisp/textmodes/tex-mode.el | 216 +++++++++++++++++++++++++++++++++++++
 3 files changed, 357 insertions(+), 14 deletions(-)

diff --git a/doc/emacs/maintaining.texi b/doc/emacs/maintaining.texi
index a95335f3df2..44b8b304026 100644
--- a/doc/emacs/maintaining.texi
+++ b/doc/emacs/maintaining.texi
@@ -2457,6 +2457,13 @@ Identifier Search
 referenced.  The XREF mode commands are available in this buffer, see
 @ref{Xref Commands}.
 
+When invoked in a buffer whose major mode uses the @code{etags}
+backend, @kbd{M-?} searches files and buffers whose major mode matches
+that of the original buffer.  It guesses that mode from file
+extensions, so if @kbd{M-?} seems to be skipping relevant buffers or
+files, try customizing the variable @code{auto-mode-alist} to include
+the missing extensions (@pxref{Choosing Modes}).
+
 @vindex xref-auto-jump-to-first-xref
   If the value of the variable @code{xref-auto-jump-to-first-xref} is
 @code{t}, @code{xref-find-references} automatically jumps to the first
@@ -2672,8 +2679,23 @@ Tag Syntax
 @code{\section}, @code{\subsection}, @code{\subsubsection},
 @code{\eqno}, @code{\label}, @code{\ref}, @code{\cite},
 @code{\bibitem}, @code{\part}, @code{\appendix}, @code{\entry},
-@code{\index}, @code{\def}, @code{\newcommand}, @code{\renewcommand},
-@code{\newenvironment} and @code{\renewenvironment} are tags.
+@code{\index}, @code{\def}, @code{\edef}, @code{\gdef}, @code{\xdef},
+@code{\newcommand}, @code{\renewcommand}, @code{\newenvironment},
+@code{\renewenvironment}, @code{\DeclareRobustCommand},
+@code{\newrobustcmd}, @code{\renewrobustcmd}, @code{\providecommand},
+@code{\providerobustcmd}, @code{\NewDocumentCommand},
+@code{\RenewDocumentCommand}, @code{\ProvideDocumentCommand},
+@code{\DeclareDocumentCommand}, @code{\NewExpandableDocumentCommand},
+@code{\RenewExpandableDocumentCommand},
+@code{\ProvideExpandableDocumentCommand},
+@code{\DeclareExpandableDocumentCommand},
+@code{\NewDocumentEnvironment}, @code{\RenewDocumentEnvironment},
+@code{\ProvideDocumentEnvironment},
+@code{\DeclareDocumentEnvironment}, @code{\csdef}, @code{\csedef},
+@code{\csgdef}, @code{\csxdef}, @code{\csletcs}, @code{\cslet},
+@code{\letcs}, and @code{\let} are tags.  So too are the arguments of
+any starred variants of these commands, when such variants currently
+exist.
 
 Other commands can make tags as well, if you specify them in the
 environment variable @env{TEXTAGS} before invoking @command{etags}.  The
@@ -2689,6 +2711,13 @@ Tag Syntax
 specifies (using Bourne shell syntax) that the commands
 @samp{\mycommand} and @samp{\myothercommand} also define tags.
 
+The @samp{--tex-alt-forms} option causes each tag to have two names,
+one with and one without the @TeX{} escape character, usually
+@samp{\}.  This may be helpful when mixing traditional @TeX{} or
+@LaTeX{} constructs (@samp{\def}) with newer constructs from the
+@samp{etoolbox} package (@samp{\csdef}).  Use of this option will
+double the size of any @TeX{}-related sections in your tags file.
+
 @item
 In Lisp code, any function defined with @code{defun}, any variable
 defined with @code{defvar} or @code{defconst}, and in general the
diff --git a/lib-src/etags.c b/lib-src/etags.c
index 147ecbd7c1b..3a6682fe451 100644
--- a/lib-src/etags.c
+++ b/lib-src/etags.c
@@ -475,6 +475,7 @@ #define xrnew(op, n, m) ((op) = xnrealloc (op, n, (m) * sizeof *(op)))
 static bool ignoreindent;	/* -I: ignore indentation in C */
 static int packages_only;	/* --packages-only: in Ada, only tag packages*/
 static int class_qualify;	/* -Q: produce class-qualified tags in C++/Java */
+static int tex_alt_forms;       /* --tex-alt-forms: tag names w/ and w/o escape */
 static int debug;		/* --debug */
 
 /* STDIN is defined in LynxOS system headers */
@@ -509,6 +510,7 @@ #define STDIN 0x1001		/* returned by getopt_long on --parse-stdin */
   { "no-regex",           no_argument,       NULL,               'R'   },
   { "ignore-case-regex",  required_argument, NULL,               'c'   },
   { "parse-stdin",        required_argument, NULL,               STDIN },
+  { "tex-alt-forms",      no_argument,       &tex_alt_forms,     1     },
   { "version",            no_argument,       NULL,               'V'   },
 
 #if CTAGS /* Ctags options */
@@ -792,12 +794,28 @@ #define STDIN 0x1001		/* returned by getopt_long on --parse-stdin */
 "In LaTeX text, the argument of any of the commands '\\chapter',\n\
 '\\section', '\\subsection', '\\subsubsection', '\\eqno', '\\label',\n\
 '\\ref', '\\cite', '\\bibitem', '\\part', '\\appendix', '\\entry',\n\
-'\\index', '\\def', '\\newcommand', '\\renewcommand',\n\
-'\\newenvironment' or '\\renewenvironment' is a tag.\n\
+'\\index', '\\def', '\\edef', '\\gdef', '\\xdef', '\\newcommand',\n\
+'\\renewcommand', '\\newenvironment', '\\renewenvironment',\n\
+'\\DeclareRobustCommand, '\\newrobustcmd', '\\renewrobustcmd',\n\
+'\\providecommand', '\\providerobustcmd', '\\NewDocumentCommand',\n\
+'\\RenewDocumentCommand', '\\ProvideDocumentCommand',\n\
+'\\DeclareDocumentCommand', '\\NewExpandableDocumentCommand',\n\
+'\\RenewExpandableDocumentCommand', '\\ProvideExpandableDocumentCommand',\n\
+'\\DeclareExpandableDocumentCommand', '\\NewDocumentEnvironment',\n\
+'\\RenewDocumentEnvironment', '\\ProvideDocumentEnvironment',\n\
+'\\DeclareDocumentEnvironment', '\\csdef', '\\csedef', '\\csgdef',\n\
+'\\csxdef', '\\csletcs', '\\cslet', '\\letcs', or '\\let' is a tag.\n\
+So is the argument of any of the starred variants of these commands,\n\
+when a starred variant currently exists.\n\
 \n\
 Other commands can be specified by setting the environment variable\n\
 'TEXTAGS' to a colon-separated list like, for example,\n\
-     TEXTAGS=\"mycommand:myothercommand\".";
+     TEXTAGS=\"mycommand:myothercommand\".\n\
+\n\
+The '--tex-alt-forms' option causes each tag to have two names, one\n\
+with and one without the TeX escape char, usually '\\'.  This may be\n\
+helpful when mixing traditional TeX or LaTeX constructs ('\\def')\n\
+with newer constructs from the 'etoolbox' package ('\\csdef').";
 
 
 static const char *Texinfo_suffixes [] =
@@ -5735,12 +5753,27 @@ Scheme_functions (FILE *inf)
 
 static linebuffer *TEX_toktab = NULL; /* Table with tag tokens */
 
-/* Default set of control sequences to put into TEX_toktab.
-   The value of environment var TEXTAGS is prepended to this.  */
+/* Default set of control sequences to put into TEX_toktab.  The value of
+   environment var TEXTAGS is prepended to this.  (2023) Add variants of
+   '\def', some additional LaTeX (and former xparse) commands, and common
+   variants from the 'etoolbox' package.  Also, add starred variants of the
+   commands if they exist. */
 static const char *TEX_defenv = "\
-:chapter:section:subsection:subsubsection:eqno:label:ref:cite:bibitem\
-:part:appendix:entry:index:def\
-:newcommand:renewcommand:newenvironment:renewenvironment";
+:chapter*:section*:subsection*:subsubsection*:part*:label:ref\
+:chapter:section:subsection:subsubsection:eqno:cite:bibitem:part\
+:appendix:entry:index:def:edef:gdef:xdef:newcommand*:newcommand\
+:renewcommand*:renewcommand:newenvironment*:newenvironment\
+:renewenvironment*:renewenvironment:DeclareRobustCommand*\
+:DeclareRobustCommand:renewrobustcmd*:renewrobustcmd\
+:newrobustcmd*:newrobustcmd:providecommand*:providecommand\
+:providerobustcmd*:providerobustcmd:NewDocumentCommand\
+:RenewDocumentCommand:ProvideDocumentCommand\
+:DeclareDocumentCommand:NewExpandableDocumentCommand\
+:RenewExpandableDocumentCommand:ProvideExpandableDocumentCommand\
+:DeclareExpandableDocumentCommand:NewDocumentEnvironment\
+:RenewDocumentEnvironment:ProvideDocumentEnvironment\
+:DeclareDocumentEnvironment:csdef:csedef:csgdef:csxdef:csletcs\
+:cslet:letcs:let";
 
 static void TEX_decode_env (const char *, const char *);
 
@@ -5752,6 +5785,7 @@ TeX_commands (FILE *inf)
 {
   char *cp;
   linebuffer *key;
+  char newname[UCHAR_MAX];
 
   char TEX_esc = '\0';
   char TEX_opgrp UNINIT, TEX_clgrp UNINIT;
@@ -5799,19 +5833,73 @@ TeX_commands (FILE *inf)
 	      {
 		char *p;
 		ptrdiff_t namelen, linelen;
-		bool opgrp = false;
+		bool opgrp = false, one_esc = false;
 
 		cp = skip_spaces (cp + key->len);
+		/* Skip the optional arguments to commands in the tags list so
+		   that these arguments don't end up as the name of the tag.
+		   The name will instead come from the argument in curly braces
+		   that follows the optional ones.  */
+		if (*cp == '[' || *cp == '(')
+		  {
+		    while (*cp != TEX_opgrp && *cp != '\0')
+		      cp++;
+		  }
 		if (*cp == TEX_opgrp)
 		  {
 		    opgrp = true;
 		    cp++;
 		  }
+		/* Jumping to a TeX command definition doesn't work in at least
+		   some of the editors that use ctags.  Using the
+		   '--tex-alt-forms' option to strip TEX_esc should provide
+		   minor improvements, though overall the behavior is still
+		   suboptimal.  (With --tex-alt-forms we print each tag twice,
+		   once with and once without TEX_esc in the tag name.  See
+		   below.)  The undocumented ctags option '--no-duplicates' may
+		   also help.  Changes in tex-mode.el in GNU Emacs address the
+		   majority of these issues for etags, though the
+		   '--tex-alt-forms' option can also be useful there. */
+
+		if (tex_alt_forms && *cp == TEX_esc)
+		  {
+		    cp++;
+		    one_esc = true;
+		  }
+
+		/* Add optional argument brackets '(' and '[' to the loop test
+		   so that these arguments don't appear in tag names.  Also add
+		   '=' as it's relational in the vast majority of cases.  */
 		for (p = cp;
-		     (!c_isspace (*p) && *p != '#' &&
-		      *p != TEX_opgrp && *p != TEX_clgrp);
+		     (!c_isspace (*p) && *p != '#' && *p != '=' &&
+		      *p != '[' && *p != '(' && *p != TEX_opgrp &&
+		      *p != TEX_clgrp);
 		     p++)
-		  continue;
+		  /* Allow only one escape char in a tag name, which
+		     (primarily) enables tagging a TeX command's different,
+		     possibly temporary, '\let' bindings.  */
+		  if (*p == TEX_esc)
+		    {
+		      if (!one_esc)
+			{
+			  one_esc = true;
+			  continue;
+			}
+		      else
+			break;
+		    }
+		  else
+		    continue;
+		/* Re-run the scan to catch (highly unusual) cases where a
+		   command name is of the form '\('.  */
+		if ((*p == '(' || *p == '[') && (p - cp) < 2)
+		  {
+		    for (p = cp;
+			 (!c_isspace (*p) && *p != '#' &&
+			  *p != TEX_opgrp && *p != TEX_clgrp);
+			 p++)
+		      continue;
+		  }
 		namelen = p - cp;
 		linelen = lb.len;
 		if (!opgrp || *p == TEX_clgrp)
@@ -5820,6 +5908,16 @@ TeX_commands (FILE *inf)
 		      p++;
 		    linelen = p - lb.buffer + 1;
 		  }
+		/* With --tex-alt-forms we strip any TEX_esc from the name (see
+		   above), print the tag with TEX_esc prepended to the bare tag
+		   name, then print the same tag again with the bare tag
+		   name. */
+		if (tex_alt_forms)
+		  {
+		  snprintf (newname, UCHAR_MAX, "%c%s", TEX_esc, cp);
+		  make_tag (newname, namelen + 1, true,
+			    lb.buffer, linelen, lineno, linecharno);
+		  }
 		make_tag (cp, namelen, true,
 			  lb.buffer, linelen, lineno, linecharno);
 		goto tex_next_line; /* We only tag a line once */
diff --git a/lisp/textmodes/tex-mode.el b/lisp/textmodes/tex-mode.el
index a26e7b9c83a..3de4a093e09 100644
--- a/lisp/textmodes/tex-mode.el
+++ b/lisp/textmodes/tex-mode.el
@@ -1277,6 +1277,8 @@ tex-common-initialization
 	      (syntax-propertize-rules latex-syntax-propertize-rules))
   ;; TABs in verbatim environments don't do what you think.
   (setq-local indent-tabs-mode nil)
+  ;; Set up xref backend in TeX buffers.
+  (tex-set-thingatpt-symbol)
   ;; Other vars that should be buffer-local.
   (make-local-variable 'tex-command)
   (make-local-variable 'tex-start-of-header)
@@ -3724,6 +3726,220 @@ tex-chktex
                    (kill-buffer (process-buffer process)))))))
       (process-send-region tex-chktex--process (point-min) (point-max))
       (process-send-eof tex-chktex--process))))
+\f
+;;; Xref / Etags tweaks
+
+;; Rather than define a new xref backend for TeX, we tweak the default
+;; etags backend so that the main xref user commands (including
+;; `xref-find-definitions', `xref-find-apropos', and
+;; `xref-find-references' [on M-., C-M-., and M-?, respectively]) work
+;; in TeX buffers.  This mostly involves defining a new THING for
+;; `thing-at-point' (texsymbol), then substituting that THING for
+;; `symbol' in TeX buffers, at least by (configurable) default.  The
+;; TeX escape character will by default appear in the resulting string
+;; only when the xref command uses string search and not regexp
+;; search, though this too is configurable.  The new THING type also
+;; improves the accuracy of other commands that use `thing-at-point'
+;; in TeX buffers, like `isearch-forward-thing-at-point' (on M-s M-.)
+;; and `project-find-regexp' (on C-x p g).  Indeed,
+;; `project-find-regexp' sometimes works better in TeX buffers than
+;; `xref-find-references'.
+
+(defvar tex-thingatpt-modes-list
+  '(tex-mode doctex-mode latex-mode plain-tex-mode slitex-mode ams-tex-mode)
+  "Major modes where `thing-at-point' may use the `texsymbol' type.
+
+When a buffer's `major-mode' is in this list, and when
+`tex-thingatpt-is-texsymbol' is t (the default), any command in
+that buffer that calls `thing-at-point' with a `symbol' argument
+actually uses the `texsymbol' argument, instead.")
+
+(defcustom tex-thingatpt-is-texsymbol t
+  "When non-nil replace `symbol' by `texsymbol' for `thing-at-point'.
+
+This applies only to TeX buffers.  The `texsymbol' \"thing\"
+modifies the standard `symbol' for use in such buffers.
+
+When nil, restore the default behavior of `thing-at-point' in TeX
+buffers.
+
+Custom will automatically apply changes in all TeX buffers, but
+if you set the variable outside of Custom it won't take effect
+until you apply it with \\[tex-set-thingatpt-symbol].  Without a
+prefix argument (\\[universal-argument]) this applies only to the
+current buffer, but with one it applies to all TeX buffers in
+`buffer-list'.  (TeX buffers are those whose `major-mode' is a
+member of `tex-thingatpt-modes-list'.)"
+  :type 'boolean
+  :group 'tex-file
+  :group 'TeX-misc
+  :initialize #'custom-initialize-default
+  :set (lambda (var val)
+         (set-default var val)
+         (tex-set-thingatpt-symbol t))
+  :version "30.1")
+
+(defcustom tex-thingatpt-include-escape '(xref-find-definitions
+                                          xref-find-definitions-other-window
+                                          xref-find-definitions-other-frame)
+  "If non-nil, include `tex-escape-char' in `thing-at-point'.
+
+This variable only takes effect when `tex-thingatpt-is-texsymbol'
+is t (the default), changing the argument passed to
+`thing-at-point' from `symbol' to `texsymbol'.  When that is the
+case, the values of this variable act as follows:
+
+When t, `thing-at-point' will always include a
+`tex-escape-char' (usually `\\'), should one be present, in the
+string it returns in TeX buffers.
+
+When nil, `thing-at-point' will never include the
+`tex-escape-char' in the string it returns in TeX buffers.
+
+Otherwise, it's a list of commands for which `thing-at-point'
+will always include the `tex-escape-char' in the string it
+returns.  The three xref commands listed by default may cease to
+function properly in TeX buffers if set to nil, but using the
+`--tex-alt-forms' option when creating your tags table with
+`etags' will rectify that."
+  :type '(choice (const :tag "Always include tex-escape-char" t)
+                 (const :tag "Never include tex-escape-char" nil)
+                 (set :tag "Include tex-escape-char for these commands"
+		      (repeat :inline t (symbol :tag "command"))))
+  :group 'tex-file
+  :group 'TeX-misc
+  :version "30.1")
+
+(defvar tex-escape-char ?\\
+  "The current, possibly buffer-local, TeX escape character.
+
+The `etags' program only recognizes `\\' (92) and `!' (33) as
+escape characters in TeX documents, and if it detects the latter
+it also uses `<>' as the TeX grouping construct rather than `{}'.
+Setting this variable to anything other than `\\' or `!' is
+possible but will not be useful without changes to `etags', at
+least for commands that search tags tables, such as
+`xref-find-definitions' (\\[xref-find-definitions]) and \
+`xref-find-apropos' (\\[xref-find-apropos]).")
+
+(defvar tex-thingatpt-syntax-table
+  (let* ((ost (if (boundp 'TeX-mode-syntax-table)
+                  TeX-mode-syntax-table
+                tex-mode-syntax-table))
+         (st (make-syntax-table ost)))
+    (modify-syntax-entry ?# "'" st)
+    (modify-syntax-entry ?= "'" st)
+    (modify-syntax-entry ?` "'" st)
+    (modify-syntax-entry ?\" "'" st)
+    (modify-syntax-entry ?' "'" st)
+    st)
+  "Syntax table for delimiting `thing-at-point' in TeX buffers.
+
+When `tex-thingatpt-is-texsymbol' is t, this syntax table helps
+to define what a `texsymbol' is.  To access it use the
+`tex-thingatpt-syntax-table' function.")
+
+(defun tex-thingatpt-syntax-table ()
+  "Return a syntax table for `thing-at-point' in TeX buffers.
+
+It modifies the pre-defined syntax table depending both on the
+setting of the `tex-escape-char' variable, which may be buffer
+local, and on whether we're using AUCTeX or the in-tree tex-mode."
+  (let ((nst (make-syntax-table tex-thingatpt-syntax-table))
+        (escsy (if (boundp 'TeX-mode-syntax-table)
+                   ?\\
+                 ?/)))
+    (cond ((char-equal tex-escape-char ?\\))
+          ((char-equal tex-escape-char ?!)
+           (modify-syntax-entry ?\\ "_" nst)
+           (modify-syntax-entry tex-escape-char (char-to-string escsy) nst)
+           (modify-syntax-entry ?< "(>" nst)
+           (modify-syntax-entry ?> ")<" nst))
+          (t
+           (modify-syntax-entry ?\\ "_" nst)
+           (modify-syntax-entry tex-escape-char (char-to-string escsy) nst)))
+    nst))
+
+;; Setup AUCTeX modes.  (Should this be in AUCTeX itself?)
+(add-hook 'TeX-mode-hook #'tex-set-thingatpt-symbol)
+
+;; `xref-find-references' needs this when called from a latex-mode
+;; buffer in order to search files or buffers with a .tex suffix
+;; (including the buffer from which it has been called).  We append it
+;; to `auto-mode-alist' so as not to interfere with the usual
+;; mode-setting apparatus.
+(add-to-list 'auto-mode-alist '("\\.[tT]e[xX]\\'" . latex-mode) t)
+
+(dolist (texmode tex-thingatpt-modes-list)
+  (put texmode 'find-tag-default-function 'tex--thing-at-point))
+
+(put 'texsymbol 'beginning-op 'tex-thingatpt--beginning-of-texsymbol)
+
+(put 'texsymbol 'end-op 'tex-thingatpt--end-of-texsymbol)
+
+(declare-function cl-substitute "cl-seq" (cl-new cl-old cl-seq &rest cl-keys))
+
+(defun tex-set-thingatpt-symbol (&optional all)
+  "Set meaning of `thing-at-point' `symbol' in (ALL?) TeX buffers.
+
+When `tex-thingatpt-is-texsymbol' is t, set `thing-at-point' to
+use the `texsymbol' \"thing\" instead of `symbol', otherwise
+maintain or restore the default.  Without an optional ALL make
+changes only in current buffer, with ALL make changes in all TeX
+buffers in `buffer-list'."
+  (interactive "P")
+  (require 'thingatpt)
+  (if all
+      (dolist (buf (buffer-list))
+        (with-current-buffer buf
+          (tex--symbol-or-texsymbol)))
+    (tex--symbol-or-texsymbol)))
+
+(defun tex--symbol-or-texsymbol ()
+  (when (memq major-mode tex-thingatpt-modes-list)
+    (if tex-thingatpt-is-texsymbol
+        (setq-local thing-at-point-provider-alist
+                    (add-to-list 'thing-at-point-provider-alist
+                                 '(symbol . tex--thing-at-point))
+                    isearch-forward-thing-at-point
+                    (cl-substitute 'texsymbol 'symbol
+                                   isearch-forward-thing-at-point))
+      (setq-local thing-at-point-provider-alist
+                  (delete '(symbol . tex--thing-at-point)
+                          thing-at-point-provider-alist)
+                  isearch-forward-thing-at-point
+                  (cl-substitute 'symbol 'texsymbol
+                                 isearch-forward-thing-at-point)))))
+
+(defun tex--thing-at-point ()
+  "Pass `thing' type `texsymbol' to `bounds-of-thing-at-point'.
+
+When `tex-thingatpt-is-texsymbol' is t, calls in TeX buffers to
+`thing-at-point' with argument `symbol' will instead use the
+argument `texsymbol'.  Otherwise it will call `find-tag-default'."
+  (if tex-thingatpt-is-texsymbol
+      (let ((bounds (bounds-of-thing-at-point 'texsymbol)))
+        (when bounds
+          (buffer-substring-no-properties (car bounds) (cdr bounds))))
+    (find-tag-default)))
+
+(defun tex--include-escape-p (command)
+  (or (eq tex-thingatpt-include-escape t)
+      (memq command tex-thingatpt-include-escape)))
+
+(defun tex-thingatpt--beginning-of-texsymbol ()
+  "Move point to the beginning of the current TeX symbol."
+  (with-syntax-table (tex-thingatpt-syntax-table)
+    (and (re-search-backward "\\([][()]\\|\\(\\sw\\|\\s_\\|\\s.\\)+\\)")
+         (skip-syntax-backward "w_.")
+         (when (tex--include-escape-p this-command)
+           (skip-syntax-backward "\\/")))))
+
+(defun tex-thingatpt--end-of-texsymbol ()
+  "Move point to the end of the current TeX symbol."
+  (with-syntax-table (tex-thingatpt-syntax-table)
+    (and (re-search-forward "\\([][()]\\|\\(\\sw\\|\\s_\\|\\s.\\)+\\)")
+         (skip-syntax-forward "w_."))))
 
 (make-obsolete-variable 'tex-mode-load-hook
                         "use `with-eval-after-load' instead." "28.1")
-- 
2.35.8


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2023-09-13 11:10                 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-09-13 13:42                   ` Stefan Kangas
  2023-09-13 15:23                   ` Dmitry Gutov
  1 sibling, 0 replies; 66+ messages in thread
From: Stefan Kangas @ 2023-09-13 13:42 UTC (permalink / raw)
  To: David Fussner; +Cc: 53749, Lars Ingebrigtsen, Dmitry Gutov

David Fussner <dfussner@googlemail.com> writes:

> P.S. I'm also starting the copyright assignment process, in case these
> changes prove acceptable.

That's great, thanks.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2023-09-13 11:10                 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-09-13 13:42                   ` Stefan Kangas
@ 2023-09-13 15:23                   ` Dmitry Gutov
  2023-09-13 17:01                     ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-09-13 19:16                     ` Eli Zaretskii
  1 sibling, 2 replies; 66+ messages in thread
From: Dmitry Gutov @ 2023-09-13 15:23 UTC (permalink / raw)
  To: David Fussner, Stefan Kangas; +Cc: 53749, Lars Ingebrigtsen

Hi David!

Thanks for the new patch.

I'm skipping over the etags parser changes (others might comment, I'm 
just assuming they are good).

And "thing at point" code is, I think, at your discretion (if the result 
is useful, then that seems good). I would probably not call the function 
the same way given that we don't install this "thing" globally, just 
using it from several the major modes in a particular way. Anyway, that 
is a minor affair.

I'd like to suggest two simplifications for the xref-related stuff, if 
those work for you.

On 13/09/2023 14:10, David Fussner via Bug reports for GNU Emacs, the 
Swiss army knife of text editors wrote:

> <...> I also manipulate some
> variables buffer-locally to make things like project-find-regexp and
> isearch-forward-thing-at-point work better in such buffers.

These won't be affected either way, right? Because project-find-regexp 
defaults its input to (thing-at-point 'symbol t), and isearch... 
probably also uses "symbol" if you ask it to.

So... why not just make tex-thingatpt-include-escape a boolean? What 
commands need to be distinguished that way? I think 'find-tag' (it's 
obsolete but still used sometimes) would need to obey this var as well.

And the second thing: you're putting the symbol on major modes.

+(dolist (texmode tex-thingatpt-modes-list)
+  (put texmode 'find-tag-default-function 'tex--thing-at-point))

Why not set the variable find-tag-default-function instead? That seems 
easier and more appropriate to do inside a major mode function.

Thanks.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2023-09-13 15:23                   ` Dmitry Gutov
@ 2023-09-13 17:01                     ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-09-13 23:59                       ` Dmitry Gutov
  2023-09-13 19:16                     ` Eli Zaretskii
  1 sibling, 1 reply; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-09-13 17:01 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 53749, Lars Ingebrigtsen, Stefan Kangas

Hi Dmitry,

Thanks for the feedback!

> These won't be affected either way, right? Because project-find-regexp
> defaults its input to (thing-at-point 'symbol t), and isearch...
> probably also uses "symbol" if you ask it to.
>
> So... why not just make tex-thingatpt-include-escape a boolean? What
> commands need to be distinguished that way? I think 'find-tag' (it's
> obsolete but still used sometimes) would need to obey this var as well.

xref-find-apropos and xref-find-references don't work well (or at all)
with the escape char included in the search string, so I was keeping
that char away from them. (The buffer-local variables I manipulate for
project-find-regexp and isearch-forward-thing-at-point have to do with
ensuring they use the texsymbol thing in the first place -- see
tex--symbol-or-texsymbol.) Does that make sense?

I'll look at find-tag, too; thanks for pointing that out.

> Why not set the variable find-tag-default-function instead? That seems
> easier and more appropriate to do inside a major mode function.

I settled on putting the symbol on the modes because I thought it was
simpler than setting the variable buffer-locally in all the in-tree
and AUCTeX modes, but I'll revisit this and see whether I can come up
with something better.

Thanks again.

On Wed, 13 Sept 2023 at 16:23, Dmitry Gutov <dgutov@yandex.ru> wrote:
>
> Hi David!
>
> Thanks for the new patch.
>
> I'm skipping over the etags parser changes (others might comment, I'm
> just assuming they are good).
>
> And "thing at point" code is, I think, at your discretion (if the result
> is useful, then that seems good). I would probably not call the function
> the same way given that we don't install this "thing" globally, just
> using it from several the major modes in a particular way. Anyway, that
> is a minor affair.
>
> I'd like to suggest two simplifications for the xref-related stuff, if
> those work for you.r along the lines of your
>
> On 13/09/2023 14:10, David Fussner via Bug reports for GNU Emacs, the
> Swiss army knife of text editors wrote:
>
> > <...> I also manipulate some
> > variables buffer-locally to make things like project-find-regexp and
> > isearch-forward-thing-at-point work better in such buffers.
>
> These won't be affected either way, right? Because project-find-regexp
> defaults its input to (thing-at-point 'symbol t), and isearch...
> probably also uses "symbol" if you ask it to.
>
> So... why not just make tex-thingatpt-include-escape a boolean? What
> commands need to be distinguished that way? I think 'find-tag' (it's
> obsolete but still used sometimes) would need to obey this var as well.
>
> And the second thing: you're putting the symbol on major modes.
>
> +(dolist (texmode tex-thingatpt-modes-list)
> +  (put texmode 'find-tag-default-function 'tex--thing-at-point))
>
> Why not set the variable find-tag-default-function instead? That seems
> easier and more appropriate to do inside a major mode function.r along the lines of your
>
> Thanks.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2023-09-13 15:23                   ` Dmitry Gutov
  2023-09-13 17:01                     ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-09-13 19:16                     ` Eli Zaretskii
  2023-09-13 20:25                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2023-09-13 19:16 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 53749, larsi, stefankangas, dfussner

> Cc: 53749@debbugs.gnu.org, Lars Ingebrigtsen <larsi@gnus.org>
> Date: Wed, 13 Sep 2023 18:23:13 +0300
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> I'm skipping over the etags parser changes (others might comment, I'm 
> just assuming they are good).

They look OK to me at first glance, but we need to make sure the etags
tests still succeed after this change, and the new option should be
documented in the man page.  Bonus points for adding to the etags test
suite a test where this option is activated.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2023-09-13 19:16                     ` Eli Zaretskii
@ 2023-09-13 20:25                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-09-14  5:14                         ` Eli Zaretskii
  0 siblings, 1 reply; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-09-13 20:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 53749, Lars Ingebrigtsen, Stefan Kangas, Dmitry Gutov

[-- Attachment #1: Type: text/plain, Size: 864 bytes --]

Thanks Eli.

I'll have a look at the man page, and also at an additional test for the
suite. I did run the test suite, and all the diffs were where they should
be; I can send a patch that I have if you'd like, but if I'm going to add
tests maybe you'd prefer to wait?

On Wed, 13 Sept 2023, 20:16 Eli Zaretskii, <eliz@gnu.org> wrote:

> > Cc: 53749@debbugs.gnu.org, Lars Ingebrigtsen <larsi@gnus.org>
> > Date: Wed, 13 Sep 2023 18:23:13 +0300
> > From: Dmitry Gutov <dgutov@yandex.ru>
> >
> > I'm skipping over the etags parser changes (others might comment, I'm
> > just assuming they are good).
>
> They look OK to me at first glance, but we need to make sure the etags
> tests still succeed after this change, and the new option should be
> documented in the man page.  Bonus points for adding to the etags test
> suite a test where this option is activated.
>

[-- Attachment #2: Type: text/html, Size: 1470 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2023-09-13 17:01                     ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-09-13 23:59                       ` Dmitry Gutov
  2023-09-14  6:10                         ` Eli Zaretskii
  2023-09-14 16:11                         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 2 replies; 66+ messages in thread
From: Dmitry Gutov @ 2023-09-13 23:59 UTC (permalink / raw)
  To: David Fussner; +Cc: 53749, Lars Ingebrigtsen, Stefan Kangas

On 13/09/2023 20:01, David Fussner via Bug reports for GNU Emacs, the 
Swiss army knife of text editors wrote:

>> These won't be affected either way, right? Because project-find-regexp
>> defaults its input to (thing-at-point 'symbol t), and isearch...
>> probably also uses "symbol" if you ask it to.
>>
>> So... why not just make tex-thingatpt-include-escape a boolean? What
>> commands need to be distinguished that way? I think 'find-tag' (it's
>> obsolete but still used sometimes) would need to obey this var as well.
> 
> xref-find-apropos and xref-find-references don't work well (or at all)
> with the escape char included in the search string, so I was keeping
> that char away from them. (The buffer-local variables I manipulate for
> project-find-regexp and isearch-forward-thing-at-point have to do with
> ensuring they use the texsymbol thing in the first place -- see
> tex--symbol-or-texsymbol.) Does that make sense?

Hmm, I suppose I skipped over that part of the patch too quickly.

Here's a potential problem with replacing the notion of "symbol": some 
other existing code (also working with TeX/LaTeX) might disagree, as it 
might have some existing notion of what a "symbol" in those modes is (as 
defined by the syntax table).

In general, we change the notion of a symbol by either changing the 
mode's syntax table, or by augmenting its effect using 
syntax-propertize-function (which, for example, could propertize the 
backslashes inside the buffer as "symbol constituent"). The latter might 
actually be a change that would affect how 'M-x xref-find-references' 
works (it will likely start to consider those \tags as symbol 
occurrences together with the backslash). But like other changes of what 
is considered to be a "symbol" in a major mode, it could conflict with 
existing code.

Anyway, I'm not saying you have to change the approach, but that's 
something to be aware of.

And to look at it from another direction: if the default implementation 
of xref-find-references (and etags uses the very generic one) doesn't 
work for you, perhaps it would be worth it to define a TeX-specific Xref 
backend. That would perhaps take 20-30 lines of code total, most of them 
delegating to the etags backend, or the default impl. But while 
delegating, you can modify the passed argument - e.g. if it included a 
backslash, you could forward it to the default impl for "find 
references" without a backslash. Or - alternatively - call 
(project-find-regexp "...") with a more complex regexp of your choice. 
The first alternative could look like this:

   (cl-defmethod xref-backend-references ((_backend (eql 'tex-etags)) 
identifier)
     (xref-backend-references 'etags (string-remove-prefix "\\" 
identifier)))

> I'll look at find-tag, too; thanks for pointing that out.

Doing the above choice on the level of Xref backend's methods 
would/should automatically make it work for all commands appropriately.

>> Why not set the variable find-tag-default-function instead? That seems
>> easier and more appropriate to do inside a major mode function.
> 
> I settled on putting the symbol on the modes because I thought it was
> simpler than setting the variable buffer-locally in all the in-tree
> and AUCTeX modes, but I'll revisit this and see whether I can come up
> with something better.

Do AUCTeX modes inherit from tex-mode? Or all call 
tex-common-initialization? Then you could set that variable locally 
inside that function once.

All in all, it might not be wise to modify the behavior of third-party 
packages from inside Emacs this way (they might have other expectations, 
or there's going to appear a new major mode that needs the same 
treatment anyway).

Setting a variable to be used through mode inheritance or delegation is 
fine, but if that doesn't help, I would probably stop at defining a 
helper function or two and documenting how it should be used. And then 
maybe work with AUCTeX people to get the remaining necessary changes in 
from their side (or just leaving that up to the user, depending on how 
functional the default config ends up being).





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2023-09-13 20:25                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-09-14  5:14                         ` Eli Zaretskii
  0 siblings, 0 replies; 66+ messages in thread
From: Eli Zaretskii @ 2023-09-14  5:14 UTC (permalink / raw)
  To: David Fussner; +Cc: 53749, larsi, stefankangas, dgutov

> From: David Fussner <dfussner@googlemail.com>
> Date: Wed, 13 Sep 2023 21:25:28 +0100
> Cc: Dmitry Gutov <dgutov@yandex.ru>, Stefan Kangas <stefankangas@gmail.com>, 53749@debbugs.gnu.org, 
> 	Lars Ingebrigtsen <larsi@gnus.org>
> 
> I'll have a look at the man page, and also at an additional test for the suite. I did run the test suite, and
> all the diffs were where they should be; I can send a patch that I have if you'd like, but if I'm going to
> add tests maybe you'd prefer to wait?

Sure, we will wait.  There's no rush.  Let's have a complete patch
that covers all the aspects of this, and install it in one go.
Meanwhile you will also have time to work on the other review
comments.

Thanks.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2023-09-13 23:59                       ` Dmitry Gutov
@ 2023-09-14  6:10                         ` Eli Zaretskii
  2023-09-15 18:45                           ` Tassilo Horn
  2023-09-14 16:11                         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2023-09-14  6:10 UTC (permalink / raw)
  To: Dmitry Gutov, Stefan Monnier, Tassilo Horn
  Cc: 53749, larsi, stefankangas, dfussner

> Cc: 53749@debbugs.gnu.org, Lars Ingebrigtsen <larsi@gnus.org>,
>  Stefan Kangas <stefankangas@gmail.com>
> Date: Thu, 14 Sep 2023 02:59:33 +0300
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> On 13/09/2023 20:01, David Fussner via Bug reports for GNU Emacs, the 
> Swiss army knife of text editors wrote:
> 
> >> These won't be affected either way, right? Because project-find-regexp
> >> defaults its input to (thing-at-point 'symbol t), and isearch...
> >> probably also uses "symbol" if you ask it to.
> >>
> >> So... why not just make tex-thingatpt-include-escape a boolean? What
> >> commands need to be distinguished that way? I think 'find-tag' (it's
> >> obsolete but still used sometimes) would need to obey this var as well.
> > 
> > xref-find-apropos and xref-find-references don't work well (or at all)
> > with the escape char included in the search string, so I was keeping
> > that char away from them. (The buffer-local variables I manipulate for
> > project-find-regexp and isearch-forward-thing-at-point have to do with
> > ensuring they use the texsymbol thing in the first place -- see
> > tex--symbol-or-texsymbol.) Does that make sense?
> 
> Hmm, I suppose I skipped over that part of the patch too quickly.
> 
> Here's a potential problem with replacing the notion of "symbol": some 
> other existing code (also working with TeX/LaTeX) might disagree, as it 
> might have some existing notion of what a "symbol" in those modes is (as 
> defined by the syntax table).
> 
> In general, we change the notion of a symbol by either changing the 
> mode's syntax table, or by augmenting its effect using 
> syntax-propertize-function (which, for example, could propertize the 
> backslashes inside the buffer as "symbol constituent"). The latter might 
> actually be a change that would affect how 'M-x xref-find-references' 
> works (it will likely start to consider those \tags as symbol 
> occurrences together with the backslash). But like other changes of what 
> is considered to be a "symbol" in a major mode, it could conflict with 
> existing code.
> 
> Anyway, I'm not saying you have to change the approach, but that's 
> something to be aware of.
> 
> And to look at it from another direction: if the default implementation 
> of xref-find-references (and etags uses the very generic one) doesn't 
> work for you, perhaps it would be worth it to define a TeX-specific Xref 
> backend. That would perhaps take 20-30 lines of code total, most of them 
> delegating to the etags backend, or the default impl. But while 
> delegating, you can modify the passed argument - e.g. if it included a 
> backslash, you could forward it to the default impl for "find 
> references" without a backslash. Or - alternatively - call 
> (project-find-regexp "...") with a more complex regexp of your choice. 
> The first alternative could look like this:
> 
>    (cl-defmethod xref-backend-references ((_backend (eql 'tex-etags)) 
> identifier)
>      (xref-backend-references 'etags (string-remove-prefix "\\" 
> identifier)))
> 
> > I'll look at find-tag, too; thanks for pointing that out.
> 
> Doing the above choice on the level of Xref backend's methods 
> would/should automatically make it work for all commands appropriately.
> 
> >> Why not set the variable find-tag-default-function instead? That seems
> >> easier and more appropriate to do inside a major mode function.
> > 
> > I settled on putting the symbol on the modes because I thought it was
> > simpler than setting the variable buffer-locally in all the in-tree
> > and AUCTeX modes, but I'll revisit this and see whether I can come up
> > with something better.
> 
> Do AUCTeX modes inherit from tex-mode? Or all call 
> tex-common-initialization? Then you could set that variable locally 
> inside that function once.
> 
> All in all, it might not be wise to modify the behavior of third-party 
> packages from inside Emacs this way (they might have other expectations, 
> or there's going to appear a new major mode that needs the same 
> treatment anyway).
> 
> Setting a variable to be used through mode inheritance or delegation is 
> fine, but if that doesn't help, I would probably stop at defining a 
> helper function or two and documenting how it should be used. And then 
> maybe work with AUCTeX people to get the remaining necessary changes in 
> from their side (or just leaving that up to the user, depending on how 
> functional the default config ends up being).

I think we should add Stefan and Tassilo (CCed) to this discussion, as
they might have valuable comments about this.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2023-09-13 23:59                       ` Dmitry Gutov
  2023-09-14  6:10                         ` Eli Zaretskii
@ 2023-09-14 16:11                         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-09-14 23:55                           ` Dmitry Gutov
  1 sibling, 1 reply; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-09-14 16:11 UTC (permalink / raw)
  To: Dmitry Gutov, Eli Zaretskii; +Cc: 53749, Lars Ingebrigtsen, Stefan Kangas

Hi Dmitry,

Once again, many thanks for the feedback. I'm still not certain I
agree about the risks involved in creating a new "thing" type, as it
really only appears in a small number of commands and then only in TeX
buffers, and generally I tried to design the code to keep out of the
way of anything outside of such buffers, but needless to say you see
further and more clearly than I can. I've been reviewing your comments
and my code, and have a few ideas and questions about how to go
forward. Though I haven't coded it yet, it's possible that the
simplest (and least intrusive) approach to follow would do something
like this:

1. Get rid of the new texsymbol "thing" and just use a buffer-local
value of find-tag-default-function and a rather more thoroughly
modified syntax table to control what "symbol" means, but _only_ in
the context of commands that use find-tag-default-function. I think
I'd lose the ability to change the behavior of
isearch-forward-thing-at-point and project-find-regexp, as I can't see
how to temporarily modify the syntax table there, though perhaps I'm
missing something.

2. Simply eliminate the TeX escape character entirely, both from tag
names in a TAGS file and from any thing-at-point in a TeX buffer. I
think this would eliminate the need to distinguish among the various
xref commands in terms of whether they can or can't handle the escape
character. It would also eliminate the need for the new user option in
etags.c, as there would no longer be any code to cope with the escape
character when finding a (thing-at-point 'symbol). This is slightly
less powerful than the default I proposed, but there are probably many
use cases where it won't matter at all (though it would for my own,
possibly eccentric, use case).

Does this sound to you like a plausible way forward?

I've tried to reach out to the AUCTeX developers to see what they
might want to do about setting the value of local variables there, and
anything they come up with should be doable.

 Thanks again.

On Thu, 14 Sept 2023 at 00:59, Dmitry Gutov <dgutov@yandex.ru> wrote:
>
> On 13/09/2023 20:01, David Fussner via Bug reports for GNU Emacs, the
> Swiss army knife of text editors wrote:
>
> >> These won't be affected either way, right? Because project-find-regexp
> >> defaults its input to (thing-at-point 'symbol t), and isearch...
> >> probably also uses "symbol" if you ask it to.
> >>
> >> So... why not just make tex-thingatpt-include-escape a boolean? What
> >> commands need to be distinguished that way? I think 'find-tag' (it's
> >> obsolete but still used sometimes) would need to obey this var as well.
> >
> > xref-find-apropos and xref-find-references don't work well (or at all)
> > with the escape char included in the search string, so I was keeping
> > that char away from them. (The buffer-local variables I manipulate for
> > project-find-regexp and isearch-forward-thing-at-point have to do with
> > ensuring they use the texsymbol thing in the first place -- see
> > tex--symbol-or-texsymbol.) Does that make sense?
>
> Hmm, I suppose I skipped over that part of the patch too quickly.
>
> Here's a potential problem with replacing the notion of "symbol": some
> other existing code (also working with TeX/LaTeX) might disagree, as it
> might have some existing notion of what a "symbol" in those modes is (as
> defined by the syntax table).
>
> In general, we change the notion of a symbol by either changing the
> mode's syntax table, or by augmenting its effect using
> syntax-propertize-function (which, for example, could propertize the
> backslashes inside the buffer as "symbol constituent"). The latter might
> actually be a change that would affect how 'M-x xref-find-references'
> works (it will likely start to consider those \tags as symbol
> occurrences together with the backslash). But like other changes of what
> is considered to be a "symbol" in a major mode, it could conflict with
> existing code.
>
> Anyway, I'm not saying you have to change the approach, but that's
> something to be aware of.
>
> And to look at it from another direction: if the default implementation
> of xref-find-references (and etags uses the very generic one) doesn't
> work for you, perhaps it would be worth it to define a TeX-specific Xref
> backend. That would perhaps take 20-30 lines of code total, most of them
> delegating to the etags backend, or the default impl. But while
> delegating, you can modify the passed argument - e.g. if it included a
> backslash, you could forward it to the default impl for "find
> references" without a backslash. Or - alternatively - call
> (project-find-regexp "...") with a more complex regexp of your choice.
> The first alternative could look like this:
>
>    (cl-defmethod xref-backend-references ((_backend (eql 'tex-etags))
> identifier)
>      (xref-backend-references 'etags (string-remove-prefix "\\"
> identifier)))
>
> > I'll look at find-tag, too; thanks for pointing that out.
>
> Doing the above choice on the level of Xref backend's methods
> would/should automatically make it work for all commands appropriately.
>
> >> Why not set the variable find-tag-default-function instead? That seems
> >> easier and more appropriate to do inside a major mode function.
> >
> > I settled on putting the symbol on the modes because I thought it was
> > simpler than setting the variable buffer-locally in all the in-tree
> > and AUCTeX modes, but I'll revisit this and see whether I can come up
> > with something better.
>
> Do AUCTeX modes inherit from tex-mode? Or all call
> tex-common-initialization? Then you could set that variable locally
> inside that function once.
>
> All in all, it might not be wise to modify the behavior of third-party
> packages from inside Emacs this way (they might have other expectations,
> or there's going to appear a new major mode that needs the same
> treatment anyway).
>
> Setting a variable to be used through mode inheritance or delegation is
> fine, but if that doesn't help, I would probably stop at defining a
> helper function or two and documenting how it should be used. And then
> maybe work with AUCTeX people to get the remaining necessary changes in
> from their side (or just leaving that up to the user, depending on how
> functional the default config ends up being).





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2023-09-14 16:11                         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-09-14 23:55                           ` Dmitry Gutov
  2023-09-15  6:47                             ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 66+ messages in thread
From: Dmitry Gutov @ 2023-09-14 23:55 UTC (permalink / raw)
  To: David Fussner, Eli Zaretskii; +Cc: 53749, Lars Ingebrigtsen, Stefan Kangas

Hi David,

On 14/09/2023 19:11, David Fussner via Bug reports for GNU Emacs, the 
Swiss army knife of text editors wrote:

> Once again, many thanks for the feedback. I'm still not certain I
> agree about the risks involved in creating a new "thing" type, as it
> really only appears in a small number of commands and then only in TeX
> buffers, and generally I tried to design the code to keep out of the
> way of anything outside of such buffers, but needless to say you see
> further and more clearly than I can. I've been reviewing your comments
> and my code, and have a few ideas and questions about how to go
> forward. Though I haven't coded it yet, it's possible that the
> simplest (and least intrusive) approach to follow would do something
> like this:

I agree that the risks are probably low, and my review stems from the 
general approach that doing global modifications to the environment can 
lead to problems. It might or might not happen in your case. If anything 
happens, though, the same modifications tend to make it harder to 
investigate, e.g. to find where a particular bit of behavior comes from. 
So the more local an implementation of a feature can be, is generally 
the better.

But I'm no maintainer of tex-mode, and whatever choices are made here 
won't have effect outside of TeX, so if somebody wants to disagree with 
me, they're more than welcome to.

> 1. Get rid of the new texsymbol "thing" and just use a buffer-local
> value of find-tag-default-function and a rather more thoroughly
> modified syntax table to control what "symbol" means, but _only_ in
> the context of commands that use find-tag-default-function. I think
> I'd lose the ability to change the behavior of
> isearch-forward-thing-at-point and project-find-regexp, as I can't see
> how to temporarily modify the syntax table there, though perhaps I'm
> missing something.

I'm suggesting this approach together with defining a "new" backend for 
TeX. Quotes because while it's going to have its own name, it's mostly 
going to perform forwarding to an existing backend (etags).

This should make it practical for you to treat identifiers in 
xref-backend-definitions differently from that in 
xref-backend-references and xref-backend-apropos.

If you define find-tag-default-function, you don't have to change the 
syntax table too: it might be easier to search around with a regexp.

But for the new backend, you can also define the method 
xref-backend-identifier-at-point, where you would invoke the necessary 
bounds-of-thing logic. Then you won't need a change in 
find-tag-default-function.

Either way, though, the major modes will need to set up 
xref-backend-functions instead (with add-hook). This could also be done 
in a minor mode, which you'd enable in any TeX-related major modes that 
you use.

> 2. Simply eliminate the TeX escape character entirely, both from tag
> names in a TAGS file and from any thing-at-point in a TeX buffer. I
> think this would eliminate the need to distinguish among the various
> xref commands in terms of whether they can or can't handle the escape
> character. It would also eliminate the need for the new user option in
> etags.c, as there would no longer be any code to cope with the escape
> character when finding a (thing-at-point 'symbol). This is slightly
> less powerful than the default I proposed, but there are probably many
> use cases where it won't matter at all (though it would for my own,
> possibly eccentric, use case).

I wanted to ask whether including the backslash is important enough (it 
should not matter too much for disambiguation), but I figured it must 
be, otherwise you wouldn't go to all this effort.

If not, it would simplify things a lot, though.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2023-09-14 23:55                           ` Dmitry Gutov
@ 2023-09-15  6:47                             ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-09-15  6:47 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 53749, Eli Zaretskii, Lars Ingebrigtsen, Stefan Kangas

[-- Attachment #1: Type: text/plain, Size: 4134 bytes --]

Thanks Dmitry,

I'll make another stab at a "new" backend, as suggested. I'll have a look
at the escape char thing, too, and see how I feel about dropping it. It
shouldn't take 18 months this time!

Best,

David.

On Fri, 15 Sept 2023, 00:55 Dmitry Gutov, <dgutov@yandex.ru> wrote:

> Hi David,
>
> On 14/09/2023 19:11, David Fussner via Bug reports for GNU Emacs, the
> Swiss army knife of text editors wrote:
>
> > Once again, many thanks for the feedback. I'm still not certain I
> > agree about the risks involved in creating a new "thing" type, as it
> > really only appears in a small number of commands and then only in TeX
> > buffers, and generally I tried to design the code to keep out of the
> > way of anything outside of such buffers, but needless to say you see
> > further and more clearly than I can. I've been reviewing your comments
> > and my code, and have a few ideas and questions about how to go
> > forward. Though I haven't coded it yet, it's possible that the
> > simplest (and least intrusive) approach to follow would do something
> > like this:
>
> I agree that the risks are probably low, and my review stems from the
> general approach that doing global modifications to the environment can
> lead to problems. It might or might not happen in your case. If anything
> happens, though, the same modifications tend to make it harder to
> investigate, e.g. to find where a particular bit of behavior comes from.
> So the more local an implementation of a feature can be, is generally
> the better.
>
> But I'm no maintainer of tex-mode, and whatever choices are made here
> won't have effect outside of TeX, so if somebody wants to disagree with
> me, they're more than welcome to.
>
> > 1. Get rid of the new texsymbol "thing" and just use a buffer-local
> > value of find-tag-default-function and a rather more thoroughly
> > modified syntax table to control what "symbol" means, but _only_ in
> > the context of commands that use find-tag-default-function. I think
> > I'd lose the ability to change the behavior of
> > isearch-forward-thing-at-point and project-find-regexp, as I can't see
> > how to temporarily modify the syntax table there, though perhaps I'm
> > missing something.
>
> I'm suggesting this approach together with defining a "new" backend for
> TeX. Quotes because while it's going to have its own name, it's mostly
> going to perform forwarding to an existing backend (etags).
>
> This should make it practical for you to treat identifiers in
> xref-backend-definitions differently from that in
> xref-backend-references and xref-backend-apropos.
>
> If you define find-tag-default-function, you don't have to change the
> syntax table too: it might be easier to search around with a regexp.
>
> But for the new backend, you can also define the method
> xref-backend-identifier-at-point, where you would invoke the necessary
> bounds-of-thing logic. Then you won't need a change in
> find-tag-default-function.
>
> Either way, though, the major modes will need to set up
> xref-backend-functions instead (with add-hook). This could also be done
> in a minor mode, which you'd enable in any TeX-related major modes that
> you use.
>
> > 2. Simply eliminate the TeX escape character entirely, both from tag
> > names in a TAGS file and from any thing-at-point in a TeX buffer. I
> > think this would eliminate the need to distinguish among the various
> > xref commands in terms of whether they can or can't handle the escape
> > character. It would also eliminate the need for the new user option in
> > etags.c, as there would no longer be any code to cope with the escape
> > character when finding a (thing-at-point 'symbol). This is slightly
> > less powerful than the default I proposed, but there are probably many
> > use cases where it won't matter at all (though it would for my own,
> > possibly eccentric, use case).
>
> I wanted to ask whether including the backslash is important enough (it
> should not matter too much for disambiguation), but I figured it must
> be, otherwise you wouldn't go to all this effort.
>
> If not, it would simplify things a lot, though.
>

[-- Attachment #2: Type: text/html, Size: 5016 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2023-09-14  6:10                         ` Eli Zaretskii
@ 2023-09-15 18:45                           ` Tassilo Horn
  2023-09-16  5:53                             ` Ikumi Keita
  0 siblings, 1 reply; 66+ messages in thread
From: Tassilo Horn @ 2023-09-15 18:45 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: 53749, Ikumi Keita, dfussner, stefankangas, Dmitry Gutov, larsi,
	Stefan Monnier

Eli Zaretskii <eliz@gnu.org> writes:

Hi Eli & all, thanks for inviting me to the discussion.  I'm adding
Keita, too, because he's currently the by far most active AUCTeX
developer.

>> Do AUCTeX modes inherit from tex-mode?

Not currently but in Keita's feature/fix-mode-names-overlap branch which
will probably become AUCTeX 14, I guess.

>> Or all call tex-common-initialization? Then you could set that
>> variable locally inside that function once.

Again, not right now but probably in the future.

>> All in all, it might not be wise to modify the behavior of
>> third-party packages from inside Emacs this way (they might have
>> other expectations, or there's going to appear a new major mode that
>> needs the same treatment anyway).
>> 
>> Setting a variable to be used through mode inheritance or delegation
>> is fine, but if that doesn't help, I would probably stop at defining
>> a helper function or two and documenting how it should be used. And
>> then maybe work with AUCTeX people to get the remaining necessary
>> changes in from their side (or just leaving that up to the user,
>> depending on how functional the default config ends up being).

That sounds reasonable.

Bye,
Tassilo





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2023-09-15 18:45                           ` Tassilo Horn
@ 2023-09-16  5:53                             ` Ikumi Keita
  2023-09-17  8:49                               ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 66+ messages in thread
From: Ikumi Keita @ 2023-09-16  5:53 UTC (permalink / raw)
  To: Tassilo Horn
  Cc: 53749, dfussner, Stefan Monnier, Dmitry Gutov, larsi,
	Eli Zaretskii, stefankangas

Hi all,

>>>>> Tassilo Horn <tsdh@gnu.org> writes:
>>> Do AUCTeX modes inherit from tex-mode?

> Not currently but in Keita's feature/fix-mode-names-overlap branch

Currently, no. In feater/fix-mode-names-overlap branch, the major mode
iheritance relations are:

text-mode      --+-- TeX-mode
                 +-- Texinfo-mode

TeX-mode       --+-- plain-TeX-mode
                 +-- LaTeX-mode
                 +-- ConTeXt-mode

plain-TeX-mode --+-- AmSTeX-mode
                 +-- japanese-plain-TeX-mode

LaTeX-mode     --+-- docTeX-mode
                 +-- japanese-LaTeX-mode

(There are ConTeXt-en-mode and ConTeXt-nl-mode as well, but my current
personal plain is to delete them.)

I don't think it's a good idea to inherit from tex-mode; it isn't
diffcult to change the "top" mode from text-mode with tex-mode, but in
that case LaTeX-mode can't have both built-in latex-mode and TeX-mode as
its parent mode.

(Maybe an exception is Texinfo-mode. It would make sense to have
built-in texinfo-mode as parent of Texinfo-mode. If there is a good
reason to do so, I won't object strongly.)

> which will probably become AUCTeX 14, I guess.

I hope so. :-)

>>> Or all call tex-common-initialization? Then you could set that
>>> variable locally inside that function once.

> Again, not right now but probably in the future.

Currently, they don't call tex-common-initialization, but we can do so
in TeX-mode. (But I haven't consider its pros and cons deeply yet.)

Best regards,
Ikumi Keita
#StandWithUkraine #StopWarInUkraine





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2023-09-16  5:53                             ` Ikumi Keita
@ 2023-09-17  8:49                               ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-04-22 13:06                                 ` Arash Esbati
  0 siblings, 1 reply; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-09-17  8:49 UTC (permalink / raw)
  To: Ikumi Keita
  Cc: 53749, Tassilo Horn, Stefan Monnier, Dmitry Gutov, larsi,
	Eli Zaretskii, stefankangas

Hi Tassilo and Keita,

Thanks for the clarifications. If you look at the current patch to
tex-mode.el, there's one function call added to TeX-mode-hook, mainly
for my own testing purposes, but no matter what the final patch looks
like it should only similarly require a single function call in an
AUCTeX hook to activate the new xref code there, along with one in
tex-common-initialization for the in-tree modes. If and when all
parties are satisfied by the patch I'll certainly be in touch with you
to find out how you'd prefer to handle activating it (or not) in
AUCTeX. The current state of affairs is a convenience for me and for
anyone else who cares to test the code.

Thanks again,

David.

On Sat, 16 Sept 2023 at 06:53, Ikumi Keita <ikumi@ikumi.que.jp> wrote:
>
> Hi all,
>
> >>>>> Tassilo Horn <tsdh@gnu.org> writes:
> >>> Do AUCTeX modes inherit from tex-mode?
>
> > Not currently but in Keita's feature/fix-mode-names-overlap branch
>
> Currently, no. In feater/fix-mode-names-overlap branch, the major mode
> iheritance relations are:
>
> text-mode      --+-- TeX-mode
>                  +-- Texinfo-mode
>
> TeX-mode       --+-- plain-TeX-mode
>                  +-- LaTeX-mode
>                  +-- ConTeXt-mode
>
> plain-TeX-mode --+-- AmSTeX-mode
>                  +-- japanese-plain-TeX-mode
>
> LaTeX-mode     --+-- docTeX-mode
>                  +-- japanese-LaTeX-mode
>
> (There are ConTeXt-en-mode and ConTeXt-nl-mode as well, but my current
> personal plain is to delete them.)
>
> I don't think it's a good idea to inherit from tex-mode; it isn't
> diffcult to change the "top" mode from text-mode with tex-mode, but in
> that case LaTeX-mode can't have both built-in latex-mode and TeX-mode as
> its parent mode.
>
> (Maybe an exception is Texinfo-mode. It would make sense to have
> built-in texinfo-mode as parent of Texinfo-mode. If there is a good
> reason to do so, I won't object strongly.)
>
> > which will probably become AUCTeX 14, I guess.
>
> I hope so. :-)
>
> >>> Or all call tex-common-initialization? Then you could set that
> >>> variable locally inside that function once.
>
> > Again, not right now but probably in the future.
>
> Currently, they don't call tex-common-initialization, but we can do so
> in TeX-mode. (But I haven't consider its pros and cons deeply yet.)
>
> Best regards,
> Ikumi Keita
> #StandWithUkraine #StopWarInUkraine





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2023-09-17  8:49                               ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-04-22 13:06                                 ` Arash Esbati
  2024-04-22 14:56                                   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-04-29 14:15                                   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 2 replies; 66+ messages in thread
From: Arash Esbati @ 2024-04-22 13:06 UTC (permalink / raw)
  To: David Fussner
  Cc: 53749, Ikumi Keita, Dmitry Gutov, Stefan Monnier, Tassilo Horn,
	Eli Zaretskii, stefankangas

David Fussner <dfussner@googlemail.com> writes:

> Thanks for the clarifications. If you look at the current patch to
> tex-mode.el, there's one function call added to TeX-mode-hook, mainly
> for my own testing purposes, but no matter what the final patch looks
> like it should only similarly require a single function call in an
> AUCTeX hook to activate the new xref code there, along with one in
> tex-common-initialization for the in-tree modes. If and when all
> parties are satisfied by the patch I'll certainly be in touch with you
> to find out how you'd prefer to handle activating it (or not) in
> AUCTeX. The current state of affairs is a convenience for me and for
> anyone else who cares to test the code.

Hi David,

I justed wanted to come back on this report and ask if there is any
progress?  It would be nice to get Xref working within TeX buffers.

TIA.  Best, Arash





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2024-04-22 13:06                                 ` Arash Esbati
@ 2024-04-22 14:56                                   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-04-22 16:15                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-04-23 12:04                                     ` Arash Esbati
  2024-04-29 14:15                                   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 2 replies; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-04-22 14:56 UTC (permalink / raw)
  To: Arash Esbati
  Cc: 53749, Ikumi Keita, Dmitry Gutov, Stefan Monnier, Tassilo Horn,
	Eli Zaretskii, stefankangas

Hi Arash,

Thanks for the nudge. I am in fact in the final stages of preparing a
new patch to get xref working in TeX buffers. As usual, the main
complexities are in xref-find-references, and while I have you here I
wonder whether I could ask your thoughts about addressing one part of
this complexity.

The semantic/symref backend used by xref-find-references greps in
files matching the major-mode of the buffer where the user calls the
command. It looks in semantic-symref-filepattern-alist for
file-extensions matching the major-mode, and if that fails it looks in
auto-mode-alist. When both fail to produce any file extensions it
tells the user to customize semantic-symref-filepattern-alist. Also,
if it finds things in s-s-f-a, it doesn't go on to auto-mode-alist, so
s-s-f-a has to be complete in order to be useful. In effect, we need
s-s-f-a to hold all the extensions for all the modes that can appear
as values of major-mode, and I notice that AUCTeX has started to
populate that alist, though incompletely. I'm also aware that many
packages add their own extensions to files which are basically TeX or
LaTeX files, and I wonder whether we can really keep up with the whole
of CTAN in terms of providing complete lists of extensions for
s-s-f-a.

As an example of where we are, if you open a plain-tex-mode (or
plain-TeX-mode) file and M-? with point on some standard word you'll
currently get the message to customize s-s-f-a, because
auto-mode-alist has only tex-mode and s-s-f-a doesn't cover them,
either.

I ask you Arash, therefore, as an AUCTeX and emacs developer, and I
ask any other developers also, whether you'd prefer me just to put
together as complete a list as possible for addition to s-s-f-a --
with patches for AUCTeX for all the new modes there -- or, and this is
what I'm finishing up now, whether you'd consider it overkill to have
code that constructs (or modifies) entries in s-s-f-a by searching in
auto-mode-alist and in the buffer-list for all the file extensions
emacs knows about that relate to the current major-mode. Changes in
s-s-f-a wouldn't be persistent across sessions, but they would allow a
user to open a file with any file extension, run latex-mode, and M-?
would work in that buffer, and search that buffer from another buffer
with a related major-mode, all without needing any user intervention.
It would also allow customizations in auto-mode-alist to appear in
s-s-f-a automatically, which seems convenient to me.

If your answer is "show me the code", I'll do that shortly, but I
wondered whether anyone had any preliminary thoughts on the matter.

Best, and sorry for the long question,

David.

On Mon, 22 Apr 2024 at 14:06, Arash Esbati <arash@gnu.org> wrote:
>
> David Fussner <dfussner@googlemail.com> writes:
>
> > Thanks for the clarifications. If you look at the current patch to
> > tex-mode.el, there's one function call added to TeX-mode-hook, mainly
> > for my own testing purposes, but no matter what the final patch looks
> > like it should only similarly require a single function call in an
> > AUCTeX hook to activate the new xref code there, along with one in
> > tex-common-initialization for the in-tree modes. If and when all
> > parties are satisfied by the patch I'll certainly be in touch with you
> > to find out how you'd prefer to handle activating it (or not) in
> > AUCTeX. The current state of affairs is a convenience for me and for
> > anyone else who cares to test the code.
>
> Hi David,
>
> I justed wanted to come back on this report and ask if there is any
> progress?  It would be nice to get Xref working within TeX buffers.
>
> TIA.  Best, Arash





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2024-04-22 14:56                                   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-04-22 16:15                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-04-22 16:37                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-04-23 12:04                                     ` Arash Esbati
  1 sibling, 1 reply; 66+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-04-22 16:15 UTC (permalink / raw)
  To: David Fussner
  Cc: 53749, Ikumi Keita, Dmitry Gutov, Arash Esbati, stefankangas,
	Tassilo Horn, Eli Zaretskii

> auto-mode-alist. When both fail to produce any file extensions it
> tells the user to customize semantic-symref-filepattern-alist.

Yes, this is not ideal.

I think ideally we'd build a regexp from `auto-mode-alist` and
`major-mode-remap-alist/defaults`, tho it may require additional info.

E.g. we may need to complement that with additional "related modes"
(e.g. html modes may want to mention `php-mode` as "related").


        Stefan






^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2024-04-22 16:15                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-04-22 16:37                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-04-22 17:16                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-04-22 16:37 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: 53749, Ikumi Keita, Dmitry Gutov, Arash Esbati, stefankangas,
	Tassilo Horn, Eli Zaretskii

Thank you, Stefan -- I didn't know about
major-mode-remap-alist/defaults. Do you think TeX and friends are
handled by emacs distinctively enough to warrant keeping some
specialist extension-handling code in tex-mode.el, or do you think
some changes should be more generally available, in grep.el, say? (I'm
wondering whether it might be useful, for example, for
semantic-symref-derive-find-filepatterns to add extensions from
auto-mode-alist even when some extensions are found in
semantic-symref-filepattern-alist.)

David.

On Mon, 22 Apr 2024 at 17:15, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>
> > auto-mode-alist. When both fail to produce any file extensions it
> > tells the user to customize semantic-symref-filepattern-alist.
>
> Yes, this is not ideal.
>
> I think ideally we'd build a regexp from `auto-mode-alist` and
> `major-mode-remap-alist/defaults`, tho it may require additional info.
>
> E.g. we may need to complement that with additional "related modes"
> (e.g. html modes may want to mention `php-mode` as "related").
>
>
>         Stefan
>





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2024-04-22 16:37                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-04-22 17:16                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-04-22 17:25                                           ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-04-24  0:09                                           ` Dmitry Gutov
  0 siblings, 2 replies; 66+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-04-22 17:16 UTC (permalink / raw)
  To: David Fussner
  Cc: 53749, Ikumi Keita, Dmitry Gutov, Arash Esbati, stefankangas,
	Tassilo Horn, Eli Zaretskii

> (I'm wondering whether it might be useful, for example, for
> semantic-symref-derive-find-filepatterns to add extensions from
> auto-mode-alist even when some extensions are found in
> semantic-symref-filepattern-alist.)

Assuming we can get good enough results from `auto-mode-alist and
friends, I think we'd want to mark `semantic-symref-filepattern-alist`
as obsolete.
But before that, we need to check the assumption.

In the short term, for AUCTeX the only workable option seems to be to
add entries to `semantic-symref-filepattern-alist`.


        Stefan






^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2024-04-22 17:16                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-04-22 17:25                                           ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-04-24  0:09                                           ` Dmitry Gutov
  1 sibling, 0 replies; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-04-22 17:25 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: 53749, Ikumi Keita, Dmitry Gutov, Arash Esbati, Stefan Kangas,
	Tassilo Horn, Eli Zaretskii

[-- Attachment #1: Type: text/plain, Size: 797 bytes --]

Thank you. I hope one or two others might join in, but I'll have some code
to look over in a few days, in any case.

David.

On Mon, 22 Apr 2024, 18:16 Stefan Monnier, <monnier@iro.umontreal.ca> wrote:

> > (I'm wondering whether it might be useful, for example, for
> > semantic-symref-derive-find-filepatterns to add extensions from
> > auto-mode-alist even when some extensions are found in
> > semantic-symref-filepattern-alist.)
>
> Assuming we can get good enough results from `auto-mode-alist and
> friends, I think we'd want to mark `semantic-symref-filepattern-alist`
> as obsolete.
> But before that, we need to check the assumption.
>
> In the short term, for AUCTeX the only workable option seems to be to
> add entries to `semantic-symref-filepattern-alist`.
>
>
>         Stefan
>
>

[-- Attachment #2: Type: text/html, Size: 1192 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2024-04-22 14:56                                   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-04-22 16:15                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-04-23 12:04                                     ` Arash Esbati
  2024-04-23 13:21                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 66+ messages in thread
From: Arash Esbati @ 2024-04-23 12:04 UTC (permalink / raw)
  To: David Fussner
  Cc: 53749, Ikumi Keita, Dmitry Gutov, Stefan Monnier, Tassilo Horn,
	Eli Zaretskii, stefankangas

David Fussner <dfussner@googlemail.com> writes:

> Thanks for the nudge. I am in fact in the final stages of preparing a
> new patch to get xref working in TeX buffers.

Thanks for the update.

> The semantic/symref backend used by xref-find-references greps in
> files matching the major-mode of the buffer where the user calls the
> command. It looks in semantic-symref-filepattern-alist for
> file-extensions matching the major-mode, and if that fails it looks in
> auto-mode-alist. When both fail to produce any file extensions it
> tells the user to customize semantic-symref-filepattern-alist. Also,
> if it finds things in s-s-f-a, it doesn't go on to auto-mode-alist, so
> s-s-f-a has to be complete in order to be useful. In effect, we need
> s-s-f-a to hold all the extensions for all the modes that can appear
> as values of major-mode, and I notice that AUCTeX has started to
> populate that alist, though incompletely.

I'm not familiar with the way xref works, but reading the above, xref
doesn't care about modes set per file variables, is this correct?

> I'm also aware that many packages add their own extensions to files
> which are basically TeX or LaTeX files, and I wonder whether we can
> really keep up with the whole of CTAN in terms of providing complete
> lists of extensions for s-s-f-a.

I think this is almost impossible.  Besides the effort, take for example
the .cnf extension which is used by other programs as well, so
associating it with LaTeX-mode wouldn't make sense, IMO.  Finally, I
think many packages are written in .dtx format and the ones with many
files with different extensions (.def, .enc, .fd, ...) usually extract
them from the .dtx via an .ins file, so the edited source is inside the
.dtx, and we don't need to care about these extensions.

> As an example of where we are, if you open a plain-tex-mode (or
> plain-TeX-mode) file and M-? with point on some standard word you'll
> currently get the message to customize s-s-f-a, because
> auto-mode-alist has only tex-mode and s-s-f-a doesn't cover them,
> either.

This is possibly the next mess since .tex can be plain-TeX, ConTeXt,
LaTeX ...  So in general, I second what Stefan M. wrote in his other
message, but respecting/using file local variables could help here (if
it doesn't work ATM, see above), e.g.:

--8<---------------cut here---------------start------------->8---
\beginsection 1. Introduction.
This is the start of the introduction.
\bye

%%% Local Variables:
%%% mode: plain-TeX
%%% TeX-master: t
%%% End:
--8<---------------cut here---------------end--------------->8---

HTH.  Best, Arash





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2024-04-23 12:04                                     ` Arash Esbati
@ 2024-04-23 13:21                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-04-23 13:21 UTC (permalink / raw)
  To: Arash Esbati
  Cc: 53749, Ikumi Keita, Dmitry Gutov, Stefan Monnier, Tassilo Horn,
	Eli Zaretskii, stefankangas

Thanks for the reply, Arash.

> I'm not familiar with the way xref works, but reading the above, xref
> doesn't care about modes set per file variables, is this correct?

As far as I know, the default xref-find-references deals strictly in
file extensions.

> I think this is almost impossible.  Besides the effort, take for example
> the .cnf extension which is used by other programs as well, so
> associating it with LaTeX-mode wouldn't make sense, IMO.

Agreed -- this may be an argument against my current approach. I hope,
however, that the way xref-find-references searches by directory or by
project should limit spurious searching when a more common extension
appears on a TeX file.

> This is possibly the next mess since .tex can be plain-TeX, ConTeXt,
> LaTeX ...

I guess currently I'm thinking that this is sort of a feature, as
searching for symbols in files/buffers from many closely-related modes
may produce useful matches. The code I'm finishing up tends to search
more files rather than fewer, but it should be possible to prune this
if it's deemed too messy.

> So in general, I second what Stefan M. wrote in his other
> message, but respecting/using file local variables could help here.

Currently, the code takes into account file-local variables only by
including in the search list extensions of TeX-related buffers, which
may well only have become TeX-related due to such variables.

I'll post a patch as soon as I solve an outstanding issue or two, and
we'll see where we are.

Thank you indeed for your help,

David.

On Tue, 23 Apr 2024 at 13:04, Arash Esbati <arash@gnu.org> wrote:
>
> David Fussner <dfussner@googlemail.com> writes:
>
> > Thanks for the nudge. I am in fact in the final stages of preparing a
> > new patch to get xref working in TeX buffers.
>
> Thanks for the update.
>
> > The semantic/symref backend used by xref-find-references greps in
> > files matching the major-mode of the buffer where the user calls the
> > command. It looks in semantic-symref-filepattern-alist for
> > file-extensions matching the major-mode, and if that fails it looks in
> > auto-mode-alist. When both fail to produce any file extensions it
> > tells the user to customize semantic-symref-filepattern-alist. Also,
> > if it finds things in s-s-f-a, it doesn't go on to auto-mode-alist, so
> > s-s-f-a has to be complete in order to be useful. In effect, we need
> > s-s-f-a to hold all the extensions for all the modes that can appear
> > as values of major-mode, and I notice that AUCTeX has started to
> > populate that alist, though incompletely.
>
> I'm not familiar with the way xref works, but reading the above, xref
> doesn't care about modes set per file variables, is this correct?
>
> > I'm also aware that many packages add their own extensions to files
> > which are basically TeX or LaTeX files, and I wonder whether we can
> > really keep up with the whole of CTAN in terms of providing complete
> > lists of extensions for s-s-f-a.
>
> I think this is almost impossible.  Besides the effort, take for example
> the .cnf extension which is used by other programs as well, so
> associating it with LaTeX-mode wouldn't make sense, IMO.  Finally, I
> think many packages are written in .dtx format and the ones with many
> files with different extensions (.def, .enc, .fd, ...) usually extract
> them from the .dtx via an .ins file, so the edited source is inside the
> .dtx, and we don't need to care about these extensions.
>
> > As an example of where we are, if you open a plain-tex-mode (or
> > plain-TeX-mode) file and M-? with point on some standard word you'll
> > currently get the message to customize s-s-f-a, because
> > auto-mode-alist has only tex-mode and s-s-f-a doesn't cover them,
> > either.
>
> This is possibly the next mess since .tex can be plain-TeX, ConTeXt,
> LaTeX ...  So in general, I second what Stefan M. wrote in his other
> message, but respecting/using file local variables could help here (if
> it doesn't work ATM, see above), e.g.:
>
> --8<---------------cut here---------------start------------->8---
> \beginsection 1. Introduction.
> This is the start of the introduction.
> \bye
>
> %%% Local Variables:
> %%% mode: plain-TeX
> %%% TeX-master: t
> %%% End:
> --8<---------------cut here---------------end--------------->8---
>
> HTH.  Best, Arash





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2024-04-22 17:16                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-04-22 17:25                                           ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-04-24  0:09                                           ` Dmitry Gutov
  2024-04-24  9:02                                             ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 66+ messages in thread
From: Dmitry Gutov @ 2024-04-24  0:09 UTC (permalink / raw)
  To: Stefan Monnier, David Fussner
  Cc: 53749, Ikumi Keita, Arash Esbati, stefankangas, Tassilo Horn,
	Eli Zaretskii

On 22/04/2024 20:16, Stefan Monnier via Bug reports for GNU Emacs, the 
Swiss army knife of text editors wrote:
>> (I'm wondering whether it might be useful, for example, for
>> semantic-symref-derive-find-filepatterns to add extensions from
>> auto-mode-alist even when some extensions are found in
>> semantic-symref-filepattern-alist.)
> Assuming we can get good enough results from `auto-mode-alist and
> friends, I think we'd want to mark `semantic-symref-filepattern-alist`
> as obsolete.
> But before that, we need to check the assumption.

Last I checked, semantic-symref-filepattern-alist had explicit entries 
only for languages whose auto-mode-alist entries were deemed too complex 
to parse out the matching extensions from the corresponding regexps.

Or had other difficulties like the c-or-c++-mode dispatcher.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2024-04-24  0:09                                           ` Dmitry Gutov
@ 2024-04-24  9:02                                             ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-04-24  9:02 UTC (permalink / raw)
  To: Dmitry Gutov
  Cc: 53749, Ikumi Keita, Arash Esbati, stefankangas, Tassilo Horn,
	Eli Zaretskii, Stefan Monnier

Thanks, Dmitry.

> Last I checked, semantic-symref-filepattern-alist had explicit entries
> only for languages whose auto-mode-alist entries were deemed too complex
> to parse out the matching extensions from the corresponding regexps.
>
> Or had other difficulties like the c-or-c++-mode dispatcher.

That makes sense, and clarifies a few things for me. I guess TeX has
the "plain-tex or latex or context or ams-tex" dispatcher and also
in-tree vs. AUCTeX mode names, both of which at least for the moment
make semantic-symref-filepattern-alist seem a better fit.

Best,

David.

On Wed, 24 Apr 2024 at 01:09, Dmitry Gutov <dgutov@yandex.ru> wrote:
>
> On 22/04/2024 20:16, Stefan Monnier via Bug reports for GNU Emacs, the
> Swiss army knife of text editors wrote:
> >> (I'm wondering whether it might be useful, for example, for
> >> semantic-symref-derive-find-filepatterns to add extensions from
> >> auto-mode-alist even when some extensions are found in
> >> semantic-symref-filepattern-alist.)
> > Assuming we can get good enough results from `auto-mode-alist and
> > friends, I think we'd want to mark `semantic-symref-filepattern-alist`
> > as obsolete.
> > But before that, we need to check the assumption.
>
> Last I checked, semantic-symref-filepattern-alist had explicit entries
> only for languages whose auto-mode-alist entries were deemed too complex
> to parse out the matching extensions from the corresponding regexps.
>
> Or had other difficulties like the c-or-c++-mode dispatcher.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2024-04-22 13:06                                 ` Arash Esbati
  2024-04-22 14:56                                   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-04-29 14:15                                   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-05-02  0:43                                     ` Dmitry Gutov
                                                       ` (2 more replies)
  1 sibling, 3 replies; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-04-29 14:15 UTC (permalink / raw)
  To: Arash Esbati
  Cc: 53749, Ikumi Keita, Dmitry Gutov, Stefan Monnier, Tassilo Horn,
	Eli Zaretskii, stefankangas

[-- Attachment #1: Type: text/plain, Size: 2615 bytes --]

Hi Dmitry and Arash,

Here's my third attempt at a working xref backend for TeX. I'll try
quickly to summarize what's in it:

1. I've modified etags so that it creates findable tags for as many
different sorts of TeX construct as possible, including those written
in the new expl3 syntax. I've now removed the escape character from
the tag names, as this simplifies code all around.

2. 4 of the 6 xref backend functions just call the etags backend.

3. xref-backend-identifier-at-point is modified to provide new regexps
for delineating TeX symbols, and there's also code to cope with expl3
constructs slightly differently in M-? than in the other two main xref
commands.

4. xref-backend-references is a wrapper for the standard backend, the
wrapper doing two things: first, it tries to accumulate as many file
extensions for the current major-mode as emacs knows about, and second
it creates a bespoke syntax-propertize-function for strings that
aren't entirely composed of symbol or word characters. It applies this
function to file-visiting buffers and lets xref apply it in the
*xref-temp buffer, though I had to add a one-liner in xref.el to fix
what I believe is a minor bug there preventing syntax-propertize from
doing its work when the temp buffer holds text from a new file. (I can
provide a recipe for this if you want.)

5. Slightly unrelatedly, I've added new syntax-propertize-rules to
latex-mode so that expl3 constructs with the underscore aren't
fontified as subscripts, which makes such code unreadable. I'm happy
to split this off as another patch.

All comments gratefully received, and thanks,

David.

On Mon, 22 Apr 2024 at 14:06, Arash Esbati <arash@gnu.org> wrote:
>
> David Fussner <dfussner@googlemail.com> writes:
>
> > Thanks for the clarifications. If you look at the current patch to
> > tex-mode.el, there's one function call added to TeX-mode-hook, mainly
> > for my own testing purposes, but no matter what the final patch looks
> > like it should only similarly require a single function call in an
> > AUCTeX hook to activate the new xref code there, along with one in
> > tex-common-initialization for the in-tree modes. If and when all
> > parties are satisfied by the patch I'll certainly be in touch with you
> > to find out how you'd prefer to handle activating it (or not) in
> > AUCTeX. The current state of affairs is a convenience for me and for
> > anyone else who cares to test the code.
>
> Hi David,
>
> I justed wanted to come back on this report and ask if there is any
> progress?  It would be nice to get Xref working within TeX buffers.
>
> TIA.  Best, Arash

[-- Attachment #2: 0001-Provide-a-modified-xref-backend-for-TeX-buffers.patch --]
[-- Type: text/x-patch, Size: 30939 bytes --]

From 64a4f7c7b89b4475a3841b54288c25bcc4ebde3d Mon Sep 17 00:00:00 2001
From: David Fussner <dfussner@googlemail.com>
Date: Mon, 29 Apr 2024 15:05:03 +0100
Subject: [PATCH] Provide a modified xref backend for TeX buffers

* lib-src/etags.c (TeX_commands): Improve parsing of commands in TeX
buffers.
(TEX_defenv): Expand list of commands to tag by default in TeX
buffers.
(TeX_help):
* doc/emacs/maintaining.texi (Tag Syntax): Document new tagged
commands.
(Identifier Search): Add note about semantic-symref-filepattern-alist,
auto-mode-alist, and xref-find-references.

* lisp/progmodes/xref.el (xref--collect-matches): Ensure
syntax-propertize actually runs in the *xref-temp buffer for each
new file searched.
* lisp/textmodes/tex-mode.el (tex-font-lock-suscript): Disable
subscript face in expl3 constructs.
(latex-syntax-propertize-rules): Add two new rules to give symbol
syntax to the standard components of expl3 constructs.
(tex-common-initialization): Set up xref backend for in-tree TeX
modes.
(tex--thing-at-point, tex-thingatpt--beginning-of-symbol)
(tex-thingatpt--end-of-symbol, tex--bounds-of-symbol-at-point):
New functions to return 'thing-at-point' for xref backend.
(tex-esc-and-group-chars): New var to do the same.
(xref-backend-identifier-at-point): New TeX backend method to provide
symbols for processing by xref.
(xref-backend-identifier-completion-table)
(xref-backend-identifier-completion-ignore-case)
(xref-backend-definitions, xref-backend-apropos): Placeholders to
call the standard 'etags' xref backend methods.
(xref-backend-references): Wrapper to call the default xref backend
method, finding as many relevant files as possible and using a bespoke
syntax-propertize-function.
(tex--collect-file-extensions, tex-xref-syntax-function): Helper
function and macro for previous.
(tex-find-references-syntax-table, tex--buffers-list)
(tex--last-ref-syntax-flag, tex--old-syntax-function): New vars for
same.
---
 doc/emacs/maintaining.texi |  34 +++-
 lib-src/etags.c            | 183 ++++++++++++++++++--
 lisp/progmodes/xref.el     |   1 +
 lisp/textmodes/tex-mode.el | 336 ++++++++++++++++++++++++++++++++++++-
 4 files changed, 537 insertions(+), 17 deletions(-)

diff --git a/doc/emacs/maintaining.texi b/doc/emacs/maintaining.texi
index 579098c81b1..2fbb964a7a0 100644
--- a/doc/emacs/maintaining.texi
+++ b/doc/emacs/maintaining.texi
@@ -2529,6 +2529,15 @@ Identifier Search
 referenced.  The XREF mode commands are available in this buffer, see
 @ref{Xref Commands}.
 
+When invoked in a buffer whose major mode uses the @code{etags} backend,
+@kbd{M-?} searches files and buffers whose major mode matches that of
+the original buffer.  It guesses that mode from file extensions, so if
+@kbd{M-?} seems to be skipping relevant buffers or files, try
+customizing either the variable @code{semantic-symref-filepattern-alist}
+(if your buffer's major mode already has an entry in it), or
+@code{auto-mode-alist} (if not), thereby informing @code{xref} of the
+missing extensions (@pxref{Choosing Modes}).
+
 @vindex xref-auto-jump-to-first-xref
   If the value of the variable @code{xref-auto-jump-to-first-xref} is
 @code{t}, @code{xref-find-references} automatically jumps to the first
@@ -2749,8 +2758,29 @@ Tag Syntax
 @code{\section}, @code{\subsection}, @code{\subsubsection},
 @code{\eqno}, @code{\label}, @code{\ref}, @code{\cite},
 @code{\bibitem}, @code{\part}, @code{\appendix}, @code{\entry},
-@code{\index}, @code{\def}, @code{\newcommand}, @code{\renewcommand},
-@code{\newenvironment} and @code{\renewenvironment} are tags.
+@code{\index}, @code{\def}, @code{\edef}, @code{\gdef}, @code{\xdef},
+@code{\newcommand}, @code{\renewcommand}, @code{\newenvironment},
+@code{\renewenvironment}, @code{\DeclareRobustCommand},
+@code{\newrobustcmd}, @code{\renewrobustcmd}, @code{\providecommand},
+@code{\providerobustcmd}, @code{\NewDocumentCommand},
+@code{\RenewDocumentCommand}, @code{\ProvideDocumentCommand},
+@code{\DeclareDocumentCommand}, @code{\NewExpandableDocumentCommand},
+@code{\RenewExpandableDocumentCommand},
+@code{\ProvideExpandableDocumentCommand},
+@code{\DeclareExpandableDocumentCommand},
+@code{\NewDocumentEnvironment}, @code{\RenewDocumentEnvironment},
+@code{\ProvideDocumentEnvironment},
+@code{\DeclareDocumentEnvironment}, @code{\csdef}, @code{\csedef},
+@code{\csgdef}, @code{\csxdef}, @code{\csletcs}, @code{\cslet},
+@code{\letcs}, @code{\let}, \@code{\cs_new_protected_nopar},
+@code{\cs_new_protected}, @code{\cs_new_nopar}, @code{\cs_new_eq},
+@code{\cs_new}, @code{\cs_set_protected_nopar},
+@code{\cs_set_protected}, @code{\cs_set_nopar}, @code{\cs_set_eq},
+@code{\cs_set}, @code{\cs_gset_protected_nopar},
+@code{\cs_gset_protected}, @code{\cs_gset_nopar}, @code{\cs_gset_eq},
+@code{\cs_gset}, @code{\cs_generate_from_arg_count}, and
+@code{\cs_generate_variant} are tags.  So too are the arguments of any
+starred variants of these commands.
 
 Other commands can make tags as well, if you specify them in the
 environment variable @env{TEXTAGS} before invoking @command{etags}.  The
diff --git a/lib-src/etags.c b/lib-src/etags.c
index 032cfa8010b..8b79e92abf1 100644
--- a/lib-src/etags.c
+++ b/lib-src/etags.c
@@ -792,8 +792,24 @@ #define STDIN 0x1001		/* returned by getopt_long on --parse-stdin */
 "In LaTeX text, the argument of any of the commands '\\chapter',\n\
 '\\section', '\\subsection', '\\subsubsection', '\\eqno', '\\label',\n\
 '\\ref', '\\cite', '\\bibitem', '\\part', '\\appendix', '\\entry',\n\
-'\\index', '\\def', '\\newcommand', '\\renewcommand',\n\
-'\\newenvironment' or '\\renewenvironment' is a tag.\n\
+'\\index', '\\def', '\\edef', '\\gdef', '\\xdef', '\\newcommand',\n\
+'\\renewcommand', '\\newenvironment', '\\renewenvironment',\n\
+'\\DeclareRobustCommand', '\\newrobustcmd', '\\renewrobustcmd',\n\
+'\\providecommand', '\\providerobustcmd', '\\NewDocumentCommand',\n\
+'\\RenewDocumentCommand', '\\ProvideDocumentCommand',\n\
+'\\DeclareDocumentCommand', '\\NewExpandableDocumentCommand',\n\
+'\\RenewExpandableDocumentCommand', '\\ProvideExpandableDocumentCommand',\n\
+'\\DeclareExpandableDocumentCommand', '\\NewDocumentEnvironment',\n\
+'\\RenewDocumentEnvironment', '\\ProvideDocumentEnvironment',\n\
+'\\DeclareDocumentEnvironment','\\csdef', '\\csedef', '\\csgdef',\n\
+'\\csxdef', '\\csletcs', '\\cslet', '\\letcs', '\\let',\n\
+'\\cs_new_protected_nopar', '\\cs_new_protected', '\\cs_new_nopar',\n\
+'\\cs_new_eq', '\\cs_new', '\\cs_set_protected_nopar',\n\
+'\\cs_set_protected', '\\cs_set_nopar', '\\cs_set_eq', '\\cs_set',\n\
+'\\cs_gset_protected_nopar', '\\cs_gset_protected', '\\cs_gset_nopar',\n\
+'\\cs_gset_eq', '\\cs_gset', '\\cs_generate_from_arg_count', or\n\
+'\\cs_generate_variant' is a tag.  So is the argument of any starred\n\
+variant of these commands.\n\
 \n\
 Other commands can be specified by setting the environment variable\n\
 'TEXTAGS' to a colon-separated list like, for example,\n\
@@ -5736,11 +5752,25 @@ Scheme_functions (FILE *inf)
 static linebuffer *TEX_toktab = NULL; /* Table with tag tokens */
 
 /* Default set of control sequences to put into TEX_toktab.
-   The value of environment var TEXTAGS is prepended to this.  */
+   The value of environment var TEXTAGS is prepended to this.
+   (2024) Add variants of '\def', some additional LaTeX (and
+   former xparse) commands, common variants from the
+   'etoolbox' package, and the main expl3 commands. */
 static const char *TEX_defenv = "\
-:chapter:section:subsection:subsubsection:eqno:label:ref:cite:bibitem\
-:part:appendix:entry:index:def\
-:newcommand:renewcommand:newenvironment:renewenvironment";
+:label:ref:chapter:section:subsection:subsubsection:eqno:cite:bibitem\
+:part:appendix:entry:index:def:edef:gdef:xdef:newcommand:renewcommand\
+:newenvironment:renewenvironment:DeclareRobustCommand:renewrobustcmd\
+:newrobustcmd:providecommand:providerobustcmd:NewDocumentCommand\
+:RenewDocumentCommand:ProvideDocumentCommand:DeclareDocumentCommand\
+:NewExpandableDocumentCommand:RenewExpandableDocumentCommand\
+:ProvideExpandableDocumentCommand:DeclareExpandableDocumentCommand\
+:NewDocumentEnvironment:RenewDocumentEnvironment\
+:ProvideDocumentEnvironment:DeclareDocumentEnvironment:csdef\
+:csedef:csgdef:csxdef:csletcs:cslet:letcs:let:cs_new_protected_nopar\
+:cs_new_protected:cs_new_nopar:cs_new_eq:cs_new:cs_set_protected_nopar\
+:cs_set_protected:cs_set_nopar:cs_set_eq:cs_set:cs_gset_protected_nopar\
+:cs_gset_protected:cs_gset_nopar:cs_gset_eq:cs_gset\
+:cs_generate_from_arg_count:cs_generate_variant";
 
 static void TEX_decode_env (const char *, const char *);
 
@@ -5799,19 +5829,137 @@ TeX_commands (FILE *inf)
 	      {
 		char *p;
 		ptrdiff_t namelen, linelen;
-		bool opgrp = false;
+		bool opgrp = false, one_esc = false, is_explthree = false;
 
 		cp = skip_spaces (cp + key->len);
+
+		/* 1. The canonical expl3 syntax looks something like this:
+		   \cs_new:Npn \__hook_tl_gput:Nn { \ERROR }.  First, if we
+		   want to tag any such commands, we include only the part
+		   before the colon (cs_new) in TEX_defenv or TEXTAGS.  Second,
+		   etags skips the argument specifier (including the colon)
+		   after the tag token, so that it doesn't become the tag name.
+		   Third, we set the boolean 'is_explthree' to true so that we
+		   can remove the argument specifier from the actual tag name
+		   (__hook_tl_gput).  This all allows us to include expl3
+		   constructs in TEX_defenv or in the environment variable
+		   TEXTAGS without requiring a change of separator, and it also
+		   allows us to find the definition of variant commands (with
+		   different argument specifiers) defined using, for example,
+		   \cs_generate_variant:Nn.  Please note that the expl3 spec
+		   requires etags to pay more attention to whitespace in the
+		   code.
+
+		   2. We also automatically remove the asterisk from starred
+		   variants of all commands, without the need to include the
+		   starred commands explicitly in TEX_defenv or TEXTAGS. */
+		if (*cp == ':')
+		  {
+		    while (!c_isspace (*cp) && *cp != TEX_opgrp)
+		      cp++;
+		    cp = skip_spaces (cp);
+		    is_explthree = true;
+		  }
+		else if (*cp == '*')
+		  cp++;
+
+		/* Skip the optional arguments to commands in the tags list so
+		   that these arguments don't end up as the name of the tag.
+		   The name will instead come from the argument in curly braces
+		   that follows the optional ones. */
+		while (*cp != '\0' && *cp != '%')
+		  {
+		    if (*cp == '[')
+		      {
+			while (*cp != ']' && *cp != '\0' && *cp != '%')
+			  cp++;
+		      }
+		    else if (*cp == '(')
+		      {
+			while (*cp != ')' && *cp != '\0' && *cp != '%')
+			  cp++;
+		      }
+		    else if (*cp == ']' || *cp == ')')
+		      cp++;
+		    else
+		      break;
+		  }
 		if (*cp == TEX_opgrp)
 		  {
 		    opgrp = true;
 		    cp++;
+		    cp = skip_spaces (cp); /* For expl3 code. */
 		  }
+
+		/* Removing the TeX escape character from tag names simplifies
+		   things for editors finding tagged commands in TeX buffers.
+		   This applies to Emacs but also to the tag-finding behavior
+		   of at least some of the editors that use ctags, though in
+		   the latter case this will remain suboptimal.  The
+		   undocumented ctags option '--no-duplicates' may help. */
+		if (*cp == TEX_esc)
+		  {
+		    cp++;
+		    one_esc = true;
+		  }
+
+		/* Testing !c_isspace && !c_ispunct is simpler, but halts
+		   processing at too many places.  The list as it stands tries
+		   both to ensure that tag names will derive from macro names
+		   rather than from optional parameters to those macros, and
+		   also to return findable names while still allowing for
+		   unorthodox constructs. */
 		for (p = cp;
-		     (!c_isspace (*p) && *p != '#' &&
-		      *p != TEX_opgrp && *p != TEX_clgrp);
+		     (!c_isspace (*p) && *p != '#' && *p != '=' &&
+		      *p != '[' && *p != '(' && *p != TEX_opgrp &&
+		      *p != TEX_clgrp && *p != '"' && *p != '\'' &&
+		      *p != '%' && *p != ',' && *p != '|' && *p != '$');
 		     p++)
-		  continue;
+		  /* In expl3 code we remove the argument specification from
+		     the tag name.  More generally we allow only one (deleted)
+		     escape char in a tag name, which (primarily) enables
+		     tagging a TeX command's different, possibly temporary,
+		     '\let' bindings. */
+		  if (is_explthree && *p == ':')
+		    break;
+		  else if (*p == TEX_esc)
+		    { /* Second part of test is for, e.g., \cslet. */
+		      if (!one_esc && !opgrp)
+			{
+			  one_esc = true;
+			  continue;
+			}
+		      else
+			break;
+		    }
+		  else
+		    continue;
+		/* For TeX files, tags without a name are basically cruft, and
+		   in some situations they can produce spurious and confusing
+		   matches.  Try to catch as many cases as possible where a
+		   command name is of the form '\(', but avoid, as far as
+		   possible, the spurious matches. */
+		if (p == cp)
+		  {
+		    switch (*p)
+		      { /* Include =? */
+		      case '(': case '[': case '"': case '\'':
+		      case '\\': case '!': case '=': case ',':
+		      case '|': case '$':
+			p++;
+			break;
+		      case '{': case '}': case '<': case '>':
+			if (!opgrp)
+			  {
+			      p++;
+			      if (*p == '\0' || *p == '%')
+				goto tex_next_line;
+			  }
+			break;
+		      default:
+			break;
+		      }
+		  }
 		namelen = p - cp;
 		linelen = lb.len;
 		if (!opgrp || *p == TEX_clgrp)
@@ -5820,9 +5968,18 @@ TeX_commands (FILE *inf)
 		      p++;
 		    linelen = p - lb.buffer + 1;
 		  }
-		make_tag (cp, namelen, true,
-			  lb.buffer, linelen, lineno, linecharno);
-		goto tex_next_line; /* We only tag a line once */
+		if (namelen)
+		  make_tag (cp, namelen, true,
+			    lb.buffer, linelen, lineno, linecharno);
+		/* Lines with more than one \def or \let are surprisingly
+		   common in TeX files, especially in the system files that
+		   form the basis of the various TeX formats.  This tags them
+		   all. */
+		/* goto tex_next_line; /\* We only tag a line once *\/ */
+		while (*cp != '\0' && *cp != '%' && *cp != TEX_esc)
+		  cp++;
+		if (*cp != TEX_esc)
+		  goto tex_next_line;
 	      }
 	}
     tex_next_line:
diff --git a/lisp/progmodes/xref.el b/lisp/progmodes/xref.el
index 755c3db04fd..1d2d4904b06 100644
--- a/lisp/progmodes/xref.el
+++ b/lisp/progmodes/xref.el
@@ -2129,6 +2129,7 @@ xref--collect-matches
           (erase-buffer))
         (insert text)
         (goto-char (point-min))
+        (setq syntax-propertize--done 0)
         (xref--collect-matches-1 regexp file line
                                  (point)
                                  (point-max)
diff --git a/lisp/textmodes/tex-mode.el b/lisp/textmodes/tex-mode.el
index 97c950267c6..d990a2dbfa9 100644
--- a/lisp/textmodes/tex-mode.el
+++ b/lisp/textmodes/tex-mode.el
@@ -647,7 +647,8 @@ tex-font-lock-suscript
 		  (setq pos (1- pos) odd (not odd)))
 		odd))
     (if (eq (char-after pos) ?_)
-	`(face subscript display (raise ,(car tex-font-script-display)))
+        (unless (equal (get-text-property pos 'syntax-table) '(3))
+	  `(face subscript display (raise ,(car tex-font-script-display))))
       `(face superscript display (raise ,(cadr tex-font-script-display))))))
 
 (defun tex-font-lock-match-suscript (limit)
@@ -695,7 +696,25 @@ tex-verbatim-environments
      ("\\\\\\(?:end\\|begin\\) *\\({[^\n{}]*}\\)"
       (1 (ignore
           (tex-env-mark (match-beginning 0)
-                        (match-beginning 1) (match-end 1))))))))
+                        (match-beginning 1) (match-end 1)))))
+     ;; The next two rules change the syntax of `:' and `_' in expl3
+     ;; constructs, so that `tex-font-lock-suscript' can fontify them
+     ;; more accurately.
+     ((concat "\\(\\(?:[\\\\[:space:]{]_\\|"
+              "[\\\\{[:space:]][^][_[:space:][:cntrl:][:digit:]\\\\{}()/=]+\\)"
+              "\\(?:_+\\(?:[^][[:space:][:cntrl:][:digit:]:\\\\{}()/#_=]+\\|"
+              "#+[1-9]\\)\\)+\\)\\([:_]?\\)")
+      (1 (ignore
+          (let* ((expr (buffer-substring-no-properties (match-beginning 1)
+                                                       (match-end 1)))
+                 (list (seq-positions expr ?_)))
+            (dolist (pos list)
+              (put-text-property (+ pos (match-beginning 1))
+                                 (1+ (+ pos (match-beginning 1)))
+                                 'syntax-table (string-to-syntax "_"))))))
+      (2 "_"))
+     ("\\\\[[:alpha:]]+\\(:\\)[[:alpha:][:space:]\n]"
+      (1 "_")))))
 
 (defun tex-env-mark (cmd start end)
   (when (= cmd (line-beginning-position))
@@ -1291,6 +1310,8 @@ tex-common-initialization
 	      (syntax-propertize-rules latex-syntax-propertize-rules))
   ;; TABs in verbatim environments don't do what you think.
   (setq-local indent-tabs-mode nil)
+  ;; Set up xref backend in TeX buffers.
+  (add-hook 'xref-backend-functions #'tex--xref-backend nil t)
   ;; Other vars that should be buffer-local.
   (make-local-variable 'tex-command)
   (make-local-variable 'tex-start-of-header)
@@ -3742,6 +3763,317 @@ tex-chktex
       (process-send-region tex-chktex--process (point-min) (point-max))
       (process-send-eof tex-chktex--process))))
 
+\f
+;;; Xref backend
+
+;; Here we lightly adapt the default etags backend for xref so that
+;; the main xref user commands (including `xref-find-definitions',
+;; `xref-find-apropos', and `xref-find-references' [on M-., C-M-., and
+;; M-?, respectively]) work in TeX buffers.  The only methods we
+;; actually modify are `xref-backend-identifier-at-point' and
+;; `xref-backend-references'.  Many of the complications here, and in
+;; `etags' itself, are due to the necessity of parsing both the old
+;; TeX syntax and the new expl3 syntax, which will continue to appear
+;; together in documents for the foreseeable future.  Synchronizing
+;; Emacs and `etags' this way aims to improve the user experience "out
+;; of the box."
+
+(defvar tex-esc-and-group-chars '(?\\ ?{ ?})
+  "The current TeX escape and grouping characters.
+
+The `etags' program only recognizes `\\' (92) and `!' (33) as
+escape characters in TeX documents, and if it detects the latter
+it also uses `<>' as the TeX grouping construct rather than `{}'.
+The TeX `xref-backend-identifier-at-point' method uses these
+three characters to delimit the `thing-at-point' in TeX buffers,
+so this variable should contain at least these three, though you
+can optionally add other characters if the default set of TeX
+symbol delimiters is inadequate for your documents.  (The
+functions `tex-thingatpt--beginning-of-symbol'
+`tex-thingatpt--end-of-symbol' construct the regexp.)  Setting
+the escape and grouping chars to anything other than `\\{}' or
+`!<>' will not be useful without changes to `etags', at least for
+commands that search tags tables, such as
+\\[xref-find-definitions] and \\[xref-find-apropos].")
+
+;; Populate `semantic-symref-filepattern-alist' for the in-tree modes;
+;; AUCTeX is doing the same for its modes.
+(defvar semantic-symref-filepattern-alist)
+(with-eval-after-load 'semantic/symref/grep
+  (push '(latex-mode "*.[tT]e[xX]" "*.ltx" "*.sty" "*.cl[so]"
+                     "*.bbl" "*.drv" "*.hva")
+        semantic-symref-filepattern-alist)
+  (push '(plain-tex-mode "*.[tT]e[xX]" "*.ins")
+        semantic-symref-filepattern-alist)
+  (push '(doctex-mode "*.dtx") semantic-symref-filepattern-alist))
+
+(defun tex--xref-backend () 'tex-etags)
+
+;; Setup AUCTeX modes (for testing purposes only).
+
+(add-hook 'TeX-mode-hook #'tex-set-auctex-xref-backend)
+
+(defun tex-set-auctex-xref-backend ()
+  (add-hook 'xref-backend-functions #'tex--xref-backend nil t))
+
+;; `xref-find-references' currently may need this when called from a
+;; latex-mode buffer in order to search files or buffers with a .tex
+;; suffix (including the buffer from which it has been called).  We
+;; append it to `auto-mode-alist' so as not to interfere with the usual
+;; mode-setting apparatus.  Changes here and in AUCTeX should soon
+;; render it unnecessary.
+(add-to-list 'auto-mode-alist '("\\.[tT]e[xX]\\'" . latex-mode) t)
+
+(cl-defmethod xref-backend-identifier-at-point ((_backend (eql 'tex-etags)))
+  (require 'etags)
+  (tex--thing-at-point))
+
+;; The detection of `_' and `:' is a primitive method for determining
+;; whether point is on an expl3 construct.  It may fail in some
+;; instances.
+(defun tex--thing-at-point ()
+  "Demarcate `thing-at-point' for TeX `xref' backend."
+  (let ((bounds (tex--bounds-of-symbol-at-point)))
+    (when bounds
+      (let ((texsym (buffer-substring-no-properties (car bounds) (cdr bounds))))
+        (if (and (not (string-match-p "reference" (symbol-name this-command)))
+                 (seq-contains-p texsym ?_)
+                 (seq-contains-p texsym ?:))
+            (seq-take texsym (seq-position texsym ?:))
+          texsym)))))
+
+(defun tex-thingatpt--beginning-of-symbol ()
+  (and
+   (re-search-backward (concat "[]["
+                               (mapconcat #'regexp-quote
+                                          (mapcar #'char-to-string
+                                                  tex-esc-and-group-chars))
+                               "\"*`'#=&()%,|$[:cntrl:][:blank:]]"))
+   (forward-char)))
+
+(defun tex-thingatpt--end-of-symbol ()
+  (and
+   (re-search-forward (concat "[]["
+                              (mapconcat #'regexp-quote
+                                          (mapcar #'char-to-string
+                                                  tex-esc-and-group-chars))
+                              "\"*`'#=&()%,|$[:cntrl:][:blank:]]"))
+   (backward-char)))
+
+(defun tex--bounds-of-symbol-at-point ()
+  "Simplify `bounds-of-thing-at-point' for TeX `xref' backend."
+  (let ((orig (point)))
+    (ignore-errors
+      (save-excursion
+	(tex-thingatpt--end-of-symbol)
+	(tex-thingatpt--beginning-of-symbol)
+	(let ((beg (point)))
+	  (if (<= beg orig)
+	      (let ((real-end
+		     (progn
+		       (tex-thingatpt--end-of-symbol)
+		       (point))))
+		(cond ((and (<= orig real-end) (< beg real-end))
+		       (cons beg real-end))
+                      ((and (= orig real-end) (= beg real-end))
+		       (cons beg (1+ beg)))))))))));; For 1-char TeX commands.
+
+(cl-defmethod xref-backend-identifier-completion-table ((_backend
+                                                         (eql 'tex-etags)))
+  (xref-backend-identifier-completion-table 'etags))
+
+(cl-defmethod xref-backend-identifier-completion-ignore-case ((_backend
+                                                               (eql
+                                                                'tex-etags)))
+  (xref-backend-identifier-completion-ignore-case 'etags))
+
+(cl-defmethod xref-backend-definitions ((_backend (eql 'tex-etags)) symbol)
+  (xref-backend-definitions 'etags symbol))
+
+(cl-defmethod xref-backend-apropos ((_backend (eql 'tex-etags)) pattern)
+  (xref-backend-apropos 'etags pattern))
+
+;; The `xref-backend-references' method requires more code than the
+;; others for at least two main reasons: TeX authors have typically been
+;; free in their invention of new file types with new suffixes, and they
+;; have also tended sometimes to include non-symbol characters in
+;; command names.  When combined with the default Semantic Symbol
+;; Reference API, these two characteristics of TeX code mean that a
+;; command like `xref-find-references' would often fail to find any hits
+;; for a symbol at point, including the one under point in the current
+;; buffer, or it would find only some instances and skip others.
+
+(defun tex-find-references-syntax-table ()
+  (let ((st (if (boundp 'TeX-mode-syntax-table)
+                 (make-syntax-table TeX-mode-syntax-table)
+               (make-syntax-table tex-mode-syntax-table))))
+    st))
+
+(defmacro tex-xref-syntax-function (str beg end)
+  (let* (grpb tempstr
+              (shrtstr (if end
+                           (progn
+                             (setq tempstr (seq-take str (1- (length str))))
+                             (if beg
+                                 (setq tempstr (seq-drop tempstr 1))
+                               tempstr))
+                         (seq-drop str 1)))
+              (grpa (if (and beg end)
+                        (prog1
+                            (list 1 "_")
+                          (setq grpb (list 2 "_")))
+                      (list 1 "_")))
+              (re (concat beg (regexp-quote shrtstr) end))
+              (temp-rule (if grpb
+                             (list re grpa grpb)
+                           (list re grpa))))
+    `(syntax-propertize-rules ,temp-rule)))
+
+(defun tex--collect-file-extensions ()
+  (let* ((mlist (when (rassq major-mode auto-mode-alist)
+		  (seq-filter
+		   (lambda (elt)
+		     (eq (cdr elt) major-mode))
+		   auto-mode-alist)))
+	 (lcsym (intern-soft (downcase (symbol-name major-mode))))
+	 (lclist (and lcsym
+		      (not (eq lcsym major-mode))
+		      (rassq lcsym auto-mode-alist)
+		      (seq-filter
+		       (lambda (elt)
+			 (eq (cdr elt) lcsym))
+		       auto-mode-alist)))
+	 (shortsym (when (stringp mode-name)
+		     (intern-soft (concat (string-trim-right mode-name "/.*")
+					  "-mode"))))
+	 (lcshortsym (when (stringp mode-name)
+		       (intern-soft (downcase
+				     (concat
+				      (string-trim-right mode-name "/.*")
+				      "-mode")))))
+	 (shlist (and shortsym
+		      (not (eq shortsym major-mode))
+		      (not (eq shortsym lcsym))
+		      (rassq shortsym auto-mode-alist)
+		      (seq-filter
+		       (lambda (elt)
+			 (eq (cdr elt) shortsym))
+		       auto-mode-alist)))
+	 (lcshlist (and lcshortsym
+			(not (eq lcshortsym major-mode))
+			(not (eq lcshortsym lcsym))
+			(rassq lcshortsym auto-mode-alist)
+			(seq-filter
+			 (lambda (elt)
+			   (eq (cdr elt) lcshortsym))
+			 auto-mode-alist)))
+	 (exts (when (or mlist lclist shlist lcshlist)
+		 (seq-union (seq-map #'car lclist)
+			    (seq-union (seq-map #'car mlist)
+				       (seq-union (seq-map #'car lcshlist)
+						  (seq-map #'car shlist))))))
+	 (ed-exts (when exts
+		    (seq-map
+		     (lambda (elt)
+		       (concat "*" (string-trim  elt "\\\\" "\\\\'")))
+		     exts))))
+    ed-exts))
+
+(defvar tex--buffers-list nil)
+(defvar-local tex--last-ref-syntax-flag nil)
+(defvar-local tex--old-syntax-function nil)
+
+(cl-defmethod xref-backend-references ((_backend (eql 'tex-etags)) identifier)
+  "Find references of IDENTIFIER in TeX buffers and files."
+  (require 'semantic/symref/grep)
+  (let (bufs texbufs
+             (mode major-mode))
+    (dolist (buf (buffer-list))
+      (if (eq (buffer-local-value 'major-mode buf) mode)
+          (push buf bufs)
+        (when (string-match-p ".*\\.[tT]e[xX]" (buffer-name buf))
+          (push buf texbufs))))
+    (unless (seq-set-equal-p tex--buffers-list bufs)
+      (let* ((amalist (tex--collect-file-extensions))
+	     (extlist (alist-get mode semantic-symref-filepattern-alist))
+	     (extlist-new (seq-uniq
+                           (seq-union amalist extlist #'string-match-p))))
+	(setq tex--buffers-list bufs)
+	(dolist (buf bufs)
+	  (when-let ((fbuf (buffer-file-name buf))
+		     (ext (file-name-extension fbuf))
+		     (finext (concat "*." ext))
+		     ((not (seq-find (lambda (elt) (string-match-p elt finext))
+				     extlist-new)))
+		     ((push finext extlist-new)))))
+	(unless (seq-set-equal-p extlist-new extlist)
+	  (setf (alist-get mode semantic-symref-filepattern-alist)
+                extlist-new))))
+    (let* (setsyntax
+           (punct (with-syntax-table (tex-find-references-syntax-table)
+                    (seq-positions identifier (list ?w ?_)
+			           (lambda (elt sycode)
+			             (not (memq (char-syntax elt) sycode))))))
+           (end (and punct
+                     (memq (1- (length identifier)) punct)
+                     (> (length identifier) 1)
+                     (concat "\\("
+                             (regexp-quote
+                              (string (elt identifier
+                                           (1- (length identifier)))))
+                             "\\)")))
+           (beg (and punct
+                     (memq 0 punct)
+                     (concat "\\("
+                             (regexp-quote (string (elt identifier 0)))
+                             "\\)")))
+           (text-mode-hook
+            (if (or end beg)
+                (progn
+                  (setq setsyntax (lambda ()
+		                    (setq-local syntax-propertize-function
+                                                (eval
+                                                 `(tex-xref-syntax-function
+                                                   ,identifier ,beg ,end)))
+                                    (setq-local TeX-style-hook-applied-p t)))
+                  (cons setsyntax text-mode-hook))
+              text-mode-hook)))
+      (unless (memq 'doctex-mode (derived-mode-all-parents mode))
+        (setq bufs (append texbufs bufs)))
+      (dolist (buf bufs)
+        (with-current-buffer buf
+          (if (or end beg)
+              (progn
+                (unless (local-variable-p 'tex--old-syntax-function)
+                  (setq tex--old-syntax-function syntax-propertize-function))
+                (setq-local syntax-propertize-function
+                            (eval
+                             `(tex-xref-syntax-function
+                               ,identifier ,beg ,end)))
+                (setq syntax-propertize--done 0)
+                (setq tex--last-ref-syntax-flag t))
+            ;; If we've computed a bespoke `syntax-propertize-function'
+            ;; then this returns the buffer to the status quo ante
+            ;; bellum on the next invocation of M-? that searches it.
+            (when tex--last-ref-syntax-flag
+              (setq-local syntax-propertize-function
+                          (eval
+                           `(tex-xref-syntax-function
+                             ,identifier nil nil)))
+              (setq syntax-propertize--done 0)))))
+      (unwind-protect
+          (xref-backend-references nil identifier)
+        (dolist (buf bufs)
+          (with-current-buffer buf
+            (when buffer-file-truename
+              (if (or end beg)
+                  (setq-local syntax-propertize-function
+                              tex--old-syntax-function)
+                (when tex--last-ref-syntax-flag
+                  (setq-local syntax-propertize-function
+                              tex--old-syntax-function)
+                  (setq tex--last-ref-syntax-flag nil))))))))))
+
 (make-obsolete-variable 'tex-mode-load-hook
                         "use `with-eval-after-load' instead." "28.1")
 (run-hooks 'tex-mode-load-hook)
-- 
2.35.8


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2024-04-29 14:15                                   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-05-02  0:43                                     ` Dmitry Gutov
  2024-05-02 13:32                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-05-02  6:47                                     ` Arash Esbati
  2024-05-03 14:10                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2 siblings, 1 reply; 66+ messages in thread
From: Dmitry Gutov @ 2024-05-02  0:43 UTC (permalink / raw)
  To: David Fussner, Arash Esbati
  Cc: 53749, Ikumi Keita, stefankangas, Tassilo Horn, Eli Zaretskii,
	Stefan Monnier

On 29/04/2024 17:15, David Fussner via Bug reports for GNU Emacs, the 
Swiss army knife of text editors wrote:
> though I had to add a one-liner in xref.el to fix
> what I believe is a minor bug there preventing syntax-propertize from
> doing its work when the temp buffer holds text from a new file. (I can
> provide a recipe for this if you want.)

Yes, could you please expand on it separately?

The rest of the patch description just makes sense to me, and I'd be 
happy to leave (or not) the detailed review to whoever reviews TeX 
contributions around here, but this is something I'll need to pay 
special attention to.





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2024-04-29 14:15                                   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-05-02  0:43                                     ` Dmitry Gutov
@ 2024-05-02  6:47                                     ` Arash Esbati
  2024-05-02 13:34                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-05-03 14:10                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2 siblings, 1 reply; 66+ messages in thread
From: Arash Esbati @ 2024-05-02  6:47 UTC (permalink / raw)
  To: David Fussner
  Cc: 53749, Ikumi Keita, Dmitry Gutov, Stefan Monnier, Tassilo Horn,
	Eli Zaretskii, stefankangas

David Fussner <dfussner@googlemail.com> writes:

> Here's my third attempt at a working xref backend for TeX. I'll try
> quickly to summarize what's in it:
>
> 1. I've modified etags so that it creates findable tags for as many
> different sorts of TeX construct as possible, including those written
> in the new expl3 syntax. I've now removed the escape character from
> the tag names, as this simplifies code all around.

Hi David,

Thanks.  I trust your code works, so I have 2 minor comments.

> 5. Slightly unrelatedly, I've added new syntax-propertize-rules to
> latex-mode so that expl3 constructs with the underscore aren't
> fontified as subscripts, which makes such code unreadable. I'm happy
> to split this off as another patch.

I think this makes sense.  AFAIK, Stefan M. looks after tex-mode.el, so
he can the review it.

> @@ -5736,11 +5752,25 @@ Scheme_functions (FILE *inf)
>  static linebuffer *TEX_toktab = NULL; /* Table with tag tokens */
>  
>  /* Default set of control sequences to put into TEX_toktab.
> -   The value of environment var TEXTAGS is prepended to this.  */
> +   The value of environment var TEXTAGS is prepended to this.
> +   (2024) Add variants of '\def', some additional LaTeX (and
> +   former xparse) commands, common variants from the
> +   'etoolbox' package, and the main expl3 commands. */
>  static const char *TEX_defenv = "\
> -:chapter:section:subsection:subsubsection:eqno:label:ref:cite:bibitem\
> -:part:appendix:entry:index:def\
> -:newcommand:renewcommand:newenvironment:renewenvironment";
> +:label:ref:chapter:section:subsection:subsubsection:eqno:cite:bibitem\

I suggest to add 'Ref' and 'footref' as well which are part of LaTeX
kernel.

> +(defvar tex-esc-and-group-chars '(?\\ ?{ ?})

(defvar tex-esc-and-group-chars '(?\\ ?\{ ?\})

> +  "The current TeX escape and grouping characters.

Best, Arash





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2024-05-02  0:43                                     ` Dmitry Gutov
@ 2024-05-02 13:32                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-05-03 13:42                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-05-02 13:32 UTC (permalink / raw)
  To: Dmitry Gutov
  Cc: 53749, Ikumi Keita, Arash Esbati, Stefan Monnier, Tassilo Horn,
	Eli Zaretskii, stefankangas

[-- Attachment #1: Type: text/plain, Size: 2558 bytes --]

Hi Dmitry,

Thanks for looking over the patch. Here's the recipe for the purported
bug in xref.el:

1. Please apply my patch to tex-mode.el (and xref.el).

2. I've attached xref-bug.zip, which contains a directory with 4
identical LaTeX files and one LaTeX file with a single additional
character. Please extract it.

3. emacs -Q

4. C-x C-f xref-bug/mwea.ltx, and please don't visit the other 4
files.

5. Put point on \__hook_debug:n in line 6.

6. M-?, RTN, ... RTN, RTN.

The xref buffer should offer 5 hits, one from each file in the
directory.

7. Comment out the the line I added to xref--collect-matches,
byte-compile and load the file.

8. With point in the same place, M-?, RTN, ... RTN, RTN.

The xref buffer should offer 3 hits. The first is from the
file-visiting buffer (where I also set syntax-propertize--done to 0,
because in my testing there could be some issues here, too). The
second hit is from the first file opened in *xref-temp. Here,
syntax-propertize runs to line-end, and all is well. The next two
files are missed, because syntax-propertize--done is set to line-end
and they have exactly the same line length as file two, and therefore
syntax-propertize thinks that's good enough and doesn't actually
change anything. The fifth file has an additional character in line 6,
so syntax-propertize decides it needs to work on this line because
line-end > syntax-propertize--done.

You can put point on, say, \documentclass, and you'll get all 5 hits,
because this string doesn't begin or end with a non-word, non-symbol
character, and syntax-propertize doesn't need to run. You can make the
search string "\documentclass" and you'll get 2 hits, as line 1 has
the same length in all 5 files. (It's worth trying "\usepackage" as
the search string, too.)

That's my diagnosis anyway. Does it make sense?

Thanks,

David.

On Thu, 2 May 2024 at 01:43, Dmitry Gutov <dgutov@yandex.ru> wrote:
>
> On 29/04/2024 17:15, David Fussner via Bug reports for GNU Emacs, the
> Swiss army knife of text editors wrote:
> > though I had to add a one-liner in xref.el to fix
> > what I believe is a minor bug there preventing syntax-propertize from
> > doing its work when the temp buffer holds text from a new file. (I can
> > provide a recipe for this if you want.)
>
> Yes, could you please expand on it separately?
>
> The rest of the patch description just makes sense to me, and I'd be
> happy to leave (or not) the detailed review to whoever reviews TeX
> contributions around here, but this is something I'll need to pay
> special attention to.

[-- Attachment #2: xref-bug.zip --]
[-- Type: application/zip, Size: 1588 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2024-05-02  6:47                                     ` Arash Esbati
@ 2024-05-02 13:34                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-05-02 13:34 UTC (permalink / raw)
  To: Arash Esbati
  Cc: 53749, Ikumi Keita, Dmitry Gutov, Stefan Monnier, Tassilo Horn,
	Eli Zaretskii, stefankangas

Thanks for the review, Arash, and I'll make those changes.

Best, David.

On Thu, 2 May 2024 at 07:47, Arash Esbati <arash@gnu.org> wrote:
>
> David Fussner <dfussner@googlemail.com> writes:
>
> > Here's my third attempt at a working xref backend for TeX. I'll try
> > quickly to summarize what's in it:
> >
> > 1. I've modified etags so that it creates findable tags for as many
> > different sorts of TeX construct as possible, including those written
> > in the new expl3 syntax. I've now removed the escape character from
> > the tag names, as this simplifies code all around.
>
> Hi David,
>
> Thanks.  I trust your code works, so I have 2 minor comments.
>
> > 5. Slightly unrelatedly, I've added new syntax-propertize-rules to
> > latex-mode so that expl3 constructs with the underscore aren't
> > fontified as subscripts, which makes such code unreadable. I'm happy
> > to split this off as another patch.
>
> I think this makes sense.  AFAIK, Stefan M. looks after tex-mode.el, so
> he can the review it.
>
> > @@ -5736,11 +5752,25 @@ Scheme_functions (FILE *inf)
> >  static linebuffer *TEX_toktab = NULL; /* Table with tag tokens */
> >
> >  /* Default set of control sequences to put into TEX_toktab.
> > -   The value of environment var TEXTAGS is prepended to this.  */
> > +   The value of environment var TEXTAGS is prepended to this.
> > +   (2024) Add variants of '\def', some additional LaTeX (and
> > +   former xparse) commands, common variants from the
> > +   'etoolbox' package, and the main expl3 commands. */
> >  static const char *TEX_defenv = "\
> > -:chapter:section:subsection:subsubsection:eqno:label:ref:cite:bibitem\
> > -:part:appendix:entry:index:def\
> > -:newcommand:renewcommand:newenvironment:renewenvironment";
> > +:label:ref:chapter:section:subsection:subsubsection:eqno:cite:bibitem\
>
> I suggest to add 'Ref' and 'footref' as well which are part of LaTeX
> kernel.
>
> > +(defvar tex-esc-and-group-chars '(?\\ ?{ ?})
>
> (defvar tex-esc-and-group-chars '(?\\ ?\{ ?\})
>
> > +  "The current TeX escape and grouping characters.
>
> Best, Arash





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2024-05-02 13:32                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-05-03 13:42                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 66+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-05-03 13:42 UTC (permalink / raw)
  To: David Fussner
  Cc: 53749, Ikumi Keita, Dmitry Gutov, Arash Esbati, stefankangas,
	Tassilo Horn, Eli Zaretskii

> Thanks for looking over the patch. Here's the recipe for the purported
> bug in xref.el:

The problem stems from xref.el's constant abuse of
`inhibit-modification-hooks`.  Binding this var to t should be done only
in exceptional circumstances and should ideally be accompanied by a
comment explaining why it's necessary.


        Stefan






^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2024-04-29 14:15                                   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-05-02  0:43                                     ` Dmitry Gutov
  2024-05-02  6:47                                     ` Arash Esbati
@ 2024-05-03 14:10                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-05-04  8:26                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-05-04 14:32                                       ` Arash Esbati
  2 siblings, 2 replies; 66+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-05-03 14:10 UTC (permalink / raw)
  To: David Fussner
  Cc: 53749, Ikumi Keita, Tassilo Horn, Arash Esbati, stefankangas,
	Dmitry Gutov, Eli Zaretskii

Hi,

Apparently I'm the `tex-mode.el` guy, so I tried to take a look.

> diff --git a/lisp/textmodes/tex-mode.el b/lisp/textmodes/tex-mode.el
> index 97c950267c6..d990a2dbfa9 100644
> --- a/lisp/textmodes/tex-mode.el
> +++ b/lisp/textmodes/tex-mode.el
> @@ -695,7 +696,25 @@ tex-verbatim-environments
>       ("\\\\\\(?:end\\|begin\\) *\\({[^\n{}]*}\\)"
>        (1 (ignore
>            (tex-env-mark (match-beginning 0)
> -                        (match-beginning 1) (match-end 1))))))))
> +                        (match-beginning 1) (match-end 1)))))
> +     ;; The next two rules change the syntax of `:' and `_' in expl3
> +     ;; constructs, so that `tex-font-lock-suscript' can fontify them
> +     ;; more accurately.
> +     ((concat "\\(\\(?:[\\\\[:space:]{]_\\|"
> +              "[\\\\{[:space:]][^][_[:space:][:cntrl:][:digit:]\\\\{}()/=]+\\)"
> +              "\\(?:_+\\(?:[^][[:space:][:cntrl:][:digit:]:\\\\{}()/#_=]+\\|"
> +              "#+[1-9]\\)\\)+\\)\\([:_]?\\)")

Can you add in the comment some URL pointing to some relevant expl3
documentation which "explains" why the above regexp makes sense?
Also I don't clearly see how the above regexp distinguishes expl3 code
from "normal" LaTeX code, so the comment should say something about it.

Side note: I'd avoid [:space:] whose exact meaning is rarely quite what
we need.
Side note: backslash doesn't need to be backslashed in [...].

> +      (1 (ignore
> +          (let* ((expr (buffer-substring-no-properties (match-beginning 1)
> +                                                       (match-end 1)))
> +                 (list (seq-positions expr ?_)))
> +            (dolist (pos list)
> +              (put-text-property (+ pos (match-beginning 1))
> +                                 (1+ (+ pos (match-beginning 1)))
> +                                 'syntax-table (string-to-syntax "_"))))))
> +      (2 "_"))
> +     ("\\\\[[:alpha:]]+\\(:\\)[[:alpha:][:space:]\n]"
> +      (1 "_")))))

Currently we "skip" inappropriate underscores via
`tex-font-lock-match-suscript` and/or by adding a particular `face` text
property rather than via `syntax-table/propertize`.

For algorithmic reasons, it's better to minimize the work done in
`syntax-propertize-function` as much as possible (font-lock is more lazy
than `syntax-propertize`), so I recommend you try and moving the above
to font-lock rules.

> +(defvar tex-esc-and-group-chars '(?\\ ?{ ?})
> +  "The current TeX escape and grouping characters.

I recommend you backslash escape the { and } above (although it's not
indispensable, `emacs-lisp-mode` will parse the code better).
More importantly, the docstring doesn't explain what this list
means/does.  E.g. does the order matter?  Can it be longer than 3 elements?

From the current docstring I can't guess what would be the consequence
of adding/removing elements to/from this list.

> +;; Populate `semantic-symref-filepattern-alist' for the in-tree modes;
> +;; AUCTeX is doing the same for its modes.
> +(defvar semantic-symref-filepattern-alist)
> +(with-eval-after-load 'semantic/symref/grep
> +  (push '(latex-mode "*.[tT]e[xX]" "*.ltx" "*.sty" "*.cl[so]"
> +                     "*.bbl" "*.drv" "*.hva")
> +        semantic-symref-filepattern-alist)
> +  (push '(plain-tex-mode "*.[tT]e[xX]" "*.ins")
> +        semantic-symref-filepattern-alist)
> +  (push '(doctex-mode "*.dtx") semantic-symref-filepattern-alist))

We know `semantic-symref-filepattern-alist` will exist when
`semantic/symref/grep` is loaded, but not before, so I'd put the
`defvar` inside the `with-eval-after-load`.

> +;; Setup AUCTeX modes (for testing purposes only).
> +
> +(add-hook 'TeX-mode-hook #'tex-set-auctex-xref-backend)
> +
> +(defun tex-set-auctex-xref-backend ()
> +  (add-hook 'xref-backend-functions #'tex--xref-backend nil t))

I assume this will be sent to AUCTeX and is not meant to be in
`tex-mode.el`, right?

> +;; `xref-find-references' currently may need this when called from a
> +;; latex-mode buffer in order to search files or buffers with a .tex
> +;; suffix (including the buffer from which it has been called).  We
> +;; append it to `auto-mode-alist' so as not to interfere with the usual
> +;; mode-setting apparatus.  Changes here and in AUCTeX should soon
> +;; render it unnecessary.
> +(add-to-list 'auto-mode-alist '("\\.[tT]e[xX]\\'" . latex-mode) t)

Maybe I have not followed the whole discussion closely enough, but at
least to me the above "soon" is very unclear.
I'll assume that this code will be removed before we install the patch.
If not, please explain in the comment why this specific hack is needed
and how it works.

> +(cl-defmethod xref-backend-references ((_backend (eql 'tex-etags)) identifier)
> +  "Find references of IDENTIFIER in TeX buffers and files."
> +  (require 'semantic/symref/grep)
> +  (let (bufs texbufs
> +             (mode major-mode))
> +    (dolist (buf (buffer-list))
> +      (if (eq (buffer-local-value 'major-mode buf) mode)
> +          (push buf bufs)
> +        (when (string-match-p ".*\\.[tT]e[xX]" (buffer-name buf))
> +          (push buf texbufs))))
> +    (unless (seq-set-equal-p tex--buffers-list bufs)
> +      (let* ((amalist (tex--collect-file-extensions))
> +	     (extlist (alist-get mode semantic-symref-filepattern-alist))
> +	     (extlist-new (seq-uniq
> +                           (seq-union amalist extlist #'string-match-p))))

After sinking the `defvar` above, you'll need to add a new `defvar` for
`semantic-symref-filepattern-alist` just after the `require`.

> +                (setq-local syntax-propertize-function
> +                            (eval
> +                             `(tex-xref-syntax-function
> +                               ,identifier ,beg ,end)))

Why do we need to change `syntax-propertize-function` and why do we need
`eval`?

> +                (setq syntax-propertize--done 0)

This is not sufficient.  You want to `syntax-ppss-flush-cache`.


        Stefan






^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2024-05-03 14:10                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-05-04  8:26                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-05-04 14:32                                       ` Arash Esbati
  1 sibling, 0 replies; 66+ messages in thread
From: David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-05-04  8:26 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: 53749, Ikumi Keita, Tassilo Horn, Arash Esbati, stefankangas,
	Dmitry Gutov, Eli Zaretskii

Thank you very much, Stefan, for taking the time to review the patch.
In short, it plainly needs some work, but I'm rather short of time
this weekend so will respond properly and I hope more coherently
Monday or Tuesday.

Best, David.

On Fri, 3 May 2024 at 15:11, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>
> Hi,
>
> Apparently I'm the `tex-mode.el` guy, so I tried to take a look.
>
> > diff --git a/lisp/textmodes/tex-mode.el b/lisp/textmodes/tex-mode.el
> > index 97c950267c6..d990a2dbfa9 100644
> > --- a/lisp/textmodes/tex-mode.el
> > +++ b/lisp/textmodes/tex-mode.el
> > @@ -695,7 +696,25 @@ tex-verbatim-environments
> >       ("\\\\\\(?:end\\|begin\\) *\\({[^\n{}]*}\\)"
> >        (1 (ignore
> >            (tex-env-mark (match-beginning 0)
> > -                        (match-beginning 1) (match-end 1))))))))
> > +                        (match-beginning 1) (match-end 1)))))
> > +     ;; The next two rules change the syntax of `:' and `_' in expl3
> > +     ;; constructs, so that `tex-font-lock-suscript' can fontify them
> > +     ;; more accurately.
> > +     ((concat "\\(\\(?:[\\\\[:space:]{]_\\|"
> > +              "[\\\\{[:space:]][^][_[:space:][:cntrl:][:digit:]\\\\{}()/=]+\\)"
> > +              "\\(?:_+\\(?:[^][[:space:][:cntrl:][:digit:]:\\\\{}()/#_=]+\\|"
> > +              "#+[1-9]\\)\\)+\\)\\([:_]?\\)")
>
> Can you add in the comment some URL pointing to some relevant expl3
> documentation which "explains" why the above regexp makes sense?
> Also I don't clearly see how the above regexp distinguishes expl3 code
> from "normal" LaTeX code, so the comment should say something about it.
>
> Side note: I'd avoid [:space:] whose exact meaning is rarely quite what
> we need.
> Side note: backslash doesn't need to be backslashed in [...].
>
> > +      (1 (ignore
> > +          (let* ((expr (buffer-substring-no-properties (match-beginning 1)
> > +                                                       (match-end 1)))
> > +                 (list (seq-positions expr ?_)))
> > +            (dolist (pos list)
> > +              (put-text-property (+ pos (match-beginning 1))
> > +                                 (1+ (+ pos (match-beginning 1)))
> > +                                 'syntax-table (string-to-syntax "_"))))))
> > +      (2 "_"))
> > +     ("\\\\[[:alpha:]]+\\(:\\)[[:alpha:][:space:]\n]"
> > +      (1 "_")))))
>
> Currently we "skip" inappropriate underscores via
> `tex-font-lock-match-suscript` and/or by adding a particular `face` text
> property rather than via `syntax-table/propertize`.
>
> For algorithmic reasons, it's better to minimize the work done in
> `syntax-propertize-function` as much as possible (font-lock is more lazy
> than `syntax-propertize`), so I recommend you try and moving the above
> to font-lock rules.
>
> > +(defvar tex-esc-and-group-chars '(?\\ ?{ ?})
> > +  "The current TeX escape and grouping characters.
>
> I recommend you backslash escape the { and } above (although it's not
> indispensable, `emacs-lisp-mode` will parse the code better).
> More importantly, the docstring doesn't explain what this list
> means/does.  E.g. does the order matter?  Can it be longer than 3 elements?
>
> From the current docstring I can't guess what would be the consequence
> of adding/removing elements to/from this list.
>
> > +;; Populate `semantic-symref-filepattern-alist' for the in-tree modes;
> > +;; AUCTeX is doing the same for its modes.
> > +(defvar semantic-symref-filepattern-alist)
> > +(with-eval-after-load 'semantic/symref/grep
> > +  (push '(latex-mode "*.[tT]e[xX]" "*.ltx" "*.sty" "*.cl[so]"
> > +                     "*.bbl" "*.drv" "*.hva")
> > +        semantic-symref-filepattern-alist)
> > +  (push '(plain-tex-mode "*.[tT]e[xX]" "*.ins")
> > +        semantic-symref-filepattern-alist)
> > +  (push '(doctex-mode "*.dtx") semantic-symref-filepattern-alist))
>
> We know `semantic-symref-filepattern-alist` will exist when
> `semantic/symref/grep` is loaded, but not before, so I'd put the
> `defvar` inside the `with-eval-after-load`.
>
> > +;; Setup AUCTeX modes (for testing purposes only).
> > +
> > +(add-hook 'TeX-mode-hook #'tex-set-auctex-xref-backend)
> > +
> > +(defun tex-set-auctex-xref-backend ()
> > +  (add-hook 'xref-backend-functions #'tex--xref-backend nil t))
>
> I assume this will be sent to AUCTeX and is not meant to be in
> `tex-mode.el`, right?
>
> > +;; `xref-find-references' currently may need this when called from a
> > +;; latex-mode buffer in order to search files or buffers with a .tex
> > +;; suffix (including the buffer from which it has been called).  We
> > +;; append it to `auto-mode-alist' so as not to interfere with the usual
> > +;; mode-setting apparatus.  Changes here and in AUCTeX should soon
> > +;; render it unnecessary.
> > +(add-to-list 'auto-mode-alist '("\\.[tT]e[xX]\\'" . latex-mode) t)
>
> Maybe I have not followed the whole discussion closely enough, but at
> least to me the above "soon" is very unclear.
> I'll assume that this code will be removed before we install the patch.
> If not, please explain in the comment why this specific hack is needed
> and how it works.
>
> > +(cl-defmethod xref-backend-references ((_backend (eql 'tex-etags)) identifier)
> > +  "Find references of IDENTIFIER in TeX buffers and files."
> > +  (require 'semantic/symref/grep)
> > +  (let (bufs texbufs
> > +             (mode major-mode))
> > +    (dolist (buf (buffer-list))
> > +      (if (eq (buffer-local-value 'major-mode buf) mode)
> > +          (push buf bufs)
> > +        (when (string-match-p ".*\\.[tT]e[xX]" (buffer-name buf))
> > +          (push buf texbufs))))
> > +    (unless (seq-set-equal-p tex--buffers-list bufs)
> > +      (let* ((amalist (tex--collect-file-extensions))
> > +          (extlist (alist-get mode semantic-symref-filepattern-alist))
> > +          (extlist-new (seq-uniq
> > +                           (seq-union amalist extlist #'string-match-p))))
>
> After sinking the `defvar` above, you'll need to add a new `defvar` for
> `semantic-symref-filepattern-alist` just after the `require`.
>
> > +                (setq-local syntax-propertize-function
> > +                            (eval
> > +                             `(tex-xref-syntax-function
> > +                               ,identifier ,beg ,end)))
>
> Why do we need to change `syntax-propertize-function` and why do we need
> `eval`?
>
> > +                (setq syntax-propertize--done 0)
>
> This is not sufficient.  You want to `syntax-ppss-flush-cache`.
>
>
>         Stefan
>





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2024-05-03 14:10                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-05-04  8:26                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-05-04 14:32                                       ` Arash Esbati
  2024-05-04 14:54                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 66+ messages in thread
From: Arash Esbati @ 2024-05-04 14:32 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: 53749, Ikumi Keita, Dmitry Gutov, David Fussner, stefankangas,
	Tassilo Horn, Eli Zaretskii

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> diff --git a/lisp/textmodes/tex-mode.el b/lisp/textmodes/tex-mode.el
>> index 97c950267c6..d990a2dbfa9 100644
>> --- a/lisp/textmodes/tex-mode.el
>> +++ b/lisp/textmodes/tex-mode.el
>> @@ -695,7 +696,25 @@ tex-verbatim-environments
>>       ("\\\\\\(?:end\\|begin\\) *\\({[^\n{}]*}\\)"
>>        (1 (ignore
>>            (tex-env-mark (match-beginning 0)
>> -                        (match-beginning 1) (match-end 1))))))))
>> +                        (match-beginning 1) (match-end 1)))))
>> +     ;; The next two rules change the syntax of `:' and `_' in expl3
>> +     ;; constructs, so that `tex-font-lock-suscript' can fontify them
>> +     ;; more accurately.
>> +     ((concat "\\(\\(?:[\\\\[:space:]{]_\\|"
>> +              "[\\\\{[:space:]][^][_[:space:][:cntrl:][:digit:]\\\\{}()/=]+\\)"
>> +              "\\(?:_+\\(?:[^][[:space:][:cntrl:][:digit:]:\\\\{}()/#_=]+\\|"
>> +              "#+[1-9]\\)\\)+\\)\\([:_]?\\)")
>
> Can you add in the comment some URL pointing to some relevant expl3
> documentation which "explains" why the above regexp makes sense?
> Also I don't clearly see how the above regexp distinguishes expl3 code
> from "normal" LaTeX code, so the comment should say something about
> it.

FWIW, I'm not sure if there is an URL for that, but in interface3.pdf,
chap.1, you'll find:

    1.1 Naming functions and variables

    LATEX3 does not use @ as a "letter"" for defining internal macros.
    Instead, the symbols _ and : are used in internal macro names to provide
    structure. The name of each function is divided into logical units using
    _, while : separates the name of the function from the argument
    specifier ("arg-spec"). This describes the arguments expected by the
    function. In most cases, each argument is represented by a single
    letter. The complete list of arg-spec letters for a function is referred
    to as the signature of the function.

So expect things like this:

    \tl_set:Nn \l_mya_tl { A }
    \tl_set:Nn \l_myb_tl { B }
    \tl_set:Nf \l_mya_tl { \l_mya_tl \l_myb_tl }

>> +;; Setup AUCTeX modes (for testing purposes only).
>> +
>> +(add-hook 'TeX-mode-hook #'tex-set-auctex-xref-backend)
>> +
>> +(defun tex-set-auctex-xref-backend ()
>> +  (add-hook 'xref-backend-functions #'tex--xref-backend nil t))
>
> I assume this will be sent to AUCTeX and is not meant to be in
> `tex-mode.el`, right?

That would have been a question from my side, but I saw that "testing
purposes only" and skipped it for this round.

Best, Arash





^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2024-05-04 14:32                                       ` Arash Esbati
@ 2024-05-04 14:54                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-05-04 21:15                                           ` Arash Esbati
  0 siblings, 1 reply; 66+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-05-04 14:54 UTC (permalink / raw)
  To: Arash Esbati
  Cc: 53749, Ikumi Keita, Dmitry Gutov, David Fussner, stefankangas,
	Tassilo Horn, Eli Zaretskii

> FWIW, I'm not sure if there is an URL for that, but in interface3.pdf,
> chap.1, you'll find:
[...]
> So expect things like this:
>
>     \tl_set:Nn \l_mya_tl { A }
>     \tl_set:Nn \l_myb_tl { B }
>     \tl_set:Nf \l_mya_tl { \l_mya_tl \l_myb_tl }

But that is *also* valid LaTeX, with a different meaning (i.e. where
`_` has its subscript meaning).  So we need some other info in order to
know which of the two we're dealing with.

Maybe that info is simply "assume LaTeX3 if the _ is followed by several
letters" or some such heuristic, but the comment should say so.


        Stefan






^ permalink raw reply	[flat|nested] 66+ messages in thread

* bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
  2024-05-04 14:54                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-05-04 21:15                                           ` Arash Esbati
  0 siblings, 0 replies; 66+ messages in thread
From: Arash Esbati @ 2024-05-04 21:15 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: 53749, Ikumi Keita, Dmitry Gutov, David Fussner, stefankangas,
	Tassilo Horn, Eli Zaretskii

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> But that is *also* valid LaTeX, with a different meaning (i.e. where
> `_` has its subscript meaning).  So we need some other info in order to
> know which of the two we're dealing with.

That's true.  AFAIK, one has to deal with:

  • \_ in ordinary text like foo\_bar
  • _ in math mode like $a_b$
  • expl3 macros like \tl_set:Nn
  • expl3 macros like \__kernel_kern:n

> Maybe that info is simply "assume LaTeX3 if the _ is followed by several
> letters" or some such heuristic, but the comment should say so.

Last time I looked at this, my conclusion was: Deal with \_ and _ in
usual .tex files and expect expl3 macros in .dtx file only.

Best, Arash





^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2024-05-04 21:15 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-03 15:09 bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-21  2:11 ` Dmitry Gutov
2022-02-21  9:48   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-21 17:28     ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-21 23:56       ` Dmitry Gutov
2022-02-22 15:19         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-23  2:21           ` Dmitry Gutov
2022-02-23 10:45             ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-24  2:23               ` Dmitry Gutov
2022-02-24 13:15                 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-21 23:55     ` Dmitry Gutov
2022-09-08 13:25   ` Lars Ingebrigtsen
2022-09-08 13:34     ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-09-08 13:39       ` Lars Ingebrigtsen
2022-09-08 15:50         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-03  9:08           ` Stefan Kangas
2023-09-03 10:03             ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-03 10:46               ` Stefan Kangas
2023-09-13 11:10                 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-13 13:42                   ` Stefan Kangas
2023-09-13 15:23                   ` Dmitry Gutov
2023-09-13 17:01                     ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-13 23:59                       ` Dmitry Gutov
2023-09-14  6:10                         ` Eli Zaretskii
2023-09-15 18:45                           ` Tassilo Horn
2023-09-16  5:53                             ` Ikumi Keita
2023-09-17  8:49                               ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-22 13:06                                 ` Arash Esbati
2024-04-22 14:56                                   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-22 16:15                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-22 16:37                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-22 17:16                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-22 17:25                                           ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-24  0:09                                           ` Dmitry Gutov
2024-04-24  9:02                                             ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-23 12:04                                     ` Arash Esbati
2024-04-23 13:21                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-29 14:15                                   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-02  0:43                                     ` Dmitry Gutov
2024-05-02 13:32                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-03 13:42                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-02  6:47                                     ` Arash Esbati
2024-05-02 13:34                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-03 14:10                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-04  8:26                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-04 14:32                                       ` Arash Esbati
2024-05-04 14:54                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-04 21:15                                           ` Arash Esbati
2023-09-14 16:11                         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-14 23:55                           ` Dmitry Gutov
2023-09-15  6:47                             ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-13 19:16                     ` Eli Zaretskii
2023-09-13 20:25                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-14  5:14                         ` Eli Zaretskii
2022-02-21 12:35 ` Arash Esbati
2022-02-21 14:03   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-25 20:16 ` Augusto Stoffel
2022-02-26  9:29   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-26 10:56     ` Augusto Stoffel
2022-02-27 18:42       ` Arash Esbati
2022-02-28  9:09         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-28 11:54           ` Arash Esbati
2022-02-28 13:11             ` Augusto Stoffel
2022-02-28 19:04               ` Arash Esbati
2022-03-01  8:46                 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-28 13:05           ` Augusto Stoffel

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).