all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: David Fussner via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
To: Arash Esbati <arash@gnu.org>
Cc: 53749@debbugs.gnu.org, Ikumi Keita <ikumi@ikumi.que.jp>,
	Dmitry Gutov <dgutov@yandex.ru>,
	Stefan Monnier <monnier@iro.umontreal.ca>,
	Tassilo Horn <tsdh@gnu.org>, Eli Zaretskii <eliz@gnu.org>,
	stefankangas@gmail.com
Subject: bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
Date: Thu, 16 May 2024 13:56:56 +0100	[thread overview]
Message-ID: <CADF+RthKH96e0Vrkfw3kGikTX-edOx2Phdj-vihMfzg5qt=tUg@mail.gmail.com> (raw)
In-Reply-To: <m2v83eur8o.fsf@macmutant.fritz.box>

[-- Attachment #1: Type: text/plain, Size: 1905 bytes --]

Thanks, Arash. Agreed, on all counts. Revised patch attached.

Best, David.

On Thu, 16 May 2024 at 08:54, Arash Esbati <arash@gnu.org> wrote:
>
> David Fussner <dfussner@googlemail.com> writes:
>
> > +(defun tex-expl-buffer-parse ()
> > +  "Identify buffers where expl3 syntax is always active."
> > +  (save-excursion
> > +    (goto-char (point-min))
> > +    (when (tex-search-noncomment
> > +        (re-search-forward
> > +         "\\(?:\\\\\\(?:ExplFile\\|ProvidesExpl\\|__xparse_file\\)\\)"
>
> Is the outer grouping necessary?  Why not just:
>
> "\\\\\\(?:ExplFile\\|ProvidesExpl\\|__xparse_file\\)"
>
> > +         nil t))
> > +      (setq tex-expl-buffer-p t))))
> > +
> > +(defun tex-expl-region-set (_beg _end)
> > +  "Create a list of regions where expl3 syntax is active.
> > +This function updates the list whenever `syntax-propertize' runs, and
> > +stores it in the buffer-local variable `tex-expl-region-list'.  The
> > +list will always be nil when the buffer visits an expl3 file, e.g., an
> > +expl3 class or package, where expl3 syntax is always active."
> > +  (unless syntax-ppss--updated-cache;; Stop forward search running twice.
> > +    (setq tex-expl-region-list nil)
> > +    ;; Leaving this test here allows users to set `tex-expl-buffer-p'
> > +    ;; independently of the mode's automatic detection of an expl3 file.
> > +    (unless tex-expl-buffer-p
> > +      (goto-char (point-min))
> > +      (while (tex-search-noncomment
> > +              (re-search-forward "\\ExplSyntaxOn" nil t))
>
> This looks wrong, I think you want `search-forward'.
>
> > +        (let ((new-beg (point))
> > +              (new-end (or (tex-search-noncomment
> > +                            (re-search-forward "\\ExplSyntaxOff" nil t))
>
> Same here.
>
> > +                           (point-max))))
> > +          (push (cons new-beg new-end) tex-expl-region-list))))))
>
> Best, Arash

[-- Attachment #2: 0003-Provide-a-modified-xref-backend-for-TeX-buffers.patch --]
[-- Type: text/x-patch, Size: 33122 bytes --]

From 2839cbe15f91a1292d26e9208d21ce47270fd72e Mon Sep 17 00:00:00 2001
From: David Fussner <dfussner@googlemail.com>
Date: Thu, 16 May 2024 13:51:12 +0100
Subject: [PATCH] Provide a modified xref backend for TeX buffers

* lib-src/etags.c (TeX_commands): Improve parsing of commands in TeX
buffers.
(TEX_defenv): Expand list of commands to tag by default in TeX
buffers.
(TeX_help):
* doc/emacs/maintaining.texi (Tag Syntax): Document new tagged
commands.
(Identifier Search): Add note about semantic-symref-filepattern-alist,
auto-mode-alist, and xref-find-references.

* lisp/textmodes/tex-mode.el (tex-font-lock-suscript): Test for
underscore in expl3 files and regions, disable subscript face there.
(tex-common-initialization): Set up xref backend for in-tree TeX
modes. Detect expl3 files, and in others set up a list of expl3
regions.
(tex-expl-buffer-parse): New function called in previous.
(tex-expl-buffer-p): New var to hold the result of previous.
(tex-expl-region-set): New function added to
'syntax-propertize-extend-region-functions' hook.
(tex-expl-region-list): New var to hold the result of previous.
(tex--thing-at-point, tex-thingatpt--beginning-of-symbol)
(tex-thingatpt--end-of-symbol, tex--bounds-of-symbol-at-point):
New functions to return 'thing-at-point' for xref backend.
(tex-thingatpt-exclude-chars): New var to do the same.
(xref-backend-identifier-at-point): New TeX backend method to provide
symbols for processing by xref.
(xref-backend-identifier-completion-table)
(xref-backend-identifier-completion-ignore-case)
(xref-backend-definitions, xref-backend-apropos): Placeholders to
call the standard 'etags' xref backend methods.
(xref-backend-references): Wrapper to call the default xref backend
method, finding as many relevant files as possible and using a bespoke
syntax-propertize-function when required.
(tex--collect-file-extensions, tex-xref-syntax-function): Helper
functions for previous.
(tex-find-references-syntax-table, tex--buffers-list)
(tex--xref-syntax-fun, tex--old-syntax-function): New vars for same.
---
 doc/emacs/maintaining.texi |  39 +++-
 lib-src/etags.c            | 189 +++++++++++++++++--
 lisp/textmodes/tex-mode.el | 373 ++++++++++++++++++++++++++++++++++++-
 3 files changed, 580 insertions(+), 21 deletions(-)

diff --git a/doc/emacs/maintaining.texi b/doc/emacs/maintaining.texi
index 579098c81b1..a064103aa25 100644
--- a/doc/emacs/maintaining.texi
+++ b/doc/emacs/maintaining.texi
@@ -2529,6 +2529,15 @@ Identifier Search
 referenced.  The XREF mode commands are available in this buffer, see
 @ref{Xref Commands}.
 
+When invoked in a buffer whose major mode uses the @code{etags} backend,
+@kbd{M-?} searches files and buffers whose major mode matches that of
+the original buffer.  It guesses that mode from file extensions, so if
+@kbd{M-?} seems to be skipping relevant buffers or files, try
+customizing either the variable @code{semantic-symref-filepattern-alist}
+(if your buffer's major mode already has an entry in it), or
+@code{auto-mode-alist} (if not), thereby informing @code{xref} of the
+missing extensions (@pxref{Choosing Modes}).
+
 @vindex xref-auto-jump-to-first-xref
   If the value of the variable @code{xref-auto-jump-to-first-xref} is
 @code{t}, @code{xref-find-references} automatically jumps to the first
@@ -2747,10 +2756,32 @@ Tag Syntax
 @item
 In @LaTeX{} documents, the arguments for @code{\chapter},
 @code{\section}, @code{\subsection}, @code{\subsubsection},
-@code{\eqno}, @code{\label}, @code{\ref}, @code{\cite},
-@code{\bibitem}, @code{\part}, @code{\appendix}, @code{\entry},
-@code{\index}, @code{\def}, @code{\newcommand}, @code{\renewcommand},
-@code{\newenvironment} and @code{\renewenvironment} are tags.
+@code{\eqno}, @code{\label}, @code{\ref}, @code{\Ref}, @code{\footref},
+@code{\cite}, @code{\bibitem}, @code{\part}, @code{\appendix},
+@code{\entry}, @code{\index}, @code{\def}, @code{\edef}, @code{\gdef},
+@code{\xdef}, @code{\newcommand}, @code{\renewcommand},
+@code{\newenvironment}, @code{\renewenvironment},
+@code{\DeclareRobustCommand}, @code{\newrobustcmd},
+@code{\renewrobustcmd}, @code{\providecommand},
+@code{\providerobustcmd}, @code{\NewDocumentCommand},
+@code{\RenewDocumentCommand}, @code{\ProvideDocumentCommand},
+@code{\DeclareDocumentCommand}, @code{\NewExpandableDocumentCommand},
+@code{\RenewExpandableDocumentCommand},
+@code{\ProvideExpandableDocumentCommand},
+@code{\DeclareExpandableDocumentCommand},
+@code{\NewDocumentEnvironment}, @code{\RenewDocumentEnvironment},
+@code{\ProvideDocumentEnvironment}, @code{\DeclareDocumentEnvironment},
+@code{\csdef}, @code{\csedef}, @code{\csgdef}, @code{\csxdef},
+@code{\csletcs}, @code{\cslet}, @code{\letcs}, @code{\let},
+@code{\cs_new_protected_nopar}, @code{\cs_new_protected},
+@code{\cs_new_nopar}, @code{\cs_new_eq}, @code{\cs_new},
+@code{\cs_set_protected_nopar}, @code{\cs_set_protected},
+@code{\cs_set_nopar}, @code{\cs_set_eq}, @code{\cs_set},
+@code{\cs_gset_protected_nopar}, @code{\cs_gset_protected},
+@code{\cs_gset_nopar}, @code{\cs_gset_eq}, @code{\cs_gset},
+@code{\cs_generate_from_arg_count}, and @code{\cs_generate_variant} are
+tags.  So too are the arguments of any starred variants of these
+commands.
 
 Other commands can make tags as well, if you specify them in the
 environment variable @env{TEXTAGS} before invoking @command{etags}.  The
diff --git a/lib-src/etags.c b/lib-src/etags.c
index 03bc55de03d..11fddc187c2 100644
--- a/lib-src/etags.c
+++ b/lib-src/etags.c
@@ -793,11 +793,27 @@ #define STDIN 0x1001		/* returned by getopt_long on --parse-stdin */
 static const char *TeX_suffixes [] =
   { "bib", "clo", "cls", "ltx", "sty", "TeX", "tex", NULL };
 static const char TeX_help [] =
-"In LaTeX text, the argument of any of the commands '\\chapter',\n\
-'\\section', '\\subsection', '\\subsubsection', '\\eqno', '\\label',\n\
-'\\ref', '\\cite', '\\bibitem', '\\part', '\\appendix', '\\entry',\n\
-'\\index', '\\def', '\\newcommand', '\\renewcommand',\n\
-'\\newenvironment' or '\\renewenvironment' is a tag.\n\
+"In LaTeX text, the argument of the commands '\\chapter', '\\section',\n\
+'\\subsection', '\\subsubsection', '\\eqno', '\\label', '\\ref',\n\
+'\\Ref', '\\footref', '\\cite', '\\bibitem', '\\part', '\\appendix',\n\
+'\\entry', '\\index', '\\def', '\\edef', '\\gdef', '\\xdef',\n\
+'\\newcommand', '\\renewcommand', '\\newrobustcmd', '\\renewrobustcmd',\n\
+'\\newenvironment', '\\renewenvironment', '\\DeclareRobustCommand',\n\
+'\\providecommand', '\\providerobustcmd', '\\NewDocumentCommand',\n\
+'\\RenewDocumentCommand', '\\ProvideDocumentCommand',\n\
+'\\DeclareDocumentCommand', '\\NewExpandableDocumentCommand',\n\
+'\\RenewExpandableDocumentCommand', '\\ProvideExpandableDocumentCommand',\n\
+'\\DeclareExpandableDocumentCommand', '\\NewDocumentEnvironment',\n\
+'\\RenewDocumentEnvironment', '\\ProvideDocumentEnvironment',\n\
+'\\DeclareDocumentEnvironment','\\csdef', '\\csedef', '\\csgdef',\n\
+'\\csxdef', '\\csletcs', '\\cslet', '\\letcs', '\\let',\n\
+'\\cs_new_protected_nopar', '\\cs_new_protected', '\\cs_new_nopar',\n\
+'\\cs_new_eq', '\\cs_new', '\\cs_set_protected_nopar',\n\
+'\\cs_set_protected', '\\cs_set_nopar', '\\cs_set_eq', '\\cs_set',\n\
+'\\cs_gset_protected_nopar', '\\cs_gset_protected', '\\cs_gset_nopar',\n\
+'\\cs_gset_eq', '\\cs_gset', '\\cs_generate_from_arg_count', or\n\
+'\\cs_generate_variant' is a tag.  So is the argument of any starred\n\
+variant of these commands.\n\
 \n\
 Other commands can be specified by setting the environment variable\n\
 'TEXTAGS' to a colon-separated list like, for example,\n\
@@ -5740,11 +5756,25 @@ Scheme_functions (FILE *inf)
 static linebuffer *TEX_toktab = NULL; /* Table with tag tokens */
 
 /* Default set of control sequences to put into TEX_toktab.
-   The value of environment var TEXTAGS is prepended to this.  */
+   The value of environment var TEXTAGS is prepended to this.
+   (2024) Add variants of '\def', some additional LaTeX (and
+   former xparse) commands, common variants from the
+   'etoolbox' package, and the main expl3 commands. */
 static const char *TEX_defenv = "\
-:chapter:section:subsection:subsubsection:eqno:label:ref:cite:bibitem\
-:part:appendix:entry:index:def\
-:newcommand:renewcommand:newenvironment:renewenvironment";
+:label:ref:Ref:footref:chapter:section:subsection:subsubsection:eqno:cite\
+:bibitem:part:appendix:entry:index:def:edef:gdef:xdef:newcommand:renewcommand\
+:newenvironment:renewenvironment:DeclareRobustCommand:renewrobustcmd\
+:newrobustcmd:providecommand:providerobustcmd:NewDocumentCommand\
+:RenewDocumentCommand:ProvideDocumentCommand:DeclareDocumentCommand\
+:NewExpandableDocumentCommand:RenewExpandableDocumentCommand\
+:ProvideExpandableDocumentCommand:DeclareExpandableDocumentCommand\
+:NewDocumentEnvironment:RenewDocumentEnvironment\
+:ProvideDocumentEnvironment:DeclareDocumentEnvironment:csdef\
+:csedef:csgdef:csxdef:csletcs:cslet:letcs:let:cs_new_protected_nopar\
+:cs_new_protected:cs_new_nopar:cs_new_eq:cs_new:cs_set_protected_nopar\
+:cs_set_protected:cs_set_nopar:cs_set_eq:cs_set:cs_gset_protected_nopar\
+:cs_gset_protected:cs_gset_nopar:cs_gset_eq:cs_gset\
+:cs_generate_from_arg_count:cs_generate_variant";
 
 static void TEX_decode_env (const char *, const char *);
 
@@ -5803,19 +5833,137 @@ TeX_commands (FILE *inf)
 	      {
 		char *p;
 		ptrdiff_t namelen, linelen;
-		bool opgrp = false;
+		bool opgrp = false, one_esc = false, is_explthree = false;
 
 		cp = skip_spaces (cp + key->len);
+
+		/* 1. The canonical expl3 syntax looks something like this:
+		   \cs_new:Npn \__hook_tl_gput:Nn { \ERROR }.  First, if we
+		   want to tag any such commands, we include only the part
+		   before the colon (cs_new) in TEX_defenv or TEXTAGS.  Second,
+		   etags skips the argument specifier (including the colon)
+		   after the tag token, so that it doesn't become the tag name.
+		   Third, we set the boolean 'is_explthree' to true so that we
+		   can remove the argument specifier from the actual tag name
+		   (__hook_tl_gput).  This all allows us to include expl3
+		   constructs in TEX_defenv or in the environment variable
+		   TEXTAGS without requiring a change of separator, and it also
+		   allows us to find the definition of variant commands (with
+		   different argument specifiers) defined using, for example,
+		   \cs_generate_variant:Nn.  Please note that the expl3 spec
+		   requires etags to pay more attention to whitespace in the
+		   code.
+
+		   2. We also automatically remove the asterisk from starred
+		   variants of all commands, without the need to include the
+		   starred commands explicitly in TEX_defenv or TEXTAGS. */
+		if (*cp == ':')
+		  {
+		    while (!c_isspace (*cp) && *cp != TEX_opgrp)
+		      cp++;
+		    cp = skip_spaces (cp);
+		    is_explthree = true;
+		  }
+		else if (*cp == '*')
+		  cp++;
+
+		/* Skip the optional arguments to commands in the tags list so
+		   that these arguments don't end up as the name of the tag.
+		   The name will instead come from the argument in curly braces
+		   that follows the optional ones. */
+		while (*cp != '\0' && *cp != '%')
+		  {
+		    if (*cp == '[')
+		      {
+			while (*cp != ']' && *cp != '\0' && *cp != '%')
+			  cp++;
+		      }
+		    else if (*cp == '(')
+		      {
+			while (*cp != ')' && *cp != '\0' && *cp != '%')
+			  cp++;
+		      }
+		    else if (*cp == ']' || *cp == ')')
+		      cp++;
+		    else
+		      break;
+		  }
 		if (*cp == TEX_opgrp)
 		  {
 		    opgrp = true;
 		    cp++;
+		    cp = skip_spaces (cp); /* For expl3 code. */
 		  }
+
+		/* Removing the TeX escape character from tag names simplifies
+		   things for editors finding tagged commands in TeX buffers.
+		   This applies to Emacs but also to the tag-finding behavior
+		   of at least some of the editors that use ctags, though in
+		   the latter case this will remain suboptimal.  The
+		   undocumented ctags option '--no-duplicates' may help. */
+		if (*cp == TEX_esc)
+		  {
+		    cp++;
+		    one_esc = true;
+		  }
+
+		/* Testing !c_isspace && !c_ispunct is simpler, but halts
+		   processing at too many places.  The list as it stands tries
+		   both to ensure that tag names will derive from macro names
+		   rather than from optional parameters to those macros, and
+		   also to return findable names while still allowing for
+		   unorthodox constructs. */
 		for (p = cp;
-		     (!c_isspace (*p) && *p != '#' &&
-		      *p != TEX_opgrp && *p != TEX_clgrp);
+		     (!c_isspace (*p) && *p != '#' && *p != '=' &&
+		      *p != '[' && *p != '(' && *p != TEX_opgrp &&
+		      *p != TEX_clgrp && *p != '"' && *p != '\'' &&
+		      *p != '%' && *p != ',' && *p != '|' && *p != '$');
 		     p++)
-		  continue;
+		  /* In expl3 code we remove the argument specification from
+		     the tag name.  More generally we allow only one (deleted)
+		     escape char in a tag name, which (primarily) enables
+		     tagging a TeX command's different, possibly temporary,
+		     '\let' bindings. */
+		  if (is_explthree && *p == ':')
+		    break;
+		  else if (*p == TEX_esc)
+		    { /* Second part of test is for, e.g., \cslet. */
+		      if (!one_esc && !opgrp)
+			{
+			  one_esc = true;
+			  continue;
+			}
+		      else
+			break;
+		    }
+		  else
+		    continue;
+		/* For TeX files, tags without a name are basically cruft, and
+		   in some situations they can produce spurious and confusing
+		   matches.  Try to catch as many cases as possible where a
+		   command name is of the form '\(', but avoid, as far as
+		   possible, the spurious matches. */
+		if (p == cp)
+		  {
+		    switch (*p)
+		      { /* Include =? */
+		      case '(': case '[': case '"': case '\'':
+		      case '\\': case '!': case '=': case ',':
+		      case '|': case '$':
+			p++;
+			break;
+		      case '{': case '}': case '<': case '>':
+			if (!opgrp)
+			  {
+			      p++;
+			      if (*p == '\0' || *p == '%')
+				goto tex_next_line;
+			  }
+			break;
+		      default:
+			break;
+		      }
+		  }
 		namelen = p - cp;
 		linelen = lb.len;
 		if (!opgrp || *p == TEX_clgrp)
@@ -5824,9 +5972,18 @@ TeX_commands (FILE *inf)
 		      p++;
 		    linelen = p - lb.buffer + 1;
 		  }
-		make_tag (cp, namelen, true,
-			  lb.buffer, linelen, lineno, linecharno);
-		goto tex_next_line; /* We only tag a line once */
+		if (namelen)
+		  make_tag (cp, namelen, true,
+			    lb.buffer, linelen, lineno, linecharno);
+		/* Lines with more than one \def or \let are surprisingly
+		   common in TeX files, especially in the system files that
+		   form the basis of the various TeX formats.  This tags them
+		   all. */
+		/* goto tex_next_line; /\* We only tag a line once *\/ */
+		while (*cp != '\0' && *cp != '%' && *cp != TEX_esc)
+		  cp++;
+		if (*cp != TEX_esc)
+		  goto tex_next_line;
 	      }
 	}
     tex_next_line:
diff --git a/lisp/textmodes/tex-mode.el b/lisp/textmodes/tex-mode.el
index 97c950267c6..4a595aa1ea5 100644
--- a/lisp/textmodes/tex-mode.el
+++ b/lisp/textmodes/tex-mode.el
@@ -636,6 +636,14 @@ tex-font-lock-keywords-2
 	      3 '(tex-font-lock-append-prop 'bold) 'append)))))
    "Gaudy expressions to highlight in TeX modes.")
 
+(defvar-local tex-expl-region-list nil
+  "List of region boundaries where expl3 syntax is active.
+It will be nil in buffers where expl3 syntax is always active, e.g.,
+expl3 classes or packages.")
+
+(defvar-local tex-expl-buffer-p nil
+  "Non-nil in buffers where expl3 syntax is always active.")
+
 (defun tex-font-lock-suscript (pos)
   (unless (or (memq (get-text-property pos 'face)
 		    '(font-lock-constant-face font-lock-builtin-face
@@ -645,7 +653,17 @@ tex-font-lock-suscript
 		    (pos pos))
 		(while (eq (char-before pos) ?\\)
 		  (setq pos (1- pos) odd (not odd)))
-		odd))
+		odd)
+              ;; Check if POS is in an expl3 syntax region or an expl3 buffer
+              (when (eq (char-after pos) ?_)
+                (or tex-expl-buffer-p
+                    (and
+                     tex-expl-region-list
+                     (catch 'result
+	               (dolist (range tex-expl-region-list)
+	                 (and (> pos (car range))
+	                      (< pos (cdr range))
+                              (throw 'result t))))))))
     (if (eq (char-after pos) ?_)
 	`(face subscript display (raise ,(car tex-font-script-display)))
       `(face superscript display (raise ,(cadr tex-font-script-display))))))
@@ -1289,8 +1307,16 @@ tex-common-initialization
                 #'tex--prettify-symbols-compose-p)
   (setq-local syntax-propertize-function
 	      (syntax-propertize-rules latex-syntax-propertize-rules))
+  ;; Don't add extra processing to `syntax-propertize' in files where
+  ;; expl3 syntax is always active.
+  :after-hook (progn (tex-expl-buffer-parse)
+                     (unless tex-expl-buffer-p
+                       (add-hook 'syntax-propertize-extend-region-functions
+                                 #'tex-expl-region-set nil t)))
   ;; TABs in verbatim environments don't do what you think.
   (setq-local indent-tabs-mode nil)
+  ;; Set up xref backend in TeX buffers.
+  (add-hook 'xref-backend-functions #'tex--xref-backend nil t)
   ;; Other vars that should be buffer-local.
   (make-local-variable 'tex-command)
   (make-local-variable 'tex-start-of-header)
@@ -1936,6 +1962,36 @@ tex-count-words
 		(forward-sexp 1))))))
       (message "%s words" count))))
 
+(defun tex-expl-buffer-parse ()
+  "Identify buffers where expl3 syntax is always active."
+  (save-excursion
+    (goto-char (point-min))
+    (when (tex-search-noncomment
+	   (re-search-forward
+	    "\\\\\\(?:ExplFile\\|ProvidesExpl\\|__xparse_file\\)"
+	    nil t))
+      (setq tex-expl-buffer-p t))))
+
+(defun tex-expl-region-set (_beg _end)
+  "Create a list of regions where expl3 syntax is active.
+This function updates the list whenever `syntax-propertize' runs, and
+stores it in the buffer-local variable `tex-expl-region-list'.  The
+list will always be nil when the buffer visits an expl3 file, e.g., an
+expl3 class or package, where expl3 syntax is always active."
+  (unless syntax-ppss--updated-cache;; Stop forward search running twice.
+    (setq tex-expl-region-list nil)
+    ;; Leaving this test here allows users to set `tex-expl-buffer-p'
+    ;; independently of the mode's automatic detection of an expl3 file.
+    (unless tex-expl-buffer-p
+      (goto-char (point-min))
+      (let ((case-fold-search nil))
+        (while (tex-search-noncomment
+                (search-forward "\\ExplSyntaxOn" nil t))
+          (let ((new-beg (point))
+                (new-end (or (tex-search-noncomment
+                              (search-forward "\\ExplSyntaxOff" nil t))
+                             (point-max))))
+            (push (cons new-beg new-end) tex-expl-region-list)))))))
 
 \f
 ;;; Invoking TeX in an inferior shell.
@@ -3742,6 +3798,321 @@ tex-chktex
       (process-send-region tex-chktex--process (point-min) (point-max))
       (process-send-eof tex-chktex--process))))
 
+\f
+;;; Xref backend
+
+;; Here we lightly adapt the default etags backend for xref so that
+;; the main xref user commands (including `xref-find-definitions',
+;; `xref-find-apropos', and `xref-find-references' [on M-., C-M-., and
+;; M-?, respectively]) work in TeX buffers.  The only methods we
+;; actually modify are `xref-backend-identifier-at-point' and
+;; `xref-backend-references'.  Many of the complications here, and in
+;; `etags' itself, are due to the necessity of parsing both the old
+;; TeX syntax and the new expl3 syntax, which will continue to appear
+;; together in documents for the foreseeable future.  Synchronizing
+;; Emacs and `etags' this way aims to improve the user experience "out
+;; of the box."
+
+(defvar tex-thingatpt-exclude-chars '(?\\ ?\{ ?\})
+  "Exclude these chars by default from TeX thing-at-point.
+
+The TeX `xref-backend-identifier-at-point' method uses the characters
+listed in this variable to decide on the default search string to
+present to the user who calls an `xref' command.  These characters
+become part of a regexp which always excludes them from that default
+string.  For the `xref' commands to function properly in TeX buffers, at
+least the TeX escape and the two TeX grouping characters should be
+listed here.  Should your TeX documents contain other characters which
+you want to exclude by default, then you can add them to the list,
+though you may wish to consult the functions
+`tex-thingatpt--beginning-of-symbol' and `tex-thingatpt--end-of-symbol'
+to see what the regexp already contains.  If your documents contain
+non-standard escape and grouping characters, then you can replace the
+three listed here with your own, thereby allowing the three standard
+characters to appear by default in search strings.  Please be aware,
+however, that the `etags' program only recognizes `\\' (92) and `!' (33)
+as escape characters in TeX documents, and if it detects the latter it
+also uses `<>' as the TeX grouping construct rather than `{}'.  Setting
+the escape and grouping chars to anything other than `\\=\\{}' or `!<>'
+will not be useful without changes to `etags', at least for commands
+that search tags tables, such as \\[xref-find-definitions] and \
+\\[xref-find-apropos].
+
+Should you wish to change the defaults, please also be aware that,
+without further modifications to tex-mode.el, the usual text-parsing
+routines for `font-lock' and the like won't work correctly, as the
+default escape and grouping characters are currently hard coded in many
+places.")
+
+;; Populate `semantic-symref-filepattern-alist' for the in-tree modes;
+;; AUCTeX is doing the same for its modes.
+(with-eval-after-load 'semantic/symref/grep
+  (defvar semantic-symref-filepattern-alist)
+  (push '(latex-mode "*.[tT]e[xX]" "*.ltx" "*.sty" "*.cl[so]"
+                     "*.bbl" "*.drv" "*.hva")
+        semantic-symref-filepattern-alist)
+  (push '(plain-tex-mode "*.[tT]e[xX]" "*.ins")
+        semantic-symref-filepattern-alist)
+  (push '(doctex-mode "*.dtx") semantic-symref-filepattern-alist))
+
+(defun tex--xref-backend () 'tex-etags)
+
+;; Setup AUCTeX modes (for testing purposes only).
+
+(add-hook 'TeX-mode-hook #'tex-set-auctex-xref-backend)
+
+(defun tex-set-auctex-xref-backend ()
+  (add-hook 'xref-backend-functions #'tex--xref-backend nil t))
+
+;; `xref-find-references' currently may need this when called from a
+;; latex-mode buffer in order to search files or buffers with a .tex
+;; suffix (including the buffer from which it has been called).  We
+;; append it to `auto-mode-alist' so as not to interfere with the usual
+;; mode-setting apparatus.  Changes here and in AUCTeX should soon
+;; render it unnecessary.
+(add-to-list 'auto-mode-alist '("\\.[tT]e[xX]\\'" . latex-mode) t)
+
+(cl-defmethod xref-backend-identifier-at-point ((_backend (eql 'tex-etags)))
+  (require 'etags)
+  (tex--thing-at-point))
+
+;; The detection of `_' and `:' is a primitive method for determining
+;; whether point is on an expl3 construct.  It may fail in some
+;; instances.
+(defun tex--thing-at-point ()
+  "Demarcate `thing-at-point' for the TeX `xref' backend."
+  (let ((bounds (tex--bounds-of-symbol-at-point)))
+    (when bounds
+      (let ((texsym (buffer-substring-no-properties (car bounds) (cdr bounds))))
+        (if (and (not (string-match-p "reference" (symbol-name this-command)))
+                 (seq-contains-p texsym ?_)
+                 (seq-contains-p texsym ?:))
+            (seq-take texsym (seq-position texsym ?:))
+          texsym)))))
+
+(defun tex-thingatpt--beginning-of-symbol ()
+  (and
+   (re-search-backward (concat "[]["
+                               (mapconcat #'regexp-quote
+                                          (mapcar #'char-to-string
+                                                  tex-thingatpt-exclude-chars))
+                               "\"*`'#=&()%,|$[:cntrl:][:blank:]]"))
+   (forward-char)))
+
+(defun tex-thingatpt--end-of-symbol ()
+  (and
+   (re-search-forward (concat "[]["
+                              (mapconcat #'regexp-quote
+                                          (mapcar #'char-to-string
+                                                  tex-thingatpt-exclude-chars))
+                              "\"*`'#=&()%,|$[:cntrl:][:blank:]]"))
+   (backward-char)))
+
+(defun tex--bounds-of-symbol-at-point ()
+  "Simplify `bounds-of-thing-at-point' for TeX `xref' backend."
+  (let ((orig (point)))
+    (ignore-errors
+      (save-excursion
+	(tex-thingatpt--end-of-symbol)
+	(tex-thingatpt--beginning-of-symbol)
+	(let ((beg (point)))
+	  (if (<= beg orig)
+	      (let ((real-end
+		     (progn
+		       (tex-thingatpt--end-of-symbol)
+		       (point))))
+		(cond ((and (<= orig real-end) (< beg real-end))
+		       (cons beg real-end))
+                      ((and (= orig real-end) (= beg real-end))
+		       (cons beg (1+ beg)))))))))));; For 1-char TeX commands.
+
+(cl-defmethod xref-backend-identifier-completion-table ((_backend
+                                                         (eql 'tex-etags)))
+  (xref-backend-identifier-completion-table 'etags))
+
+(cl-defmethod xref-backend-identifier-completion-ignore-case ((_backend
+                                                               (eql
+                                                                'tex-etags)))
+  (xref-backend-identifier-completion-ignore-case 'etags))
+
+(cl-defmethod xref-backend-definitions ((_backend (eql 'tex-etags)) symbol)
+  (xref-backend-definitions 'etags symbol))
+
+(cl-defmethod xref-backend-apropos ((_backend (eql 'tex-etags)) pattern)
+  (xref-backend-apropos 'etags pattern))
+
+;; The `xref-backend-references' method requires more code than the
+;; others for at least two main reasons: TeX authors have typically been
+;; free in their invention of new file types with new suffixes, and they
+;; have also tended sometimes to include non-symbol characters in
+;; command names.  When combined with the default Semantic Symbol
+;; Reference API, these two characteristics of TeX code mean that a
+;; command like `xref-find-references' would often fail to find any hits
+;; for a symbol at point, including the one under point in the current
+;; buffer, or it would find only some instances and skip others.
+
+(defun tex-find-references-syntax-table ()
+  (let ((st (if (boundp 'TeX-mode-syntax-table)
+                 (make-syntax-table TeX-mode-syntax-table)
+               (make-syntax-table tex-mode-syntax-table))))
+    st))
+
+(defvar tex--xref-syntax-fun nil)
+
+(defun tex-xref-syntax-function (str beg end)
+  "Provide a bespoke `syntax-propertize-function' for \\[xref-find-references]."
+  (let* (grpb tempstr
+              (shrtstr (if end
+                           (progn
+                             (setq tempstr (seq-take str (1- (length str))))
+                             (if beg
+                                 (setq tempstr (seq-drop tempstr 1))
+                               tempstr))
+                         (seq-drop str 1)))
+              (grpa (if (and beg end)
+                        (prog1
+                            (list 1 "_")
+                          (setq grpb (list 2 "_")))
+                      (list 1 "_")))
+              (re (concat beg (regexp-quote shrtstr) end))
+              (temp-rule (if grpb
+                             (list re grpa grpb)
+                           (list re grpa))))
+    ;; Simple benchmarks suggested that the speed-up from compiling this
+    ;; function was nearly nil, so `eval' and its non-byte-compiled
+    ;; function remain.
+    (setq tex--xref-syntax-fun (eval
+                                `(syntax-propertize-rules ,temp-rule)))))
+
+(defun tex--collect-file-extensions ()
+  "Gather TeX file extensions from `auto-mode-alist'."
+  (let* ((mlist (when (rassq major-mode auto-mode-alist)
+		  (seq-filter
+		   (lambda (elt)
+		     (eq (cdr elt) major-mode))
+		   auto-mode-alist)))
+	 (lcsym (intern-soft (downcase (symbol-name major-mode))))
+	 (lclist (and lcsym
+		      (not (eq lcsym major-mode))
+		      (rassq lcsym auto-mode-alist)
+		      (seq-filter
+		       (lambda (elt)
+			 (eq (cdr elt) lcsym))
+		       auto-mode-alist)))
+	 (shortsym (when (stringp mode-name)
+		     (intern-soft (concat (string-trim-right mode-name "/.*")
+					  "-mode"))))
+	 (lcshortsym (when (stringp mode-name)
+		       (intern-soft (downcase
+				     (concat
+				      (string-trim-right mode-name "/.*")
+				      "-mode")))))
+	 (shlist (and shortsym
+		      (not (eq shortsym major-mode))
+		      (not (eq shortsym lcsym))
+		      (rassq shortsym auto-mode-alist)
+		      (seq-filter
+		       (lambda (elt)
+			 (eq (cdr elt) shortsym))
+		       auto-mode-alist)))
+	 (lcshlist (and lcshortsym
+			(not (eq lcshortsym major-mode))
+			(not (eq lcshortsym lcsym))
+			(rassq lcshortsym auto-mode-alist)
+			(seq-filter
+			 (lambda (elt)
+			   (eq (cdr elt) lcshortsym))
+			 auto-mode-alist)))
+	 (exts (when (or mlist lclist shlist lcshlist)
+		 (seq-union (seq-map #'car lclist)
+			    (seq-union (seq-map #'car mlist)
+				       (seq-union (seq-map #'car lcshlist)
+						  (seq-map #'car shlist))))))
+	 (ed-exts (when exts
+		    (seq-map
+		     (lambda (elt)
+		       (concat "*" (string-trim  elt "\\\\" "\\\\'")))
+		     exts))))
+    ed-exts))
+
+(defvar tex--buffers-list nil)
+(defvar-local tex--old-syntax-function nil)
+
+(cl-defmethod xref-backend-references ((_backend (eql 'tex-etags)) identifier)
+  "Find references of IDENTIFIER in TeX buffers and files."
+  (require 'semantic/symref/grep)
+  (defvar semantic-symref-filepattern-alist)
+  (let (bufs texbufs
+             (mode major-mode))
+    (dolist (buf (buffer-list))
+      (if (eq (buffer-local-value 'major-mode buf) mode)
+          (push buf bufs)
+        (when (string-match-p ".*\\.[tT]e[xX]" (buffer-name buf))
+          (push buf texbufs))))
+    (unless (seq-set-equal-p tex--buffers-list bufs)
+      (let* ((amalist (tex--collect-file-extensions))
+	     (extlist (alist-get mode semantic-symref-filepattern-alist))
+	     (extlist-new (seq-uniq
+                           (seq-union amalist extlist #'string-match-p))))
+	(setq tex--buffers-list bufs)
+	(dolist (buf bufs)
+	  (when-let ((fbuf (buffer-file-name buf))
+		     (ext (file-name-extension fbuf))
+		     (finext (concat "*." ext))
+		     ((not (seq-find (lambda (elt) (string-match-p elt finext))
+				     extlist-new)))
+		     ((push finext extlist-new)))))
+	(unless (seq-set-equal-p extlist-new extlist)
+	  (setf (alist-get mode semantic-symref-filepattern-alist)
+                extlist-new))))
+    (let* (setsyntax
+           (punct (with-syntax-table (tex-find-references-syntax-table)
+                    (seq-positions identifier (list ?w ?_)
+			           (lambda (elt sycode)
+			             (not (memq (char-syntax elt) sycode))))))
+           (end (and punct
+                     (memq (1- (length identifier)) punct)
+                     (> (length identifier) 1)
+                     (concat "\\("
+                             (regexp-quote
+                              (string (elt identifier
+                                           (1- (length identifier)))))
+                             "\\)")))
+           (beg (and punct
+                     (memq 0 punct)
+                     (concat "\\("
+                             (regexp-quote (string (elt identifier 0)))
+                             "\\)")))
+           (text-mode-hook
+            (if (or end beg)
+                (progn
+                  (tex-xref-syntax-function identifier beg end)
+                  (setq setsyntax (lambda ()
+		                    (setq-local syntax-propertize-function
+                                                tex--xref-syntax-fun)
+                                    (setq-local TeX-style-hook-applied-p t)))
+                  (cons setsyntax text-mode-hook))
+              text-mode-hook)))
+      (unless (memq 'doctex-mode (derived-mode-all-parents mode))
+        (setq bufs (append texbufs bufs)))
+      (when (or end beg)
+        (dolist (buf bufs)
+          (with-current-buffer buf
+            (unless (local-variable-p 'tex--old-syntax-function)
+              (setq tex--old-syntax-function syntax-propertize-function))
+            (setq-local syntax-propertize-function
+                        tex--xref-syntax-fun)
+            (syntax-ppss-flush-cache (point-min)))))
+      (unwind-protect
+          (xref-backend-references nil identifier)
+        (when (or end beg)
+          (dolist (buf bufs)
+            (with-current-buffer buf
+              (when buffer-file-truename
+                (setq-local syntax-propertize-function
+                            tex--old-syntax-function)
+                (syntax-ppss-flush-cache (point-min))))))))))
+
 (make-obsolete-variable 'tex-mode-load-hook
                         "use `with-eval-after-load' instead." "28.1")
 (run-hooks 'tex-mode-load-hook)
-- 
2.39.4


  reply	other threads:[~2024-05-16 12:56 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-03 15:09 bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-21  2:11 ` Dmitry Gutov
2022-02-21  9:48   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-21 17:28     ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-21 23:56       ` Dmitry Gutov
2022-02-22 15:19         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-23  2:21           ` Dmitry Gutov
2022-02-23 10:45             ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-24  2:23               ` Dmitry Gutov
2022-02-24 13:15                 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-21 23:55     ` Dmitry Gutov
2022-09-08 13:25   ` Lars Ingebrigtsen
2022-09-08 13:34     ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-09-08 13:39       ` Lars Ingebrigtsen
2022-09-08 15:50         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-03  9:08           ` Stefan Kangas
2023-09-03 10:03             ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-03 10:46               ` Stefan Kangas
2023-09-13 11:10                 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-13 13:42                   ` Stefan Kangas
2023-09-13 15:23                   ` Dmitry Gutov
2023-09-13 17:01                     ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-13 23:59                       ` Dmitry Gutov
2023-09-14  6:10                         ` Eli Zaretskii
2023-09-15 18:45                           ` Tassilo Horn
2023-09-16  5:53                             ` Ikumi Keita
2023-09-17  8:49                               ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-22 13:06                                 ` Arash Esbati
2024-04-22 14:56                                   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-22 16:15                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-22 16:37                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-22 17:16                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-22 17:25                                           ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-24  0:09                                           ` Dmitry Gutov
2024-04-24  9:02                                             ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-23 12:04                                     ` Arash Esbati
2024-04-23 13:21                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-29 14:15                                   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-02  0:43                                     ` Dmitry Gutov
2024-05-02 13:32                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-03 13:42                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-07  2:27                                           ` Dmitry Gutov
2024-05-09  3:00                                             ` Dmitry Gutov
2024-05-09  6:38                                               ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-09 10:49                                               ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-13 20:54                                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-14 21:24                                                 ` Dmitry Gutov
2024-05-16 18:18                                                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-20  0:21                                                     ` Dmitry Gutov
2024-05-20  2:38                                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-25  7:57                                                         ` Eli Zaretskii
2024-05-25 23:01                                                         ` Dmitry Gutov
2024-05-07  2:06                                         ` Dmitry Gutov
2024-05-02  6:47                                     ` Arash Esbati
2024-05-02 13:34                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-03 14:10                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-04  8:26                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-04 14:32                                       ` Arash Esbati
2024-05-04 14:54                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-04 21:15                                           ` Arash Esbati
2024-05-07 13:15                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-15 15:47                                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-16  7:53                                         ` Arash Esbati
2024-05-16 12:56                                           ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors [this message]
2023-09-14 16:11                         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-14 23:55                           ` Dmitry Gutov
2023-09-15  6:47                             ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-13 19:16                     ` Eli Zaretskii
2023-09-13 20:25                       ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-14  5:14                         ` Eli Zaretskii
2022-02-21 12:35 ` Arash Esbati
2022-02-21 14:03   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-25 20:16 ` Augusto Stoffel
2022-02-26  9:29   ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-26 10:56     ` Augusto Stoffel
2022-02-27 18:42       ` Arash Esbati
2022-02-28  9:09         ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-28 11:54           ` Arash Esbati
2022-02-28 13:11             ` Augusto Stoffel
2022-02-28 19:04               ` Arash Esbati
2022-03-01  8:46                 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-28 13:05           ` Augusto Stoffel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADF+RthKH96e0Vrkfw3kGikTX-edOx2Phdj-vihMfzg5qt=tUg@mail.gmail.com' \
    --to=bug-gnu-emacs@gnu.org \
    --cc=53749@debbugs.gnu.org \
    --cc=arash@gnu.org \
    --cc=dfussner@googlemail.com \
    --cc=dgutov@yandex.ru \
    --cc=eliz@gnu.org \
    --cc=ikumi@ikumi.que.jp \
    --cc=monnier@iro.umontreal.ca \
    --cc=stefankangas@gmail.com \
    --cc=tsdh@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.