From: David Fussner via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
To: Arash Esbati <arash@gnu.org>
Cc: 53749@debbugs.gnu.org, Ikumi Keita <ikumi@ikumi.que.jp>,
Dmitry Gutov <dgutov@yandex.ru>,
Stefan Monnier <monnier@iro.umontreal.ca>,
Tassilo Horn <tsdh@gnu.org>, Eli Zaretskii <eliz@gnu.org>,
stefankangas@gmail.com
Subject: bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers
Date: Thu, 16 May 2024 13:56:56 +0100 [thread overview]
Message-ID: <CADF+RthKH96e0Vrkfw3kGikTX-edOx2Phdj-vihMfzg5qt=tUg@mail.gmail.com> (raw)
In-Reply-To: <m2v83eur8o.fsf@macmutant.fritz.box>
[-- Attachment #1: Type: text/plain, Size: 1905 bytes --]
Thanks, Arash. Agreed, on all counts. Revised patch attached.
Best, David.
On Thu, 16 May 2024 at 08:54, Arash Esbati <arash@gnu.org> wrote:
>
> David Fussner <dfussner@googlemail.com> writes:
>
> > +(defun tex-expl-buffer-parse ()
> > + "Identify buffers where expl3 syntax is always active."
> > + (save-excursion
> > + (goto-char (point-min))
> > + (when (tex-search-noncomment
> > + (re-search-forward
> > + "\\(?:\\\\\\(?:ExplFile\\|ProvidesExpl\\|__xparse_file\\)\\)"
>
> Is the outer grouping necessary? Why not just:
>
> "\\\\\\(?:ExplFile\\|ProvidesExpl\\|__xparse_file\\)"
>
> > + nil t))
> > + (setq tex-expl-buffer-p t))))
> > +
> > +(defun tex-expl-region-set (_beg _end)
> > + "Create a list of regions where expl3 syntax is active.
> > +This function updates the list whenever `syntax-propertize' runs, and
> > +stores it in the buffer-local variable `tex-expl-region-list'. The
> > +list will always be nil when the buffer visits an expl3 file, e.g., an
> > +expl3 class or package, where expl3 syntax is always active."
> > + (unless syntax-ppss--updated-cache;; Stop forward search running twice.
> > + (setq tex-expl-region-list nil)
> > + ;; Leaving this test here allows users to set `tex-expl-buffer-p'
> > + ;; independently of the mode's automatic detection of an expl3 file.
> > + (unless tex-expl-buffer-p
> > + (goto-char (point-min))
> > + (while (tex-search-noncomment
> > + (re-search-forward "\\ExplSyntaxOn" nil t))
>
> This looks wrong, I think you want `search-forward'.
>
> > + (let ((new-beg (point))
> > + (new-end (or (tex-search-noncomment
> > + (re-search-forward "\\ExplSyntaxOff" nil t))
>
> Same here.
>
> > + (point-max))))
> > + (push (cons new-beg new-end) tex-expl-region-list))))))
>
> Best, Arash
[-- Attachment #2: 0003-Provide-a-modified-xref-backend-for-TeX-buffers.patch --]
[-- Type: text/x-patch, Size: 33122 bytes --]
From 2839cbe15f91a1292d26e9208d21ce47270fd72e Mon Sep 17 00:00:00 2001
From: David Fussner <dfussner@googlemail.com>
Date: Thu, 16 May 2024 13:51:12 +0100
Subject: [PATCH] Provide a modified xref backend for TeX buffers
* lib-src/etags.c (TeX_commands): Improve parsing of commands in TeX
buffers.
(TEX_defenv): Expand list of commands to tag by default in TeX
buffers.
(TeX_help):
* doc/emacs/maintaining.texi (Tag Syntax): Document new tagged
commands.
(Identifier Search): Add note about semantic-symref-filepattern-alist,
auto-mode-alist, and xref-find-references.
* lisp/textmodes/tex-mode.el (tex-font-lock-suscript): Test for
underscore in expl3 files and regions, disable subscript face there.
(tex-common-initialization): Set up xref backend for in-tree TeX
modes. Detect expl3 files, and in others set up a list of expl3
regions.
(tex-expl-buffer-parse): New function called in previous.
(tex-expl-buffer-p): New var to hold the result of previous.
(tex-expl-region-set): New function added to
'syntax-propertize-extend-region-functions' hook.
(tex-expl-region-list): New var to hold the result of previous.
(tex--thing-at-point, tex-thingatpt--beginning-of-symbol)
(tex-thingatpt--end-of-symbol, tex--bounds-of-symbol-at-point):
New functions to return 'thing-at-point' for xref backend.
(tex-thingatpt-exclude-chars): New var to do the same.
(xref-backend-identifier-at-point): New TeX backend method to provide
symbols for processing by xref.
(xref-backend-identifier-completion-table)
(xref-backend-identifier-completion-ignore-case)
(xref-backend-definitions, xref-backend-apropos): Placeholders to
call the standard 'etags' xref backend methods.
(xref-backend-references): Wrapper to call the default xref backend
method, finding as many relevant files as possible and using a bespoke
syntax-propertize-function when required.
(tex--collect-file-extensions, tex-xref-syntax-function): Helper
functions for previous.
(tex-find-references-syntax-table, tex--buffers-list)
(tex--xref-syntax-fun, tex--old-syntax-function): New vars for same.
---
doc/emacs/maintaining.texi | 39 +++-
lib-src/etags.c | 189 +++++++++++++++++--
lisp/textmodes/tex-mode.el | 373 ++++++++++++++++++++++++++++++++++++-
3 files changed, 580 insertions(+), 21 deletions(-)
diff --git a/doc/emacs/maintaining.texi b/doc/emacs/maintaining.texi
index 579098c81b1..a064103aa25 100644
--- a/doc/emacs/maintaining.texi
+++ b/doc/emacs/maintaining.texi
@@ -2529,6 +2529,15 @@ Identifier Search
referenced. The XREF mode commands are available in this buffer, see
@ref{Xref Commands}.
+When invoked in a buffer whose major mode uses the @code{etags} backend,
+@kbd{M-?} searches files and buffers whose major mode matches that of
+the original buffer. It guesses that mode from file extensions, so if
+@kbd{M-?} seems to be skipping relevant buffers or files, try
+customizing either the variable @code{semantic-symref-filepattern-alist}
+(if your buffer's major mode already has an entry in it), or
+@code{auto-mode-alist} (if not), thereby informing @code{xref} of the
+missing extensions (@pxref{Choosing Modes}).
+
@vindex xref-auto-jump-to-first-xref
If the value of the variable @code{xref-auto-jump-to-first-xref} is
@code{t}, @code{xref-find-references} automatically jumps to the first
@@ -2747,10 +2756,32 @@ Tag Syntax
@item
In @LaTeX{} documents, the arguments for @code{\chapter},
@code{\section}, @code{\subsection}, @code{\subsubsection},
-@code{\eqno}, @code{\label}, @code{\ref}, @code{\cite},
-@code{\bibitem}, @code{\part}, @code{\appendix}, @code{\entry},
-@code{\index}, @code{\def}, @code{\newcommand}, @code{\renewcommand},
-@code{\newenvironment} and @code{\renewenvironment} are tags.
+@code{\eqno}, @code{\label}, @code{\ref}, @code{\Ref}, @code{\footref},
+@code{\cite}, @code{\bibitem}, @code{\part}, @code{\appendix},
+@code{\entry}, @code{\index}, @code{\def}, @code{\edef}, @code{\gdef},
+@code{\xdef}, @code{\newcommand}, @code{\renewcommand},
+@code{\newenvironment}, @code{\renewenvironment},
+@code{\DeclareRobustCommand}, @code{\newrobustcmd},
+@code{\renewrobustcmd}, @code{\providecommand},
+@code{\providerobustcmd}, @code{\NewDocumentCommand},
+@code{\RenewDocumentCommand}, @code{\ProvideDocumentCommand},
+@code{\DeclareDocumentCommand}, @code{\NewExpandableDocumentCommand},
+@code{\RenewExpandableDocumentCommand},
+@code{\ProvideExpandableDocumentCommand},
+@code{\DeclareExpandableDocumentCommand},
+@code{\NewDocumentEnvironment}, @code{\RenewDocumentEnvironment},
+@code{\ProvideDocumentEnvironment}, @code{\DeclareDocumentEnvironment},
+@code{\csdef}, @code{\csedef}, @code{\csgdef}, @code{\csxdef},
+@code{\csletcs}, @code{\cslet}, @code{\letcs}, @code{\let},
+@code{\cs_new_protected_nopar}, @code{\cs_new_protected},
+@code{\cs_new_nopar}, @code{\cs_new_eq}, @code{\cs_new},
+@code{\cs_set_protected_nopar}, @code{\cs_set_protected},
+@code{\cs_set_nopar}, @code{\cs_set_eq}, @code{\cs_set},
+@code{\cs_gset_protected_nopar}, @code{\cs_gset_protected},
+@code{\cs_gset_nopar}, @code{\cs_gset_eq}, @code{\cs_gset},
+@code{\cs_generate_from_arg_count}, and @code{\cs_generate_variant} are
+tags. So too are the arguments of any starred variants of these
+commands.
Other commands can make tags as well, if you specify them in the
environment variable @env{TEXTAGS} before invoking @command{etags}. The
diff --git a/lib-src/etags.c b/lib-src/etags.c
index 03bc55de03d..11fddc187c2 100644
--- a/lib-src/etags.c
+++ b/lib-src/etags.c
@@ -793,11 +793,27 @@ #define STDIN 0x1001 /* returned by getopt_long on --parse-stdin */
static const char *TeX_suffixes [] =
{ "bib", "clo", "cls", "ltx", "sty", "TeX", "tex", NULL };
static const char TeX_help [] =
-"In LaTeX text, the argument of any of the commands '\\chapter',\n\
-'\\section', '\\subsection', '\\subsubsection', '\\eqno', '\\label',\n\
-'\\ref', '\\cite', '\\bibitem', '\\part', '\\appendix', '\\entry',\n\
-'\\index', '\\def', '\\newcommand', '\\renewcommand',\n\
-'\\newenvironment' or '\\renewenvironment' is a tag.\n\
+"In LaTeX text, the argument of the commands '\\chapter', '\\section',\n\
+'\\subsection', '\\subsubsection', '\\eqno', '\\label', '\\ref',\n\
+'\\Ref', '\\footref', '\\cite', '\\bibitem', '\\part', '\\appendix',\n\
+'\\entry', '\\index', '\\def', '\\edef', '\\gdef', '\\xdef',\n\
+'\\newcommand', '\\renewcommand', '\\newrobustcmd', '\\renewrobustcmd',\n\
+'\\newenvironment', '\\renewenvironment', '\\DeclareRobustCommand',\n\
+'\\providecommand', '\\providerobustcmd', '\\NewDocumentCommand',\n\
+'\\RenewDocumentCommand', '\\ProvideDocumentCommand',\n\
+'\\DeclareDocumentCommand', '\\NewExpandableDocumentCommand',\n\
+'\\RenewExpandableDocumentCommand', '\\ProvideExpandableDocumentCommand',\n\
+'\\DeclareExpandableDocumentCommand', '\\NewDocumentEnvironment',\n\
+'\\RenewDocumentEnvironment', '\\ProvideDocumentEnvironment',\n\
+'\\DeclareDocumentEnvironment','\\csdef', '\\csedef', '\\csgdef',\n\
+'\\csxdef', '\\csletcs', '\\cslet', '\\letcs', '\\let',\n\
+'\\cs_new_protected_nopar', '\\cs_new_protected', '\\cs_new_nopar',\n\
+'\\cs_new_eq', '\\cs_new', '\\cs_set_protected_nopar',\n\
+'\\cs_set_protected', '\\cs_set_nopar', '\\cs_set_eq', '\\cs_set',\n\
+'\\cs_gset_protected_nopar', '\\cs_gset_protected', '\\cs_gset_nopar',\n\
+'\\cs_gset_eq', '\\cs_gset', '\\cs_generate_from_arg_count', or\n\
+'\\cs_generate_variant' is a tag. So is the argument of any starred\n\
+variant of these commands.\n\
\n\
Other commands can be specified by setting the environment variable\n\
'TEXTAGS' to a colon-separated list like, for example,\n\
@@ -5740,11 +5756,25 @@ Scheme_functions (FILE *inf)
static linebuffer *TEX_toktab = NULL; /* Table with tag tokens */
/* Default set of control sequences to put into TEX_toktab.
- The value of environment var TEXTAGS is prepended to this. */
+ The value of environment var TEXTAGS is prepended to this.
+ (2024) Add variants of '\def', some additional LaTeX (and
+ former xparse) commands, common variants from the
+ 'etoolbox' package, and the main expl3 commands. */
static const char *TEX_defenv = "\
-:chapter:section:subsection:subsubsection:eqno:label:ref:cite:bibitem\
-:part:appendix:entry:index:def\
-:newcommand:renewcommand:newenvironment:renewenvironment";
+:label:ref:Ref:footref:chapter:section:subsection:subsubsection:eqno:cite\
+:bibitem:part:appendix:entry:index:def:edef:gdef:xdef:newcommand:renewcommand\
+:newenvironment:renewenvironment:DeclareRobustCommand:renewrobustcmd\
+:newrobustcmd:providecommand:providerobustcmd:NewDocumentCommand\
+:RenewDocumentCommand:ProvideDocumentCommand:DeclareDocumentCommand\
+:NewExpandableDocumentCommand:RenewExpandableDocumentCommand\
+:ProvideExpandableDocumentCommand:DeclareExpandableDocumentCommand\
+:NewDocumentEnvironment:RenewDocumentEnvironment\
+:ProvideDocumentEnvironment:DeclareDocumentEnvironment:csdef\
+:csedef:csgdef:csxdef:csletcs:cslet:letcs:let:cs_new_protected_nopar\
+:cs_new_protected:cs_new_nopar:cs_new_eq:cs_new:cs_set_protected_nopar\
+:cs_set_protected:cs_set_nopar:cs_set_eq:cs_set:cs_gset_protected_nopar\
+:cs_gset_protected:cs_gset_nopar:cs_gset_eq:cs_gset\
+:cs_generate_from_arg_count:cs_generate_variant";
static void TEX_decode_env (const char *, const char *);
@@ -5803,19 +5833,137 @@ TeX_commands (FILE *inf)
{
char *p;
ptrdiff_t namelen, linelen;
- bool opgrp = false;
+ bool opgrp = false, one_esc = false, is_explthree = false;
cp = skip_spaces (cp + key->len);
+
+ /* 1. The canonical expl3 syntax looks something like this:
+ \cs_new:Npn \__hook_tl_gput:Nn { \ERROR }. First, if we
+ want to tag any such commands, we include only the part
+ before the colon (cs_new) in TEX_defenv or TEXTAGS. Second,
+ etags skips the argument specifier (including the colon)
+ after the tag token, so that it doesn't become the tag name.
+ Third, we set the boolean 'is_explthree' to true so that we
+ can remove the argument specifier from the actual tag name
+ (__hook_tl_gput). This all allows us to include expl3
+ constructs in TEX_defenv or in the environment variable
+ TEXTAGS without requiring a change of separator, and it also
+ allows us to find the definition of variant commands (with
+ different argument specifiers) defined using, for example,
+ \cs_generate_variant:Nn. Please note that the expl3 spec
+ requires etags to pay more attention to whitespace in the
+ code.
+
+ 2. We also automatically remove the asterisk from starred
+ variants of all commands, without the need to include the
+ starred commands explicitly in TEX_defenv or TEXTAGS. */
+ if (*cp == ':')
+ {
+ while (!c_isspace (*cp) && *cp != TEX_opgrp)
+ cp++;
+ cp = skip_spaces (cp);
+ is_explthree = true;
+ }
+ else if (*cp == '*')
+ cp++;
+
+ /* Skip the optional arguments to commands in the tags list so
+ that these arguments don't end up as the name of the tag.
+ The name will instead come from the argument in curly braces
+ that follows the optional ones. */
+ while (*cp != '\0' && *cp != '%')
+ {
+ if (*cp == '[')
+ {
+ while (*cp != ']' && *cp != '\0' && *cp != '%')
+ cp++;
+ }
+ else if (*cp == '(')
+ {
+ while (*cp != ')' && *cp != '\0' && *cp != '%')
+ cp++;
+ }
+ else if (*cp == ']' || *cp == ')')
+ cp++;
+ else
+ break;
+ }
if (*cp == TEX_opgrp)
{
opgrp = true;
cp++;
+ cp = skip_spaces (cp); /* For expl3 code. */
}
+
+ /* Removing the TeX escape character from tag names simplifies
+ things for editors finding tagged commands in TeX buffers.
+ This applies to Emacs but also to the tag-finding behavior
+ of at least some of the editors that use ctags, though in
+ the latter case this will remain suboptimal. The
+ undocumented ctags option '--no-duplicates' may help. */
+ if (*cp == TEX_esc)
+ {
+ cp++;
+ one_esc = true;
+ }
+
+ /* Testing !c_isspace && !c_ispunct is simpler, but halts
+ processing at too many places. The list as it stands tries
+ both to ensure that tag names will derive from macro names
+ rather than from optional parameters to those macros, and
+ also to return findable names while still allowing for
+ unorthodox constructs. */
for (p = cp;
- (!c_isspace (*p) && *p != '#' &&
- *p != TEX_opgrp && *p != TEX_clgrp);
+ (!c_isspace (*p) && *p != '#' && *p != '=' &&
+ *p != '[' && *p != '(' && *p != TEX_opgrp &&
+ *p != TEX_clgrp && *p != '"' && *p != '\'' &&
+ *p != '%' && *p != ',' && *p != '|' && *p != '$');
p++)
- continue;
+ /* In expl3 code we remove the argument specification from
+ the tag name. More generally we allow only one (deleted)
+ escape char in a tag name, which (primarily) enables
+ tagging a TeX command's different, possibly temporary,
+ '\let' bindings. */
+ if (is_explthree && *p == ':')
+ break;
+ else if (*p == TEX_esc)
+ { /* Second part of test is for, e.g., \cslet. */
+ if (!one_esc && !opgrp)
+ {
+ one_esc = true;
+ continue;
+ }
+ else
+ break;
+ }
+ else
+ continue;
+ /* For TeX files, tags without a name are basically cruft, and
+ in some situations they can produce spurious and confusing
+ matches. Try to catch as many cases as possible where a
+ command name is of the form '\(', but avoid, as far as
+ possible, the spurious matches. */
+ if (p == cp)
+ {
+ switch (*p)
+ { /* Include =? */
+ case '(': case '[': case '"': case '\'':
+ case '\\': case '!': case '=': case ',':
+ case '|': case '$':
+ p++;
+ break;
+ case '{': case '}': case '<': case '>':
+ if (!opgrp)
+ {
+ p++;
+ if (*p == '\0' || *p == '%')
+ goto tex_next_line;
+ }
+ break;
+ default:
+ break;
+ }
+ }
namelen = p - cp;
linelen = lb.len;
if (!opgrp || *p == TEX_clgrp)
@@ -5824,9 +5972,18 @@ TeX_commands (FILE *inf)
p++;
linelen = p - lb.buffer + 1;
}
- make_tag (cp, namelen, true,
- lb.buffer, linelen, lineno, linecharno);
- goto tex_next_line; /* We only tag a line once */
+ if (namelen)
+ make_tag (cp, namelen, true,
+ lb.buffer, linelen, lineno, linecharno);
+ /* Lines with more than one \def or \let are surprisingly
+ common in TeX files, especially in the system files that
+ form the basis of the various TeX formats. This tags them
+ all. */
+ /* goto tex_next_line; /\* We only tag a line once *\/ */
+ while (*cp != '\0' && *cp != '%' && *cp != TEX_esc)
+ cp++;
+ if (*cp != TEX_esc)
+ goto tex_next_line;
}
}
tex_next_line:
diff --git a/lisp/textmodes/tex-mode.el b/lisp/textmodes/tex-mode.el
index 97c950267c6..4a595aa1ea5 100644
--- a/lisp/textmodes/tex-mode.el
+++ b/lisp/textmodes/tex-mode.el
@@ -636,6 +636,14 @@ tex-font-lock-keywords-2
3 '(tex-font-lock-append-prop 'bold) 'append)))))
"Gaudy expressions to highlight in TeX modes.")
+(defvar-local tex-expl-region-list nil
+ "List of region boundaries where expl3 syntax is active.
+It will be nil in buffers where expl3 syntax is always active, e.g.,
+expl3 classes or packages.")
+
+(defvar-local tex-expl-buffer-p nil
+ "Non-nil in buffers where expl3 syntax is always active.")
+
(defun tex-font-lock-suscript (pos)
(unless (or (memq (get-text-property pos 'face)
'(font-lock-constant-face font-lock-builtin-face
@@ -645,7 +653,17 @@ tex-font-lock-suscript
(pos pos))
(while (eq (char-before pos) ?\\)
(setq pos (1- pos) odd (not odd)))
- odd))
+ odd)
+ ;; Check if POS is in an expl3 syntax region or an expl3 buffer
+ (when (eq (char-after pos) ?_)
+ (or tex-expl-buffer-p
+ (and
+ tex-expl-region-list
+ (catch 'result
+ (dolist (range tex-expl-region-list)
+ (and (> pos (car range))
+ (< pos (cdr range))
+ (throw 'result t))))))))
(if (eq (char-after pos) ?_)
`(face subscript display (raise ,(car tex-font-script-display)))
`(face superscript display (raise ,(cadr tex-font-script-display))))))
@@ -1289,8 +1307,16 @@ tex-common-initialization
#'tex--prettify-symbols-compose-p)
(setq-local syntax-propertize-function
(syntax-propertize-rules latex-syntax-propertize-rules))
+ ;; Don't add extra processing to `syntax-propertize' in files where
+ ;; expl3 syntax is always active.
+ :after-hook (progn (tex-expl-buffer-parse)
+ (unless tex-expl-buffer-p
+ (add-hook 'syntax-propertize-extend-region-functions
+ #'tex-expl-region-set nil t)))
;; TABs in verbatim environments don't do what you think.
(setq-local indent-tabs-mode nil)
+ ;; Set up xref backend in TeX buffers.
+ (add-hook 'xref-backend-functions #'tex--xref-backend nil t)
;; Other vars that should be buffer-local.
(make-local-variable 'tex-command)
(make-local-variable 'tex-start-of-header)
@@ -1936,6 +1962,36 @@ tex-count-words
(forward-sexp 1))))))
(message "%s words" count))))
+(defun tex-expl-buffer-parse ()
+ "Identify buffers where expl3 syntax is always active."
+ (save-excursion
+ (goto-char (point-min))
+ (when (tex-search-noncomment
+ (re-search-forward
+ "\\\\\\(?:ExplFile\\|ProvidesExpl\\|__xparse_file\\)"
+ nil t))
+ (setq tex-expl-buffer-p t))))
+
+(defun tex-expl-region-set (_beg _end)
+ "Create a list of regions where expl3 syntax is active.
+This function updates the list whenever `syntax-propertize' runs, and
+stores it in the buffer-local variable `tex-expl-region-list'. The
+list will always be nil when the buffer visits an expl3 file, e.g., an
+expl3 class or package, where expl3 syntax is always active."
+ (unless syntax-ppss--updated-cache;; Stop forward search running twice.
+ (setq tex-expl-region-list nil)
+ ;; Leaving this test here allows users to set `tex-expl-buffer-p'
+ ;; independently of the mode's automatic detection of an expl3 file.
+ (unless tex-expl-buffer-p
+ (goto-char (point-min))
+ (let ((case-fold-search nil))
+ (while (tex-search-noncomment
+ (search-forward "\\ExplSyntaxOn" nil t))
+ (let ((new-beg (point))
+ (new-end (or (tex-search-noncomment
+ (search-forward "\\ExplSyntaxOff" nil t))
+ (point-max))))
+ (push (cons new-beg new-end) tex-expl-region-list)))))))
\f
;;; Invoking TeX in an inferior shell.
@@ -3742,6 +3798,321 @@ tex-chktex
(process-send-region tex-chktex--process (point-min) (point-max))
(process-send-eof tex-chktex--process))))
+\f
+;;; Xref backend
+
+;; Here we lightly adapt the default etags backend for xref so that
+;; the main xref user commands (including `xref-find-definitions',
+;; `xref-find-apropos', and `xref-find-references' [on M-., C-M-., and
+;; M-?, respectively]) work in TeX buffers. The only methods we
+;; actually modify are `xref-backend-identifier-at-point' and
+;; `xref-backend-references'. Many of the complications here, and in
+;; `etags' itself, are due to the necessity of parsing both the old
+;; TeX syntax and the new expl3 syntax, which will continue to appear
+;; together in documents for the foreseeable future. Synchronizing
+;; Emacs and `etags' this way aims to improve the user experience "out
+;; of the box."
+
+(defvar tex-thingatpt-exclude-chars '(?\\ ?\{ ?\})
+ "Exclude these chars by default from TeX thing-at-point.
+
+The TeX `xref-backend-identifier-at-point' method uses the characters
+listed in this variable to decide on the default search string to
+present to the user who calls an `xref' command. These characters
+become part of a regexp which always excludes them from that default
+string. For the `xref' commands to function properly in TeX buffers, at
+least the TeX escape and the two TeX grouping characters should be
+listed here. Should your TeX documents contain other characters which
+you want to exclude by default, then you can add them to the list,
+though you may wish to consult the functions
+`tex-thingatpt--beginning-of-symbol' and `tex-thingatpt--end-of-symbol'
+to see what the regexp already contains. If your documents contain
+non-standard escape and grouping characters, then you can replace the
+three listed here with your own, thereby allowing the three standard
+characters to appear by default in search strings. Please be aware,
+however, that the `etags' program only recognizes `\\' (92) and `!' (33)
+as escape characters in TeX documents, and if it detects the latter it
+also uses `<>' as the TeX grouping construct rather than `{}'. Setting
+the escape and grouping chars to anything other than `\\=\\{}' or `!<>'
+will not be useful without changes to `etags', at least for commands
+that search tags tables, such as \\[xref-find-definitions] and \
+\\[xref-find-apropos].
+
+Should you wish to change the defaults, please also be aware that,
+without further modifications to tex-mode.el, the usual text-parsing
+routines for `font-lock' and the like won't work correctly, as the
+default escape and grouping characters are currently hard coded in many
+places.")
+
+;; Populate `semantic-symref-filepattern-alist' for the in-tree modes;
+;; AUCTeX is doing the same for its modes.
+(with-eval-after-load 'semantic/symref/grep
+ (defvar semantic-symref-filepattern-alist)
+ (push '(latex-mode "*.[tT]e[xX]" "*.ltx" "*.sty" "*.cl[so]"
+ "*.bbl" "*.drv" "*.hva")
+ semantic-symref-filepattern-alist)
+ (push '(plain-tex-mode "*.[tT]e[xX]" "*.ins")
+ semantic-symref-filepattern-alist)
+ (push '(doctex-mode "*.dtx") semantic-symref-filepattern-alist))
+
+(defun tex--xref-backend () 'tex-etags)
+
+;; Setup AUCTeX modes (for testing purposes only).
+
+(add-hook 'TeX-mode-hook #'tex-set-auctex-xref-backend)
+
+(defun tex-set-auctex-xref-backend ()
+ (add-hook 'xref-backend-functions #'tex--xref-backend nil t))
+
+;; `xref-find-references' currently may need this when called from a
+;; latex-mode buffer in order to search files or buffers with a .tex
+;; suffix (including the buffer from which it has been called). We
+;; append it to `auto-mode-alist' so as not to interfere with the usual
+;; mode-setting apparatus. Changes here and in AUCTeX should soon
+;; render it unnecessary.
+(add-to-list 'auto-mode-alist '("\\.[tT]e[xX]\\'" . latex-mode) t)
+
+(cl-defmethod xref-backend-identifier-at-point ((_backend (eql 'tex-etags)))
+ (require 'etags)
+ (tex--thing-at-point))
+
+;; The detection of `_' and `:' is a primitive method for determining
+;; whether point is on an expl3 construct. It may fail in some
+;; instances.
+(defun tex--thing-at-point ()
+ "Demarcate `thing-at-point' for the TeX `xref' backend."
+ (let ((bounds (tex--bounds-of-symbol-at-point)))
+ (when bounds
+ (let ((texsym (buffer-substring-no-properties (car bounds) (cdr bounds))))
+ (if (and (not (string-match-p "reference" (symbol-name this-command)))
+ (seq-contains-p texsym ?_)
+ (seq-contains-p texsym ?:))
+ (seq-take texsym (seq-position texsym ?:))
+ texsym)))))
+
+(defun tex-thingatpt--beginning-of-symbol ()
+ (and
+ (re-search-backward (concat "[]["
+ (mapconcat #'regexp-quote
+ (mapcar #'char-to-string
+ tex-thingatpt-exclude-chars))
+ "\"*`'#=&()%,|$[:cntrl:][:blank:]]"))
+ (forward-char)))
+
+(defun tex-thingatpt--end-of-symbol ()
+ (and
+ (re-search-forward (concat "[]["
+ (mapconcat #'regexp-quote
+ (mapcar #'char-to-string
+ tex-thingatpt-exclude-chars))
+ "\"*`'#=&()%,|$[:cntrl:][:blank:]]"))
+ (backward-char)))
+
+(defun tex--bounds-of-symbol-at-point ()
+ "Simplify `bounds-of-thing-at-point' for TeX `xref' backend."
+ (let ((orig (point)))
+ (ignore-errors
+ (save-excursion
+ (tex-thingatpt--end-of-symbol)
+ (tex-thingatpt--beginning-of-symbol)
+ (let ((beg (point)))
+ (if (<= beg orig)
+ (let ((real-end
+ (progn
+ (tex-thingatpt--end-of-symbol)
+ (point))))
+ (cond ((and (<= orig real-end) (< beg real-end))
+ (cons beg real-end))
+ ((and (= orig real-end) (= beg real-end))
+ (cons beg (1+ beg)))))))))));; For 1-char TeX commands.
+
+(cl-defmethod xref-backend-identifier-completion-table ((_backend
+ (eql 'tex-etags)))
+ (xref-backend-identifier-completion-table 'etags))
+
+(cl-defmethod xref-backend-identifier-completion-ignore-case ((_backend
+ (eql
+ 'tex-etags)))
+ (xref-backend-identifier-completion-ignore-case 'etags))
+
+(cl-defmethod xref-backend-definitions ((_backend (eql 'tex-etags)) symbol)
+ (xref-backend-definitions 'etags symbol))
+
+(cl-defmethod xref-backend-apropos ((_backend (eql 'tex-etags)) pattern)
+ (xref-backend-apropos 'etags pattern))
+
+;; The `xref-backend-references' method requires more code than the
+;; others for at least two main reasons: TeX authors have typically been
+;; free in their invention of new file types with new suffixes, and they
+;; have also tended sometimes to include non-symbol characters in
+;; command names. When combined with the default Semantic Symbol
+;; Reference API, these two characteristics of TeX code mean that a
+;; command like `xref-find-references' would often fail to find any hits
+;; for a symbol at point, including the one under point in the current
+;; buffer, or it would find only some instances and skip others.
+
+(defun tex-find-references-syntax-table ()
+ (let ((st (if (boundp 'TeX-mode-syntax-table)
+ (make-syntax-table TeX-mode-syntax-table)
+ (make-syntax-table tex-mode-syntax-table))))
+ st))
+
+(defvar tex--xref-syntax-fun nil)
+
+(defun tex-xref-syntax-function (str beg end)
+ "Provide a bespoke `syntax-propertize-function' for \\[xref-find-references]."
+ (let* (grpb tempstr
+ (shrtstr (if end
+ (progn
+ (setq tempstr (seq-take str (1- (length str))))
+ (if beg
+ (setq tempstr (seq-drop tempstr 1))
+ tempstr))
+ (seq-drop str 1)))
+ (grpa (if (and beg end)
+ (prog1
+ (list 1 "_")
+ (setq grpb (list 2 "_")))
+ (list 1 "_")))
+ (re (concat beg (regexp-quote shrtstr) end))
+ (temp-rule (if grpb
+ (list re grpa grpb)
+ (list re grpa))))
+ ;; Simple benchmarks suggested that the speed-up from compiling this
+ ;; function was nearly nil, so `eval' and its non-byte-compiled
+ ;; function remain.
+ (setq tex--xref-syntax-fun (eval
+ `(syntax-propertize-rules ,temp-rule)))))
+
+(defun tex--collect-file-extensions ()
+ "Gather TeX file extensions from `auto-mode-alist'."
+ (let* ((mlist (when (rassq major-mode auto-mode-alist)
+ (seq-filter
+ (lambda (elt)
+ (eq (cdr elt) major-mode))
+ auto-mode-alist)))
+ (lcsym (intern-soft (downcase (symbol-name major-mode))))
+ (lclist (and lcsym
+ (not (eq lcsym major-mode))
+ (rassq lcsym auto-mode-alist)
+ (seq-filter
+ (lambda (elt)
+ (eq (cdr elt) lcsym))
+ auto-mode-alist)))
+ (shortsym (when (stringp mode-name)
+ (intern-soft (concat (string-trim-right mode-name "/.*")
+ "-mode"))))
+ (lcshortsym (when (stringp mode-name)
+ (intern-soft (downcase
+ (concat
+ (string-trim-right mode-name "/.*")
+ "-mode")))))
+ (shlist (and shortsym
+ (not (eq shortsym major-mode))
+ (not (eq shortsym lcsym))
+ (rassq shortsym auto-mode-alist)
+ (seq-filter
+ (lambda (elt)
+ (eq (cdr elt) shortsym))
+ auto-mode-alist)))
+ (lcshlist (and lcshortsym
+ (not (eq lcshortsym major-mode))
+ (not (eq lcshortsym lcsym))
+ (rassq lcshortsym auto-mode-alist)
+ (seq-filter
+ (lambda (elt)
+ (eq (cdr elt) lcshortsym))
+ auto-mode-alist)))
+ (exts (when (or mlist lclist shlist lcshlist)
+ (seq-union (seq-map #'car lclist)
+ (seq-union (seq-map #'car mlist)
+ (seq-union (seq-map #'car lcshlist)
+ (seq-map #'car shlist))))))
+ (ed-exts (when exts
+ (seq-map
+ (lambda (elt)
+ (concat "*" (string-trim elt "\\\\" "\\\\'")))
+ exts))))
+ ed-exts))
+
+(defvar tex--buffers-list nil)
+(defvar-local tex--old-syntax-function nil)
+
+(cl-defmethod xref-backend-references ((_backend (eql 'tex-etags)) identifier)
+ "Find references of IDENTIFIER in TeX buffers and files."
+ (require 'semantic/symref/grep)
+ (defvar semantic-symref-filepattern-alist)
+ (let (bufs texbufs
+ (mode major-mode))
+ (dolist (buf (buffer-list))
+ (if (eq (buffer-local-value 'major-mode buf) mode)
+ (push buf bufs)
+ (when (string-match-p ".*\\.[tT]e[xX]" (buffer-name buf))
+ (push buf texbufs))))
+ (unless (seq-set-equal-p tex--buffers-list bufs)
+ (let* ((amalist (tex--collect-file-extensions))
+ (extlist (alist-get mode semantic-symref-filepattern-alist))
+ (extlist-new (seq-uniq
+ (seq-union amalist extlist #'string-match-p))))
+ (setq tex--buffers-list bufs)
+ (dolist (buf bufs)
+ (when-let ((fbuf (buffer-file-name buf))
+ (ext (file-name-extension fbuf))
+ (finext (concat "*." ext))
+ ((not (seq-find (lambda (elt) (string-match-p elt finext))
+ extlist-new)))
+ ((push finext extlist-new)))))
+ (unless (seq-set-equal-p extlist-new extlist)
+ (setf (alist-get mode semantic-symref-filepattern-alist)
+ extlist-new))))
+ (let* (setsyntax
+ (punct (with-syntax-table (tex-find-references-syntax-table)
+ (seq-positions identifier (list ?w ?_)
+ (lambda (elt sycode)
+ (not (memq (char-syntax elt) sycode))))))
+ (end (and punct
+ (memq (1- (length identifier)) punct)
+ (> (length identifier) 1)
+ (concat "\\("
+ (regexp-quote
+ (string (elt identifier
+ (1- (length identifier)))))
+ "\\)")))
+ (beg (and punct
+ (memq 0 punct)
+ (concat "\\("
+ (regexp-quote (string (elt identifier 0)))
+ "\\)")))
+ (text-mode-hook
+ (if (or end beg)
+ (progn
+ (tex-xref-syntax-function identifier beg end)
+ (setq setsyntax (lambda ()
+ (setq-local syntax-propertize-function
+ tex--xref-syntax-fun)
+ (setq-local TeX-style-hook-applied-p t)))
+ (cons setsyntax text-mode-hook))
+ text-mode-hook)))
+ (unless (memq 'doctex-mode (derived-mode-all-parents mode))
+ (setq bufs (append texbufs bufs)))
+ (when (or end beg)
+ (dolist (buf bufs)
+ (with-current-buffer buf
+ (unless (local-variable-p 'tex--old-syntax-function)
+ (setq tex--old-syntax-function syntax-propertize-function))
+ (setq-local syntax-propertize-function
+ tex--xref-syntax-fun)
+ (syntax-ppss-flush-cache (point-min)))))
+ (unwind-protect
+ (xref-backend-references nil identifier)
+ (when (or end beg)
+ (dolist (buf bufs)
+ (with-current-buffer buf
+ (when buffer-file-truename
+ (setq-local syntax-propertize-function
+ tex--old-syntax-function)
+ (syntax-ppss-flush-cache (point-min))))))))))
+
(make-obsolete-variable 'tex-mode-load-hook
"use `with-eval-after-load' instead." "28.1")
(run-hooks 'tex-mode-load-hook)
--
2.39.4
next prev parent reply other threads:[~2024-05-16 12:56 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-03 15:09 bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-21 2:11 ` Dmitry Gutov
2022-02-21 9:48 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-21 17:28 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-21 23:56 ` Dmitry Gutov
2022-02-22 15:19 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-23 2:21 ` Dmitry Gutov
2022-02-23 10:45 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-24 2:23 ` Dmitry Gutov
2022-02-24 13:15 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-21 23:55 ` Dmitry Gutov
2022-09-08 13:25 ` Lars Ingebrigtsen
2022-09-08 13:34 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-09-08 13:39 ` Lars Ingebrigtsen
2022-09-08 15:50 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-03 9:08 ` Stefan Kangas
2023-09-03 10:03 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-03 10:46 ` Stefan Kangas
2023-09-13 11:10 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-13 13:42 ` Stefan Kangas
2023-09-13 15:23 ` Dmitry Gutov
2023-09-13 17:01 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-13 23:59 ` Dmitry Gutov
2023-09-14 6:10 ` Eli Zaretskii
2023-09-15 18:45 ` Tassilo Horn
2023-09-16 5:53 ` Ikumi Keita
2023-09-17 8:49 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-22 13:06 ` Arash Esbati
2024-04-22 14:56 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-22 16:15 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-22 16:37 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-22 17:16 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-22 17:25 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-24 0:09 ` Dmitry Gutov
2024-04-24 9:02 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-23 12:04 ` Arash Esbati
2024-04-23 13:21 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-29 14:15 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-02 0:43 ` Dmitry Gutov
2024-05-02 13:32 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-03 13:42 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-07 2:27 ` Dmitry Gutov
2024-05-09 3:00 ` Dmitry Gutov
2024-05-09 6:38 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-09 10:49 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-13 20:54 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-14 21:24 ` Dmitry Gutov
2024-05-16 18:18 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-20 0:21 ` Dmitry Gutov
2024-05-20 2:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-25 7:57 ` Eli Zaretskii
2024-05-25 23:01 ` Dmitry Gutov
2024-05-07 2:06 ` Dmitry Gutov
2024-05-02 6:47 ` Arash Esbati
2024-05-02 13:34 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-03 14:10 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-04 8:26 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-04 14:32 ` Arash Esbati
2024-05-04 14:54 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-04 21:15 ` Arash Esbati
2024-05-07 13:15 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-15 15:47 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-16 7:53 ` Arash Esbati
2024-05-16 12:56 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors [this message]
2023-09-14 16:11 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-14 23:55 ` Dmitry Gutov
2023-09-15 6:47 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-13 19:16 ` Eli Zaretskii
2023-09-13 20:25 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-09-14 5:14 ` Eli Zaretskii
2022-02-21 12:35 ` Arash Esbati
2022-02-21 14:03 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-25 20:16 ` Augusto Stoffel
2022-02-26 9:29 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-26 10:56 ` Augusto Stoffel
2022-02-27 18:42 ` Arash Esbati
2022-02-28 9:09 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-28 11:54 ` Arash Esbati
2022-02-28 13:11 ` Augusto Stoffel
2022-02-28 19:04 ` Arash Esbati
2022-03-01 8:46 ` David Fussner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-28 13:05 ` Augusto Stoffel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CADF+RthKH96e0Vrkfw3kGikTX-edOx2Phdj-vihMfzg5qt=tUg@mail.gmail.com' \
--to=bug-gnu-emacs@gnu.org \
--cc=53749@debbugs.gnu.org \
--cc=arash@gnu.org \
--cc=dfussner@googlemail.com \
--cc=dgutov@yandex.ru \
--cc=eliz@gnu.org \
--cc=ikumi@ikumi.que.jp \
--cc=monnier@iro.umontreal.ca \
--cc=stefankangas@gmail.com \
--cc=tsdh@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.