From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" Newsgroups: gmane.emacs.bugs Subject: bug#53749: 29.0.50; [PATCH] Xref backend for TeX buffers Date: Fri, 03 May 2024 10:10:58 -0400 Message-ID: References: <1de34060-e93b-0a42-fff5-20e283abe0dc@yandex.ru> <87o7vq0zir.fsf@gnus.org> <8735d20yvd.fsf@gnus.org> <2c5c8afa-b57e-3156-d21c-5523cacb4d87@yandex.ru> <831qf1mgjl.fsf@gnu.org> <87cyyj9rpp.fsf@gnu.org> <65793.1694843596@localhost> Reply-To: Stefan Monnier Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="35474"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: 53749@debbugs.gnu.org, Ikumi Keita , Tassilo Horn , Arash Esbati , stefankangas@gmail.com, Dmitry Gutov , Eli Zaretskii To: David Fussner Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Fri May 03 16:12:08 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1s2tdw-00090k-GO for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 03 May 2024 16:12:08 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1s2tdY-00028T-Sr; Fri, 03 May 2024 10:11:44 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1s2tdU-000286-Vj for bug-gnu-emacs@gnu.org; Fri, 03 May 2024 10:11:42 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1s2tdT-00040b-FR for bug-gnu-emacs@gnu.org; Fri, 03 May 2024 10:11:40 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1s2tdp-0001lc-SB for bug-gnu-emacs@gnu.org; Fri, 03 May 2024 10:12:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Stefan Monnier Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 03 May 2024 14:12:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 53749 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: pending patch Original-Received: via spool by 53749-submit@debbugs.gnu.org id=B53749.17147454946777 (code B ref 53749); Fri, 03 May 2024 14:12:01 +0000 Original-Received: (at 53749) by debbugs.gnu.org; 3 May 2024 14:11:34 +0000 Original-Received: from localhost ([127.0.0.1]:46542 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1s2tdO-0001lF-2h for submit@debbugs.gnu.org; Fri, 03 May 2024 10:11:34 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:16816) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1s2tdK-0001l6-0R for 53749@debbugs.gnu.org; Fri, 03 May 2024 10:11:32 -0400 Original-Received: from pmg3.iro.umontreal.ca (localhost [127.0.0.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 20A4B4414B0; Fri, 3 May 2024 10:11:02 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1714745459; bh=Str7xwjWdC+z7r5d3a6xv+raptKl+7aQhuYFT4Hs1LA=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=nHODQCy5H9d+m+VG1GD7cAmj+KweDKAin7UirZx/B1fr8Kxl3ZhOWDDgvRpFox4hW JMJMATgOMEKpPXbOd2NsdoQUL+agupLciJN1b3CMZ6qGRnpK7rk4c3ukIXErlPcfxF DRXchAPxcQNgKJaTSFxQM9/MK4JuSSp3lmUkEAbMa8NzG6TX3E7If6DXUMANV7QVdQ u/92wba6FbXHw4tc5p7FTvFOkHYn3DMwx7QbdBTV27lsG/hmmbET51z3A/KMcIlHxV pMcPW+Ih3H9GqSFMQjC9A9sFksiiJoMmdkGuA+Pit1LBbxC+9MARUFOD9OvzXRWFrU sVaQB/kvVYF8w== Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id A301544144D; Fri, 3 May 2024 10:10:59 -0400 (EDT) Original-Received: from pastel (unknown [45.72.201.215]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 652FC12020E; Fri, 3 May 2024 10:10:59 -0400 (EDT) In-Reply-To: (David Fussner's message of "Mon, 29 Apr 2024 15:15:41 +0100") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:284381 Archived-At: Hi, Apparently I'm the `tex-mode.el` guy, so I tried to take a look. > diff --git a/lisp/textmodes/tex-mode.el b/lisp/textmodes/tex-mode.el > index 97c950267c6..d990a2dbfa9 100644 > --- a/lisp/textmodes/tex-mode.el > +++ b/lisp/textmodes/tex-mode.el > @@ -695,7 +696,25 @@ tex-verbatim-environments > ("\\\\\\(?:end\\|begin\\) *\\({[^\n{}]*}\\)" > (1 (ignore > (tex-env-mark (match-beginning 0) > - (match-beginning 1) (match-end 1)))))))) > + (match-beginning 1) (match-end 1))))) > + ;; The next two rules change the syntax of `:' and `_' in expl3 > + ;; constructs, so that `tex-font-lock-suscript' can fontify them > + ;; more accurately. > + ((concat "\\(\\(?:[\\\\[:space:]{]_\\|" > + "[\\\\{[:space:]][^][_[:space:][:cntrl:][:digit:]\\\\{}()/=]+\\)" > + "\\(?:_+\\(?:[^][[:space:][:cntrl:][:digit:]:\\\\{}()/#_=]+\\|" > + "#+[1-9]\\)\\)+\\)\\([:_]?\\)") Can you add in the comment some URL pointing to some relevant expl3 documentation which "explains" why the above regexp makes sense? Also I don't clearly see how the above regexp distinguishes expl3 code from "normal" LaTeX code, so the comment should say something about it. Side note: I'd avoid [:space:] whose exact meaning is rarely quite what we need. Side note: backslash doesn't need to be backslashed in [...]. > + (1 (ignore > + (let* ((expr (buffer-substring-no-properties (match-beginning 1) > + (match-end 1))) > + (list (seq-positions expr ?_))) > + (dolist (pos list) > + (put-text-property (+ pos (match-beginning 1)) > + (1+ (+ pos (match-beginning 1))) > + 'syntax-table (string-to-syntax "_")))))) > + (2 "_")) > + ("\\\\[[:alpha:]]+\\(:\\)[[:alpha:][:space:]\n]" > + (1 "_"))))) Currently we "skip" inappropriate underscores via `tex-font-lock-match-suscript` and/or by adding a particular `face` text property rather than via `syntax-table/propertize`. For algorithmic reasons, it's better to minimize the work done in `syntax-propertize-function` as much as possible (font-lock is more lazy than `syntax-propertize`), so I recommend you try and moving the above to font-lock rules. > +(defvar tex-esc-and-group-chars '(?\\ ?{ ?}) > + "The current TeX escape and grouping characters. I recommend you backslash escape the { and } above (although it's not indispensable, `emacs-lisp-mode` will parse the code better). More importantly, the docstring doesn't explain what this list means/does. E.g. does the order matter? Can it be longer than 3 elements? >From the current docstring I can't guess what would be the consequence of adding/removing elements to/from this list. > +;; Populate `semantic-symref-filepattern-alist' for the in-tree modes; > +;; AUCTeX is doing the same for its modes. > +(defvar semantic-symref-filepattern-alist) > +(with-eval-after-load 'semantic/symref/grep > + (push '(latex-mode "*.[tT]e[xX]" "*.ltx" "*.sty" "*.cl[so]" > + "*.bbl" "*.drv" "*.hva") > + semantic-symref-filepattern-alist) > + (push '(plain-tex-mode "*.[tT]e[xX]" "*.ins") > + semantic-symref-filepattern-alist) > + (push '(doctex-mode "*.dtx") semantic-symref-filepattern-alist)) We know `semantic-symref-filepattern-alist` will exist when `semantic/symref/grep` is loaded, but not before, so I'd put the `defvar` inside the `with-eval-after-load`. > +;; Setup AUCTeX modes (for testing purposes only). > + > +(add-hook 'TeX-mode-hook #'tex-set-auctex-xref-backend) > + > +(defun tex-set-auctex-xref-backend () > + (add-hook 'xref-backend-functions #'tex--xref-backend nil t)) I assume this will be sent to AUCTeX and is not meant to be in `tex-mode.el`, right? > +;; `xref-find-references' currently may need this when called from a > +;; latex-mode buffer in order to search files or buffers with a .tex > +;; suffix (including the buffer from which it has been called). We > +;; append it to `auto-mode-alist' so as not to interfere with the usual > +;; mode-setting apparatus. Changes here and in AUCTeX should soon > +;; render it unnecessary. > +(add-to-list 'auto-mode-alist '("\\.[tT]e[xX]\\'" . latex-mode) t) Maybe I have not followed the whole discussion closely enough, but at least to me the above "soon" is very unclear. I'll assume that this code will be removed before we install the patch. If not, please explain in the comment why this specific hack is needed and how it works. > +(cl-defmethod xref-backend-references ((_backend (eql 'tex-etags)) identifier) > + "Find references of IDENTIFIER in TeX buffers and files." > + (require 'semantic/symref/grep) > + (let (bufs texbufs > + (mode major-mode)) > + (dolist (buf (buffer-list)) > + (if (eq (buffer-local-value 'major-mode buf) mode) > + (push buf bufs) > + (when (string-match-p ".*\\.[tT]e[xX]" (buffer-name buf)) > + (push buf texbufs)))) > + (unless (seq-set-equal-p tex--buffers-list bufs) > + (let* ((amalist (tex--collect-file-extensions)) > + (extlist (alist-get mode semantic-symref-filepattern-alist)) > + (extlist-new (seq-uniq > + (seq-union amalist extlist #'string-match-p)))) After sinking the `defvar` above, you'll need to add a new `defvar` for `semantic-symref-filepattern-alist` just after the `require`. > + (setq-local syntax-propertize-function > + (eval > + `(tex-xref-syntax-function > + ,identifier ,beg ,end))) Why do we need to change `syntax-propertize-function` and why do we need `eval`? > + (setq syntax-propertize--done 0) This is not sufficient. You want to `syntax-ppss-flush-cache`. Stefan