From: Dmitry Gutov <dgutov@yandex.ru>
To: emacs-devel <emacs-devel@gnu.org>
Cc: "Philip K." <philipk@posteo.net>, Tom Tromey <tom@tromey.com>,
John Yates <john@yates-sheets.org>
Subject: Automatic (e)tags generation and incremental updates
Date: Mon, 14 Dec 2020 05:36:32 +0200 [thread overview]
Message-ID: <779a6328-9ca5-202a-25a2-b270c66fe6dd@yandex.ru> (raw)
[-- Attachment #1: Type: text/plain, Size: 3149 bytes --]
Hi all!
I went back to an old thread from 2018 and updated the patch. Together
with 'project-files' being faster than previously and a few other
tweaks, the current state feels surprisingly usable, from small to
medium projects, up until tags file sizes where code completion starts
giving an unpleasant latency.
The code lives in the branch scranch/etags-regen and currently it's just
triggered by etags when you try using the etags xref backend without
having a tags table visited. I'll also attach a patch against master to
this email.
This time I took the approach of implementing the meat of incremental
updates inside Emacs, instead of relying on an external tool. And it
works better when Emacs knows which exact things it needs to update, and
which it doesn't need to. E.g., it doesn't need to re-visit the whole
45MB tags file if it just needs a file re-indexed; it directs etags's
output to the buffer and then appends it to the file. Completion table
updates could be made faster this way too, although I think we'd need
some new data structure for them.
The main question remains how to update information for files that have
been deleted, or edited from outside Emacs (including by 'git
checkout'). Two main approaches that I'm thinking of:
- On a timer, re-create the list of project files together with their
mtimes (for instance, by piping through 'stat -c %Y') and compare with
the previous saved list. Given a big enough project, it will create
intermittent stalls in the UI, though, which could be unpleasant. But it
can be the first approach to be implemented anyway.
- filenotify. I have already been warned here that it's unreliable,
prone to overflowing due to excessive notifications or file watching
limits. The current API doesn't allow to want a directory recursively
either (which would be required to know about new files). There is a
project walled Watchman, however (https://github.com/facebook/watchman)
which I have read good things about, and it must use some sort of file
notification API under the covers. Perhaps if someone here is familiar
with its architecture, they could advise how to build a better
abstraction on top of inotify in Emacs as well.
Of course, we can give the users a manual knob as well (in could come in
the form of enabling/disabling an associated minor mode), but first we
should try to make it work automatically, at least for projects of up to
certain size.
Another question I'd like to ask is where the maintainers want to see
this code: inside etags.el, in a new file near it (etags-regen.el,
perhaps?), or just in GNU ELPA? It can be a minor mode, or a value in
some user option like proposed in bug#43086.
Normally I would go the ELPA route straight away, but this kind of
feature (automatic code indexing) is what Emacs sorely needs OOtB, IMHO.
It loses here even to Sublime Text 3 released 7 years ago, which is not
a very "smart" editor. And "democratizing" etags this way should result
in better adoption, bug reports, feature requests, etc.
Please give it a try and comment.
(Cc'd some folks who went near previous discussions.)
[-- Attachment #2: etags-regen.diff --]
[-- Type: text/x-patch, Size: 5920 bytes --]
diff --git a/lisp/progmodes/etags.el b/lisp/progmodes/etags.el
index 104d889b8b..00723608da 100644
--- a/lisp/progmodes/etags.el
+++ b/lisp/progmodes/etags.el
@@ -2069,7 +2069,9 @@ etags-xref-find-definitions-tag-order
file name, add `tag-partial-file-name-match-p' to the list value.")
;;;###autoload
-(defun etags--xref-backend () 'etags)
+(defun etags--xref-backend ()
+ (etags--maybe-use-project-tags)
+ 'etags)
(cl-defmethod xref-backend-identifier-at-point ((_backend (eql etags)))
(find-tag--default))
@@ -2144,6 +2146,132 @@ xref-location-line
(nth 1 tag-info)))
\f
+;;; Simple tags generation, with automatic invalidation
+
+(defvar etags--project-tags-file nil)
+(defvar etags--project-tags-root nil)
+(defvar etags--project-new-file nil)
+
+(defvar etags--command (executable-find "etags")
+ ;; How do we get the correct etags here?
+ ;; E.g. "~/vc/emacs-master/lib-src/etags"
+ ;;
+ ;; ctags's etags requires '-L -' for stdin input.
+ ;; It also looks broken here (indexes only some of the input files).
+ ;;
+ ;; If our etags supported '-L', we could use any version of etags.
+ )
+
+(defun etags--maybe-use-project-tags ()
+ (let (proj)
+ (when (and etags--project-tags-root
+ (not (file-in-directory-p default-directory
+ etags--project-tags-root)))
+ (etags--project-tags-cleanup))
+ (when (and (not (or tags-file-name
+ tags-table-list))
+ (setq proj (project-current)))
+ (message "Generating new tags table...")
+ (let ((start (time-to-seconds)))
+ (etags--project-tags-generate proj)
+ (message "...done (%.2f s)" (- (time-to-seconds) start)))
+ ;; Invalidate the scanned tags after any change is written to disk.
+ (add-hook 'after-save-hook #'etags--project-update-file)
+ (add-hook 'before-save-hook #'etags--project-mark-as-new)
+ (visit-tags-table etags--project-tags-file))))
+
+(defun etags--project-tags-generate (proj)
+ (let* ((root (project-root proj))
+ (default-directory root)
+ (files (project-files proj))
+ ;; FIXME: List all extensions, or wait for etags fix.
+ ;; http://lists.gnu.org/archive/html/emacs-devel/2018-01/msg00323.html
+ (extensions '("rb" "js" "py" "pl" "el" "c" "cpp" "cc" "h" "hh" "hpp"
+ "java" "go" "cl" "lisp" "prolog" "php" "erl" "hrl"
+ "F" "f" "f90" "for" "cs" "a" "asm" "ads" "adb" "ada"))
+ (file-regexp (format "\\.%s\\'" (regexp-opt extensions t))))
+ (setq etags--project-tags-file (make-temp-file "emacs-project-tags-")
+ etags--project-tags-root root)
+ (with-temp-buffer
+ (mapc (lambda (f)
+ (when (string-match-p file-regexp f)
+ (insert f "\n")))
+ files)
+ (shell-command-on-region
+ (point-min) (point-max)
+ (format "%s - -o %s" etags--command etags--project-tags-file)
+ nil nil "*etags-project-tags-errors*" t))))
+
+(defun etags--project-update-file ()
+ ;; TODO: Maybe only do this when Emacs is idle for a bit.
+ (let ((file-name buffer-file-name)
+ (tags-file-buf (get-file-buffer etags--project-tags-file))
+ pr should-scan)
+ (save-excursion
+ (when tags-file-buf
+ (cond
+ ((and etags--project-new-file
+ (kill-local-variable 'etags--project-new-file)
+ (setq pr (project-current))
+ (equal (project-root pr) etags--project-tags-root)
+ (member file-name (project-files pr)))
+ (set-buffer tags-file-buf)
+ (setq should-scan t))
+ ((progn (set-buffer tags-file-buf)
+ (goto-char (point-min))
+ (re-search-forward (format "^%s," (regexp-quote file-name)) nil t))
+ (let ((start (line-beginning-position)))
+ (re-search-forward "\f\n" nil 'move)
+ (let ((inhibit-read-only t)
+ (save-silently t))
+ (delete-region (- start 2)
+ (if (eobp)
+ (point)
+ (- (point) 2)))
+ (write-region (point-min) (point-max) buffer-file-name nil 'silent)
+ (set-visited-file-modtime)))
+ (setq should-scan t))))
+ (when should-scan
+ (goto-char (point-max))
+ (let ((inhibit-read-only t)
+ (current-end (point)))
+ (call-process
+ etags--command
+ nil
+ '(t "*etags-project-tags-errors*")
+ nil
+ file-name
+ "--append"
+ "-o"
+ "-")
+ ;; XXX: When the project is big (tags file in 10s of megabytes),
+ ;; this is much faster than revert-buffer. Or even using
+ ;; write-region without APPEND.
+ ;; We could also keep TAGS strictly as a buffer, with no
+ ;; backing on disk.
+ (write-region current-end (point-max) etags--project-tags-file t))
+ (set-visited-file-modtime)
+ (set-buffer-modified-p nil)
+ ;; FIXME: Is there a better way to do this?
+ ;; Completion table is the only remaining place where the
+ ;; update is not incremental.
+ (setq-default tags-completion-table nil)
+ ))))
+
+(defun etags--project-mark-as-new ()
+ (unless buffer-file-number
+ (setq-local etags--project-new-file t)))
+
+(defun etags--project-tags-cleanup ()
+ (when etags--project-tags-file
+ (delete-file etags--project-tags-file)
+ (setq tags-file-name nil
+ tags-table-list nil
+ etags--project-tags-file nil
+ etags--project-tags-root nil))
+ (remove-hook 'after-save-hook #'etags--project-update-file)
+ (remove-hook 'before-save-hook #'etags--project-mark-as-new))
+
(provide 'etags)
;;; etags.el ends here
next reply other threads:[~2020-12-14 3:36 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-14 3:36 Dmitry Gutov [this message]
2021-01-07 3:46 ` Automatic (e)tags generation and incremental updates Dmitry Gutov
2021-01-07 14:15 ` Eli Zaretskii
2021-01-07 15:56 ` Dmitry Gutov
2021-01-07 16:17 ` Stefan Kangas
2021-01-09 21:49 ` Tom Tromey
2021-01-10 13:53 ` Dmitry Gutov
2021-01-10 16:56 ` Tom Tromey
2021-01-10 19:39 ` Tom Tromey
2021-01-10 23:09 ` Dmitry Gutov
2021-01-10 23:36 ` Dmitry Gutov
2021-01-10 23:50 ` Dmitry Gutov
2021-01-11 14:56 ` Eli Zaretskii
2021-01-12 1:33 ` Dmitry Gutov
2021-01-12 4:21 ` Stefan Monnier
2021-01-12 16:59 ` Dmitry Gutov
2021-01-12 17:24 ` Stefan Monnier
2021-01-12 15:08 ` Eli Zaretskii
2021-01-12 16:48 ` Dmitry Gutov
2021-01-12 17:15 ` Eli Zaretskii
2021-01-12 17:32 ` Dmitry Gutov
2021-01-12 17:55 ` Eli Zaretskii
2021-01-12 22:26 ` Dmitry Gutov
2021-01-13 15:01 ` Eli Zaretskii
2021-01-13 15:52 ` Dmitry Gutov
2021-01-13 15:58 ` Eli Zaretskii
2021-01-16 3:57 ` Dmitry Gutov
2021-01-16 7:34 ` Eli Zaretskii
2021-01-10 16:49 ` Eli Zaretskii
2021-01-10 16:58 ` Tom Tromey
2021-01-10 17:56 ` Dmitry Gutov
2021-01-10 18:14 ` Eli Zaretskii
2021-01-10 23:13 ` Dmitry Gutov
2021-01-11 14:53 ` Eli Zaretskii
2021-01-12 1:49 ` Dmitry Gutov
2021-01-12 15:09 ` Eli Zaretskii
2021-02-18 23:26 ` Dmitry Gutov
2021-02-19 8:33 ` Eli Zaretskii
2021-02-19 14:35 ` Dmitry Gutov
2021-02-19 15:44 ` Eli Zaretskii
2021-02-20 1:35 ` Dmitry Gutov
2021-02-20 7:30 ` Eli Zaretskii
2021-02-20 20:27 ` Dmitry Gutov
2021-02-20 20:41 ` Eli Zaretskii
2021-02-20 21:05 ` Dmitry Gutov
2021-02-20 21:14 ` Dmitry Gutov
2021-02-21 19:53 ` Eli Zaretskii
2021-02-21 20:39 ` Dmitry Gutov
2021-02-22 16:08 ` Eli Zaretskii
2021-02-22 19:25 ` Dmitry Gutov
2021-02-22 19:33 ` Eli Zaretskii
2021-02-23 1:15 ` Dmitry Gutov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=779a6328-9ca5-202a-25a2-b270c66fe6dd@yandex.ru \
--to=dgutov@yandex.ru \
--cc=emacs-devel@gnu.org \
--cc=john@yates-sheets.org \
--cc=philipk@posteo.net \
--cc=tom@tromey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.