* treesitter local parser: huge slowdown and memory usage in a long file
@ 2024-02-11 21:53 Vincenzo Pupillo
2024-02-12 4:16 ` Yuan Fu
0 siblings, 1 reply; 31+ messages in thread
From: Vincenzo Pupillo @ 2024-02-11 21:53 UTC (permalink / raw)
To: emacs-devel
[-- Attachment #1: Type: text/plain, Size: 1068 bytes --]
Hi,
as a benchmark for my php-ts-mode (in 2 variants: one with tree-sitter-phpdoc
for php comment block, and another using regular expression for comment block)
I use tcpdf.php (from the tcpdf library). This php file has 24730 lines and
generates 669 parser ranges, 665 of which are for phpdoc.
As you can see from the profile (I try to edit the comment on line 2350) that I
attached, the problem is in treesit--pre-redisplay.
I tried playing around with the code a bit but to no avail (for example, I
limited treesit-update-ranges to window-start and window-end).
comments say:
;; Force repase on _all_ the parsers might not be necessary, but
;; this is probably the most robust way.
Any ideas?
My php-ts-mode (It's a working progress) is available here:
https://github.com/vpxyz/php-ts-mode
Thanks
V.
p.s without phpdoc emacs is as fast as with short php files.
p.p.s. nvim with treesitter is as slow as my major mode with this file.
GNU Emacs 30.0.50 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.41, cairo
version 1.18.0) of 2024-02-11
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: cpu_profiler_report --]
[-- Type: text/plain; charset="x-UTF_8J"; name="cpu_profiler_report", Size: 12025 bytes --]
[profiler-profile "28.1" cpu #s(hash-table test equal data ([redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil] 124 [nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil] 945 [line-move-visual line-move next-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil] 6 [line-move-visual line-move previous-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil] 4 [treesit--pre-redisplay run-hook-with-args redisplay--pre-redisplay-functions redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil nil] 2744 [jit-lock--antiblink-post-command nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil] 1 [syntax-class show-paren--categorize-paren show-paren--locate-near-paren show-paren--default show-paren-function apply timer-event-handler nil nil nil nil nil nil nil nil nil] 1 [delete-dups xselect-convert-to-targets pgtk-own-selection-internal "#<compiled -0x1728ef4c95247357>" apply gui-backend-set-selection gui-set-selection gui-select-text kill-new kill-region kill-line funcall-interactively command-execute nil nil nil] 3 [treesit-query-range treesit--update-ranges-local treesit-update-ranges treesit--pre-redisplay run-hook-with-args redisplay--pre-redisplay-functions redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil] 1759 [treesit-query-range treesit-update-ranges treesit--pre-redisplay run-hook-with-args redisplay--pre-redisplay-functions redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil] 1636 ["#<compiled 0x1c852541b14fae74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil] 1 [cl-delete cl-remove cl-remove-if-not treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c852541b14fae74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil] 3 [treesit-node-parent let* php-ts-mode--language-at-point treesit-language-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil] 1 [treesit-buffer-root-node treesit-node-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil nil nil] 102 [treesit-query-range treesit-update-ranges treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c852725f4eeae74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil] 2 [treesit--update-ranges-local treesit-update-ranges treesit--pre-redisplay run-hook-with-args redisplay--pre-redisplay-functions redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil] 14 [cl-remove cl-remove-if-not treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c852724ac226e74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil] 3 [treesit-parser-root-node treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c85273d2ea0ee74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil] 52 [treesit--font-lock-fontify-region-1 treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c85273eefe1ae74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil] 13 [self-insert-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil nil nil] 3 [parse-partial-sexp syntax-ppss jit-lock--antiblink-post-command nil nil nil nil nil nil nil nil nil nil nil nil nil] 2 [treesit-update-ranges treesit--pre-redisplay run-hook-with-args redisplay--pre-redisplay-functions redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil] 23 [treesit--update-ranges-local treesit-update-ranges treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c85273724beee74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil] 2 [comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil nil] 1 [jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil nil] 3 [treesit-query-range treesit--update-ranges-local treesit-update-ranges treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c8527451184aa74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil] 1 [treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c852743d740ea74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil] 2 [make-closure syntax-ppss show-paren--default show-paren-function apply timer-event-handler nil nil nil nil nil nil nil nil nil nil] 1 ["#<compiled 0x16385ee6d140fc75>" newline comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil] 1 ["#<compiled 0x2736abab5e0d1>" font-lock-unfontify-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c853ead11f7aa74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil] 1 ["#<compiled 0x1daadc6ebc11a146>" cl-remove cl-remove-if-not treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c853ead11f7aa74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil] 1 [internal-echo-keystrokes-prefix nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil] 1 [facep treesit--font-lock-fontify-region-1 treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c8538e0d90bb674>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil] 1 [line-move next-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil nil] 1 [frame-parameter if eval redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil nil] 1 [mode-line-frame-control eval redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil nil nil] 1 [line-move previous-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil nil] 2 [redisplay_internal\ \(C\ function\) sit-for icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil] 24 [sit-for icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil] 30 ["#<compiled 0x17c4b22424e34a24>" all-completions complete-with-action "#<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_56>" completion-pcm--all-completions completion-substring--all-completions completion-flex-all-completions "#<compiled -0x1bd7cce58ccf55c9>" "#<compiled -0x4953631e8c8c907>" mapc seq-do seq-some completion--nth-completion completion-all-completions completion-all-sorted-completions icomplete--sorted-completions] 7 [interactive-form commandp "#<compiled -0x8f410b7c4bcca45>" "#<compiled 0x17c4b22424e34a24>" all-completions complete-with-action "#<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_56>" completion-pcm--all-completions completion-substring--all-completions completion-flex-all-completions "#<compiled -0x1bd7cce58ccf55c9>" "#<compiled -0x4953631e8c8c907>" mapc seq-do seq-some completion--nth-completion] 1 [all-completions complete-with-action "#<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_56>" completion-pcm--all-completions completion-substring--all-completions completion-flex-all-completions "#<compiled -0x1bd7cce58ccf55c9>" "#<compiled -0x4953631e8c8c907>" mapc seq-do seq-some completion--nth-completion completion-all-completions completion-all-sorted-completions icomplete--sorted-completions icomplete-completions] 68 [completion-pcm--all-completions completion-substring--all-completions completion-flex-all-completions "#<compiled -0x1bd7cce58ccf55c9>" "#<compiled -0x4953631e8c8c907>" mapc seq-do seq-some completion--nth-completion completion-all-completions completion-all-sorted-completions icomplete--sorted-completions icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default] 2 [delete-dups completion-all-sorted-completions icomplete--sorted-completions icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil] 1 ["#<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_45>" minibuffer--sort-by-length-alpha completion-all-sorted-completions icomplete--sorted-completions icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil] 2 ["#<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_61>" read-extended-command--affixation icomplete--augment icomplete--render-vertical icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil] 1 [window--min-size-1 window--min-size-1 window-min-size window-sizable window--resize-root-window-vertically redisplay_internal\ \(C\ function\) completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil] 1 [redisplay_internal\ \(C\ function\) completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil nil nil] 22 [completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil nil nil nil] 67 [completion--flex-score "#<compiled -0x1dd7ba1b6c096a85>" mapcar "#<compiled -0x820789829508d37>" completion-all-sorted-completions icomplete--sorted-completions icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil] 1 [try-completion complete-with-action "#<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_56>" completion--complete-and-exit minibuffer-force-complete-and-exit icomplete-force-complete-and-exit icomplete-fido-ret funcall-interactively command-execute completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil] 10 [execute-extended-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil nil nil] 1 [Automatic\ GC nil] 165)) (26057 13278 384506 497000) nil]
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: mem_profiler_report --]
[-- Type: text/plain; charset="x-UTF_8J"; name="mem_profiler_report", Size: 21749 bytes --]
[profiler-profile "28.1" memory #s(hash-table test equal data ([profiler-start funcall-interactively command-execute execute-extended-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil] 631 [timer--time-setter timer-set-time run-at-time execute-extended-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil] 24 [timer--time-less-p timer--activate timer-activate run-at-time execute-extended-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil] 24 [timer--time-setter timer-set-idle-time run-with-idle-timer jit-lock--antiblink-post-command nil nil nil nil nil nil nil nil nil nil nil nil] 24 [nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil] 32744 [menu-bar-update-buffers-1 menu-bar-update-buffers redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil nil nil] 2016 [redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil] 316718 [string-match kill-this-buffer-enabled-p redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil nil nil] 1024 [timer--time-setter timer-set-time run-at-time run-with-timer blink-cursor--start-timer blink-cursor-start apply timer-event-handler nil nil nil nil nil nil nil nil] 216 [timer--time-less-p timer--activate timer-activate run-at-time run-with-timer blink-cursor--start-timer blink-cursor-start apply timer-event-handler nil nil nil nil nil nil nil] 192 [timer-relative-time timer-inc-time timer-event-handler nil nil nil nil nil nil nil nil nil nil nil nil nil] 672 [timer--time-setter timer-inc-time timer-event-handler nil nil nil nil nil nil nil nil nil nil nil nil nil] 288 [time-less-p timer-event-handler nil nil nil nil nil nil nil nil nil nil nil nil nil nil] 288 [timer--time-less-p timer--activate timer-activate timer-event-handler nil nil nil nil nil nil nil nil nil nil nil nil] 264 [line-move-visual line-move next-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil] 4748 [default-font-height default-line-height line-move-partial line-move next-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil] 8184 [delete-and-extract-region "#<compiled 0x1bdcb8d5ab3d6985>" apply "#<compiled -0x1e3e289fe5154769>" buffer-substring--filter filter-buffer-substring kill-region kill-line funcall-interactively command-execute nil nil nil nil nil nil] 24840 [menu-bar-update-yank-menu kill-new kill-region kill-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil] 1152 [pgtk-own-selection-internal "#<compiled -0x1728ef4c95247357>" apply gui-backend-set-selection gui-set-selection gui-select-text kill-new kill-region kill-line funcall-interactively command-execute nil nil nil nil nil] 368 [treesit--update-ranges-local treesit-update-ranges treesit--pre-redisplay run-hook-with-args redisplay--pre-redisplay-functions redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil] 2464048 [treesit-query-range treesit--update-ranges-local treesit-update-ranges treesit--pre-redisplay run-hook-with-args redisplay--pre-redisplay-functions redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil] 6513544 [treesit-query-range treesit-update-ranges treesit--pre-redisplay run-hook-with-args redisplay--pre-redisplay-functions redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil] 23008 [treesit--pre-redisplay run-hook-with-args redisplay--pre-redisplay-functions redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil nil] 571762456 [jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil nil] 8288 [treesit--update-ranges-local treesit-update-ranges treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c852541b14fae74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil] 5200 [treesit-local-parsers-on treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c852541b14fae74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil] 2320 [treesit-parser-root-node treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c852541b14fae74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil] 1679304 [treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c852541b14fae74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil] 41440 [treesit--font-lock-fontify-region-1 treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c852541b14fae74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil] 51984 [treesit--update-ranges-local treesit-update-ranges treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil] 160 [treesit-local-parsers-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil] 80 [treesit-local-parsers-at treesit-node-at let* php-ts-mode--language-at-point treesit-language-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil] 160 [treesit-node-at let* php-ts-mode--language-at-point treesit-language-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil] 3072 [treesit-node-parent let* php-ts-mode--language-at-point treesit-language-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil] 3584 [format let* php-ts-mode--language-at-point treesit-language-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil] 338 [looking-at looking-at-p and cond save-excursion let* php-ts-mode--language-at-point treesit-language-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil] 1024 [treesit-local-parsers-at treesit-node-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil nil nil] 80 [treesit-buffer-root-node treesit-node-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil nil nil] 23857603 [treesit-node-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil] 3584 [treesit-parent-while treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil] 256 [treesit-local-parsers-at treesit-node-at let* php-ts-mode--language-at-point treesit-language-at treesit-node-on treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil] 80 [treesit-node-at let* php-ts-mode--language-at-point treesit-language-at treesit-node-on treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil] 1536 [treesit-node-parent let* php-ts-mode--language-at-point treesit-language-at treesit-node-on treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil] 1792 [format let* php-ts-mode--language-at-point treesit-language-at treesit-node-on treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil] 169 [treesit-local-parsers-on treesit-node-on treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil nil nil] 80 [make-closure "#<compiled 0xfaa45bf0cbe519b>" treesit--simple-indent-eval treesit--simple-indent-eval treesit-simple-indent treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil] 4144 [string-match "#<compiled -0x12329b36eb2983c1>" "#<compiled -0x185b5bdede7ccb0a>" treesit--simple-indent-eval treesit-simple-indent treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil] 1024 [treesit-simple-indent treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil] 1336 [looking-at "#<compiled 0x113256ad37d3f5de>" treesit--simple-indent-eval treesit-simple-indent treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil] 2112 [string-match "#<compiled 0x113256ad37d3f5de>" treesit--simple-indent-eval treesit-simple-indent treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil] 1024 [indent-line-to treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil] 25168 [self-insert-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil nil nil] 732592 [make-closure syntax-ppss jit-lock--antiblink-post-command nil nil nil nil nil nil nil nil nil nil nil nil nil] 4144 [generate-new-buffer comment-normalize-vars comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil] 21 [comment-normalize-vars comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil] 2176 [timer--time-setter timer-set-time run-at-time undo-auto--boundary-ensure-timer undo-auto--undoable-change newline comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil] 24 [newline comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil] 50656 [looking-at "#<compiled 0x16385ee5b7c4fc75>" newline comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil] 1152 [comment-string-strip comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil] 2304 [comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil nil] 101648 [comment-enter-backward comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil] 1024 [comment-normalize-vars comment-indent comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil] 1024 [comment-indent comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil] 101824 [make-closure syntax-ppss show-paren--default show-paren-function apply timer-event-handler nil nil nil nil nil nil nil nil nil nil] 8288 [string-prefix-p treesit-query-range treesit--update-ranges-local treesit-update-ranges treesit--pre-redisplay run-hook-with-args redisplay--pre-redisplay-functions redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil] 7187 [delete-backward-char funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil nil nil] 25144 [treesit-query-range treesit--update-ranges-local treesit-update-ranges treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c85277a0b8dae74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil] 4144 [line-move-visual line-move previous-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil] 4748 [completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil nil nil nil] 37647 [timer--time-setter timer-set-time run-at-time undo-auto--boundary-ensure-timer undo-auto--undoable-change completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil] 24 [minibuffer--regexp-setup completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil nil nil] 1024 [string-match pgtk-device-class device-class minibuffer-setup-on-screen-keyboard completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil] 1024 [sit-for icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil] 816 [menu-bar-update-buffers-1 menu-bar-update-buffers redisplay_internal\ \(C\ function\) sit-for icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil] 1008 [redisplay_internal\ \(C\ function\) sit-for icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil] 168822 [mode-line-default-help-echo redisplay_internal\ \(C\ function\) sit-for icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil] 8184 [completion-pcm--pattern->regex completion-pcm--all-completions completion-substring--all-completions completion-flex-all-completions "#<compiled -0x1bd7cce58ccf55c9>" "#<compiled -0x4953631e8c8c907>" mapc seq-do seq-some completion--nth-completion completion-all-completions completion-all-sorted-completions icomplete--sorted-completions icomplete-completions icomplete-exhibit icomplete-post-command-hook] 1152 [all-completions complete-with-action "#<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_56>" completion-pcm--all-completions completion-substring--all-completions completion-flex-all-completions "#<compiled -0x1bd7cce58ccf55c9>" "#<compiled -0x4953631e8c8c907>" mapc seq-do seq-some completion--nth-completion completion-all-completions completion-all-sorted-completions icomplete--sorted-completions icomplete-completions] 4352 [version-to-list "#<compiled 0x17c4b22424e34a24>" all-completions complete-with-action "#<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_56>" completion-pcm--all-completions completion-substring--all-completions completion-flex-all-completions "#<compiled -0x1bd7cce58ccf55c9>" "#<compiled -0x4953631e8c8c907>" mapc seq-do seq-some completion--nth-completion completion-all-completions completion-all-sorted-completions] 3328 [delete-dups completion-all-sorted-completions icomplete--sorted-completions icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil] 113424 [minibuffer--sort-by-length-alpha completion-all-sorted-completions icomplete--sorted-completions icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil] 53296 [minibuffer--sort-by-position completion-all-sorted-completions icomplete--sorted-completions icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil] 64 [minibuffer--sort-by-key minibuffer--sort-by-position completion-all-sorted-completions icomplete--sorted-completions icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil] 85344 ["#<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_61>" read-extended-command--affixation icomplete--augment icomplete--render-vertical icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil] 162492 [move-overlay icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil] 24 [format icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil] 6378 [redisplay_internal\ \(C\ function\) completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil nil nil] 399513 [funcall-interactively command-execute completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil nil] 80 [self-insert-command funcall-interactively command-execute completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil] 288 [completion--flex-score "#<compiled -0x1dd7ba1b6c096a85>" mapcar "#<compiled -0x820789829508d37>" completion-all-sorted-completions icomplete--sorted-completions icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil] 2304 [icomplete--render-vertical icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil] 117856 [completion--hilit-from-re "#<compiled -0x2358f1b2a8a21e7>" completion-lazy-hilit icomplete--render-vertical icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil] 8128 [frame-height max-mini-window-lines icomplete--render-vertical icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil] 8184 [timer--time-setter timer-set-time run-at-time run-with-timer blink-cursor--start-timer blink-cursor-start apply timer-event-handler completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil] 24 [timer--time-less-p timer--activate timer-activate run-at-time run-with-timer blink-cursor--start-timer blink-cursor-start apply timer-event-handler completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil] 24 [timer-relative-time timer-inc-time timer-event-handler completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil] 56 [timer--time-setter timer-inc-time timer-event-handler completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil] 24 [time-less-p timer-event-handler completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil nil] 24 [timer--time-less-p timer--activate timer-activate timer-event-handler completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil] 24 [completion--replace minibuffer-force-complete minibuffer-force-complete-and-exit icomplete-force-complete-and-exit icomplete-fido-ret funcall-interactively command-execute completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil] 72 [execute-extended-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil nil nil] 59696 [command-execute execute-extended-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil nil] 49783 [funcall-interactively command-execute execute-extended-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil] 16 [profiler-stop funcall-interactively command-execute execute-extended-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil] 1563900 [Automatic\ GC nil] 17704)) (26057 13278 386045 775000) nil]
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-02-11 21:53 treesitter local parser: huge slowdown and memory usage in a long file Vincenzo Pupillo
@ 2024-02-12 4:16 ` Yuan Fu
2024-02-12 14:09 ` Eli Zaretskii
2024-02-13 0:50 ` Dmitry Gutov
0 siblings, 2 replies; 31+ messages in thread
From: Yuan Fu @ 2024-02-12 4:16 UTC (permalink / raw)
To: Vincenzo Pupillo; +Cc: Ergus via Emacs development discussions., Eli Zaretskii
[-- Attachment #1: Type: text/plain, Size: 1228 bytes --]
> On Feb 11, 2024, at 1:53 PM, Vincenzo Pupillo <v.pupillo@gmail.com> wrote:
>
> Hi,
> as a benchmark for my php-ts-mode (in 2 variants: one with tree-sitter-phpdoc
> for php comment block, and another using regular expression for comment block)
> I use tcpdf.php (from the tcpdf library). This php file has 24730 lines and
> generates 669 parser ranges, 665 of which are for phpdoc.
>
> As you can see from the profile (I try to edit the comment on line 2350) that I
> attached, the problem is in treesit--pre-redisplay.
> I tried playing around with the code a bit but to no avail (for example, I
> limited treesit-update-ranges to window-start and window-end).
> comments say:
> ;; Force repase on _all_ the parsers might not be necessary, but
> ;; this is probably the most robust way.
>
> Any ideas?
> My php-ts-mode (It's a working progress) is available here:
> https://github.com/vpxyz/php-ts-mode
>
> Thanks
> V.
>
> p.s without phpdoc emacs is as fast as with short php files.
> p.p.s. nvim with treesitter is as slow as my major mode with this file.
>
> GNU Emacs 30.0.50 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.41, cairo
> version 1.18.0) of 2024-02-11
[-- Attachment #2: cpu_profiler_report.txt --]
[-- Type: text/plain, Size: 12025 bytes --]
[profiler-profile "28.1" cpu #s(hash-table test equal data ([redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil] 124 [nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil] 945 [line-move-visual line-move next-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil] 6 [line-move-visual line-move previous-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil] 4 [treesit--pre-redisplay run-hook-with-args redisplay--pre-redisplay-functions redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil nil] 2744 [jit-lock--antiblink-post-command nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil] 1 [syntax-class show-paren--categorize-paren show-paren--locate-near-paren show-paren--default show-paren-function apply timer-event-handler nil nil nil nil nil nil nil nil nil] 1 [delete-dups xselect-convert-to-targets pgtk-own-selection-internal "#<compiled -0x1728ef4c95247357>" apply gui-backend-set-selection gui-set-selection gui-select-text kill-new kill-region kill-line funcall-interactively command-execute nil nil nil] 3 [treesit-query-range treesit--update-ranges-local treesit-update-ranges treesit--pre-redisplay run-hook-with-args redisplay--pre-redisplay-functions redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil] 1759 [treesit-query-range treesit-update-ranges treesit--pre-redisplay run-hook-with-args redisplay--pre-redisplay-functions redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil] 1636 ["#<compiled 0x1c852541b14fae74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil] 1 [cl-delete cl-remove cl-remove-if-not treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c852541b14fae74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil] 3 [treesit-node-parent let* php-ts-mode--language-at-point treesit-language-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil] 1 [treesit-buffer-root-node treesit-node-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil nil nil] 102 [treesit-query-range treesit-update-ranges treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c852725f4eeae74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil] 2 [treesit--update-ranges-local treesit-update-ranges treesit--pre-redisplay run-hook-with-args redisplay--pre-redisplay-functions redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil] 14 [cl-remove cl-remove-if-not treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c852724ac226e74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil] 3 [treesit-parser-root-node treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c85273d2ea0ee74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil] 52 [treesit--font-lock-fontify-region-1 treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c85273eefe1ae74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil] 13 [self-insert-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil nil nil] 3 [parse-partial-sexp syntax-ppss jit-lock--antiblink-post-command nil nil nil nil nil nil nil nil nil nil nil nil nil] 2 [treesit-update-ranges treesit--pre-redisplay run-hook-with-args redisplay--pre-redisplay-functions redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil] 23 [treesit--update-ranges-local treesit-update-ranges treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c85273724beee74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil] 2 [comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil nil] 1 [jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil nil] 3 [treesit-query-range treesit--update-ranges-local treesit-update-ranges treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c8527451184aa74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil] 1 [treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c852743d740ea74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil] 2 [make-closure syntax-ppss show-paren--default show-paren-function apply timer-event-handler nil nil nil nil nil nil nil nil nil nil] 1 ["#<compiled 0x16385ee6d140fc75>" newline comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil] 1 ["#<compiled 0x2736abab5e0d1>" font-lock-unfontify-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c853ead11f7aa74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil] 1 ["#<compiled 0x1daadc6ebc11a146>" cl-remove cl-remove-if-not treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c853ead11f7aa74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil] 1 [internal-echo-keystrokes-prefix nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil] 1 [facep treesit--font-lock-fontify-region-1 treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c8538e0d90bb674>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil] 1 [line-move next-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil nil] 1 [frame-parameter if eval redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil nil] 1 [mode-line-frame-control eval redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil nil nil] 1 [line-move previous-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil nil] 2 [redisplay_internal\ \(C\ function\) sit-for icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil] 24 [sit-for icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil] 30 ["#<compiled 0x17c4b22424e34a24>" all-completions complete-with-action "#<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_56>" completion-pcm--all-completions completion-substring--all-completions completion-flex-all-completions "#<compiled -0x1bd7cce58ccf55c9>" "#<compiled -0x4953631e8c8c907>" mapc seq-do seq-some completion--nth-completion completion-all-completions completion-all-sorted-completions icomplete--sorted-completions] 7 [interactive-form commandp "#<compiled -0x8f410b7c4bcca45>" "#<compiled 0x17c4b22424e34a24>" all-completions complete-with-action "#<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_56>" completion-pcm--all-completions completion-substring--all-completions completion-flex-all-completions "#<compiled -0x1bd7cce58ccf55c9>" "#<compiled -0x4953631e8c8c907>" mapc seq-do seq-some completion--nth-completion] 1 [all-completions complete-with-action "#<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_56>" completion-pcm--all-completions completion-substring--all-completions completion-flex-all-completions "#<compiled -0x1bd7cce58ccf55c9>" "#<compiled -0x4953631e8c8c907>" mapc seq-do seq-some completion--nth-completion completion-all-completions completion-all-sorted-completions icomplete--sorted-completions icomplete-completions] 68 [completion-pcm--all-completions completion-substring--all-completions completion-flex-all-completions "#<compiled -0x1bd7cce58ccf55c9>" "#<compiled -0x4953631e8c8c907>" mapc seq-do seq-some completion--nth-completion completion-all-completions completion-all-sorted-completions icomplete--sorted-completions icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default] 2 [delete-dups completion-all-sorted-completions icomplete--sorted-completions icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil] 1 ["#<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_45>" minibuffer--sort-by-length-alpha completion-all-sorted-completions icomplete--sorted-completions icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil] 2 ["#<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_61>" read-extended-command--affixation icomplete--augment icomplete--render-vertical icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil] 1 [window--min-size-1 window--min-size-1 window-min-size window-sizable window--resize-root-window-vertically redisplay_internal\ \(C\ function\) completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil] 1 [redisplay_internal\ \(C\ function\) completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil nil nil] 22 [completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil nil nil nil] 67 [completion--flex-score "#<compiled -0x1dd7ba1b6c096a85>" mapcar "#<compiled -0x820789829508d37>" completion-all-sorted-completions icomplete--sorted-completions icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil] 1 [try-completion complete-with-action "#<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_56>" completion--complete-and-exit minibuffer-force-complete-and-exit icomplete-force-complete-and-exit icomplete-fido-ret funcall-interactively command-execute completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil] 10 [execute-extended-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil nil nil] 1 [Automatic\ GC nil] 165)) (26057 13278 384506 497000) nil]
[-- Attachment #3: mem_profiler_report.txt --]
[-- Type: text/plain, Size: 21749 bytes --]
[profiler-profile "28.1" memory #s(hash-table test equal data ([profiler-start funcall-interactively command-execute execute-extended-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil] 631 [timer--time-setter timer-set-time run-at-time execute-extended-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil] 24 [timer--time-less-p timer--activate timer-activate run-at-time execute-extended-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil] 24 [timer--time-setter timer-set-idle-time run-with-idle-timer jit-lock--antiblink-post-command nil nil nil nil nil nil nil nil nil nil nil nil] 24 [nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil] 32744 [menu-bar-update-buffers-1 menu-bar-update-buffers redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil nil nil] 2016 [redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil] 316718 [string-match kill-this-buffer-enabled-p redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil nil nil] 1024 [timer--time-setter timer-set-time run-at-time run-with-timer blink-cursor--start-timer blink-cursor-start apply timer-event-handler nil nil nil nil nil nil nil nil] 216 [timer--time-less-p timer--activate timer-activate run-at-time run-with-timer blink-cursor--start-timer blink-cursor-start apply timer-event-handler nil nil nil nil nil nil nil] 192 [timer-relative-time timer-inc-time timer-event-handler nil nil nil nil nil nil nil nil nil nil nil nil nil] 672 [timer--time-setter timer-inc-time timer-event-handler nil nil nil nil nil nil nil nil nil nil nil nil nil] 288 [time-less-p timer-event-handler nil nil nil nil nil nil nil nil nil nil nil nil nil nil] 288 [timer--time-less-p timer--activate timer-activate timer-event-handler nil nil nil nil nil nil nil nil nil nil nil nil] 264 [line-move-visual line-move next-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil] 4748 [default-font-height default-line-height line-move-partial line-move next-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil] 8184 [delete-and-extract-region "#<compiled 0x1bdcb8d5ab3d6985>" apply "#<compiled -0x1e3e289fe5154769>" buffer-substring--filter filter-buffer-substring kill-region kill-line funcall-interactively command-execute nil nil nil nil nil nil] 24840 [menu-bar-update-yank-menu kill-new kill-region kill-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil] 1152 [pgtk-own-selection-internal "#<compiled -0x1728ef4c95247357>" apply gui-backend-set-selection gui-set-selection gui-select-text kill-new kill-region kill-line funcall-interactively command-execute nil nil nil nil nil] 368 [treesit--update-ranges-local treesit-update-ranges treesit--pre-redisplay run-hook-with-args redisplay--pre-redisplay-functions redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil] 2464048 [treesit-query-range treesit--update-ranges-local treesit-update-ranges treesit--pre-redisplay run-hook-with-args redisplay--pre-redisplay-functions redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil] 6513544 [treesit-query-range treesit-update-ranges treesit--pre-redisplay run-hook-with-args redisplay--pre-redisplay-functions redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil] 23008 [treesit--pre-redisplay run-hook-with-args redisplay--pre-redisplay-functions redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil nil] 571762456 [jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil nil nil nil nil] 8288 [treesit--update-ranges-local treesit-update-ranges treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c852541b14fae74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil] 5200 [treesit-local-parsers-on treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c852541b14fae74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil] 2320 [treesit-parser-root-node treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c852541b14fae74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil] 1679304 [treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c852541b14fae74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil] 41440 [treesit--font-lock-fontify-region-1 treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c852541b14fae74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil nil nil] 51984 [treesit--update-ranges-local treesit-update-ranges treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil] 160 [treesit-local-parsers-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil] 80 [treesit-local-parsers-at treesit-node-at let* php-ts-mode--language-at-point treesit-language-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil] 160 [treesit-node-at let* php-ts-mode--language-at-point treesit-language-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil] 3072 [treesit-node-parent let* php-ts-mode--language-at-point treesit-language-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil] 3584 [format let* php-ts-mode--language-at-point treesit-language-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil] 338 [looking-at looking-at-p and cond save-excursion let* php-ts-mode--language-at-point treesit-language-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil] 1024 [treesit-local-parsers-at treesit-node-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil nil nil] 80 [treesit-buffer-root-node treesit-node-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil nil nil] 23857603 [treesit-node-at treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil] 3584 [treesit-parent-while treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil] 256 [treesit-local-parsers-at treesit-node-at let* php-ts-mode--language-at-point treesit-language-at treesit-node-on treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil] 80 [treesit-node-at let* php-ts-mode--language-at-point treesit-language-at treesit-node-on treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil] 1536 [treesit-node-parent let* php-ts-mode--language-at-point treesit-language-at treesit-node-on treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil] 1792 [format let* php-ts-mode--language-at-point treesit-language-at treesit-node-on treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil] 169 [treesit-local-parsers-on treesit-node-on treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil nil nil] 80 [make-closure "#<compiled 0xfaa45bf0cbe519b>" treesit--simple-indent-eval treesit--simple-indent-eval treesit-simple-indent treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil] 4144 [string-match "#<compiled -0x12329b36eb2983c1>" "#<compiled -0x185b5bdede7ccb0a>" treesit--simple-indent-eval treesit-simple-indent treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil] 1024 [treesit-simple-indent treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil] 1336 [looking-at "#<compiled 0x113256ad37d3f5de>" treesit--simple-indent-eval treesit-simple-indent treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil] 2112 [string-match "#<compiled 0x113256ad37d3f5de>" treesit--simple-indent-eval treesit-simple-indent treesit--indent-1 treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil] 1024 [indent-line-to treesit-indent indent--funcall-widened indent-for-tab-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil] 25168 [self-insert-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil nil nil] 732592 [make-closure syntax-ppss jit-lock--antiblink-post-command nil nil nil nil nil nil nil nil nil nil nil nil nil] 4144 [generate-new-buffer comment-normalize-vars comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil] 21 [comment-normalize-vars comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil] 2176 [timer--time-setter timer-set-time run-at-time undo-auto--boundary-ensure-timer undo-auto--undoable-change newline comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil] 24 [newline comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil] 50656 [looking-at "#<compiled 0x16385ee5b7c4fc75>" newline comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil] 1152 [comment-string-strip comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil] 2304 [comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil nil] 101648 [comment-enter-backward comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil] 1024 [comment-normalize-vars comment-indent comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil] 1024 [comment-indent comment-indent-new-line default-indent-new-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil] 101824 [make-closure syntax-ppss show-paren--default show-paren-function apply timer-event-handler nil nil nil nil nil nil nil nil nil nil] 8288 [string-prefix-p treesit-query-range treesit--update-ranges-local treesit-update-ranges treesit--pre-redisplay run-hook-with-args redisplay--pre-redisplay-functions redisplay_internal\ \(C\ function\) nil nil nil nil nil nil nil nil] 7187 [delete-backward-char funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil nil nil] 25144 [treesit-query-range treesit--update-ranges-local treesit-update-ranges treesit-font-lock-fontify-region font-lock-fontify-syntactically-region font-lock-default-fontify-region font-lock-fontify-region "#<compiled 0x1c85277a0b8dae74>" jit-lock--run-functions jit-lock-fontify-now jit-lock-function redisplay_internal\ \(C\ function\) nil nil nil nil] 4144 [line-move-visual line-move previous-line funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil] 4748 [completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil nil nil nil] 37647 [timer--time-setter timer-set-time run-at-time undo-auto--boundary-ensure-timer undo-auto--undoable-change completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil] 24 [minibuffer--regexp-setup completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil nil nil] 1024 [string-match pgtk-device-class device-class minibuffer-setup-on-screen-keyboard completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil] 1024 [sit-for icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil] 816 [menu-bar-update-buffers-1 menu-bar-update-buffers redisplay_internal\ \(C\ function\) sit-for icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil] 1008 [redisplay_internal\ \(C\ function\) sit-for icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil] 168822 [mode-line-default-help-echo redisplay_internal\ \(C\ function\) sit-for icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil] 8184 [completion-pcm--pattern->regex completion-pcm--all-completions completion-substring--all-completions completion-flex-all-completions "#<compiled -0x1bd7cce58ccf55c9>" "#<compiled -0x4953631e8c8c907>" mapc seq-do seq-some completion--nth-completion completion-all-completions completion-all-sorted-completions icomplete--sorted-completions icomplete-completions icomplete-exhibit icomplete-post-command-hook] 1152 [all-completions complete-with-action "#<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_56>" completion-pcm--all-completions completion-substring--all-completions completion-flex-all-completions "#<compiled -0x1bd7cce58ccf55c9>" "#<compiled -0x4953631e8c8c907>" mapc seq-do seq-some completion--nth-completion completion-all-completions completion-all-sorted-completions icomplete--sorted-completions icomplete-completions] 4352 [version-to-list "#<compiled 0x17c4b22424e34a24>" all-completions complete-with-action "#<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_56>" completion-pcm--all-completions completion-substring--all-completions completion-flex-all-completions "#<compiled -0x1bd7cce58ccf55c9>" "#<compiled -0x4953631e8c8c907>" mapc seq-do seq-some completion--nth-completion completion-all-completions completion-all-sorted-completions] 3328 [delete-dups completion-all-sorted-completions icomplete--sorted-completions icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil] 113424 [minibuffer--sort-by-length-alpha completion-all-sorted-completions icomplete--sorted-completions icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil] 53296 [minibuffer--sort-by-position completion-all-sorted-completions icomplete--sorted-completions icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil] 64 [minibuffer--sort-by-key minibuffer--sort-by-position completion-all-sorted-completions icomplete--sorted-completions icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil] 85344 ["#<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_61>" read-extended-command--affixation icomplete--augment icomplete--render-vertical icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil] 162492 [move-overlay icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil] 24 [format icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil] 6378 [redisplay_internal\ \(C\ function\) completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil nil nil] 399513 [funcall-interactively command-execute completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil nil] 80 [self-insert-command funcall-interactively command-execute completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil] 288 [completion--flex-score "#<compiled -0x1dd7ba1b6c096a85>" mapcar "#<compiled -0x820789829508d37>" completion-all-sorted-completions icomplete--sorted-completions icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil] 2304 [icomplete--render-vertical icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil] 117856 [completion--hilit-from-re "#<compiled -0x2358f1b2a8a21e7>" completion-lazy-hilit icomplete--render-vertical icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil] 8128 [frame-height max-mini-window-lines icomplete--render-vertical icomplete-completions icomplete-exhibit icomplete-post-command-hook completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil] 8184 [timer--time-setter timer-set-time run-at-time run-with-timer blink-cursor--start-timer blink-cursor-start apply timer-event-handler completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil] 24 [timer--time-less-p timer--activate timer-activate run-at-time run-with-timer blink-cursor--start-timer blink-cursor-start apply timer-event-handler completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil] 24 [timer-relative-time timer-inc-time timer-event-handler completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil] 56 [timer--time-setter timer-inc-time timer-event-handler completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil] 24 [time-less-p timer-event-handler completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil nil nil] 24 [timer--time-less-p timer--activate timer-activate timer-event-handler completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil nil nil nil] 24 [completion--replace minibuffer-force-complete minibuffer-force-complete-and-exit icomplete-force-complete-and-exit icomplete-fido-ret funcall-interactively command-execute completing-read-default read-extended-command-1 read-extended-command byte-code command-execute nil nil nil nil] 72 [execute-extended-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil nil nil] 59696 [command-execute execute-extended-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil nil] 49783 [funcall-interactively command-execute execute-extended-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil nil] 16 [profiler-stop funcall-interactively command-execute execute-extended-command funcall-interactively command-execute nil nil nil nil nil nil nil nil nil nil] 1563900 [Automatic\ GC nil] 17704)) (26057 13278 386045 775000) nil]
[-- Attachment #4: Type: text/plain, Size: 556 bytes --]
Thanks, the culprit is the call to treesit-update-ranges in treesit--pre-redisplay, where we don’t pass it any specific range, so it updates the range for the whole buffer. Eli, is there any way to get a rough estimate the range that redisplay is refreshing? Do you think something like this would work?
(treesit-update-ranges
(max (point-min) (- (window-start) 1000)) ; BEG
(min (point-max) (+ (or (window-end) (+ (window-start) 4000)) 1000))) ; END
I guess the window-start would be outdated in pre-redisplay-function...
Yuan
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-02-12 4:16 ` Yuan Fu
@ 2024-02-12 14:09 ` Eli Zaretskii
2024-02-13 8:15 ` Yuan Fu
2024-02-13 0:50 ` Dmitry Gutov
1 sibling, 1 reply; 31+ messages in thread
From: Eli Zaretskii @ 2024-02-12 14:09 UTC (permalink / raw)
To: Yuan Fu; +Cc: v.pupillo, emacs-devel
> From: Yuan Fu <casouri@gmail.com>
> Date: Sun, 11 Feb 2024 20:16:11 -0800
> Cc: "Ergus via Emacs development discussions." <emacs-devel@gnu.org>,
> Eli Zaretskii <eliz@gnu.org>
>
> Thanks, the culprit is the call to treesit-update-ranges in treesit--pre-redisplay, where we don’t pass it any specific range, so it updates the range for the whole buffer. Eli, is there any way to get a rough estimate the range that redisplay is refreshing? Do you think something like this would work?
>
> (treesit-update-ranges
> (max (point-min) (- (window-start) 1000)) ; BEG
> (min (point-max) (+ (or (window-end) (+ (window-start) 4000)) 1000))) ; END
>
> I guess the window-start would be outdated in pre-redisplay-function...
The problem is that window-start is not guaranteed to be up-to-date
when pre-redisplay-function is called: the window-start is updated by
redisplay, and pre-redisplay-function is called before the update.
Moreover, pre-redisplay-function could be called either once or twice
in a redisplay cycle, and window-start is up-to-date only for the
second call.
The window-end point is basically never up-to-date during redisplay,
only at its very end.
So my suggestion would be to define the range from position of point,
using the window dimensions; see get_narrowed_width for ideas. This
could lose if the buffer has a lot of invisible text, so I suggest to
check for invisible properties, and if they are present in the buffer,
punt and use the whole accessible portion of the buffer (I don't
expect PHP buffers, or any buffers in programming-language modes, to
have invisible text).
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-02-12 14:09 ` Eli Zaretskii
@ 2024-02-13 8:15 ` Yuan Fu
2024-02-13 9:39 ` Vincenzo Pupillo
2024-02-13 12:59 ` Eli Zaretskii
0 siblings, 2 replies; 31+ messages in thread
From: Yuan Fu @ 2024-02-13 8:15 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: Vincenzo Pupillo, emacs-devel
> On Feb 12, 2024, at 6:09 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Sun, 11 Feb 2024 20:16:11 -0800
>> Cc: "Ergus via Emacs development discussions." <emacs-devel@gnu.org>,
>> Eli Zaretskii <eliz@gnu.org>
>>
>> Thanks, the culprit is the call to treesit-update-ranges in treesit--pre-redisplay, where we don’t pass it any specific range, so it updates the range for the whole buffer. Eli, is there any way to get a rough estimate the range that redisplay is refreshing? Do you think something like this would work?
>>
>> (treesit-update-ranges
>> (max (point-min) (- (window-start) 1000)) ; BEG
>> (min (point-max) (+ (or (window-end) (+ (window-start) 4000)) 1000))) ; END
>>
>> I guess the window-start would be outdated in pre-redisplay-function...
>
> The problem is that window-start is not guaranteed to be up-to-date
> when pre-redisplay-function is called: the window-start is updated by
> redisplay, and pre-redisplay-function is called before the update.
> Moreover, pre-redisplay-function could be called either once or twice
> in a redisplay cycle, and window-start is up-to-date only for the
> second call.
>
> The window-end point is basically never up-to-date during redisplay,
> only at its very end.
>
> So my suggestion would be to define the range from position of point,
> using the window dimensions; see get_narrowed_width for ideas. This
> could lose if the buffer has a lot of invisible text, so I suggest to
> check for invisible properties, and if they are present in the buffer,
> punt and use the whole accessible portion of the buffer (I don't
> expect PHP buffers, or any buffers in programming-language modes, to
> have invisible text).
Ah, clever :-) Programming language buffers could have invisible text when the user uses hideshow, or folded some section of code using outline-minor-mode :-(
But as I said in the reply to Dmitry, we might need some better design for updating parser ranges than the current one. I’ll just fix V’s problem for now by updating the range around point, and ignore invisible text for now.
Yuan
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-02-13 8:15 ` Yuan Fu
@ 2024-02-13 9:39 ` Vincenzo Pupillo
2024-02-13 12:59 ` Eli Zaretskii
1 sibling, 0 replies; 31+ messages in thread
From: Vincenzo Pupillo @ 2024-02-13 9:39 UTC (permalink / raw)
To: Eli Zaretskii, Yuan Fu; +Cc: emacs-devel
I don't know if this is a stupid idea or not, but I'll try to explain it.
My other php-ts-mode (the one without a tree-sitter parser for php) does these things:
there is a "treesit-font-lock-rules" to capture a comment node,
this rule calls a function that tries to figure out if it is a comment block in PHP. If it is a comment block, it uses
some regular expression for the font-locking, otherwise use treesit-fontify-with-override for the entire comment.
Treesit "knows" the intervals in the file to inject the embedded parser.
Can this information be used for local embedded parsers?
V.
In data martedì 13 febbraio 2024 09:15:49 CET, Yuan Fu ha scritto:
>
> > On Feb 12, 2024, at 6:09 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> >
> >> From: Yuan Fu <casouri@gmail.com>
> >> Date: Sun, 11 Feb 2024 20:16:11 -0800
> >> Cc: "Ergus via Emacs development discussions." <emacs-devel@gnu.org>,
> >> Eli Zaretskii <eliz@gnu.org>
> >>
> >> Thanks, the culprit is the call to treesit-update-ranges in treesit--pre-redisplay, where we don’t pass it any specific range, so it updates the range for the whole buffer. Eli, is there any way to get a rough estimate the range that redisplay is refreshing? Do you think something like this would work?
> >>
> >> (treesit-update-ranges
> >> (max (point-min) (- (window-start) 1000)) ; BEG
> >> (min (point-max) (+ (or (window-end) (+ (window-start) 4000)) 1000))) ; END
> >>
> >> I guess the window-start would be outdated in pre-redisplay-function...
> >
> > The problem is that window-start is not guaranteed to be up-to-date
> > when pre-redisplay-function is called: the window-start is updated by
> > redisplay, and pre-redisplay-function is called before the update.
> > Moreover, pre-redisplay-function could be called either once or twice
> > in a redisplay cycle, and window-start is up-to-date only for the
> > second call.
> >
> > The window-end point is basically never up-to-date during redisplay,
> > only at its very end.
> >
> > So my suggestion would be to define the range from position of point,
> > using the window dimensions; see get_narrowed_width for ideas. This
> > could lose if the buffer has a lot of invisible text, so I suggest to
> > check for invisible properties, and if they are present in the buffer,
> > punt and use the whole accessible portion of the buffer (I don't
> > expect PHP buffers, or any buffers in programming-language modes, to
> > have invisible text).
>
> Ah, clever :-) Programming language buffers could have invisible text when the user uses hideshow, or folded some section of code using outline-minor-mode :-(
>
> But as I said in the reply to Dmitry, we might need some better design for updating parser ranges than the current one. I’ll just fix V’s problem for now by updating the range around point, and ignore invisible text for now.
>
> Yuan
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-02-13 8:15 ` Yuan Fu
2024-02-13 9:39 ` Vincenzo Pupillo
@ 2024-02-13 12:59 ` Eli Zaretskii
1 sibling, 0 replies; 31+ messages in thread
From: Eli Zaretskii @ 2024-02-13 12:59 UTC (permalink / raw)
To: Yuan Fu; +Cc: v.pupillo, emacs-devel
> From: Yuan Fu <casouri@gmail.com>
> Date: Tue, 13 Feb 2024 00:15:49 -0800
> Cc: Vincenzo Pupillo <v.pupillo@gmail.com>,
> emacs-devel@gnu.org
>
> > So my suggestion would be to define the range from position of point,
> > using the window dimensions; see get_narrowed_width for ideas. This
> > could lose if the buffer has a lot of invisible text, so I suggest to
> > check for invisible properties, and if they are present in the buffer,
> > punt and use the whole accessible portion of the buffer (I don't
> > expect PHP buffers, or any buffers in programming-language modes, to
> > have invisible text).
>
> Ah, clever :-) Programming language buffers could have invisible text when the user uses hideshow, or folded some section of code using outline-minor-mode :-(
If having invisible text in such buffers is frequent enough, you could
look for the first position before the region beginning that is
visible, and the first position after the region end that is visible,
and use that as the range.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-02-12 4:16 ` Yuan Fu
2024-02-12 14:09 ` Eli Zaretskii
@ 2024-02-13 0:50 ` Dmitry Gutov
2024-02-13 8:08 ` Yuan Fu
1 sibling, 1 reply; 31+ messages in thread
From: Dmitry Gutov @ 2024-02-13 0:50 UTC (permalink / raw)
To: Yuan Fu, Vincenzo Pupillo
Cc: Ergus via Emacs development discussions., Eli Zaretskii
On 12/02/2024 06:16, Yuan Fu wrote:
> Thanks, the culprit is the call to treesit-update-ranges in
> treesit--pre-redisplay, where we don’t pass it any specific range, so it
> updates the range for the whole buffer. Eli, is there any way to get a
> rough estimate the range that redisplay is refreshing? Do you think
> something like this would work?
If we don't update the ranges outside of some interval surrounding the
window, what does that mean for correctness?
Perhaps the mode has a syntax-propertize-function which behaves
differently (as it should) depending on the language at point. Or
different ranges have different syntax tables, something like that.
If the ranges, after some edit (perhaps a programmatic one, performed
far from the visible area), are kept not update somewhere around the
beginning of the buffer, do we not risk confusing the syntax-ppss
parser, for example?
Come to think of it, take treesit-indent: it only updates the ranges for
the current line. But the line's indentation usually depends on the
previous buffer positions, doesn't it?
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-02-13 0:50 ` Dmitry Gutov
@ 2024-02-13 8:08 ` Yuan Fu
2024-02-18 3:37 ` Dmitry Gutov
0 siblings, 1 reply; 31+ messages in thread
From: Yuan Fu @ 2024-02-13 8:08 UTC (permalink / raw)
To: Dmitry Gutov
Cc: Vincenzo Pupillo, Ergus via Emacs development discussions.,
Eli Zaretskii
> On Feb 12, 2024, at 4:50 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>
> On 12/02/2024 06:16, Yuan Fu wrote:
>> Thanks, the culprit is the call to treesit-update-ranges in
>> treesit--pre-redisplay, where we don’t pass it any specific range, so it
>> updates the range for the whole buffer. Eli, is there any way to get a
>> rough estimate the range that redisplay is refreshing? Do you think
>> something like this would work?
>
> If we don't update the ranges outside of some interval surrounding the window, what does that mean for correctness?
If the place of update and the embedded code currently in view belong to the same node in the host language, then when we update ranges for the current window-visible range, the whole node’s range is updated. So at least for this node, the range is correct.
If the place of update and the embedded code currently in view belong to different nodes in the host language, then when we update ranges for the current window-visible range, only the visible node’s range is updated.
>
> Perhaps the mode has a syntax-propertize-function which behaves differently (as it should) depending on the language at point. Or different ranges have different syntax tables, something like that.
>
> If the ranges, after some edit (perhaps a programmatic one, performed far from the visible area), are kept not update somewhere around the beginning of the buffer, do we not risk confusing the syntax-ppss parser, for example?
That can happen, yes.
>
> Come to think of it, take treesit-indent: it only updates the ranges for the current line. But the line's indentation usually depends on the previous buffer positions, doesn't it?
The range passed to treesit-update-ranges act as an intercepting range—we capture nodes that intercepts with the range and use them to update ranges. If the line to be indented is in an embedded language block, the whole block will be captured and it’s range will be given to the embedded language parser.
We haven’t have any problem so far mainly because most embedded code blocks are local, and it’s rare for some edit to take place far from the visible portion which affects ranges and user expects that edit to affect the current visible range.
I don’t have any great idea for a better way to update ranges right now. Let me think about that. In the meantime, I’ll push a temporary fix so V’s original problem can be solved.
Yuan
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-02-13 8:08 ` Yuan Fu
@ 2024-02-18 3:37 ` Dmitry Gutov
2024-02-19 5:53 ` Yuan Fu
0 siblings, 1 reply; 31+ messages in thread
From: Dmitry Gutov @ 2024-02-18 3:37 UTC (permalink / raw)
To: Yuan Fu
Cc: Vincenzo Pupillo, Ergus via Emacs development discussions.,
Eli Zaretskii
On 13/02/2024 10:08, Yuan Fu wrote:
>> On 12/02/2024 06:16, Yuan Fu wrote:
>>> Thanks, the culprit is the call to treesit-update-ranges in
>>> treesit--pre-redisplay, where we don’t pass it any specific range, so it
>>> updates the range for the whole buffer. Eli, is there any way to get a
>>> rough estimate the range that redisplay is refreshing? Do you think
>>> something like this would work?
>>
>> If we don't update the ranges outside of some interval surrounding the window, what does that mean for correctness?
>
> If the place of update and the embedded code currently in view belong to the same node in the host language, then when we update ranges for the current window-visible range, the whole node’s range is updated. So at least for this node, the range is correct.
>
> If the place of update and the embedded code currently in view belong to different nodes in the host language, then when we update ranges for the current window-visible range, only the visible node’s range is updated.
Okay. What about positions after the visible part of the buffer? Can
their ranges be outdated? It's probably okay when the ranges are only
used for font-lock and syntax-ppss, but I wonder about possible other
applications (reindenting the whole buffer, for example).
>>
>> Perhaps the mode has a syntax-propertize-function which behaves differently (as it should) depending on the language at point. Or different ranges have different syntax tables, something like that.
>>
>> If the ranges, after some edit (perhaps a programmatic one, performed far from the visible area), are kept not update somewhere around the beginning of the buffer, do we not risk confusing the syntax-ppss parser, for example?
>
> That can happen, yes.
>
>>
>> Come to think of it, take treesit-indent: it only updates the ranges for the current line. But the line's indentation usually depends on the previous buffer positions, doesn't it?
>
> The range passed to treesit-update-ranges act as an intercepting range—we capture nodes that intercepts with the range and use them to update ranges. If the line to be indented is in an embedded language block, the whole block will be captured and it’s range will be given to the embedded language parser.
>
>
> We haven’t have any problem so far mainly because most embedded code blocks are local, and it’s rare for some edit to take place far from the visible portion which affects ranges and user expects that edit to affect the current visible range.
>
> I don’t have any great idea for a better way to update ranges right now. Let me think about that. In the meantime, I’ll push a temporary fix so V’s original problem can be solved.
I was thinking (since considering the same problem in mmm-mode,
actually) that it would make sense to either plug into
syntax-propertize-function, or have a parallel data structure similarly
tracking the outdated buffer regions, which would only update the part
of the buffer which had been modified since last time.
Dealing with the "remainder" of the buffer might be trickier, but maybe
some heuristic which would help detect the "no changes" case could be
implemented.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-02-18 3:37 ` Dmitry Gutov
@ 2024-02-19 5:53 ` Yuan Fu
2024-03-21 6:39 ` Yuan Fu
0 siblings, 1 reply; 31+ messages in thread
From: Yuan Fu @ 2024-02-19 5:53 UTC (permalink / raw)
To: Dmitry Gutov
Cc: Vincenzo Pupillo, Ergus via Emacs development discussions.,
Eli Zaretskii
> On Feb 17, 2024, at 7:37 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>
> On 13/02/2024 10:08, Yuan Fu wrote:
>
>>> On 12/02/2024 06:16, Yuan Fu wrote:
>>>> Thanks, the culprit is the call to treesit-update-ranges in
>>>> treesit--pre-redisplay, where we don’t pass it any specific range, so it
>>>> updates the range for the whole buffer. Eli, is there any way to get a
>>>> rough estimate the range that redisplay is refreshing? Do you think
>>>> something like this would work?
>>>
>>> If we don't update the ranges outside of some interval surrounding the window, what does that mean for correctness?
>> If the place of update and the embedded code currently in view belong to the same node in the host language, then when we update ranges for the current window-visible range, the whole node’s range is updated. So at least for this node, the range is correct.
>> If the place of update and the embedded code currently in view belong to different nodes in the host language, then when we update ranges for the current window-visible range, only the visible node’s range is updated.
>
> Okay. What about positions after the visible part of the buffer? Can their ranges be outdated? It's probably okay when the ranges are only used for font-lock and syntax-ppss, but I wonder about possible other applications (reindenting the whole buffer, for example).
It’s the same as positions before the visible part. For reindenting the whole buffer, treesit-indent-region will update the range for the whole buffer at the very beginning.
>
>>>
>>> Perhaps the mode has a syntax-propertize-function which behaves differently (as it should) depending on the language at point. Or different ranges have different syntax tables, something like that.
>>>
>>> If the ranges, after some edit (perhaps a programmatic one, performed far from the visible area), are kept not update somewhere around the beginning of the buffer, do we not risk confusing the syntax-ppss parser, for example?
>> That can happen, yes.
>>>
>>> Come to think of it, take treesit-indent: it only updates the ranges for the current line. But the line's indentation usually depends on the previous buffer positions, doesn't it?
>> The range passed to treesit-update-ranges act as an intercepting range—we capture nodes that intercepts with the range and use them to update ranges. If the line to be indented is in an embedded language block, the whole block will be captured and it’s range will be given to the embedded language parser.
>> We haven’t have any problem so far mainly because most embedded code blocks are local, and it’s rare for some edit to take place far from the visible portion which affects ranges and user expects that edit to affect the current visible range.
>> I don’t have any great idea for a better way to update ranges right now. Let me think about that. In the meantime, I’ll push a temporary fix so V’s original problem can be solved.
>
> I was thinking (since considering the same problem in mmm-mode, actually) that it would make sense to either plug into syntax-propertize-function, or have a parallel data structure similarly tracking the outdated buffer regions, which would only update the part of the buffer which had been modified since last time.
>
> Dealing with the "remainder" of the buffer might be trickier, but maybe some heuristic which would help detect the "no changes" case could be implemented.
Yeah, something similar to syntax-ppss or jit-lock. Or maybe it can be avoided, since the current on-demand range update has been working fine, until we added treesit--pre-redisplay for syntax-ppss.
Yuan
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-02-19 5:53 ` Yuan Fu
@ 2024-03-21 6:39 ` Yuan Fu
0 siblings, 0 replies; 31+ messages in thread
From: Yuan Fu @ 2024-03-21 6:39 UTC (permalink / raw)
To: Dmitry Gutov
Cc: Vincenzo Pupillo, Ergus via Emacs development discussions.,
Eli Zaretskii
> On Feb 18, 2024, at 9:53 PM, Yuan Fu <casouri@gmail.com> wrote:
>
>
>
>> On Feb 17, 2024, at 7:37 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>>
>> On 13/02/2024 10:08, Yuan Fu wrote:
>>
>>>> On 12/02/2024 06:16, Yuan Fu wrote:
>>>>> Thanks, the culprit is the call to treesit-update-ranges in
>>>>> treesit--pre-redisplay, where we don’t pass it any specific range, so it
>>>>> updates the range for the whole buffer. Eli, is there any way to get a
>>>>> rough estimate the range that redisplay is refreshing? Do you think
>>>>> something like this would work?
>>>>
>>>> If we don't update the ranges outside of some interval surrounding the window, what does that mean for correctness?
>>> If the place of update and the embedded code currently in view belong to the same node in the host language, then when we update ranges for the current window-visible range, the whole node’s range is updated. So at least for this node, the range is correct.
>>> If the place of update and the embedded code currently in view belong to different nodes in the host language, then when we update ranges for the current window-visible range, only the visible node’s range is updated.
>>
>> Okay. What about positions after the visible part of the buffer? Can their ranges be outdated? It's probably okay when the ranges are only used for font-lock and syntax-ppss, but I wonder about possible other applications (reindenting the whole buffer, for example).
>
> It’s the same as positions before the visible part. For reindenting the whole buffer, treesit-indent-region will update the range for the whole buffer at the very beginning.
>
>>
>>>>
>>>> Perhaps the mode has a syntax-propertize-function which behaves differently (as it should) depending on the language at point. Or different ranges have different syntax tables, something like that.
>>>>
>>>> If the ranges, after some edit (perhaps a programmatic one, performed far from the visible area), are kept not update somewhere around the beginning of the buffer, do we not risk confusing the syntax-ppss parser, for example?
>>> That can happen, yes.
>>>>
>>>> Come to think of it, take treesit-indent: it only updates the ranges for the current line. But the line's indentation usually depends on the previous buffer positions, doesn't it?
>>> The range passed to treesit-update-ranges act as an intercepting range—we capture nodes that intercepts with the range and use them to update ranges. If the line to be indented is in an embedded language block, the whole block will be captured and it’s range will be given to the embedded language parser.
>>> We haven’t have any problem so far mainly because most embedded code blocks are local, and it’s rare for some edit to take place far from the visible portion which affects ranges and user expects that edit to affect the current visible range.
>>> I don’t have any great idea for a better way to update ranges right now. Let me think about that. In the meantime, I’ll push a temporary fix so V’s original problem can be solved.
>>
>> I was thinking (since considering the same problem in mmm-mode, actually) that it would make sense to either plug into syntax-propertize-function, or have a parallel data structure similarly tracking the outdated buffer regions, which would only update the part of the buffer which had been modified since last time.
>>
>> Dealing with the "remainder" of the buffer might be trickier, but maybe some heuristic which would help detect the "no changes" case could be implemented.
>
> Yeah, something similar to syntax-ppss or jit-lock. Or maybe it can be avoided, since the current on-demand range update has been working fine, until we added treesit--pre-redisplay for syntax-ppss.
This is actually a bit involved, because there could be multiple layer’s of parsers: the host language sets range for a local parser, and the local parser can set ranges for a nested-nested parser. Eg, we might have a markdown parser for parsing doc-comments, and inside the markdown there could be code blocks which require another level of nested parser.
This use-case is a bit advanced but we definitely need to support it in our design. And my brain is twisted by all the dependency and range. If you guys has some ideas they’ll be most welcome :-)
Yuan
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
@ 2024-04-20 2:18 Yuan Fu
2024-04-20 19:14 ` Vincenzo Pupillo
2024-05-06 2:04 ` Dmitry Gutov
0 siblings, 2 replies; 31+ messages in thread
From: Yuan Fu @ 2024-04-20 2:18 UTC (permalink / raw)
To: 付禹安, Dmitry Gutov
Cc: Ergus via Emacs development discussions.
>
>
> > On Feb 18, 2024, at 9:53 PM, Yuan Fu <casouri@gmail.com> wrote:
> >
> >
> >
> >> On Feb 17, 2024, at 7:37 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
> >>
> >> On 13/02/2024 10:08, Yuan Fu wrote:
> >>
> >>>> On 12/02/2024 06:16, Yuan Fu wrote:
> >>>>> Thanks, the culprit is the call to treesit-update-ranges in
> >>>>> treesit--pre-redisplay, where we don’t pass it any specific range, so it
> >>>>> updates the range for the whole buffer. Eli, is there any way to get a
> >>>>> rough estimate the range that redisplay is refreshing? Do you think
> >>>>> something like this would work?
> >>>>
> >>>> If we don't update the ranges outside of some interval surrounding the
> >>>> window, what does that mean for correctness?
> >>> If the place of update and the embedded code currently in view belong to
> >>> the same node in the host language, then when we update ranges for the
> >>> current window-visible range, the whole node’s range is updated. So at
> >>> least for this node, the range is correct.
> >>> If the place of update and the embedded code currently in view belong to
> >>> different nodes in the host language, then when we update ranges for the
> >>> current window-visible range, only the visible node’s range is updated.
> >>
> >> Okay. What about positions after the visible part of the buffer? Can their
> >> ranges be outdated? It's probably okay when the ranges are only used for
> >> font-lock and syntax-ppss, but I wonder about possible other applications
> >> (reindenting the whole buffer, for example).
> >
> > It’s the same as positions before the visible part. For reindenting the whole
> > buffer, treesit-indent-region will update the range for the whole buffer at
> > the very beginning.
> >
> >>
> >>>>
> >>>> Perhaps the mode has a syntax-propertize-function which behaves
> >>>> differently (as it should) depending on the language at point. Or
> >>>> different ranges have different syntax tables, something like that.
> >>>>
> >>>> If the ranges, after some edit (perhaps a programmatic one, performed far
> >>>> from the visible area), are kept not update somewhere around the beginning
> >>>> of the buffer, do we not risk confusing the syntax-ppss parser, for
> >>>> example?
> >>> That can happen, yes.
> >>>>
> >>>> Come to think of it, take treesit-indent: it only updates the ranges for
> >>>> the current line. But the line's indentation usually depends on the
> >>>> previous buffer positions, doesn't it?
> >>> The range passed to treesit-update-ranges act as an intercepting range—we
> >>> capture nodes that intercepts with the range and use them to update ranges.
> >>> If the line to be indented is in an embedded language block, the whole
> >>> block will be captured and it’s range will be given to the embedded
> >>> language parser.
> >>> We haven’t have any problem so far mainly because most embedded code blocks
> >>> are local, and it’s rare for some edit to take place far from the visible
> >>> portion which affects ranges and user expects that edit to affect the
> >>> current visible range.
> >>> I don’t have any great idea for a better way to update ranges right now.
> >>> Let me think about that. In the meantime, I’ll push a temporary fix so V’s
> >>> original problem can be solved.
> >>
> >> I was thinking (since considering the same problem in mmm-mode, actually)
> >> that it would make sense to either plug into syntax-propertize-function, or
> >> have a parallel data structure similarly tracking the outdated buffer
> >> regions, which would only update the part of the buffer which had been
> >> modified since last time.
> >>
> >> Dealing with the "remainder" of the buffer might be trickier, but maybe some
> >> heuristic which would help detect the "no changes" case could be implemented.
> >
> > Yeah, something similar to syntax-ppss or jit-lock. Or maybe it can be
> > avoided, since the current on-demand range update has been working fine,
> > until we added treesit--pre-redisplay for syntax-ppss.
>
> This is actually a bit involved, because there could be multiple layer’s of
> parsers: the host language sets range for a local parser, and the local parser
> can set ranges for a nested-nested parser. Eg, we might have a markdown parser
> for parsing doc-comments, and inside the markdown there could be code blocks
> which require another level of nested parser.
>
> This use-case is a bit advanced but we definitely need to support it in our
> design. And my brain is twisted by all the dependency and range. If you guys
> has some ideas they’ll be most welcome :-)
>
I believe I’ve found a good way to solve this problem. I pushed the changes to master.
Basically I added a function treesit-parser-changed-ranges that can directly return the change ranges from last reparse. This means we don’t need to use notifiers to get those change ranges anymore. Then in treesit-pre-redisplay, we reparse the primary parser and get the changed ranges from it.
Once we have the changed ranges, we update other non-primary parser’s ranges, but only within the changed ranges. Originally we were updating those parser’s ranges on the whole buffer, which led to the slowdown. Then we had to use some workaround to solve this. Now the workaround isn’t needed anymore.
I also remove some notifier functions and moved their work into treesit-pre-redisplay.
Yuan
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-04-20 2:18 Yuan Fu
@ 2024-04-20 19:14 ` Vincenzo Pupillo
2024-04-23 5:09 ` Yuan Fu
2024-05-06 2:04 ` Dmitry Gutov
1 sibling, 1 reply; 31+ messages in thread
From: Vincenzo Pupillo @ 2024-04-20 19:14 UTC (permalink / raw)
To: 付禹安, Dmitry Gutov, emacs-devel
Cc: Ergus via Emacs development discussions., Yuan Fu
Great job!
I tried your new patch with my usual benchmark, tcpdf.php, and my php-ts-mode
and it works very well!
Thank you very much
V.
In data sabato 20 aprile 2024 04:18:53 CEST, Yuan Fu ha scritto:
> > > On Feb 18, 2024, at 9:53 PM, Yuan Fu <casouri@gmail.com> wrote:
> > >> On Feb 17, 2024, at 7:37 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
> > >>
> > >> On 13/02/2024 10:08, Yuan Fu wrote:
> > >>>> On 12/02/2024 06:16, Yuan Fu wrote:
> > >>>>> Thanks, the culprit is the call to treesit-update-ranges in
> > >>>>> treesit--pre-redisplay, where we don’t pass it any specific range,
> > >>>>> so it
> > >>>>> updates the range for the whole buffer. Eli, is there any way to get
> > >>>>> a
> > >>>>> rough estimate the range that redisplay is refreshing? Do you think
> > >>>>> something like this would work?
> > >>>>
> > >>>> If we don't update the ranges outside of some interval surrounding
> > >>>> the
> > >>>> window, what does that mean for correctness?
> > >>>
> > >>> If the place of update and the embedded code currently in view belong
> > >>> to
> > >>> the same node in the host language, then when we update ranges for the
> > >>> current window-visible range, the whole node’s range is updated. So at
> > >>> least for this node, the range is correct.
> > >>> If the place of update and the embedded code currently in view belong
> > >>> to
> > >>> different nodes in the host language, then when we update ranges for
> > >>> the
> > >>> current window-visible range, only the visible node’s range is
> > >>> updated.
> > >>
> > >> Okay. What about positions after the visible part of the buffer? Can
> > >> their
> > >> ranges be outdated? It's probably okay when the ranges are only used
> > >> for
> > >> font-lock and syntax-ppss, but I wonder about possible other
> > >> applications
> > >> (reindenting the whole buffer, for example).
> > >
> > > It’s the same as positions before the visible part. For reindenting the
> > > whole buffer, treesit-indent-region will update the range for the whole
> > > buffer at the very beginning.
> > >
> > >>>> Perhaps the mode has a syntax-propertize-function which behaves
> > >>>> differently (as it should) depending on the language at point. Or
> > >>>> different ranges have different syntax tables, something like that.
> > >>>>
> > >>>> If the ranges, after some edit (perhaps a programmatic one, performed
> > >>>> far
> > >>>> from the visible area), are kept not update somewhere around the
> > >>>> beginning
> > >>>> of the buffer, do we not risk confusing the syntax-ppss parser, for
> > >>>> example?
> > >>>
> > >>> That can happen, yes.
> > >>>
> > >>>> Come to think of it, take treesit-indent: it only updates the ranges
> > >>>> for
> > >>>> the current line. But the line's indentation usually depends on the
> > >>>> previous buffer positions, doesn't it?
> > >>>
> > >>> The range passed to treesit-update-ranges act as an intercepting
> > >>> range—we
> > >>> capture nodes that intercepts with the range and use them to update
> > >>> ranges.
> > >>> If the line to be indented is in an embedded language block, the whole
> > >>> block will be captured and it’s range will be given to the embedded
> > >>> language parser.
> > >>> We haven’t have any problem so far mainly because most embedded code
> > >>> blocks
> > >>> are local, and it’s rare for some edit to take place far from the
> > >>> visible
> > >>> portion which affects ranges and user expects that edit to affect the
> > >>> current visible range.
> > >>> I don’t have any great idea for a better way to update ranges right
> > >>> now.
> > >>> Let me think about that. In the meantime, I’ll push a temporary fix so
> > >>> V’s
> > >>> original problem can be solved.
> > >>
> > >> I was thinking (since considering the same problem in mmm-mode,
> > >> actually)
> > >> that it would make sense to either plug into
> > >> syntax-propertize-function, or
> > >> have a parallel data structure similarly tracking the outdated buffer
> > >> regions, which would only update the part of the buffer which had been
> > >> modified since last time.
> > >>
> > >> Dealing with the "remainder" of the buffer might be trickier, but maybe
> > >> some heuristic which would help detect the "no changes" case could be
> > >> implemented.> >
> > > Yeah, something similar to syntax-ppss or jit-lock. Or maybe it can be
> > > avoided, since the current on-demand range update has been working fine,
> > > until we added treesit--pre-redisplay for syntax-ppss.
> >
> > This is actually a bit involved, because there could be multiple layer’s
> > of
> > parsers: the host language sets range for a local parser, and the local
> > parser can set ranges for a nested-nested parser. Eg, we might have a
> > markdown parser for parsing doc-comments, and inside the markdown there
> > could be code blocks which require another level of nested parser.
> >
> > This use-case is a bit advanced but we definitely need to support it in
> > our
> > design. And my brain is twisted by all the dependency and range. If you
> > guys has some ideas they’ll be most welcome :-)
>
> I believe I’ve found a good way to solve this problem. I pushed the changes
> to master.
>
> Basically I added a function treesit-parser-changed-ranges that can directly
> return the change ranges from last reparse. This means we don’t need to use
> notifiers to get those change ranges anymore. Then in
> treesit-pre-redisplay, we reparse the primary parser and get the changed
> ranges from it.
>
> Once we have the changed ranges, we update other non-primary parser’s
> ranges, but only within the changed ranges. Originally we were updating
> those parser’s ranges on the whole buffer, which led to the slowdown. Then
> we had to use some workaround to solve this. Now the workaround isn’t
> needed anymore.
>
> I also remove some notifier functions and moved their work into
> treesit-pre-redisplay.
>
> Yuan
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-04-20 19:14 ` Vincenzo Pupillo
@ 2024-04-23 5:09 ` Yuan Fu
0 siblings, 0 replies; 31+ messages in thread
From: Yuan Fu @ 2024-04-23 5:09 UTC (permalink / raw)
To: Vincenzo Pupillo; +Cc: Dmitry Gutov, emacs-devel
> On Apr 20, 2024, at 12:14 PM, Vincenzo Pupillo <v.pupillo@gmail.com> wrote:
>
> Great job!
> I tried your new patch with my usual benchmark, tcpdf.php, and my php-ts-mode
> and it works very well!
> Thank you very much
> V.
Glad it worked! It’s been haunting me for months and now I can finally sleep soundly at night :-)
Yuan
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-04-20 2:18 Yuan Fu
2024-04-20 19:14 ` Vincenzo Pupillo
@ 2024-05-06 2:04 ` Dmitry Gutov
2024-05-09 0:16 ` Yuan Fu
1 sibling, 1 reply; 31+ messages in thread
From: Dmitry Gutov @ 2024-05-06 2:04 UTC (permalink / raw)
To: Yuan Fu; +Cc: Ergus via Emacs development discussions.
Hi Yuan,
Sorry if I'm being too pedantic here.
On 20/04/2024 05:18, Yuan Fu wrote:
> I believe I’ve found a good way to solve this problem. I pushed the changes to master.
>
> Basically I added a function treesit-parser-changed-ranges that can directly return the change ranges from last reparse. This means we don’t need to use notifiers to get those change ranges anymore. Then in treesit-pre-redisplay, we reparse the primary parser and get the changed ranges from it.
>
> Once we have the changed ranges, we update other non-primary parser’s ranges, but only within the changed ranges. Originally we were updating those parser’s ranges on the whole buffer, which led to the slowdown. Then we had to use some workaround to solve this. Now the workaround isn’t needed anymore.
The essence of the change (querying fewer ranges) looks good.
I'm a bit uneasy about the new function and how it's supposed to be
used. treesit-parser-changed-ranges returns the ranges changes during
the last reparse. That seems to imply that all of its callers must have
the up-to-date information about the state of the buffer before that
reparse, and thus basically follow the parser's updates through some
mechanism. The implementation also saves some information during every
reparse, whether somebody is going to call treesit-parser-changed-ranges
or not.
To take our new code as an example, the only client of
treesit-parser-changed-ranges now is treesit--pre-redisplay, which is
called from syntax-propertize-extend-region-functions and
pre-redisplay-functions.
Is it possible that there would occur multiple changes and reparses
between some firings of the above hooks? For example, some new feature
might go over the buffer's text with an automated multi-step
transformation, calling the parser (but not syntax-ppss) on each step.
In such a scenario it seems treesit--pre-redisplay might miss
intermediate range updates. Would that be okay?
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-05-06 2:04 ` Dmitry Gutov
@ 2024-05-09 0:16 ` Yuan Fu
2024-05-12 23:44 ` Dmitry Gutov
2024-05-26 4:23 ` Stefan Monnier via Emacs development discussions.
0 siblings, 2 replies; 31+ messages in thread
From: Yuan Fu @ 2024-05-09 0:16 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: Ergus via Emacs development discussions.
> On May 5, 2024, at 7:04 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>
> Hi Yuan,
>
> Sorry if I'm being too pedantic here.
>
> On 20/04/2024 05:18, Yuan Fu wrote:
>
>> I believe I’ve found a good way to solve this problem. I pushed the changes to master.
>> Basically I added a function treesit-parser-changed-ranges that can directly return the change ranges from last reparse. This means we don’t need to use notifiers to get those change ranges anymore. Then in treesit-pre-redisplay, we reparse the primary parser and get the changed ranges from it.
>> Once we have the changed ranges, we update other non-primary parser’s ranges, but only within the changed ranges. Originally we were updating those parser’s ranges on the whole buffer, which led to the slowdown. Then we had to use some workaround to solve this. Now the workaround isn’t needed anymore.
>
> The essence of the change (querying fewer ranges) looks good.
>
> I'm a bit uneasy about the new function and how it's supposed to be used. treesit-parser-changed-ranges returns the ranges changes during the last reparse. That seems to imply that all of its callers must have the up-to-date information about the state of the buffer before that reparse, and thus basically follow the parser's updates through some mechanism. The implementation also saves some information during every reparse, whether somebody is going to call treesit-parser-changed-ranges or not.
>
> To take our new code as an example, the only client of treesit-parser-changed-ranges now is treesit--pre-redisplay, which is called from syntax-propertize-extend-region-functions and pre-redisplay-functions.
>
> Is it possible that there would occur multiple changes and reparses between some firings of the above hooks? For example, some new feature might go over the buffer's text with an automated multi-step transformation, calling the parser (but not syntax-ppss) on each step.
> In such a scenario it seems treesit--pre-redisplay might miss intermediate range updates. Would that be okay?
I think you’re right. The chance of it actually go wrong will be slim, but anything that’s possible to go wrong will eventually go wrong.
The remaining question is how. I’m thinking of keeping a history of updated ranges, each marked with the parser timestamp. The parser timestamp is already there, it’s incremented every time the parser reparses. And treesit-parser-changed-ranges will return the timestamp along with the updated ranges. Then in the next iteration, the consumer can pass the last timestamp to treesit-parser-changed-ranges, which tells it to return all the changed ranges since that timestamp.
The only problem is to decide how long a history of updated ranges do we keep for each parser. The 100% correct approach is to maintain a separate history for each consumer, and never throw away old ranges until the consumer consumes them. But then you risk wasting memory if some consumer never consumes the ranges. To handle that we can add a hard limit. But then this hard limit might be too low for some edge case… We can make this hard limit configurable, and if we ever encountered a case where this hard limit is not enough and there’s no way around it (unlikely), we can instruct users or lisp program to increase it.
Yuan
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-05-09 0:16 ` Yuan Fu
@ 2024-05-12 23:44 ` Dmitry Gutov
2024-05-22 5:51 ` Yuan Fu
2024-05-26 4:23 ` Stefan Monnier via Emacs development discussions.
1 sibling, 1 reply; 31+ messages in thread
From: Dmitry Gutov @ 2024-05-12 23:44 UTC (permalink / raw)
To: Yuan Fu; +Cc: Ergus via Emacs development discussions.
On 09/05/2024 03:16, Yuan Fu wrote:
>> Is it possible that there would occur multiple changes and reparses between some firings of the above hooks? For example, some new feature might go over the buffer's text with an automated multi-step transformation, calling the parser (but not syntax-ppss) on each step.
>> In such a scenario it seems treesit--pre-redisplay might miss intermediate range updates. Would that be okay?
>
> I think you’re right. The chance of it actually go wrong will be slim, but anything that’s possible to go wrong will eventually go wrong.
Thanks for confirming the concern.
> The remaining question is how. I’m thinking of keeping a history of updated ranges, each marked with the parser timestamp. The parser timestamp is already there, it’s incremented every time the parser reparses. And treesit-parser-changed-ranges will return the timestamp along with the updated ranges. Then in the next iteration, the consumer can pass the last timestamp to treesit-parser-changed-ranges, which tells it to return all the changed ranges since that timestamp.
>
> The only problem is to decide how long a history of updated ranges do we keep for each parser. The 100% correct approach is to maintain a separate history for each consumer, and never throw away old ranges until the consumer consumes them. But then you risk wasting memory if some consumer never consumes the ranges. To handle that we can add a hard limit. But then this hard limit might be too low for some edge case… We can make this hard limit configurable, and if we ever encountered a case where this hard limit is not enough and there’s no way around it (unlikely), we can instruct users or lisp program to increase it.
That could work. Although it's hard for me to imagine how far back the
history would have to be stored, and would that have any practical
consequences for Emacs's memory use. Maybe not.
The approach I was thinking of is in different direction: we take a step
back, remove (or stop using at least) the new function, and go back to
the idea of subscribing to parsers' after-change notifiers. The
improvement in commit f62c1b4cd00 seems to stem from relying foremost on
changes ranges in the primary parser. Okay - we re-add the listener for
the primary parser only.
This listener would be specific for a particular consumer. In our case,
we'd have a listener which would populate - and then update - the
variable used by treesit--pre-redisplay. That variable would store the
"up to date" list of updated ranges. The listener, on every call, would
"merge" its current value one with the new list of ranges (*).
treesit--pre-redisplay would use the data in that data structure instead
of calling treesit-parser-changed-ranges, and set the value to nil to
"reset" it for the next update.
(*) So real "merging" would only need to be performed when listener
fires 2+ times between the two adjacent treesit--pre-redisplay calls.
Otherwise the current value is nil, so the the new list is simply
assigned to the variable. Anyway, the merging logic seems to be the
trickiest part in this scheme (managing and interpreting offsets), but
it should be very similar in both approaches.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-05-12 23:44 ` Dmitry Gutov
@ 2024-05-22 5:51 ` Yuan Fu
2024-05-22 23:42 ` Dmitry Gutov
0 siblings, 1 reply; 31+ messages in thread
From: Yuan Fu @ 2024-05-22 5:51 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: Ergus via Emacs development discussions.
> On May 12, 2024, at 4:44 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>
> On 09/05/2024 03:16, Yuan Fu wrote:
>
>>> Is it possible that there would occur multiple changes and reparses between some firings of the above hooks? For example, some new feature might go over the buffer's text with an automated multi-step transformation, calling the parser (but not syntax-ppss) on each step.
>>> In such a scenario it seems treesit--pre-redisplay might miss intermediate range updates. Would that be okay?
>> I think you’re right. The chance of it actually go wrong will be slim, but anything that’s possible to go wrong will eventually go wrong.
>
> Thanks for confirming the concern.
>
>> The remaining question is how. I’m thinking of keeping a history of updated ranges, each marked with the parser timestamp. The parser timestamp is already there, it’s incremented every time the parser reparses. And treesit-parser-changed-ranges will return the timestamp along with the updated ranges. Then in the next iteration, the consumer can pass the last timestamp to treesit-parser-changed-ranges, which tells it to return all the changed ranges since that timestamp.
>> The only problem is to decide how long a history of updated ranges do we keep for each parser. The 100% correct approach is to maintain a separate history for each consumer, and never throw away old ranges until the consumer consumes them. But then you risk wasting memory if some consumer never consumes the ranges. To handle that we can add a hard limit. But then this hard limit might be too low for some edge case… We can make this hard limit configurable, and if we ever encountered a case where this hard limit is not enough and there’s no way around it (unlikely), we can instruct users or lisp program to increase it.
>
> That could work. Although it's hard for me to imagine how far back the history would have to be stored, and would that have any practical consequences for Emacs's memory use. Maybe not.
>
> The approach I was thinking of is in different direction: we take a step back, remove (or stop using at least) the new function, and go back to the idea of subscribing to parsers' after-change notifiers. The improvement in commit f62c1b4cd00 seems to stem from relying foremost on changes ranges in the primary parser. Okay - we re-add the listener for the primary parser only.
>
> This listener would be specific for a particular consumer. In our case, we'd have a listener which would populate - and then update - the variable used by treesit--pre-redisplay. That variable would store the "up to date" list of updated ranges. The listener, on every call, would "merge" its current value one with the new list of ranges (*). treesit--pre-redisplay would use the data in that data structure instead of calling treesit-parser-changed-ranges, and set the value to nil to "reset" it for the next update.
>
> (*) So real "merging" would only need to be performed when listener fires 2+ times between the two adjacent treesit--pre-redisplay calls. Otherwise the current value is nil, so the the new list is simply assigned to the variable. Anyway, the merging logic seems to be the trickiest part in this scheme (managing and interpreting offsets), but it should be very similar in both approaches.
I agree. The usefulness of treesit-parser-changed-ranges aren’t really justified at this point (well, except that it makes the caller’s code much easier to follow). Let me implement what you described and let’s see how it goes. I think we don’t even need to merge the ranges (which will be prone to bugs if I were to write it ;-), we can just push the new ranges to a list and later process them one by one.
Yuan
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-05-22 5:51 ` Yuan Fu
@ 2024-05-22 23:42 ` Dmitry Gutov
2024-05-27 22:03 ` Yuan Fu
0 siblings, 1 reply; 31+ messages in thread
From: Dmitry Gutov @ 2024-05-22 23:42 UTC (permalink / raw)
To: Yuan Fu; +Cc: Ergus via Emacs development discussions.
On 22/05/2024 08:51, Yuan Fu wrote:
>> This listener would be specific for a particular consumer. In our case, we'd have a listener which would populate - and then update - the variable used by treesit--pre-redisplay. That variable would store the "up to date" list of updated ranges. The listener, on every call, would "merge" its current value one with the new list of ranges (*). treesit--pre-redisplay would use the data in that data structure instead of calling treesit-parser-changed-ranges, and set the value to nil to "reset" it for the next update.
>>
>> (*) So real "merging" would only need to be performed when listener fires 2+ times between the two adjacent treesit--pre-redisplay calls. Otherwise the current value is nil, so the the new list is simply assigned to the variable. Anyway, the merging logic seems to be the trickiest part in this scheme (managing and interpreting offsets), but it should be very similar in both approaches.
> I agree. The usefulness of treesit-parser-changed-ranges aren’t really justified at this point (well, except that it makes the caller’s code much easier to follow).
That it does.
> Let me implement what you described and let’s see how it goes.
Thank you, looking forward to it!
> I think we don’t even need to merge the ranges (which will be prone to bugs if I were to write it 😉, we can just push the new ranges to a list and later process them one by one.
I think this might amount to the same thing (merging when generating, or
merging when processing). It seems there will also be a small issue of
"kinds" of ranges?..
Like for example suppose we have two consecutive operations which insert
new characters in range 200..300. The result should be a range that
spans 200..400, right?
But if one operation just changes text in that range (keeping its length
intact, e.g. capitalizing the whole region), and another does the same
(back to lower case), then the combined range would remain 200..300.
Computing that might be difficult without having access to the kinds of
changes are being done (does tree-sitter report those?). OTOH, most of
the time the most important part is the position of the beginning of the
changes (e.g. for syntax-ppss), and we could treat the rest of the
buffer as invalidated...
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-05-22 23:42 ` Dmitry Gutov
@ 2024-05-27 22:03 ` Yuan Fu
2024-05-27 22:24 ` Dmitry Gutov
0 siblings, 1 reply; 31+ messages in thread
From: Yuan Fu @ 2024-05-27 22:03 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: Ergus via Emacs development discussions.
> On May 22, 2024, at 4:42 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>
> On 22/05/2024 08:51, Yuan Fu wrote:
>>> This listener would be specific for a particular consumer. In our case, we'd have a listener which would populate - and then update - the variable used by treesit--pre-redisplay. That variable would store the "up to date" list of updated ranges. The listener, on every call, would "merge" its current value one with the new list of ranges (*). treesit--pre-redisplay would use the data in that data structure instead of calling treesit-parser-changed-ranges, and set the value to nil to "reset" it for the next update.
>>>
>>> (*) So real "merging" would only need to be performed when listener fires 2+ times between the two adjacent treesit--pre-redisplay calls. Otherwise the current value is nil, so the the new list is simply assigned to the variable. Anyway, the merging logic seems to be the trickiest part in this scheme (managing and interpreting offsets), but it should be very similar in both approaches.
>> I agree. The usefulness of treesit-parser-changed-ranges aren’t really justified at this point (well, except that it makes the caller’s code much easier to follow).
>
> That it does.
>
>> Let me implement what you described and let’s see how it goes.
>
> Thank you, looking forward to it!
>
>> I think we don’t even need to merge the ranges (which will be prone to bugs if I were to write it 😉, we can just push the new ranges to a list and later process them one by one.
>
> I think this might amount to the same thing (merging when generating, or merging when processing). It seems there will also be a small issue of "kinds" of ranges?..
>
> Like for example suppose we have two consecutive operations which insert new characters in range 200..300. The result should be a range that spans 200..400, right?
>
> But if one operation just changes text in that range (keeping its length intact, e.g. capitalizing the whole region), and another does the same (back to lower case), then the combined range would remain 200..300.
>
> Computing that might be difficult without having access to the kinds of changes are being done (does tree-sitter report those?). OTOH, most of the time the most important part is the position of the beginning of the changes (e.g. for syntax-ppss), and we could treat the rest of the buffer as invalidated…
Oh you’re absolutely right, the range will be shifted by later edits in the buffer. It’ll be hella hairy to keep track of all that—say the previous changed range is (100 . 200), and user inserted 50 chars in position 150, we need to account for that and update the range to (100 . 250) before merging the new updated ranges with this one.
So it seems the best way is really to move treesit--pre-redisplay entirely into the primary parser’s notifier, WDYT?
Yuan
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-05-27 22:03 ` Yuan Fu
@ 2024-05-27 22:24 ` Dmitry Gutov
2024-06-04 4:53 ` Yuan Fu
0 siblings, 1 reply; 31+ messages in thread
From: Dmitry Gutov @ 2024-05-27 22:24 UTC (permalink / raw)
To: Yuan Fu; +Cc: Ergus via Emacs development discussions.
On 28/05/2024 01:03, Yuan Fu wrote:
>> But if one operation just changes text in that range (keeping its length intact, e.g. capitalizing the whole region), and another does the same (back to lower case), then the combined range would remain 200..300.
>>
>> Computing that might be difficult without having access to the kinds of changes are being done (does tree-sitter report those?). OTOH, most of the time the most important part is the position of the beginning of the changes (e.g. for syntax-ppss), and we could treat the rest of the buffer as invalidated…
>
> Oh you’re absolutely right, the range will be shifted by later edits in the buffer. It’ll be hella hairy to keep track of all that—say the previous changed range is (100 . 200), and user inserted 50 chars in position 150, we need to account for that and update the range to (100 . 250) before merging the new updated ranges with this one.
>
> So it seems the best way is really to move treesit--pre-redisplay entirely into the primary parser’s notifier, WDYT?
Yep, that sounds easier. And the performance should be about the same,
even if it'd have a bit extra overhead in those theoretical complex cases.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-05-27 22:24 ` Dmitry Gutov
@ 2024-06-04 4:53 ` Yuan Fu
2024-06-04 8:19 ` Vincenzo Pupillo
2024-06-06 23:59 ` Dmitry Gutov
0 siblings, 2 replies; 31+ messages in thread
From: Yuan Fu @ 2024-06-04 4:53 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: Ergus via Emacs development discussions., Stefan Monnier
> On May 27, 2024, at 3:24 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>
> On 28/05/2024 01:03, Yuan Fu wrote:
>
>>> But if one operation just changes text in that range (keeping its length intact, e.g. capitalizing the whole region), and another does the same (back to lower case), then the combined range would remain 200..300.
>>>
>>> Computing that might be difficult without having access to the kinds of changes are being done (does tree-sitter report those?). OTOH, most of the time the most important part is the position of the beginning of the changes (e.g. for syntax-ppss), and we could treat the rest of the buffer as invalidated…
>> Oh you’re absolutely right, the range will be shifted by later edits in the buffer. It’ll be hella hairy to keep track of all that—say the previous changed range is (100 . 200), and user inserted 50 chars in position 150, we need to account for that and update the range to (100 . 250) before merging the new updated ranges with this one.
>> So it seems the best way is really to move treesit--pre-redisplay entirely into the primary parser’s notifier, WDYT?
>
> Yep, that sounds easier. And the performance should be about the same, even if it'd have a bit extra overhead in those theoretical complex cases.
>
Ok, I pushed a commit to master that does just that. I tried with C’s block comment, and php-ts-mode. Everything seems to work fine.
I also added treesit-primary-parser. This is supposed to be another configuration variable that a major mode should set. I’ve encountered various cases where knowing the primary parser (parser that parses the entire buffer rather than just a subset of it) would be very helpful. Treesit-primary-parser can be auto-guessed if major mode doesn’t set it, so it shouldn’t break anything. I’d love to know yours and Stefan’s thoughts on it.
Yuan
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-06-04 4:53 ` Yuan Fu
@ 2024-06-04 8:19 ` Vincenzo Pupillo
2024-06-04 15:01 ` Eli Zaretskii
2024-06-06 23:59 ` Dmitry Gutov
1 sibling, 1 reply; 31+ messages in thread
From: Vincenzo Pupillo @ 2024-06-04 8:19 UTC (permalink / raw)
To: Dmitry Gutov, emacs-devel
Cc: Ergus via Emacs development discussions., Stefan Monnier, Yuan Fu
Thank you Yuan!
Yesterday I tried your patch with my php-ts-mode (with an without treesit-primary-parser) and works fine.
Yesterday I pushed a commit that uses the new variable.
When I work at home, php-ts-mode is the major-mode I use,
and is almost done (I just need to add in support for flymake).
Next week I hope to submit it for inclusion in emacs.
It also seems to me that it might also be useful for trying out various
combinations of parser.
Thanks
V.
In data martedì 4 giugno 2024 06:53:47 CEST, Yuan Fu ha scritto:
>
> > On May 27, 2024, at 3:24 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
> >
> > On 28/05/2024 01:03, Yuan Fu wrote:
> >
> >>> But if one operation just changes text in that range (keeping its length intact, e.g. capitalizing the whole region), and another does the same (back to lower case), then the combined range would remain 200..300.
> >>>
> >>> Computing that might be difficult without having access to the kinds of changes are being done (does tree-sitter report those?). OTOH, most of the time the most important part is the position of the beginning of the changes (e.g. for syntax-ppss), and we could treat the rest of the buffer as invalidated…
> >> Oh you’re absolutely right, the range will be shifted by later edits in the buffer. It’ll be hella hairy to keep track of all that—say the previous changed range is (100 . 200), and user inserted 50 chars in position 150, we need to account for that and update the range to (100 . 250) before merging the new updated ranges with this one.
> >> So it seems the best way is really to move treesit--pre-redisplay entirely into the primary parser’s notifier, WDYT?
> >
> > Yep, that sounds easier. And the performance should be about the same, even if it'd have a bit extra overhead in those theoretical complex cases.
> >
>
> Ok, I pushed a commit to master that does just that. I tried with C’s block comment, and php-ts-mode. Everything seems to work fine.
>
> I also added treesit-primary-parser. This is supposed to be another configuration variable that a major mode should set. I’ve encountered various cases where knowing the primary parser (parser that parses the entire buffer rather than just a subset of it) would be very helpful. Treesit-primary-parser can be auto-guessed if major mode doesn’t set it, so it shouldn’t break anything. I’d love to know yours and Stefan’s thoughts on it.
>
> Yuan
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-06-04 8:19 ` Vincenzo Pupillo
@ 2024-06-04 15:01 ` Eli Zaretskii
2024-06-04 20:44 ` Vincenzo Pupillo
0 siblings, 1 reply; 31+ messages in thread
From: Eli Zaretskii @ 2024-06-04 15:01 UTC (permalink / raw)
To: Vincenzo Pupillo; +Cc: dmitry, emacs-devel, emacs-devel, monnier, casouri
> From: Vincenzo Pupillo <vincenzo.pupillo@lpsd.it>
> Cc: "Ergus via Emacs development discussions." <emacs-devel@gnu.org>,
> Stefan Monnier <monnier@iro.umontreal.ca>, Yuan Fu <casouri@gmail.com>
> Date: Tue, 04 Jun 2024 10:19:30 +0200
>
> When I work at home, php-ts-mode is the major-mode I use,
> and is almost done (I just need to add in support for flymake).
> Next week I hope to submit it for inclusion in emacs.
If you want it included in Emacs 30, next week might be too late,
because I'm about to cut the emacs-30 release branch. Could you do
this sooner, perhaps?
Thanks.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-06-04 15:01 ` Eli Zaretskii
@ 2024-06-04 20:44 ` Vincenzo Pupillo
0 siblings, 0 replies; 31+ messages in thread
From: Vincenzo Pupillo @ 2024-06-04 20:44 UTC (permalink / raw)
To: emacs-devel
Cc: dmitry, emacs-devel, emacs-devel, monnier, casouri, Eli Zaretskii
Ok I'll try.
Thank you.
V.
In data martedì 4 giugno 2024 17:01:18 CEST, Eli Zaretskii ha scritto:
> > From: Vincenzo Pupillo <vincenzo.pupillo@lpsd.it>
> > Cc: "Ergus via Emacs development discussions." <emacs-devel@gnu.org>,
> >
> > Stefan Monnier <monnier@iro.umontreal.ca>, Yuan Fu <casouri@gmail.com>
> >
> > Date: Tue, 04 Jun 2024 10:19:30 +0200
> >
> > When I work at home, php-ts-mode is the major-mode I use,
> > and is almost done (I just need to add in support for flymake).
> > Next week I hope to submit it for inclusion in emacs.
>
> If you want it included in Emacs 30, next week might be too late,
> because I'm about to cut the emacs-30 release branch. Could you do
> this sooner, perhaps?
>
> Thanks.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-06-04 4:53 ` Yuan Fu
2024-06-04 8:19 ` Vincenzo Pupillo
@ 2024-06-06 23:59 ` Dmitry Gutov
2024-06-07 0:13 ` Dmitry Gutov
1 sibling, 1 reply; 31+ messages in thread
From: Dmitry Gutov @ 2024-06-06 23:59 UTC (permalink / raw)
To: Yuan Fu; +Cc: Ergus via Emacs development discussions., Stefan Monnier
On 04/06/2024 07:53, Yuan Fu wrote:
>> Yep, that sounds easier. And the performance should be about the same, even if it'd have a bit extra overhead in those theoretical complex cases.
>>
> Ok, I pushed a commit to master that does just that. I tried with C’s block comment, and php-ts-mode. Everything seems to work fine.
>
> I also added treesit-primary-parser. This is supposed to be another configuration variable that a major mode should set. I’ve encountered various cases where knowing the primary parser (parser that parses the entire buffer rather than just a subset of it) would be very helpful. Treesit-primary-parser can be auto-guessed if major mode doesn’t set it, so it shouldn’t break anything. I’d love to know yours and Stefan’s thoughts on it.
Thanks Yuan, it's looking good.
The new variable probably deserves a mention in the manual, although if
the guesser fallback is reliable enough (which it seems to be, given
that we didn't hear seee reports about that specifically) maybe that's
not urgent.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-06-06 23:59 ` Dmitry Gutov
@ 2024-06-07 0:13 ` Dmitry Gutov
2024-06-11 16:36 ` Yuan Fu
0 siblings, 1 reply; 31+ messages in thread
From: Dmitry Gutov @ 2024-06-07 0:13 UTC (permalink / raw)
To: Yuan Fu; +Cc: Ergus via Emacs development discussions., Stefan Monnier
On 07/06/2024 02:59, Dmitry Gutov wrote:
> The new variable probably deserves a mention in the manual, although if
> the guesser fallback is reliable enough (which it seems to be, given
> that we didn't hear seee reports about that specifically) maybe that's
> not urgent.
Sorry, you have already documented it, of course. Just in a separate commit.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-06-07 0:13 ` Dmitry Gutov
@ 2024-06-11 16:36 ` Yuan Fu
0 siblings, 0 replies; 31+ messages in thread
From: Yuan Fu @ 2024-06-11 16:36 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: Ergus via Emacs development discussions., Stefan Monnier
> On Jun 6, 2024, at 5:13 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>
> On 07/06/2024 02:59, Dmitry Gutov wrote:
>> The new variable probably deserves a mention in the manual, although if the guesser fallback is reliable enough (which it seems to be, given that we didn't hear seee reports about that specifically) maybe that's not urgent.
>
> Sorry, you have already documented it, of course. Just in a separate commit.
Indeed 😁
Yuan
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-05-09 0:16 ` Yuan Fu
2024-05-12 23:44 ` Dmitry Gutov
@ 2024-05-26 4:23 ` Stefan Monnier via Emacs development discussions.
2024-05-27 22:05 ` Yuan Fu
1 sibling, 1 reply; 31+ messages in thread
From: Stefan Monnier via Emacs development discussions. @ 2024-05-26 4:23 UTC (permalink / raw)
To: emacs-devel
> The only problem is to decide how long a history of updated ranges do we
> keep for each parser. The 100% correct approach is to maintain a separate
> history for each consumer, and never throw away old ranges until the
> consumer consumes them. But then you risk wasting memory if some consumer
> never consumes the ranges. To handle that we can add a hard limit. But then
> this hard limit might be too low for some edge case… We can make this hard
> limit configurable, and if we ever encountered a case where this hard limit
> is not enough and there’s no way around it (unlikely), we can instruct users
> or lisp program to increase it.
Side note: the above is fairly close to describing `track-changes.el`.
[ where I don't have any hard limit, currently. ]
Stefan
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: treesitter local parser: huge slowdown and memory usage in a long file
2024-05-26 4:23 ` Stefan Monnier via Emacs development discussions.
@ 2024-05-27 22:05 ` Yuan Fu
2024-05-27 22:34 ` Stefan Monnier
0 siblings, 1 reply; 31+ messages in thread
From: Yuan Fu @ 2024-05-27 22:05 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
> On May 25, 2024, at 9:23 PM, Stefan Monnier via Emacs development discussions. <emacs-devel@gnu.org> wrote:
>
>> The only problem is to decide how long a history of updated ranges do we
>> keep for each parser. The 100% correct approach is to maintain a separate
>> history for each consumer, and never throw away old ranges until the
>> consumer consumes them. But then you risk wasting memory if some consumer
>> never consumes the ranges. To handle that we can add a hard limit. But then
>> this hard limit might be too low for some edge case… We can make this hard
>> limit configurable, and if we ever encountered a case where this hard limit
>> is not enough and there’s no way around it (unlikely), we can instruct users
>> or lisp program to increase it.
>
> Side note: the above is fairly close to describing `track-changes.el`.
> [ where I don't have any hard limit, currently. ]
>
I see. I suppose you had to shift “modified region” around according to later edits, right?
Yuan
^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2024-06-11 16:36 UTC | newest]
Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-11 21:53 treesitter local parser: huge slowdown and memory usage in a long file Vincenzo Pupillo
2024-02-12 4:16 ` Yuan Fu
2024-02-12 14:09 ` Eli Zaretskii
2024-02-13 8:15 ` Yuan Fu
2024-02-13 9:39 ` Vincenzo Pupillo
2024-02-13 12:59 ` Eli Zaretskii
2024-02-13 0:50 ` Dmitry Gutov
2024-02-13 8:08 ` Yuan Fu
2024-02-18 3:37 ` Dmitry Gutov
2024-02-19 5:53 ` Yuan Fu
2024-03-21 6:39 ` Yuan Fu
-- strict thread matches above, loose matches on Subject: below --
2024-04-20 2:18 Yuan Fu
2024-04-20 19:14 ` Vincenzo Pupillo
2024-04-23 5:09 ` Yuan Fu
2024-05-06 2:04 ` Dmitry Gutov
2024-05-09 0:16 ` Yuan Fu
2024-05-12 23:44 ` Dmitry Gutov
2024-05-22 5:51 ` Yuan Fu
2024-05-22 23:42 ` Dmitry Gutov
2024-05-27 22:03 ` Yuan Fu
2024-05-27 22:24 ` Dmitry Gutov
2024-06-04 4:53 ` Yuan Fu
2024-06-04 8:19 ` Vincenzo Pupillo
2024-06-04 15:01 ` Eli Zaretskii
2024-06-04 20:44 ` Vincenzo Pupillo
2024-06-06 23:59 ` Dmitry Gutov
2024-06-07 0:13 ` Dmitry Gutov
2024-06-11 16:36 ` Yuan Fu
2024-05-26 4:23 ` Stefan Monnier via Emacs development discussions.
2024-05-27 22:05 ` Yuan Fu
2024-05-27 22:34 ` Stefan Monnier
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).