* bug#58558: 29.0.50; re-search-forward is slow in some buffers
@ 2022-10-16 1:26 Ihor Radchenko
2022-10-16 9:19 ` Lars Ingebrigtsen
2023-04-10 8:48 ` Mattias Engdegård
0 siblings, 2 replies; 81+ messages in thread
From: Ihor Radchenko @ 2022-10-16 1:26 UTC (permalink / raw)
To: 58558
Hi,
I am consistently experiencing a significant slowdown of regexp search
in large buffers in Emacs 29 (master and noverlay), but not on Emacs 28:
ELP data:
;; Emacs 29
;; re-search-forward 181593 10.090536098 5.556...e-05
;; re-search-forward 180625 8.7113028330 4.822...e-05
;; re-search-forward 177357 9.7315074570 5.486...e-05
;; Emacs 28
;; re-search-forward 171661 2.7219785009 1.585...e-05
(up to 4x slowdown)
It happens consistently in Emacs 29, but not in all buffers. Sometimes,
it only happens after some time after Emacs startup. The slowdown is not
there in Emacs 28.
The issue started long time ago (over a year), but all my attempts to
bisect the problem failed or landed on inconsistent bad commits.
The above slowdown should have nothing to do with ELP overheads.
I tested agenda generation times (agenda uses a huge number of regexp
searches) with the following results from manually wrapping
re-search-forward calls into time accumulator:
Emacs 29. Note re-search time
;; Mapped over elements in #<buffer notes.org>. 33/5592 predicate matches. Total time: 8.788400 sec. Pre-process time: 0.000000 sec. Predicate time: 0.604878 sec. Re-search time: 8.023365 sec.
;; Calling parameters: :granularity headline+inlinetask :restrict-elements (headline inlinetask) :next-re "\\(?:\\(?:\\<DEADLINE: *\\(\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\)\\)\\|\\(?:\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\|^\\*+[[:blank:]]+\\(?:[[:upper:]]+[[:blank:]]+\\)?\\[#A]\\|^[[:space:]]*:STYLE:[[:space:]]+habit[[:space:]]*$\\)\\)" :fail-re "\\(?:\\(?:\\<DEADLINE: *\\(\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\)\\)\\|\\(?:\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\|^\\*+[[:blank:]]+\\(?:[[:upper:]]+[[:blank:]]+\\)?\\[#A]\\|^[[:space:]]*:STYLE:[[:space:]]+habit[[:space:]]*$\\)\\)" :from-pos 321 :to-pos #<marker at 21071050 in notes.org> :limit-count nil :after-element nil
Emacs 28. Note re-search time
;; Mapped over elements in #<buffer notes.org>. 33/5592 predicate matches. Total time: 1.396713 sec. Pre-process time: 0.000000 sec. Predicate time: 0.544486 sec. Re-search time: 0.708682 sec.
;; Calling parameters: :granularity headline+inlinetask :restrict-elements (headline inlinetask) :next-re "\\(?:\\(?:\\<DEADLINE: *\\(\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\)\\)\\|\\(?:\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\|^\\*+[[:blank:]]+\\(?:[[:upper:]]+[[:blank:]]+\\)?\\[#A]\\|^[[:space:]]*:STYLE:[[:space:]]+habit[[:space:]]*$\\)\\)" :fail-re "\\(?:\\(?:\\<DEADLINE: *\\(\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\)\\)\\|\\(?:\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\|^\\*+[[:blank:]]+\\(?:[[:upper:]]+[[:blank:]]+\\)?\\[#A]\\|^[[:space:]]*:STYLE:[[:space:]]+habit[[:space:]]*$\\)\\)" :from-pos 321 :to-pos #<marker at 21071050 in notes.org> :limit-count nil :after-element nil
Any idea what might be going on or how to debug this further?
In GNU Emacs 29.0.50 (build 1, x86_64-pc-linux-gnu, GTK+ Version
3.24.34, cairo version 1.16.0) of 2022-10-15 built on yantar92-laptop
Repository revision: b86505387480fed81629cbc81cef6b70098bd607
Repository branch: feature/noverlay
Windowing system distributor 'The X.Org Foundation', version 11.0.12101004
System Description: Gentoo Linux
Configured features:
ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG
JSON LCMS2 LIBXML2 MODULES NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP SOUND
SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS WEBP X11 XDBE XIM XINPUT2 XPM
GTK3 ZLIB
Important settings:
value of $LC_COLLATE: C
value of $LANG: en_US.utf8
locale-coding-system: utf-8-unix
Major mode: Lisp Interaction
Minor modes in effect:
windmove-mode: t
TeX-PDF-mode: t
pyvenv-mode: t
git-email-notmuch-mode: t
git-email-piem-mode: t
piem-notmuch-mode: t
org-edna-mode: t
eros-mode: t
pdf-occur-global-minor-mode: t
which-key-mode: t
diredfl-global-mode: t
dired-async-mode: t
winner-mode: t
eval-sexp-fu-flash-mode: t
global-flycheck-mode: t
flycheck-mode: t
el-patch-use-package-mode: t
global-git-commit-mode: t
magit-auto-revert-mode: t
recentf-mode: t
hl-todo-mode: t
pretty-symbols-mode: t
company-mode: t
persistent-scratch-autosave-mode: t
savehist-mode: t
helm-adaptive-mode: t
helm-mode: t
helm-minibuffer-history-mode: t
helm-ff-icon-mode: t
shell-dirtrack-mode: t
helm--remap-mouse-mode: t
async-bytecomp-package-mode: t
boon-mode: t
boon-local-mode: t
global-hl-line-mode: t
global-page-break-lines-mode: t
page-break-lines-mode: t
shackle-mode: t
override-global-mode: t
straight-use-package-mode: t
straight-package-neutering-mode: t
global-eldoc-mode: t
eldoc-mode: t
show-paren-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
global-prettify-symbols-mode: t
prettify-symbols-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
window-divider-mode: t
line-number-mode: t
indent-tabs-mode: t
transient-mark-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
abbrev-mode: t
Load-path shadows:
/home/yantar92/.emacs.d/straight/build/transient/transient hides /home/yantar92/Git/emacs/lisp/transient
/home/yantar92/.emacs.d/straight/build/xref/xref hides /home/yantar92/Git/emacs/lisp/progmodes/xref
/home/yantar92/.emacs.d/straight/build/project/project hides /home/yantar92/Git/emacs/lisp/progmodes/project
/home/yantar92/.emacs.d/straight/build/org/ox-publish hides /home/yantar92/Git/emacs/lisp/org/ox-publish
/home/yantar92/.emacs.d/straight/build/org/ox-org hides /home/yantar92/Git/emacs/lisp/org/ox-org
/home/yantar92/.emacs.d/straight/build/org/ox-odt hides /home/yantar92/Git/emacs/lisp/org/ox-odt
/home/yantar92/.emacs.d/straight/build/org/org hides /home/yantar92/Git/emacs/lisp/org/org
/home/yantar92/.emacs.d/straight/build/org/ox-md hides /home/yantar92/Git/emacs/lisp/org/ox-md
/home/yantar92/.emacs.d/straight/build/org/ox-man hides /home/yantar92/Git/emacs/lisp/org/ox-man
/home/yantar92/.emacs.d/straight/build/org/ox-latex hides /home/yantar92/Git/emacs/lisp/org/ox-latex
/home/yantar92/.emacs.d/straight/build/org/ox-koma-letter hides /home/yantar92/Git/emacs/lisp/org/ox-koma-letter
/home/yantar92/.emacs.d/straight/build/org/ox-icalendar hides /home/yantar92/Git/emacs/lisp/org/ox-icalendar
/home/yantar92/.emacs.d/straight/build/org/ox-html hides /home/yantar92/Git/emacs/lisp/org/ox-html
/home/yantar92/.emacs.d/straight/build/org/ox-ascii hides /home/yantar92/Git/emacs/lisp/org/ox-ascii
/home/yantar92/.emacs.d/straight/build/org/ox-beamer hides /home/yantar92/Git/emacs/lisp/org/ox-beamer
/home/yantar92/.emacs.d/straight/build/org/org-timer hides /home/yantar92/Git/emacs/lisp/org/org-timer
/home/yantar92/.emacs.d/straight/build/org/org-tempo hides /home/yantar92/Git/emacs/lisp/org/org-tempo
/home/yantar92/.emacs.d/straight/build/org/org-table hides /home/yantar92/Git/emacs/lisp/org/org-table
/home/yantar92/.emacs.d/straight/build/org/org-src hides /home/yantar92/Git/emacs/lisp/org/org-src
/home/yantar92/.emacs.d/straight/build/org/org-protocol hides /home/yantar92/Git/emacs/lisp/org/org-protocol
/home/yantar92/.emacs.d/straight/build/org/org-plot hides /home/yantar92/Git/emacs/lisp/org/org-plot
/home/yantar92/.emacs.d/straight/build/org/org-refile hides /home/yantar92/Git/emacs/lisp/org/org-refile
/home/yantar92/.emacs.d/straight/build/org/org-mouse hides /home/yantar92/Git/emacs/lisp/org/org-mouse
/home/yantar92/.emacs.d/straight/build/org/org-num hides /home/yantar92/Git/emacs/lisp/org/org-num
/home/yantar92/.emacs.d/straight/build/org/org-mobile hides /home/yantar92/Git/emacs/lisp/org/org-mobile
/home/yantar92/.emacs.d/straight/build/org/org-lint hides /home/yantar92/Git/emacs/lisp/org/org-lint
/home/yantar92/.emacs.d/straight/build/org/org-pcomplete hides /home/yantar92/Git/emacs/lisp/org/org-pcomplete
/home/yantar92/.emacs.d/straight/build/org/org-inlinetask hides /home/yantar92/Git/emacs/lisp/org/org-inlinetask
/home/yantar92/.emacs.d/straight/build/org/org-list hides /home/yantar92/Git/emacs/lisp/org/org-list
/home/yantar92/.emacs.d/straight/build/org/org-indent hides /home/yantar92/Git/emacs/lisp/org/org-indent
/home/yantar92/.emacs.d/straight/build/org/org-macs hides /home/yantar92/Git/emacs/lisp/org/org-macs
/home/yantar92/.emacs.d/straight/build/org/org-id hides /home/yantar92/Git/emacs/lisp/org/org-id
/home/yantar92/.emacs.d/straight/build/org/org-loaddefs hides /home/yantar92/Git/emacs/lisp/org/org-loaddefs
/home/yantar92/.emacs.d/straight/build/org/org-habit hides /home/yantar92/Git/emacs/lisp/org/org-habit
/home/yantar92/.emacs.d/straight/build/org/org-goto hides /home/yantar92/Git/emacs/lisp/org/org-goto
/home/yantar92/.emacs.d/straight/build/org/org-keys hides /home/yantar92/Git/emacs/lisp/org/org-keys
/home/yantar92/.emacs.d/straight/build/org/org-feed hides /home/yantar92/Git/emacs/lisp/org/org-feed
/home/yantar92/.emacs.d/straight/build/org/org-datetree hides /home/yantar92/Git/emacs/lisp/org/org-datetree
/home/yantar92/.emacs.d/straight/build/org/org-ctags hides /home/yantar92/Git/emacs/lisp/org/org-ctags
/home/yantar92/.emacs.d/straight/build/org/org-agenda hides /home/yantar92/Git/emacs/lisp/org/org-agenda
/home/yantar92/.emacs.d/straight/build/org/org-footnote hides /home/yantar92/Git/emacs/lisp/org/org-footnote
/home/yantar92/.emacs.d/straight/build/org/org-faces hides /home/yantar92/Git/emacs/lisp/org/org-faces
/home/yantar92/.emacs.d/straight/build/org/org-entities hides /home/yantar92/Git/emacs/lisp/org/org-entities
/home/yantar92/.emacs.d/straight/build/org/org-duration hides /home/yantar92/Git/emacs/lisp/org/org-duration
/home/yantar92/.emacs.d/straight/build/org/org-colview hides /home/yantar92/Git/emacs/lisp/org/org-colview
/home/yantar92/.emacs.d/straight/build/org/org-compat hides /home/yantar92/Git/emacs/lisp/org/org-compat
/home/yantar92/.emacs.d/straight/build/org/org-clock hides /home/yantar92/Git/emacs/lisp/org/org-clock
/home/yantar92/.emacs.d/straight/build/org/org-crypt hides /home/yantar92/Git/emacs/lisp/org/org-crypt
/home/yantar92/.emacs.d/straight/build/org/org-attach-git hides /home/yantar92/Git/emacs/lisp/org/org-attach-git
/home/yantar92/.emacs.d/straight/build/org/org-attach hides /home/yantar92/Git/emacs/lisp/org/org-attach
/home/yantar92/.emacs.d/straight/build/org/org-capture hides /home/yantar92/Git/emacs/lisp/org/org-capture
/home/yantar92/.emacs.d/straight/build/org/org-archive hides /home/yantar92/Git/emacs/lisp/org/org-archive
/home/yantar92/.emacs.d/straight/build/org/ol-gnus hides /home/yantar92/Git/emacs/lisp/org/ol-gnus
/home/yantar92/.emacs.d/straight/build/org/ol-w3m hides /home/yantar92/Git/emacs/lisp/org/ol-w3m
/home/yantar92/.emacs.d/straight/build/org/ol-mhe hides /home/yantar92/Git/emacs/lisp/org/ol-mhe
/home/yantar92/.emacs.d/straight/build/org/ol-rmail hides /home/yantar92/Git/emacs/lisp/org/ol-rmail
/home/yantar92/.emacs.d/straight/build/org/ol-eww hides /home/yantar92/Git/emacs/lisp/org/ol-eww
/home/yantar92/.emacs.d/straight/build/org/ol-irc hides /home/yantar92/Git/emacs/lisp/org/ol-irc
/home/yantar92/.emacs.d/straight/build/org/ol-man hides /home/yantar92/Git/emacs/lisp/org/ol-man
/home/yantar92/.emacs.d/straight/build/org/ol-info hides /home/yantar92/Git/emacs/lisp/org/ol-info
/home/yantar92/.emacs.d/straight/build/org/ob-fortran hides /home/yantar92/Git/emacs/lisp/org/ob-fortran
/home/yantar92/.emacs.d/straight/build/org/ol-eshell hides /home/yantar92/Git/emacs/lisp/org/ol-eshell
/home/yantar92/.emacs.d/straight/build/org/ol-doi hides /home/yantar92/Git/emacs/lisp/org/ol-doi
/home/yantar92/.emacs.d/straight/build/org/ol-docview hides /home/yantar92/Git/emacs/lisp/org/ol-docview
/home/yantar92/.emacs.d/straight/build/org/ol-bibtex hides /home/yantar92/Git/emacs/lisp/org/ol-bibtex
/home/yantar92/.emacs.d/straight/build/org/ol-bbdb hides /home/yantar92/Git/emacs/lisp/org/ol-bbdb
/home/yantar92/.emacs.d/straight/build/org/oc-natbib hides /home/yantar92/Git/emacs/lisp/org/oc-natbib
/home/yantar92/.emacs.d/straight/build/org/oc-csl hides /home/yantar92/Git/emacs/lisp/org/oc-csl
/home/yantar92/.emacs.d/straight/build/org/oc-basic hides /home/yantar92/Git/emacs/lisp/org/oc-basic
/home/yantar92/.emacs.d/straight/build/org/oc-biblatex hides /home/yantar92/Git/emacs/lisp/org/oc-biblatex
/home/yantar92/.emacs.d/straight/build/org/ob hides /home/yantar92/Git/emacs/lisp/org/ob
/home/yantar92/.emacs.d/straight/build/org/ob-tangle hides /home/yantar92/Git/emacs/lisp/org/ob-tangle
/home/yantar92/.emacs.d/straight/build/org/ob-sql hides /home/yantar92/Git/emacs/lisp/org/ob-sql
/home/yantar92/.emacs.d/straight/build/org/ob-sqlite hides /home/yantar92/Git/emacs/lisp/org/ob-sqlite
/home/yantar92/.emacs.d/straight/build/org/ob-table hides /home/yantar92/Git/emacs/lisp/org/ob-table
/home/yantar92/.emacs.d/straight/build/org/ob-shell hides /home/yantar92/Git/emacs/lisp/org/ob-shell
/home/yantar92/.emacs.d/straight/build/org/ob-sed hides /home/yantar92/Git/emacs/lisp/org/ob-sed
/home/yantar92/.emacs.d/straight/build/org/ob-screen hides /home/yantar92/Git/emacs/lisp/org/ob-screen
/home/yantar92/.emacs.d/straight/build/org/ob-scheme hides /home/yantar92/Git/emacs/lisp/org/ob-scheme
/home/yantar92/.emacs.d/straight/build/org/ob-C hides /home/yantar92/Git/emacs/lisp/org/ob-C
/home/yantar92/.emacs.d/straight/build/org/ob-sass hides /home/yantar92/Git/emacs/lisp/org/ob-sass
/home/yantar92/.emacs.d/straight/build/org/ob-ruby hides /home/yantar92/Git/emacs/lisp/org/ob-ruby
/home/yantar92/.emacs.d/straight/build/org/ob-python hides /home/yantar92/Git/emacs/lisp/org/ob-python
/home/yantar92/.emacs.d/straight/build/org/ob-processing hides /home/yantar92/Git/emacs/lisp/org/ob-processing
/home/yantar92/.emacs.d/straight/build/org/ob-plantuml hides /home/yantar92/Git/emacs/lisp/org/ob-plantuml
/home/yantar92/.emacs.d/straight/build/org/ob-ref hides /home/yantar92/Git/emacs/lisp/org/ob-ref
/home/yantar92/.emacs.d/straight/build/org/ob-perl hides /home/yantar92/Git/emacs/lisp/org/ob-perl
/home/yantar92/.emacs.d/straight/build/org/ob-octave hides /home/yantar92/Git/emacs/lisp/org/ob-octave
/home/yantar92/.emacs.d/straight/build/org/ob-org hides /home/yantar92/Git/emacs/lisp/org/ob-org
/home/yantar92/.emacs.d/straight/build/org/ob-ocaml hides /home/yantar92/Git/emacs/lisp/org/ob-ocaml
/home/yantar92/.emacs.d/straight/build/org/ob-maxima hides /home/yantar92/Git/emacs/lisp/org/ob-maxima
/home/yantar92/.emacs.d/straight/build/org/ob-matlab hides /home/yantar92/Git/emacs/lisp/org/ob-matlab
/home/yantar92/.emacs.d/straight/build/org/ob-makefile hides /home/yantar92/Git/emacs/lisp/org/ob-makefile
/home/yantar92/.emacs.d/straight/build/org/ob-lua hides /home/yantar92/Git/emacs/lisp/org/ob-lua
/home/yantar92/.emacs.d/straight/build/org/ob-lisp hides /home/yantar92/Git/emacs/lisp/org/ob-lisp
/home/yantar92/.emacs.d/straight/build/org/ob-lilypond hides /home/yantar92/Git/emacs/lisp/org/ob-lilypond
/home/yantar92/.emacs.d/straight/build/org/ob-lob hides /home/yantar92/Git/emacs/lisp/org/ob-lob
/home/yantar92/.emacs.d/straight/build/org/ob-latex hides /home/yantar92/Git/emacs/lisp/org/ob-latex
/home/yantar92/.emacs.d/straight/build/org/ob-julia hides /home/yantar92/Git/emacs/lisp/org/ob-julia
/home/yantar92/.emacs.d/straight/build/org/ob-java hides /home/yantar92/Git/emacs/lisp/org/ob-java
/home/yantar92/.emacs.d/straight/build/org/ob-js hides /home/yantar92/Git/emacs/lisp/org/ob-js
/home/yantar92/.emacs.d/straight/build/org/ob-haskell hides /home/yantar92/Git/emacs/lisp/org/ob-haskell
/home/yantar92/.emacs.d/straight/build/org/ob-gnuplot hides /home/yantar92/Git/emacs/lisp/org/ob-gnuplot
/home/yantar92/.emacs.d/straight/build/org/ob-groovy hides /home/yantar92/Git/emacs/lisp/org/ob-groovy
/home/yantar92/.emacs.d/straight/build/org/ob-forth hides /home/yantar92/Git/emacs/lisp/org/ob-forth
/home/yantar92/.emacs.d/straight/build/org/ob-exp hides /home/yantar92/Git/emacs/lisp/org/ob-exp
/home/yantar92/.emacs.d/straight/build/org/ob-eval hides /home/yantar92/Git/emacs/lisp/org/ob-eval
/home/yantar92/.emacs.d/straight/build/org/ob-eshell hides /home/yantar92/Git/emacs/lisp/org/ob-eshell
/home/yantar92/.emacs.d/straight/build/org/ob-dot hides /home/yantar92/Git/emacs/lisp/org/ob-dot
/home/yantar92/.emacs.d/straight/build/org/ob-ditaa hides /home/yantar92/Git/emacs/lisp/org/ob-ditaa
/home/yantar92/.emacs.d/straight/build/org/ob-css hides /home/yantar92/Git/emacs/lisp/org/ob-css
/home/yantar92/.emacs.d/straight/build/org/ob-core hides /home/yantar92/Git/emacs/lisp/org/ob-core
/home/yantar92/.emacs.d/straight/build/org/ob-emacs-lisp hides /home/yantar92/Git/emacs/lisp/org/ob-emacs-lisp
/home/yantar92/.emacs.d/straight/build/org/ob-calc hides /home/yantar92/Git/emacs/lisp/org/ob-calc
/home/yantar92/.emacs.d/straight/build/org/ob-clojure hides /home/yantar92/Git/emacs/lisp/org/ob-clojure
/home/yantar92/.emacs.d/straight/build/org/ob-R hides /home/yantar92/Git/emacs/lisp/org/ob-R
/home/yantar92/.emacs.d/straight/build/org/ob-comint hides /home/yantar92/Git/emacs/lisp/org/ob-comint
/home/yantar92/.emacs.d/straight/build/org/ob-awk hides /home/yantar92/Git/emacs/lisp/org/ob-awk
/home/yantar92/.emacs.d/straight/build/org/org-element hides /home/yantar92/Git/emacs/lisp/org/org-element
/home/yantar92/.emacs.d/straight/build/org/ox hides /home/yantar92/Git/emacs/lisp/org/ox
/home/yantar92/.emacs.d/straight/build/org/ox-texinfo hides /home/yantar92/Git/emacs/lisp/org/ox-texinfo
/home/yantar92/.emacs.d/straight/build/org/ol hides /home/yantar92/Git/emacs/lisp/org/ol
/home/yantar92/.emacs.d/straight/build/org/oc hides /home/yantar92/Git/emacs/lisp/org/oc
/home/yantar92/.emacs.d/straight/build/org/org-macro hides /home/yantar92/Git/emacs/lisp/org/org-macro
/home/yantar92/.emacs.d/straight/build/org/org-version hides /home/yantar92/Git/emacs/lisp/org/org-version
/home/yantar92/.emacs.d/straight/build/map/map hides /home/yantar92/Git/emacs/lisp/emacs-lisp/map
/home/yantar92/.emacs.d/straight/build/let-alist/let-alist hides /home/yantar92/Git/emacs/lisp/emacs-lisp/let-alist
Features:
(shadow emacsbug org-datetree elfeed-link windmove make-mode
gnuplot-context gnuplot org-test ert-x ert finder autoinsert vc-hg
vc-bzr vc-src vc-sccs vc-svn vc-cvs vc-rcs log-view helm-imenu latexenc
oc-bibtex textsec uni-scripts idna-mapping ucs-normalize uni-confusable
textsec-check helm-ring footnote descr-text dired-open
all-the-icons-dired dired-filter dired-hide-dotfiles misearch
multi-isearch cal-move org-learn network-stream url-cache preview
font-latex w3m-form w3m-symbol tabify latex latex-flymake tex-ispell
tex-style tex pdf-sync pdf-outline pdf-links pdf-history w3m doc-view
w3m-hist w3m-fb bookmark-w3m w3m-ems w3m-favicon w3m-image tab-line
w3m-proc w3m-util boon-moves er-basic-expansions expand-region-core
expand-region-custom tex-mode compare-w mm-archive helm-command
helm-elisp helm-eval helm-x-files helm-for-files helm-bookmark
helm-external helm-net boon-main boon-hl boon-arguments multiple-cursors
mc-separate-operations rectangular-region-mode mc-mark-pop mc-edit-lines
mc-hide-unmatched-lines-mode mc-mark-more mc-cycle-cursors
multiple-cursors-core boon-regs boon-utils cl-print tramp-archive
tramp-gvfs cal-iso org-duration ffap org-table-sticky-header oc-basic
highlight-indentation flymake-proc flymake elpy elpy-rpc pyvenv eshell
esh-cmd esh-ext esh-opt esh-proc esh-io esh-arg esh-module esh-groups
esh-util elpy-shell elpy-profile elpy-django elpy-refactor grep
git-email-magit magit-patch git-email-notmuch git-email-piem git-email
git-email-autoloads project-autoloads xref-autoloads piem-notmuch piem
piem-maildir mail-extr piem-autoloads org-crypt helm-notmuch
helm-notmuch-autoloads ol-notmuch ol-notmuch-autoloads org-eldoc
org-table-sticky-header-autoloads posframe posframe-autoloads ob-async
ob-async-autoloads ob-latex ob-dot ob-calc calc-store calc-trail
ob-gnuplot ob-ditaa ob-C cc-mode cc-fonts cc-guess cc-menus cc-cmds
cc-styles cc-align cc-engine cc-langs cc-vars cc-defs cc-bytecomp
ob-python ob-perl ob-org ob-shell ob-mathematica
ob-mathematica-autoloads org-tempo tempo org-archive ox-md ox-beamer
engrave-faces engrave-faces-autoloads ox-extra orgdiff orgdiff-autoloads
doct ya-org-capture ya-org-capture-autoloads doct-autoloads
org-capture-pop-frame org-capture-pop-frame-autoloads org-protocol
org-analyzer-autoloads pomidor-autoloads alert-autoloads log4e-autoloads
gntp-autoloads helm-org-ql helm-org org-clock org-autosort
org-autosort-autoloads helm-org-contacts helm-org-contacts-autoloads
org-contacts gnus-art mm-uu mml2015 gnus-sum gnus-group mm-url gnus-undo
gnus-start gnus-dbus gnus-cloud nnimap nnmail mail-source utf7 nnoo
gnus-spec gnus-int gnus-range gnus-win gnus org-contacts-autoloads
helm-org-ql-autoloads helm-org-autoloads org-ql-search org-ql-view ov
org-super-agenda org-ql peg ts org-ql-autoloads peg-autoloads
ov-autoloads org-super-agenda-autoloads ts-autoloads org-quick-peek
org-quick-peek-autoloads calfw-org calfw-org-autoloads calfw holidays
holiday-loaddefs calfw-autoloads org-attach cdlatex reftex
reftex-loaddefs reftex-vars texmathp cdlatex-autoloads org-capture-ref
org-ref-url-utils org-ref org-ref-core org-ref-glossary org-ref-bibtex
avy doi-utils org-ref-utils org-ref-export citeproc citeproc-itemgetters
citeproc-biblatex citeproc-bibtex ol-bibtex citeproc-cite
citeproc-subbibs citeproc-sort citeproc-name citeproc-formatters
citeproc-number rst citeproc-proc citeproc-disamb citeproc-itemdata
citeproc-generic-elements citeproc-macro citeproc-choose citeproc-date
citeproc-context citeproc-prange citeproc-style citeproc-locale
citeproc-term citeproc-rt citeproc-lib citeproc-s queue ox-pandoc ox-org
ox-odt rng-loc rng-uri rng-parse rng-match rng-dt rng-util rng-pttrn
nxml-parse nxml-ns nxml-enc xmltok nxml-util ox-latex ox-icalendar
ox-html table ox-ascii ox-publish ox org-ref-misc-links
org-ref-label-link org-ref-ref-links org-ref-citation-links
org-ref-bibliography-links bibtex-completion biblio biblio-download
biblio-dissemin biblio-ieee biblio-hal biblio-dblp biblio-crossref
biblio-arxiv timezone biblio-doi biblio-core ido parsebib bibtex
org-ref-autoloads ox-pandoc-autoloads citeproc-autoloads
string-inflection-autoloads queue-autoloads bibtex-completion-autoloads
biblio-autoloads biblio-core-autoloads parsebib-autoloads
htmlize-autoloads scimax-inkscape scimax-inkscape-autoloads org-pdftools
pdf-annot facemenu org-noter org-pdftools-autoloads org-noter-autoloads
org-capture org-checklist org-habit org-edna org-edna-autoloads
org-inlinetask org-drill persist org-agenda org-drill-autoloads
persist-autoloads ol-info ol-w3m ol-doi org-link-doi speed-type
speed-type-autoloads ement ement-notify ement-room ement-lib ement-api
ement-structs ement-macros warnings dns ement-autoloads
svg-lib-autoloads taxy-magit-section-autoloads taxy-autoloads
map-autoloads plz plz-autoloads 0x0 0x0-autoloads notmuch-calendar-x
notmuch-calendar-x-autoloads notmuch notmuch-tree notmuch-jump
notmuch-hello notmuch-show notmuch-print notmuch-crypto notmuch-mua
notmuch-message notmuch-draft notmuch-maildir-fcc notmuch-address
notmuch-company notmuch-parser notmuch-wash coolj notmuch-query
goto-addr icalendar diary-lib diary-loaddefs notmuch-tag notmuch-lib
notmuch-version notmuch-compat w3m-autoloads elfeed-score
elfeed-score-maint elfeed-score-scoring elfeed-score-serde
elfeed-score-rule-stats elfeed-org org-element org-persist
elfeed-org-autoloads quick-peek quick-peek-autoloads elfeed-show
elfeed-search hideshow display-fill-column-indicator eros
rainbow-delimiters highlight-numbers parent-mode easy-escape
license-snippets yasnippet-snippets-autoloads yasnippet-snippets
yasnippet elfeed-csv elfeed elfeed-curl elfeed-log elfeed-db elfeed-lib
avl-tree url-queue xml-query elfeed-score-rules elfeed-score-log
elfeed-score-autoloads elfeed-autoloads ytel-show-autoloads ytel
ytel-autoloads qrencode-el-autoloads tb-keycast tb-keycast-autoloads
gif-screencast gif-screencast-autoloads yaml-mode yaml-mode-autoloads
mingus libmpdee cl mingus-autoloads libmpdee-autoloads calctex calc-sel
calctex-autoloads shell-pop-autoloads eterm-256color-autoloads
xterm-color-autoloads vterm term ehelp vterm-module term/xterm xterm
vterm-autoloads diffpdf diffpdf-autoloads pdf-view-restore
pdf-view-restore-autoloads pdf-occur ibuf-ext ibuffer ibuffer-loaddefs
tablist tablist-filter semantic/wisent/comp semantic/wisent
semantic/wisent/wisent semantic/util-modes semantic/util semantic
semantic/tag semantic/lex semantic/fw mode-local cedet pdf-isearch
pdf-misc pdf-tools pdf-roll pdf-view jka-compr pdf-cache pdf-info tq
pdf-util pdf-macs pdf-tools-autoloads tablist-autoloads image-roll
image-roll-autoloads wolfram-mode wolfram-mode-autoloads
ledger-mode-autoloads auctex-autoloads tex-site ebuild-mode skeleton
sh-script smie executable ebuild-mode-autoloads lua-mode
lua-mode-autoloads gnuplot-autoloads eros-autoloads nameless
nameless-autoloads paredit paredit-autoloads company-jedi
company-jedi-autoloads jedi jedi-core python-environment epc ctable
concurrent auto-complete jedi-autoloads auto-complete-autoloads
jedi-core-autoloads python-environment-autoloads epc-autoloads
ctable-autoloads concurrent-autoloads elpy-autoloads pyvenv-autoloads
highlight-indentation-autoloads python helm-info which-key
which-key-autoloads helm-descbinds helm-descbinds-autoloads elisp-demos
elisp-demos-autoloads helpful info-look help-fns elisp-refs
helpful-autoloads elisp-refs-autoloads tldr tldr-autoloads
lsp-ui-autoloads lsp-mode-autoloads spinner-autoloads macrostep
macrostep-autoloads highlight-refontification
highlight-refontification-autoloads font-lock-profiler
font-lock-profiler-autoloads font-lock-studio font-lock-studio-autoloads
memory-usage memory-usage-autoloads bug-hunter bug-hunter-autoloads
lorem-ipsum lorem-ipsum-autoloads license-snippets-autoloads
yasnippet-autoloads move-text move-text-autoloads aggressive-indent
aggressive-indent-autoloads visual-regexp-autoloads magit-bookmark
bookmark mule-util helm-bm helm-bm-autoloads bm bm-autoloads helm-dash
dash-docs helm-dash-autoloads dash-docs-autoloads disk-usage
disk-usage-autoloads dired-git-info-autoloads
dired-hide-dotfiles-autoloads dired-filter-autoloads diredfl
diredfl-autoloads all-the-icons-dired-autoloads dired-async
dired-open-autoloads dired-avfs dired-avfs-autoloads
dired-narrow-autoloads dired-hacks-utils dired-hacks-utils-autoloads
dired+ image-file image-converter dired-x dired-aux dired+-autoloads
winner windower emacs-windower-autoloads goggles pulse skip-buffers-mode
avy-autoloads eval-sexp-fu eval-sexp-fu-autoloads goggles-autoloads
easy-escape-autoloads highlight-numbers-autoloads parent-mode-autoloads
rainbow-delimiters-autoloads highlight-parentheses
highlight-parentheses-autoloads flycheck-tip error-tip notifications
dbus popup flycheck-tip-autoloads flycheck flycheck-autoloads
pkg-info-autoloads epl-autoloads wordnut wordnut-history wordnut-u
wordnut-autoloads smog smog-autoloads writegood-mode
writegood-mode-autoloads langtool-ignore-fonts
langtool-ignore-fonts-autoloads langtool compile langtool-autoloads
el-patch-autoloads el-patch el-patch-stub flyspell ispell hi-lock ediff
ediff-merg ediff-mult ediff-wind ediff-diff ediff-help ediff-init
ediff-util browse-at-remote vc-git vc-dir ewoc vc vc-dispatcher f
f-shortdoc shortdoc browse-at-remote-autoloads f-autoloads code-review
code-review-actions code-review-comment code-review-section
code-review-bitbucket code-review-faces shr pixel-fill kinsoku url-file
svg xml dom emojify apropos tar-mode arc-mode archive-mode ht
code-review-gitlab code-review-utils code-review-parse-hunk
code-review-github code-review-db uuidgen calc-misc calc-ext calc
calc-loaddefs rect calc-macs a code-review-interfaces deferred
forge-list forge-commands forge-semi forge-bitbucket buck forge-gogs
gogs forge-gitea gtea forge-gitlab glab forge-github ghub-graphql treepy
gsexp ghub forge-notify forge-revnote forge-pullreq forge-issue
forge-topic yaml bug-reference forge-post markdown-mode thingatpt
forge-repo forge forge-core forge-db closql emacsql-sqlite emacsql
emacsql-compiler url-http url-auth url-gw nsm magit-submodule
magit-obsolete magit-blame magit-stash magit-reflog magit-bisect
magit-push magit-pull magit-fetch magit-clone magit-remote magit-commit
magit-sequence magit-notes magit-worktree magit-tag magit-merge
magit-branch magit-reset magit-files magit-refs magit-status magit
package let-alist browse-url url-handlers magit-repos magit-apply
magit-wip magit-log which-func imenu edebug debug backtrace magit-diff
smerge-mode diff diff-mode git-commit log-edit message sendmail
yank-media rfc822 mml mailabbrev nnheader range mail-utils gmm-utils
mailheader pcvs-util add-log magit-core magit-autorevert magit-margin
magit-transient magit-process with-editor magit-mode transient magit-git
magit-base magit-section crm compat-27 compat-26 code-review-autoloads
emojify-autoloads ht-autoloads deferred-autoloads uuidgen-autoloads
a-autoloads forge-autoloads yaml-autoloads markdown-mode-autoloads
ghub-autoloads treepy-autoloads let-alist-autoloads
emacsql-sqlite-autoloads emacsql-autoloads closql-autoloads
magit-autoloads magit-section-autoloads git-commit-autoloads
with-editor-autoloads transient-autoloads autorevert recentf tree-widget
disp-table hl-todo pretty-symbols company-oddmuse company-keywords
company-etags etags fileloop generator xref project company-gtags
company-dabbrev-code company-dabbrev company-files company-clang
company-capf company-cmake company-semantic company-template
company-bbdb company persistent-scratch persistent-scratch-autoloads
savehist backup-walker-autoloads company-autoloads helm-adaptive
helm-mode helm-misc helm-files image-dired image-dired-tags
image-dired-external image-dired-util xdg image-mode dired desktop
frameset dired-loaddefs exif filenotify tramp tramp-cache time-stamp
tramp-loaddefs trampver tramp-integration cus-edit pp cus-load wid-edit
tramp-compat shell parse-time iso8601 ls-lisp helm-buffers helm-occur
helm-tags helm-locate helm-grep helm-regexp helm-utils helm-help
helm-types helm helm-global-bindings helm-easymenu helm-core
async-bytecomp helm-source helm-multi-match helm-lib helm-autoloads
popup-autoloads helm-core-autoloads face-remap pyim pyim-cloudim url
url-proxy url-privacy url-expand url-methods url-history url-cookie
url-domsuf mm-view mml-smime mml-sec epa epg rfc6068 epg-config
gnus-util text-property-search smime gnutls puny dig mm-decode mm-bodies
mm-encode mail-parse rfc2231 rfc2047 rfc2045 mm-util ietf-drums
mail-prsvr mailcap pyim-probe pyim-preview pyim-page pyim-indicator
pyim-dregcache pyim-dhashcache sort pyim-dict async pyim-autoselector
pyim-process pyim-punctuation pyim-outcome pyim-candidates pyim-cstring
pyim-cregexp xr pyim-codes pyim-imobjs pyim-pinyin pyim-entered
pyim-dcache url-util url-parse auth-source eieio eieio-core
password-cache json map url-vars pyim-pymap pyim-scheme pyim-common
pyim-autoloads xr-autoloads async-autoloads reverse-im quail
reverse-im-autoloads hydra lv boon-qwerty color olivetti straight-x boon
boon-keys boon-core advice boon-loaddefs boon-autoloads
multiple-cursors-autoloads expand-region-autoloads meta-functions org-id
org-refile dash meta-functions-autoloads dash-autoloads hl-line memoize
memoize-autoloads info-colors info-colors-autoloads hl-todo-autoloads
latex-pretty-symbols latex-pretty-symbols-autoloads
pretty-symbols-autoloads page-break-lines page-break-lines-autoloads
edmacro kmacro adaptive-wrap adaptive-wrap-autoloads olivetti-autoloads
shackle trace shackle-autoloads use-package-diminish all-the-icons
all-the-icons-faces data-material data-weathericons data-octicons
data-fileicons data-faicons data-alltheicons all-the-icons-autoloads org
ob ob-tangle ob-ref ob-lob ob-table ob-exp org-macro org-footnote
org-src ob-comint org-pcomplete pcomplete comint files-x derived osc
ansi-color ring org-list org-entities time-date noutline outline icons
ob-emacs-lisp ob-core ob-eval org-cycle org-font-lock org-font-lock-core
org-element-match org-faces org-table ol org-fold org-fold-core org-keys
oc org-loaddefs find-func cal-menu calendar cal-loaddefs org-version
org-compat org-font-lock-obsolete org-macs format-spec rx
modus-operandi-theme modus-themes modus-themes-autoloads s s-autoloads
asoc asoc.el-autoloads no-littering compat no-littering-autoloads
compat-autoloads hydra-autoloads lv-autoloads finder-inf
use-package-bind-key org-contrib-autoloads bind-key diminish
diminish-autoloads use-package-core use-package-autoloads
bind-key-autoloads straight-autoloads cl-extra help-mode straight info
autoload loaddefs-gen generate-lisp-file radix-tree lisp-mnt easy-mmode
cl-seq pcase subr-x byte-opt cl-macs gv cl-loaddefs cl-lib bytecomp
byte-compile cconv server rmc iso-transl tooltip eldoc paren electric
uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel
term/x-win x-win term/common-win x-dnd tool-bar dnd fontset image
regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode
prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu
timer select scroll-bar mouse jit-lock font-lock syntax font-core
term/tty-colors frame minibuffer nadvice seq simple cl-generic
indonesian philippine cham georgian utf-8-lang misc-lang vietnamese
tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
composite emoji-zwj charscript charprop case-table epa-hook
jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs
faces cus-face macroexp files window text-properties overlay sha1 md5
base64 format env code-pages mule custom widget keymap
hashtable-print-readable backquote threads dbusbind inotify lcms2
dynamic-setting system-font-setting font-render-setting cairo
move-toolbar gtk x-toolkit xinput2 x multi-tty make-network-process
emacs)
Memory information:
((conses 16 8304927 6682874)
(symbols 48 111731 347)
(strings 32 1998260 614327)
(string-bytes 1 74322513)
(vectors 16 646829)
(vector-slots 8 11831847 5926232)
(floats 8 156374 74631)
(intervals 56 356161 102742)
(buffers 984 132))
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-10-16 1:26 bug#58558: 29.0.50; re-search-forward is slow in some buffers Ihor Radchenko
@ 2022-10-16 9:19 ` Lars Ingebrigtsen
2022-10-16 9:34 ` Ihor Radchenko
2023-04-10 8:48 ` Mattias Engdegård
1 sibling, 1 reply; 81+ messages in thread
From: Lars Ingebrigtsen @ 2022-10-16 9:19 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558
Ihor Radchenko <yantar92@posteo.net> writes:
> It happens consistently in Emacs 29, but not in all buffers. Sometimes,
> it only happens after some time after Emacs startup. The slowdown is not
> there in Emacs 28.
Is there anything special about buffers where you see these slowdowns?
For instance, a large number of text properties or overlays?
(length (object-intervals (current-buffer)))
will tell you how many text properties there are (sort of), and
(length (overlays-in (point-min) (point-max)))
should tell you the same for overlays.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-10-16 9:19 ` Lars Ingebrigtsen
@ 2022-10-16 9:34 ` Ihor Radchenko
2022-10-16 9:37 ` Lars Ingebrigtsen
2023-02-19 12:17 ` Dmitry Gutov
0 siblings, 2 replies; 81+ messages in thread
From: Ihor Radchenko @ 2022-10-16 9:34 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: 58558
Lars Ingebrigtsen <larsi@gnus.org> writes:
>> It happens consistently in Emacs 29, but not in all buffers. Sometimes,
>> it only happens after some time after Emacs startup. The slowdown is not
>> there in Emacs 28.
>
> Is there anything special about buffers where you see these slowdowns?
This is a large complex Org buffer.
> For instance, a large number of text properties or overlays?
>
> (length (object-intervals (current-buffer)))
=> 101075 (took over 10sec to complete the command)
> will tell you how many text properties there are (sort of), and
>
> (length (overlays-in (point-min) (point-max)))
>
> should tell you the same for overlays.
=> 1
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-10-16 9:34 ` Ihor Radchenko
@ 2022-10-16 9:37 ` Lars Ingebrigtsen
2022-10-16 10:02 ` Ihor Radchenko
2023-02-19 12:17 ` Dmitry Gutov
1 sibling, 1 reply; 81+ messages in thread
From: Lars Ingebrigtsen @ 2022-10-16 9:37 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558
Ihor Radchenko <yantar92@posteo.net> writes:
>> Is there anything special about buffers where you see these slowdowns?
>
> This is a large complex Org buffer.
>
>> For instance, a large number of text properties or overlays?
>>
>> (length (object-intervals (current-buffer)))
>
> => 101075 (took over 10sec to complete the command)
If you switch the buffer to `clean-mode' (which should remove all text
props), does the slowdown disappear? In that case, it seems likely that
the slowdown is connected to text properties, somehow.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-10-16 9:37 ` Lars Ingebrigtsen
@ 2022-10-16 10:02 ` Ihor Radchenko
2022-10-16 10:04 ` Lars Ingebrigtsen
2022-10-16 10:36 ` Eli Zaretskii
0 siblings, 2 replies; 81+ messages in thread
From: Ihor Radchenko @ 2022-10-16 10:02 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: 58558
Lars Ingebrigtsen <larsi@gnus.org> writes:
> If you switch the buffer to `clean-mode' (which should remove all text
> props), does the slowdown disappear? In that case, it seems likely that
> the slowdown is connected to text properties, somehow.
The slowdown becomes slightly better, but nowhere close to Emacs 28:
;; Emacs 29
;; Elapsed time: 16.953404s
;; Emacs 29 + clean-mode
;; Elapsed time: 13.290568s
;; Emacs 28
;; Elapsed time: 0.869748s
I did
(setq yant/re "\\(?:\\(?:\\<DEADLINE: *\\(\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\)\\)\\|\\(?:\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\|^\\*+[[:blank:]]+\\(?:[[:upper:]]+[[:blank:]]+\\)?\\[#A]\\|^[[:space:]]*:STYLE:[[:space:]]+habit[[:space:]]*$\\)\\)")
(benchmark-progn (goto-char (point-min)) (while (re-search-forward yant/re nil t)))
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-10-16 10:02 ` Ihor Radchenko
@ 2022-10-16 10:04 ` Lars Ingebrigtsen
2022-10-16 10:53 ` Ihor Radchenko
2022-10-16 10:36 ` Eli Zaretskii
1 sibling, 1 reply; 81+ messages in thread
From: Lars Ingebrigtsen @ 2022-10-16 10:04 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558
Ihor Radchenko <yantar92@posteo.net> writes:
>> If you switch the buffer to `clean-mode' (which should remove all text
>> props), does the slowdown disappear? In that case, it seems likely that
>> the slowdown is connected to text properties, somehow.
>
> The slowdown becomes slightly better, but nowhere close to Emacs 28:
>
> ;; Emacs 29
> ;; Elapsed time: 16.953404s
> ;; Emacs 29 + clean-mode
> ;; Elapsed time: 13.290568s
> ;; Emacs 28
> ;; Elapsed time: 0.869748s
Hm... Another test -- could you try `find-file-literally' on the Org
file and repeat the search?
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-10-16 10:02 ` Ihor Radchenko
2022-10-16 10:04 ` Lars Ingebrigtsen
@ 2022-10-16 10:36 ` Eli Zaretskii
1 sibling, 0 replies; 81+ messages in thread
From: Eli Zaretskii @ 2022-10-16 10:36 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, larsi
> Cc: 58558@debbugs.gnu.org
> From: Ihor Radchenko <yantar92@posteo.net>
> Date: Sun, 16 Oct 2022 10:02:25 +0000
>
> Lars Ingebrigtsen <larsi@gnus.org> writes:
>
> > If you switch the buffer to `clean-mode' (which should remove all text
> > props), does the slowdown disappear? In that case, it seems likely that
> > the slowdown is connected to text properties, somehow.
>
> The slowdown becomes slightly better, but nowhere close to Emacs 28:
>
> ;; Emacs 29
> ;; Elapsed time: 16.953404s
> ;; Emacs 29 + clean-mode
> ;; Elapsed time: 13.290568s
> ;; Emacs 28
> ;; Elapsed time: 0.869748s
>
> I did
>
> (setq yant/re "\\(?:\\(?:\\<DEADLINE: *\\(\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\)\\)\\|\\(?:\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\|^\\*+[[:blank:]]+\\(?:[[:upper:]]+[[:blank:]]+\\)?\\[#A]\\|^[[:space:]]*:STYLE:[[:space:]]+habit[[:space:]]*$\\)\\)")
> (benchmark-progn (goto-char (point-min)) (while (re-search-forward yant/re nil t)))
AFAICT, the changes in regex-emacs.c between these two versions are
very minor, almost non-existent. So it sounds like the reason is
somewhere else, not in regexp search per se. But to be absolutely
sure, could you please try building Emacs 29 with regex-emacs.c from
Emacs 28, and see if the slowdown disappears or not?
Thanks.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-10-16 10:04 ` Lars Ingebrigtsen
@ 2022-10-16 10:53 ` Ihor Radchenko
2022-10-16 11:01 ` Lars Ingebrigtsen
0 siblings, 1 reply; 81+ messages in thread
From: Ihor Radchenko @ 2022-10-16 10:53 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: 58558
Lars Ingebrigtsen <larsi@gnus.org> writes:
>> The slowdown becomes slightly better, but nowhere close to Emacs 28:
>>
>> ;; Emacs 29
>> ;; Elapsed time: 16.953404s
>> ;; Emacs 29 + clean-mode
>> ;; Elapsed time: 13.290568s
>> ;; Emacs 28
>> ;; Elapsed time: 0.869748s
>
> Hm... Another test -- could you try `find-file-literally' on the Org
> file and repeat the search?
I just switched between Emacs 28 and Emacs 29 and I do note that
right after loading Emacs and the Org file, Emacs 29 takes similar time
with Emacs 28.
I do know that things will get back to slow after a while. This problem
has been around on my machine for a long time.
I will report once I use Emacs long enough to observe the slowdown.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-10-16 10:53 ` Ihor Radchenko
@ 2022-10-16 11:01 ` Lars Ingebrigtsen
2022-10-16 11:21 ` Eli Zaretskii
0 siblings, 1 reply; 81+ messages in thread
From: Lars Ingebrigtsen @ 2022-10-16 11:01 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558
> I just switched between Emacs 28 and Emacs 29 and I do note that
> right after loading Emacs and the Org file, Emacs 29 takes similar time
> with Emacs 28.
Huh, very odd. Almost as something is... fragmenting in the buffer?
We do have many caches and stuff -- perhaps something is... degrading?
I guess some C-level perf measurements would be handy here, but that's
not something I know much about. Anybody?
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-10-16 11:01 ` Lars Ingebrigtsen
@ 2022-10-16 11:21 ` Eli Zaretskii
2022-10-16 14:23 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
0 siblings, 1 reply; 81+ messages in thread
From: Eli Zaretskii @ 2022-10-16 11:21 UTC (permalink / raw)
To: Lars Ingebrigtsen, Stefan Monnier; +Cc: 58558, yantar92
> Cc: 58558@debbugs.gnu.org
> From: Lars Ingebrigtsen <larsi@gnus.org>
> Date: Sun, 16 Oct 2022 13:01:48 +0200
>
> > I just switched between Emacs 28 and Emacs 29 and I do note that
> > right after loading Emacs and the Org file, Emacs 29 takes similar time
> > with Emacs 28.
>
> Huh, very odd. Almost as something is... fragmenting in the buffer?
> We do have many caches and stuff -- perhaps something is... degrading?
>
> I guess some C-level perf measurements would be handy here, but that's
> not something I know much about. Anybody?
AFAIU, we use elaborate caching for regular expressions, so maybe that
is related. Stefan, any ideas?
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-10-16 11:21 ` Eli Zaretskii
@ 2022-10-16 14:23 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-10-17 0:56 ` Ihor Radchenko
0 siblings, 1 reply; 81+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-10-16 14:23 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, Lars Ingebrigtsen, yantar92
>> Huh, very odd. Almost as something is... fragmenting in the buffer?
>> We do have many caches and stuff -- perhaps something is... degrading?
>>
>> I guess some C-level perf measurements would be handy here, but that's
>> not something I know much about. Anybody?
>
> AFAIU, we use elaborate caching for regular expressions, so maybe that
> is related. Stefan, any ideas?
The regexp cache hasn't changed between 28 and 29, so that seems
unlikely to be the source of the problem. But that cache is fairly
simple-minded, so it's possible that for some reason it thrashes in
Emacs-29 but not in Emacs-28 (but see below).
IIUC a summary of what we know so far:
- the "yant/re" benchmark is ~20x slower in Emacs-29 than in Emacs-28.
- removing all text properties reduces the factor down to about ~15x.
- that difference is absent after a fresh start: it only appears over time.
Since this benchmark always matches the same regexp, I can't imagine how
the regexp cache could thrash, so it definitely seems to come from
something else.
I'd curious to know the result of the following tests:
- Run the same benchmark twice in a row: does the second run take the
same time as the first, or is the second run significantly faster?
[ if it's faster it might be due to something like the on-the-fly
`syntax-propertize`ation.
BTW, what does the profiler-start/report say?
Is the time 100% spent in `re-search-forward`? ]
- Try to reduce the number of "features" used in the regexp to see how
it affects the slow down. Maybe try a "binary search" where you try
to reduce the regexp to something much simpler and see if some regexps
exhibit the slowdown while others don't?
Stefan
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-10-16 14:23 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-10-17 0:56 ` Ihor Radchenko
2022-10-18 11:50 ` Lars Ingebrigtsen
0 siblings, 1 reply; 81+ messages in thread
From: Ihor Radchenko @ 2022-10-17 0:56 UTC (permalink / raw)
To: Stefan Monnier; +Cc: 58558, Eli Zaretskii, Lars Ingebrigtsen
Stefan Monnier <monnier@iro.umontreal.ca> writes:
> IIUC a summary of what we know so far:
> - the "yant/re" benchmark is ~20x slower in Emacs-29 than in Emacs-28.
> - removing all text properties reduces the factor down to about ~15x.
> - that difference is absent after a fresh start: it only appears over time.
>
> Since this benchmark always matches the same regexp, I can't imagine how
> the regexp cache could thrash, so it definitely seems to come from
> something else.
>
> I'd curious to know the result of the following tests:
>
> - Run the same benchmark twice in a row: does the second run take the
> same time as the first, or is the second run significantly faster?
> [ if it's faster it might be due to something like the on-the-fly
> `syntax-propertize`ation.
After 11 hours of Emacs uptime and some edits in the buffer (actually,
just a few hours; mostly idle), running the benchmark-progn
repetitively:
;; Elapsed time: 8.339753s
;; Elapsed time: 9.243140s
;; Elapsed time: 9.868761s
;; Elapsed time: 10.330362s
;; Elapsed time: 11.279218s
;; Elapsed time: 13.581893s
;; Elapsed time: 13.675609s
;; Elapsed time: 14.553157s
;; Elapsed time: 14.651782s
;; Elapsed time: 17.253983s
The elapsed time gradually increases. It is definitely a clue, but very
odd one.
> BTW, what does the profiler-start/report say?
> Is the time 100% spent in `re-search-forward`? ]
;; w CPU profiler
;; Elapsed time: 19.628828s
;; profiler:
;; 19954 99% - command-execute
;; 19926 99% - funcall-interactively
;; 19627 98% - eval-expression
;; 19627 98% - let
;; 19627 98% - progn
;; 19627 98% while
;; ------------ no more data inside while ---------
Nothing useful. It's like while loop is doing something bad, but how so
in (benchmark-progn (while (re-search-forward yant/re nil t))) ??
I also tried find-file-literally and the timing gets back to fresh Emacs
(even faster):
;; find-file-literally
;; Elapsed time: 0.592935s
Then, I re-opened the file normally.
;; re-open the file
;; Elapsed time: 7.348727s
Note how the time is back to 7-8 seconds, but not same as fresh Emacs.
> - Try to reduce the number of "features" used in the regexp to see how
> it affects the slow down. Maybe try a "binary search" where you try
> to reduce the regexp to something much simpler and see if some regexps
> exhibit the slowdown while others don't?
Hmm. I tried a very simple regexp "^\\*+ " 10 times in a row:
;; Elapsed time: 0.267681s
;; Elapsed time: 0.381607s
;; Elapsed time: 0.342378s
;; Elapsed time: 0.350618s
;; Elapsed time: 0.376871s
;; Elapsed time: 0.446346s
;; Elapsed time: 0.472543s
;; Elapsed time: 0.529925s
;; Elapsed time: 0.604101s
;; Elapsed time: 0.665601s
It is generally faster, but still relatively slow and gets worse over
time.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-10-17 0:56 ` Ihor Radchenko
@ 2022-10-18 11:50 ` Lars Ingebrigtsen
2022-10-18 14:58 ` Eli Zaretskii
0 siblings, 1 reply; 81+ messages in thread
From: Lars Ingebrigtsen @ 2022-10-18 11:50 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, Eli Zaretskii, Stefan Monnier
Ihor Radchenko <yantar92@posteo.net> writes:
> After 11 hours of Emacs uptime and some edits in the buffer (actually,
> just a few hours; mostly idle), running the benchmark-progn
> repetitively:
>
> ;; Elapsed time: 8.339753s
> ;; Elapsed time: 9.243140s
> ;; Elapsed time: 9.868761s
> ;; Elapsed time: 10.330362s
> ;; Elapsed time: 11.279218s
> ;; Elapsed time: 13.581893s
> ;; Elapsed time: 13.675609s
> ;; Elapsed time: 14.553157s
> ;; Elapsed time: 14.651782s
> ;; Elapsed time: 17.253983s
>
> The elapsed time gradually increases. It is definitely a clue, but very
> odd one.
The slowdowns are so dramatic that they should show up on a profiler --
which might give us a clue which parts of Emacs is slowing down. I
briefly tried to use "perf" under Linux to connect to a running Emacs
and get some data out of it, but... er... I've never used it before,
and...
Does anybody have a recipe for how to do runtime function tracing for a
running process?
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-10-18 11:50 ` Lars Ingebrigtsen
@ 2022-10-18 14:58 ` Eli Zaretskii
2022-10-18 18:19 ` Lars Ingebrigtsen
0 siblings, 1 reply; 81+ messages in thread
From: Eli Zaretskii @ 2022-10-18 14:58 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: 58558, yantar92, monnier
> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, 58558@debbugs.gnu.org, Eli
> Zaretskii <eliz@gnu.org>
> Date: Tue, 18 Oct 2022 13:50:02 +0200
>
> The slowdowns are so dramatic that they should show up on a profiler --
> which might give us a clue which parts of Emacs is slowing down.
Right.
> I briefly tried to use "perf" under Linux to connect to a running
> Emacs and get some data out of it, but... er... I've never used it
> before, and...
>
> Does anybody have a recipe for how to do runtime function tracing for a
> running process?
The way I run perf is to start Emacs under perf to begin with.
What did you try? It was quite simple, AFAIR, last time I tried.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-10-18 14:58 ` Eli Zaretskii
@ 2022-10-18 18:19 ` Lars Ingebrigtsen
2022-10-18 18:38 ` Eli Zaretskii
2022-12-13 10:28 ` Ihor Radchenko
0 siblings, 2 replies; 81+ messages in thread
From: Lars Ingebrigtsen @ 2022-10-18 18:19 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, yantar92, monnier
Eli Zaretskii <eliz@gnu.org> writes:
>> I briefly tried to use "perf" under Linux to connect to a running
>> Emacs and get some data out of it, but... er... I've never used it
>> before, and...
>>
>> Does anybody have a recipe for how to do runtime function tracing for a
>> running process?
>
> The way I run perf is to start Emacs under perf to begin with.
>
> What did you try? It was quite simple, AFAIR, last time I tried.
I thought it might be easier to see the differences in results if one
first attached perf to a fresh (fast) Emacs and got the trace, and the
waited until Emacs got slow, and repeated the same thing under another
trace.
perf is able to do this by:
perf record -p <PID> -g
and
perf report
then shows me stuff, but I don't even know what to look for when
interpreting that. Or whether perf is, indeed, the right too for this
task.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-10-18 18:19 ` Lars Ingebrigtsen
@ 2022-10-18 18:38 ` Eli Zaretskii
2022-12-13 10:28 ` Ihor Radchenko
1 sibling, 0 replies; 81+ messages in thread
From: Eli Zaretskii @ 2022-10-18 18:38 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: 58558, yantar92, monnier
> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: yantar92@posteo.net, monnier@iro.umontreal.ca, 58558@debbugs.gnu.org
> Date: Tue, 18 Oct 2022 20:19:24 +0200
>
> > What did you try? It was quite simple, AFAIR, last time I tried.
>
> I thought it might be easier to see the differences in results if one
> first attached perf to a fresh (fast) Emacs and got the trace, and the
> waited until Emacs got slow, and repeated the same thing under another
> trace.
>
> perf is able to do this by:
>
> perf record -p <PID> -g
I never tried that, always started Emacs under perf to begin with.
> and
>
> perf report
>
> then shows me stuff, but I don't even know what to look for when
> interpreting that.
I thought you wanted to compare two or more profiles taken at
different times? Then looking at percentages of the same functions
could tell something. Since the complaint is about regexp search, I
guess re_compile_pattern and re_match_2_internal and their subroutines
would be the immediate suspects. Or maybe re-search-forward, which is
a couple of levels higher.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-10-18 18:19 ` Lars Ingebrigtsen
2022-10-18 18:38 ` Eli Zaretskii
@ 2022-12-13 10:28 ` Ihor Radchenko
2022-12-13 13:11 ` Eli Zaretskii
2022-12-13 13:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
1 sibling, 2 replies; 81+ messages in thread
From: Ihor Radchenko @ 2022-12-13 10:28 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: 58558, Eli Zaretskii, monnier
Lars Ingebrigtsen <larsi@gnus.org> writes:
> I thought it might be easier to see the differences in results if one
> first attached perf to a fresh (fast) Emacs and got the trace, and the
> waited until Emacs got slow, and repeated the same thing under another
> trace.
>
> perf is able to do this by:
>
> perf record -p <PID> -g
>
> and
>
> perf report
>
> then shows me stuff, but I don't even know what to look for when
> interpreting that. Or whether perf is, indeed, the right too for this
> task.
Ok. I got around to try perf, and it turned out to be very easy to get
started.
perf record -p <PID> + perf report already appear to give some clue:
88.27% emacs emacs-30-vcs [.] buf_bytepos_to_charpos
3.75% emacs emacs-30-vcs [.] re_match_2_internal
1.35% emacs emacs-30-vcs [.] scan_sexps_forward
1.03% emacs emacs-30-vcs [.] re_search_2
0.65% emacs emacs-30-vcs [.] find_interval
0.56% emacs emacs-30-vcs [.] sub_char_table_ref
0.55% emacs emacs-30-vcs [.] lookup_char_property
The fraction of buf_bytepos_to_charpos increases over repeated benchmark
runs.
In contrast, using find-file-literally produces
34.44% emacs emacs-30-vcs [.] re_match_2_internal
25.55% emacs emacs-30-vcs [.] scan_sexps_forward
11.09% emacs emacs-30-vcs [.] re_search_2
...
0.59% emacs emacs-30-vcs [.] buf_bytepos_to_charpos
with buf_bytepos_to_charpos taking diminishing cpu sample fraction.
Any ideas what I can do further?
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-13 10:28 ` Ihor Radchenko
@ 2022-12-13 13:11 ` Eli Zaretskii
2022-12-13 13:32 ` Ihor Radchenko
2022-12-13 13:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
1 sibling, 1 reply; 81+ messages in thread
From: Eli Zaretskii @ 2022-12-13 13:11 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, larsi, monnier
> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: Eli Zaretskii <eliz@gnu.org>, monnier@iro.umontreal.ca,
> 58558@debbugs.gnu.org
> Date: Tue, 13 Dec 2022 10:28:57 +0000
>
> Ok. I got around to try perf, and it turned out to be very easy to get
> started.
>
> perf record -p <PID> + perf report already appear to give some clue:
>
> 88.27% emacs emacs-30-vcs [.] buf_bytepos_to_charpos
> 3.75% emacs emacs-30-vcs [.] re_match_2_internal
> 1.35% emacs emacs-30-vcs [.] scan_sexps_forward
> 1.03% emacs emacs-30-vcs [.] re_search_2
> 0.65% emacs emacs-30-vcs [.] find_interval
> 0.56% emacs emacs-30-vcs [.] sub_char_table_ref
> 0.55% emacs emacs-30-vcs [.] lookup_char_property
>
> The fraction of buf_bytepos_to_charpos increases over repeated benchmark
> runs.
So buf_bytepos_to_charpos is the main suspect now, I guess. This
could happen because either (a) buf_bytepos_to_charpos is called more
times as session uptime progresses, or (b) because each call to
buf_bytepos_to_charpos becomes more and more expensive. So I think
the first question is: how many times is buf_bytepos_to_charpos called
for each search, or, equivalently, is the CPU time per call used up by
buf_bytepos_to_charpos stays stable or goes up? I think perf can
answer these questions if you ask nicely.
If the number of calls is the same, but each call becomes more and
more expensive, then the next step is to ask perf to produce a
detailed profile for each line of buf_bytepos_to_charpos, and see
which parts of it become more expensive. I could think about a couple
of possible reasons for that, but I'd rather not speculate about
profiles, as that is known to produce wrong guesses.
Is the buffer in question being edited as time advances? Or is buffer
text and everything else in the buffer left unchanged?
> In contrast, using find-file-literally produces
>
> 34.44% emacs emacs-30-vcs [.] re_match_2_internal
> 25.55% emacs emacs-30-vcs [.] scan_sexps_forward
> 11.09% emacs emacs-30-vcs [.] re_search_2
> ...
> 0.59% emacs emacs-30-vcs [.] buf_bytepos_to_charpos
>
> with buf_bytepos_to_charpos taking diminishing cpu sample fraction.
That find-file-literally yields a buffer with a much faster
buf_bytepos_to_charpos is not surprising: when each character is a
single byte, the conversion is trivial, and buf_bytepos_to_charpos
returns immediately. The puzzling part is not that
buf_bytepos_to_charpos is much more expensive in a buffer with
non-ASCII text, the puzzle is why it becomes more and more expensive
with time.
Thanks.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-13 10:28 ` Ihor Radchenko
2022-12-13 13:11 ` Eli Zaretskii
@ 2022-12-13 13:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
1 sibling, 0 replies; 81+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-12-13 13:27 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, Lars Ingebrigtsen, Eli Zaretskii
> The fraction of buf_bytepos_to_charpos increases over repeated benchmark
> runs.
[...]
> Any ideas what I can do further?
As usual, the problem is either that we call this function too often or
that it takes too much time every time we call it so:
- Try and figure out who is the most frequent caller of
`buf_bytepos_to_charpos` during your benchmark. Most calls to this
function can usually be eliminated by changing the code to keep track
of both bytes and chars at the same time. Actually, most of the time
we already have the char info somewhere nearby, so it might be
a simple change.
`gprof` can often give that info.
- Try and figure out why `buf_bytepos_to_charpos` is so slow.
Last time we tweaked that code, AFAIK, is commit
b300052fb4ef1261519b0fd57f5eb186c2d10295.
My debugging approach for those cases is the following:
DEFVAR_LISP a new variable in which you put a vector of N integers
(initialized to 0), and then at various "interesting" points in the
`buf_bytepos_to_charpos`, increment one of the vector elements.
This way you can see from ELisp how many times each "interesting"
point was executed.
IOW, I do the profiling counters by hand.
Stefan
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-13 13:11 ` Eli Zaretskii
@ 2022-12-13 13:32 ` Ihor Radchenko
2022-12-13 14:28 ` Eli Zaretskii
0 siblings, 1 reply; 81+ messages in thread
From: Ihor Radchenko @ 2022-12-13 13:32 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, larsi, monnier
Eli Zaretskii <eliz@gnu.org> writes:
>> The fraction of buf_bytepos_to_charpos increases over repeated benchmark
>> runs.
>
> So buf_bytepos_to_charpos is the main suspect now, I guess. This
> could happen because either (a) buf_bytepos_to_charpos is called more
> times as session uptime progresses,
Just to clarify. The perf records I did are roughly for the duration of
benchmark-run calls. Nothing more.
> or (b) because each call to
> buf_bytepos_to_charpos becomes more and more expensive. So I think
> the first question is: how many times is buf_bytepos_to_charpos called
> for each search, or, equivalently, is the CPU time per call used up by
> buf_bytepos_to_charpos stays stable or goes up? I think perf can
> answer these questions if you ask nicely.
I will look how to do it. Maybe perf probe.
I guess, it will be useful to compile Emacs with debug symbols at this
point.
> Is the buffer in question being edited as time advances? Or is buffer
> text and everything else in the buffer left unchanged?
Not edited between benchmarks. Remember that I did sequence of
benchmark-run calls and the time gradually increases.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-13 13:32 ` Ihor Radchenko
@ 2022-12-13 14:28 ` Eli Zaretskii
2022-12-13 15:56 ` Ihor Radchenko
0 siblings, 1 reply; 81+ messages in thread
From: Eli Zaretskii @ 2022-12-13 14:28 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, larsi, monnier
> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: larsi@gnus.org, monnier@iro.umontreal.ca, 58558@debbugs.gnu.org
> Date: Tue, 13 Dec 2022 13:32:13 +0000
>
> > or (b) because each call to
> > buf_bytepos_to_charpos becomes more and more expensive. So I think
> > the first question is: how many times is buf_bytepos_to_charpos called
> > for each search, or, equivalently, is the CPU time per call used up by
> > buf_bytepos_to_charpos stays stable or goes up? I think perf can
> > answer these questions if you ask nicely.
>
> I will look how to do it. Maybe perf probe.
> I guess, it will be useful to compile Emacs with debug symbols at this
> point.
AFAIR, you can ask perf to profile a single function, and you can ask
it to annotate the profile with the source code.
> > Is the buffer in question being edited as time advances? Or is buffer
> > text and everything else in the buffer left unchanged?
>
> Not edited between benchmarks. Remember that I did sequence of
> benchmark-run calls and the time gradually increases.
OK, so it looks more and more like each call becomes more expensive
for some reason. But let's see the numbers before jumping to
conclusions.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-13 14:28 ` Eli Zaretskii
@ 2022-12-13 15:56 ` Ihor Radchenko
2022-12-13 16:08 ` Eli Zaretskii
2022-12-13 17:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
0 siblings, 2 replies; 81+ messages in thread
From: Ihor Radchenko @ 2022-12-13 15:56 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, larsi, monnier
Eli Zaretskii <eliz@gnu.org> writes:
>> I will look how to do it. Maybe perf probe.
>> I guess, it will be useful to compile Emacs with debug symbols at this
>> point.
>
> AFAIR, you can ask perf to profile a single function, and you can ask
> it to annotate the profile with the source code.
I now compiled Emacs with debug symbols, waited enough to see observable
increase in the benchmark-run timing, and recorded the perf data.
buf_bytepos_to_charpos is still on the top
78.06% emacs emacs [.] buf_bytepos_to_charpos
3.00% emacs emacs [.] re_match_2_internal
1.05% emacs emacs [.] find_interval
1.04% emacs emacs [.] CHAR_TABLE_REF_ASCII
0.85% emacs emacs [.] make_lisp_symbol
0.80% emacs emacs [.] re_search_2
0.76% emacs emacs [.] builtin_lisp_symbol
0.62% emacs emacs [.] PSEUDOVECTORP
The specific place in the code is:
perf annotate -s buf_bytepos_to_charpos
: 352 for (tail = BUF_MARKERS (b); tail; tail = tail->next)
0.00 : 237e53: mov -0xe8(%rbp),%rax
0.00 : 237e5a: mov 0x2e8(%rax),%rax
0.01 : 237e61: mov 0x80(%rax),%rax
0.00 : 237e68: mov %rax,-0xc0(%rbp)
0.00 : 237e6f: jmp 237fc6 <buf_bytepos_to_charpos+0x7ba>
: 353 {
: 354 CONSIDER (tail->bytepos, tail->charpos);
0.02 : 237e74: mov -0xc0(%rbp),%rax
47.07 : 237e7b: mov 0x28(%rax),%rax
7.27 : 237e7f: mov %rax,-0x38(%rbp)
0.02 : 237e83: movl $0x0,-0xc4(%rbp)
9.05 : 237e8d: mov -0x38(%rbp),%rax
0.01 : 237e91: cmp -0xf0(%rbp),%rax
3.73 : 237e98: jne 237eb2 <buf_bytepos_to_charpos+0x6a6>
0.00 : 237e9a: mov -0xc0(%rbp),%rax
0.00 : 237ea1: mov 0x20(%rax),%rax
0.00 : 237ea5: mov %rax,-0x28(%rbp)
0.00 : 237ea9: mov -0x28(%rbp),%rax
0.00 : 237ead: jmp 2381cd <buf_bytepos_to_charpos+0x9c1>
2.14 : 237eb2: mov -0x38(%rbp),%rax
1.87 : 237eb6: cmp -0xf0(%rbp),%rax
0.85 : 237ebd: jle 237ef5 <buf_bytepos_to_charpos+0x6e9>
2.32 : 237ebf: mov -0x38(%rbp),%rax
0.04 : 237ec3: cmp -0xb0(%rbp),%rax
2.56 : 237eca: jge 237f29 <buf_bytepos_to_charpos+0x71d>
My guess: number of markers is growing somehow?
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-13 15:56 ` Ihor Radchenko
@ 2022-12-13 16:08 ` Eli Zaretskii
2022-12-13 17:43 ` Ihor Radchenko
2022-12-13 17:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
1 sibling, 1 reply; 81+ messages in thread
From: Eli Zaretskii @ 2022-12-13 16:08 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, larsi, monnier
> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: larsi@gnus.org, monnier@iro.umontreal.ca, 58558@debbugs.gnu.org
> Date: Tue, 13 Dec 2022 15:56:33 +0000
>
> My guess: number of markers is growing somehow?
That was my guess, yeah.
So now the question becomes: who creates all those additional markers
if all you do is run the benchmark?
If no other idea to find this out comes up, maybe run this with a
breakpoint in make-marker, look at the backtrace to see the callers.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-13 15:56 ` Ihor Radchenko
2022-12-13 16:08 ` Eli Zaretskii
@ 2022-12-13 17:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-12-14 12:00 ` Ihor Radchenko
2022-12-14 12:23 ` Ihor Radchenko
1 sibling, 2 replies; 81+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-12-13 17:38 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, Eli Zaretskii, larsi
>>> I will look how to do it. Maybe perf probe.
>>> I guess, it will be useful to compile Emacs with debug symbols at this
>>> point.
>>
>> AFAIR, you can ask perf to profile a single function, and you can ask
>> it to annotate the profile with the source code.
>
> I now compiled Emacs with debug symbols, waited enough to see observable
> increase in the benchmark-run timing, and recorded the perf data.
>
> buf_bytepos_to_charpos is still on the top
>
> 78.06% emacs emacs [.] buf_bytepos_to_charpos
> 3.00% emacs emacs [.] re_match_2_internal
> 1.05% emacs emacs [.] find_interval
> 1.04% emacs emacs [.] CHAR_TABLE_REF_ASCII
> 0.85% emacs emacs [.] make_lisp_symbol
> 0.80% emacs emacs [.] re_search_2
> 0.76% emacs emacs [.] builtin_lisp_symbol
> 0.62% emacs emacs [.] PSEUDOVECTORP
AFAIK the main places where we call `buf_bytepos_to_charpos` from
`re_match_2_internal` is via the `SYNTAX_TABLE_BYTE_TO_CHAR` macro, used
for regexp elements that depend on syntax tables (i.e. \<, \>, \_<, ...).
But I'd expect those to be executed "frequently&closely" enough that the
`cached_(byte|char)pos` data should almost always be nearby, making the
call to `buf_bytepos_to_charpos` fairly cheap (more specifically
the `for (tail = BUF_MARKERS (b);...` loop should not iterate many
times, regardless how many markers there are).
> My guess: number of markers is growing somehow?
`buf_bytepos_to_charpos` itself creates markers (using them as a cache
of previous conversions), so that might be why.
But we only look at the first N markers where N*50 is the distance to
the closest marker found so far. So growth is not sufficient (it's
clearly a part of the reason, tho).
Regarding growth: could you call `garbage-collect` between the calls to
`re-search-forward` to see if that avoids the accumulation?
[ I presume here that those markers are created/added by
`buf_bytepos_to_charpos` itself, so they should be recovered by the GC
because they're not referenced from anywhere else. ]
I'd be interested to know how many iterations of the `for (tail =
BUF_MARKERS (b);...` loop are executed on average during your
`re-search-forward` (and how that average changes between runs of
`re-search-forward`).
Stefan
PS: Of course, another approach would be to replace this code with
something else. Using markers as a cache of bytepos/charpos conversions
has been a source of a few performance issues over the year.
Another approach could be to use a "vector with gap" managed alongside
the actual buffer text. It could be indexed by "charpos divided by
1024", so conversion from charpos to bytepos could be a simple vector
lookup followed by scanning at most 1kB, and conversion in the other
direction would use a binary search in that same vector (or we could use
2 "vectors with gap", one per direction of conversion).
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-13 16:08 ` Eli Zaretskii
@ 2022-12-13 17:43 ` Ihor Radchenko
2022-12-13 17:52 ` Eli Zaretskii
2022-12-13 18:15 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
0 siblings, 2 replies; 81+ messages in thread
From: Ihor Radchenko @ 2022-12-13 17:43 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, larsi, monnier
Eli Zaretskii <eliz@gnu.org> writes:
>> My guess: number of markers is growing somehow?
>
> That was my guess, yeah.
>
> So now the question becomes: who creates all those additional markers
> if all you do is run the benchmark?
>
> If no other idea to find this out comes up, maybe run this with a
> breakpoint in make-marker, look at the backtrace to see the callers.
I tried gdb now with break Fmake_marker.
The benchmark itself does not trigger the breakpoint.
However, a huge number (hundreds) of breakpoint hits is generated upon
finishing the benchmark execution.
bt:
#0 Fmake_marker () at alloc.c:3736
#1 0x00005555557bb750 in Fmatch_data (integers=0x0, reuse=0x0, reseat=0x0) at search.c:2903
#2 0x000055555580eb6d in funcall_subr (subr=0x555555e0dc20 <Smatch_data>, numargs=0, args=0x7ffff0c02070) at eval.c:3038
#3 0x00005555558634c1 in exec_byte_code (fun=0x555557370195, args_template=0, nargs=0, args=0x0) at bytecode.c:809
#4 0x000055555580ee6b in fetch_and_exec_byte_code (fun=0x555557370195, args_template=0, nargs=0, args=0x0) at eval.c:3081
#5 0x000055555580f5a8 in funcall_lambda (fun=0x555557370195, nargs=1, arg_vector=0x7ffff0c02038) at eval.c:3242
#6 0x000055555580e688 in funcall_general (fun=0x555557370195, numargs=1, args=0x7ffff0c02038) at eval.c:2945
#7 0x00005555558634e1 in exec_byte_code (fun=0x55555734c7cd, args_template=0, nargs=0, args=0x0) at bytecode.c:811
#8 0x000055555580ee6b in fetch_and_exec_byte_code (fun=0x55555734c7cd, args_template=0, nargs=0, args=0x0) at eval.c:3081
#9 0x000055555580f5a8 in funcall_lambda (fun=0x55555734c7cd, nargs=1, arg_vector=0x7fffffff6ce0) at eval.c:3242
#10 0x000055555580f00c in apply_lambda (fun=0x55555734c7cd, args=0x555557f7a2c3, count=...) at eval.c:3103
#11 0x000055555580d591 in eval_sub (form=0x555557f7a2b3) at eval.c:2545
#12 0x00005555558084f0 in Fsetq (args=0x555557f7a2a3) at eval.c:483
#13 0x000055555580cfa8 in eval_sub (form=0x555557f7a293) at eval.c:2449
#14 0x00005555558083bc in Fprogn (body=0x555557f7a363) at eval.c:436
#15 0x0000555555809b4e in Flet (args=0x555557f7a283) at eval.c:1026
#16 0x000055555580cfa8 in eval_sub (form=0x555557f7a223) at eval.c:2449
#17 0x000055555580d151 in eval_sub (form=0x555557f712b3) at eval.c:2465
#18 0x000055555580efa6 in apply_lambda (fun=0x555557f8049d, args=0x555557f712a3, count=...) at eval.c:3098
#19 0x000055555580d591 in eval_sub (form=0x555557f71883) at eval.c:2545
#20 0x000055555580cac8 in Feval (form=0x555557f71883, lexical=0x0) at eval.c:2361
#21 0x000055555580eb37 in funcall_subr (subr=0x555555e11ea0 <Seval>, numargs=1, args=0x7fffffff7788) at eval.c:3036
#22 0x000055555580e63c in funcall_general (fun=0x555555e11ea5 <Seval+5>, numargs=1, args=0x7fffffff7788) at eval.c:2941
#23 0x000055555580e909 in Ffuncall (nargs=2, args=0x7fffffff7780) at eval.c:2995
#24 0x000055555580ab30 in internal_condition_case_n
(bfun=0x55555580e7eb <Ffuncall>, nargs=2, args=0x7fffffff7780, handlers=0x30, hfun=0x5555555ccfe7 <safe_eval_handler>) at eval.c:1558
#25 0x00005555555cd24c in safe__call (inhibit_quit=true, nargs=2, func=0x6900, ap=0x7fffffff7840) at xdisp.c:3024
#26 0x00005555555cd450 in safe__call1 (inhibit_quit=true, fn=0x6900) at xdisp.c:3060
#27 0x00005555555cd4e0 in safe__eval (inhibit_quit=true, sexpr=0x555557f71883) at xdisp.c:3074
#28 0x000055555561367e in display_mode_element (it=0x7fffffff7d10, depth=2, field_width=0, precision=0, elt=0x555557f71873, props=0x0, risky=false)
at xdisp.c:27228
#29 0x0000555555613a28 in display_mode_element (it=0x7fffffff7d10, depth=1, field_width=0, precision=0, elt=0x555557f79cd3, props=0x0, risky=false)
at xdisp.c:27314
#30 0x0000555555612210 in display_mode_line (w=0x55555628c8c0, face_id=MODE_LINE_INACTIVE_FACE_ID, format=0x555557f79cd3) at xdisp.c:26740
#31 0x0000555555611efe in display_mode_lines (w=0x55555628c8c0) at xdisp.c:26653
#32 0x00005555555fcf67 in redisplay_window (window=0x55555628c8c5, just_this_one_p=false) at xdisp.c:20345
#33 0x00005555555f2e3f in redisplay_window_0 (window=0x55555628c8c5) at xdisp.c:17434
#34 0x000055555580a994 in internal_condition_case_1
(bfun=0x5555555f2dfd <redisplay_window_0>, arg=0x55555628c8c5, handlers=0x7ffff1adb5a3, hfun=0x5555555f2d16 <redisplay_window_error>) at eval.c:1498
#35 0x00005555555f2cec in redisplay_windows (window=0x55555628c8c5) at xdisp.c:17404
#36 0x00005555555f1a9f in redisplay_internal () at xdisp.c:16854
--Type <RET> for more, q to quit, c to continue without paging--
#37 0x00005555555efb5e in redisplay () at xdisp.c:16043
#38 0x000055555574711a in read_char (commandflag=1, map=0x55556333fb33, prev_event=0x0, used_mouse_menu=0x7fffffffd2a9, end_time=0x0) at keyboard.c:2627
#39 0x0000555555758856 in read_key_sequence
(keybuf=0x7fffffffd4e0, prompt=0x0, dont_downcase_last=false, can_return_switch_frame=true, fix_current_buffer=true, prevent_redisplay=false)
at keyboard.c:10074
#40 0x00005555557438b0 in command_loop_1 () at keyboard.c:1376
#41 0x000055555580a8ed in internal_condition_case (bfun=0x5555557434a1 <command_loop_1>, handlers=0x90, hfun=0x555555742a7a <cmd_error>) at eval.c:1474
#42 0x0000555555743151 in command_loop_2 (handlers=0x90) at keyboard.c:1125
#43 0x0000555555809f61 in internal_catch (tag=0xff90, func=0x555555743127 <command_loop_2>, arg=0x90) at eval.c:1197
#44 0x00005555557430e3 in command_loop () at keyboard.c:1103
#45 0x000055555574261c in recursive_edit_1 () at keyboard.c:712
#46 0x00005555557427c8 in Frecursive_edit () at keyboard.c:795
#47 0x000055555573e88a in main (argc=1, argv=0x7fffffffd9a8) at emacs.c:2529
If I read the backtrace correctly, something in my custom mode-line is
triggering Fmatch_data that creates markers.
But that code has not changes for years from git log.
One suspicious thing is that my code gets called that much frequently
(100s of times) by redisplay. Not sure if it is normal.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-13 17:43 ` Ihor Radchenko
@ 2022-12-13 17:52 ` Eli Zaretskii
2022-12-13 18:03 ` Ihor Radchenko
2022-12-13 18:15 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
1 sibling, 1 reply; 81+ messages in thread
From: Eli Zaretskii @ 2022-12-13 17:52 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, larsi, monnier
> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: larsi@gnus.org, monnier@iro.umontreal.ca, 58558@debbugs.gnu.org
> Date: Tue, 13 Dec 2022 17:43:36 +0000
>
> > If no other idea to find this out comes up, maybe run this with a
> > breakpoint in make-marker, look at the backtrace to see the callers.
>
> I tried gdb now with break Fmake_marker.
>
> The benchmark itself does not trigger the breakpoint.
> However, a huge number (hundreds) of breakpoint hits is generated upon
> finishing the benchmark execution.
>
> bt:
>
> #0 Fmake_marker () at alloc.c:3736
> #1 0x00005555557bb750 in Fmatch_data (integers=0x0, reuse=0x0, reseat=0x0) at search.c:2903
Ha-ha, shooting ourselves in the foot!
Great sleuthing job. Now we need to think what to do with this.
Hmm...
> If I read the backtrace correctly, something in my custom mode-line is
> triggering Fmatch_data that creates markers.
Yes, you have sone :eval form in the mode line, it seems?
Calling xbacktrace will show a Lisp backtrace, which could be
educational here.
> But that code has not changes for years from git log.
>
> One suspicious thing is that my code gets called that much frequently
> (100s of times) by redisplay. Not sure if it is normal.
You cannot predict when redisplay decides to redraw the mode line.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-13 17:52 ` Eli Zaretskii
@ 2022-12-13 18:03 ` Ihor Radchenko
2022-12-13 20:02 ` Eli Zaretskii
0 siblings, 1 reply; 81+ messages in thread
From: Ihor Radchenko @ 2022-12-13 18:03 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, larsi, monnier
Eli Zaretskii <eliz@gnu.org> writes:
>> If I read the backtrace correctly, something in my custom mode-line is
>> triggering Fmatch_data that creates markers.
>
> Yes, you have sone :eval form in the mode line, it seems?
Yes. For example, I call
(defun yant/vc-git-current-branch ()
"Get current GIT branch."
(and vc-mode
(cadr (s-match "Git.\\([^ ]+\\)" vc-mode))))
with s-match wrapping its code into save-match-data.
> Calling xbacktrace will show a Lisp backtrace, which could be
> educational here.
(gdb) xbacktrace
Undefined command: "xbacktrace". Try "help".
I am not sure what you mean by xbacktrace.
Also, as Stefan pointed, number of markers may or may not be a problem
here. However, I had a similar issue even with Emacs 28 when we tested
creating a huge number of markers in buffer + re-search-forward. I ended
up seeing similar perf logs that time.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-13 17:43 ` Ihor Radchenko
2022-12-13 17:52 ` Eli Zaretskii
@ 2022-12-13 18:15 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-12-13 18:40 ` Ihor Radchenko
1 sibling, 1 reply; 81+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-12-13 18:15 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, Eli Zaretskii, larsi
> The benchmark itself does not trigger the breakpoint.
Does that mean that `Fmatch_data` is not called during a single
`re-search-forward` (not a surprise: you'd need to put a breakpoint on
`build_marker` to see the markers built by `buf_bytepos_to_charpos`)
but is called between `re-search-forward`, or that it's not called at
all during the whole benchmark where you perform several
`re-search-forward` which grow progressively slower?
If it's the latter, then those calls can't explain the slowdown, right?
> If I read the backtrace correctly, something in my custom mode-line is
> triggering Fmatch_data that creates markers.
The most common calls to `match-data` are from `save-match-data`.
And most uses of `save-match-data` are ill-advised (as the docstring
explains `save-match-data' should be used to save *your* match data
rather than your caller's match data), so you might like to double check
whether that call to `match-data` can be eliminated altogether.
Stefan
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-13 18:15 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-12-13 18:40 ` Ihor Radchenko
2022-12-13 19:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
0 siblings, 1 reply; 81+ messages in thread
From: Ihor Radchenko @ 2022-12-13 18:40 UTC (permalink / raw)
To: Stefan Monnier; +Cc: 58558, Eli Zaretskii, larsi
Stefan Monnier <monnier@iro.umontreal.ca> writes:
>> The benchmark itself does not trigger the breakpoint.
>
> Does that mean that `Fmatch_data` is not called during a single
> `re-search-forward` (not a surprise: you'd need to put a breakpoint on
> `build_marker` to see the markers built by `buf_bytepos_to_charpos`)
> but is called between `re-search-forward`, or that it's not called at
> all during the whole benchmark where you perform several
> `re-search-forward` which grow progressively slower?
I do the benchmark via
M-:
(benchmark-progn (goto-char (point-min)) (while (re-search-forward yant/re nil t)))
<RET>
The breakpoint triggers after the minibuffer outputs the elapsed time.
During redisplay, AFAIU.
> If it's the latter, then those calls can't explain the slowdown, right?
The slowdown manifests by increasing elapsed time upon subsequent
benchmark calls like the above. So, redisplay may or may not be a part
of it.
I tried to run
(progn (benchmark-progn (goto-char (point-min)) (while (re-search-forward yant/re nil t))) (benchmark-progn (goto-char (point-min)) (while (re-search-forward yant/re nil t))))
4 times:
Elapsed time: 16.399824s
Elapsed time: 17.009694s
nil
Elapsed time: 18.187187s
Elapsed time: 18.597610s
nil
Elapsed time: 18.851388s
Elapsed time: 19.593968s
nil
Elapsed time: 20.194616s
Elapsed time: 20.414686s
nil
Though message may still trigger the redisplay. Not sure if this small
test really reveals anything useful.
Now, with (garbage-collect):
(progn (benchmark-progn (goto-char (point-min)) (while (re-search-forward yant/re nil t))) (garbage-collect) (benchmark-progn (goto-char (point-min)) (while (re-search-forward yant/re nil t))))
Elapsed time: 20.576637s
<GC>
Elapsed time: 15.734101s
Elapsed time: 16.101646s
<GC>
Elapsed time: 16.179796s
Elapsed time: 16.545040s
<GC>
Elapsed time: 16.365847s
Elapsed time: 16.842143s
<GC>
Elapsed time: 16.726615s
So, GC does help somewhat.
Then, if I kill and re-open the Org buffer:
Elapsed time: 72.847256s ;; <- Org just did a bunch of re-search for initial folding and setup
<GC>
Elapsed time: 4.864642s
re-open again, but GC before running the benchmark:
<GC>
Elapsed time: 4.884221s
<GC>
Elapsed time: 4.368755s
>> If I read the backtrace correctly, something in my custom mode-line is
>> triggering Fmatch_data that creates markers.
>
> The most common calls to `match-data` are from `save-match-data`.
> And most uses of `save-match-data` are ill-advised (as the docstring
> explains `save-match-data' should be used to save *your* match data
> rather than your caller's match data), so you might like to double check
> whether that call to `match-data` can be eliminated altogether.
This is coming from s.el. In any case, this implementation detail did
not change as I switched from Emacs 28 to Emacs 29. It is Emacs doing
something less efficiently here.
What I can try to do is replacing s-* functions in my mode-line with
built-ins. Will it help debugging this issue?
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-13 18:40 ` Ihor Radchenko
@ 2022-12-13 19:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-12-13 20:21 ` Eli Zaretskii
0 siblings, 1 reply; 81+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-12-13 19:55 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, Eli Zaretskii, larsi
>> The most common calls to `match-data` are from `save-match-data`.
>> And most uses of `save-match-data` are ill-advised (as the docstring
>> explains `save-match-data' should be used to save *your* match data
>> rather than your caller's match data), so you might like to double check
>> whether that call to `match-data` can be eliminated altogether.
>
> This is coming from s.el. In any case, this implementation detail did
> not change as I switched from Emacs 28 to Emacs 29. It is Emacs doing
> something less efficiently here.
>
> What I can try to do is replacing s-* functions in my mode-line with
> built-ins. Will it help debugging this issue?
I suspect these marker allocations for the mode-line are unrelated to the
actual problem.
Stefan
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-13 18:03 ` Ihor Radchenko
@ 2022-12-13 20:02 ` Eli Zaretskii
2022-12-14 11:40 ` Ihor Radchenko
0 siblings, 1 reply; 81+ messages in thread
From: Eli Zaretskii @ 2022-12-13 20:02 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, larsi, monnier
> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: larsi@gnus.org, monnier@iro.umontreal.ca, 58558@debbugs.gnu.org
> Date: Tue, 13 Dec 2022 18:03:49 +0000
>
> > Calling xbacktrace will show a Lisp backtrace, which could be
> > educational here.
>
> (gdb) xbacktrace
> Undefined command: "xbacktrace". Try "help".
>
> I am not sure what you mean by xbacktrace.
It's a command we define in src/.gdbinit. Try this:
(gdb) source /path/to/emacs/src/.gdbinit
(gdb) xbacktrace
But do that after catching Fmake_marker call from Fmatch_data, like
you did before.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-13 19:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-12-13 20:21 ` Eli Zaretskii
2022-12-14 11:42 ` Ihor Radchenko
0 siblings, 1 reply; 81+ messages in thread
From: Eli Zaretskii @ 2022-12-13 20:21 UTC (permalink / raw)
To: Stefan Monnier; +Cc: 58558, yantar92, larsi
> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>, larsi@gnus.org, 58558@debbugs.gnu.org
> Date: Tue, 13 Dec 2022 14:55:20 -0500
>
> I suspect these marker allocations for the mode-line are unrelated to the
> actual problem.
If this is true, then re-running the benchmarks after removing those
:eval's from the mode-line-format will still show slowdown with each
benchmark run.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-13 20:02 ` Eli Zaretskii
@ 2022-12-14 11:40 ` Ihor Radchenko
2022-12-14 13:06 ` Eli Zaretskii
0 siblings, 1 reply; 81+ messages in thread
From: Ihor Radchenko @ 2022-12-14 11:40 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, larsi, monnier
Eli Zaretskii <eliz@gnu.org> writes:
>> I am not sure what you mean by xbacktrace.
>
> It's a command we define in src/.gdbinit. Try this:
>
> (gdb) source /path/to/emacs/src/.gdbinit
> (gdb) xbacktrace
>
> But do that after catching Fmake_marker call from Fmatch_data, like
> you did before.
Ok.
Now, I disabled my custom mode-line and tied to get the backtrace for
Fmake_marker and also build_marker (as suggested by Stefan).
Disabling custom mode-line did not cause any apparent improvement in
performance.
Result:
Breakpoint is still _not_ triggered during benchmark-run call
(benchmark-progn (goto-char (point-min)) (while (re-search-forward yant/re nil t)))
build_marker is not triggered, except during redisplay and completion.
Fmake_marker is triggered a dozen of times when preparing M-: prompt and
later a couple of hundreds of times _after_ executing the benchmark:
Called a couple of hundreds of times
Lisp Backtrace:
"match-data" (0xf0c02130)
0x59846038 PVEC_COMPILED
"auto-revert-buffers--buffer-list-filter" (0xf0c020b8)
"apply" (0xf0c020b0)
"auto-revert-buffers" (0xf0c02058)
"apply" (0xf0c02050)
"timer-event-handler" (0xffffcd48)
not related.
I will now look into counting the number of for look cycles, as Stefan
suggested.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-13 20:21 ` Eli Zaretskii
@ 2022-12-14 11:42 ` Ihor Radchenko
0 siblings, 0 replies; 81+ messages in thread
From: Ihor Radchenko @ 2022-12-14 11:42 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, larsi, Stefan Monnier
Eli Zaretskii <eliz@gnu.org> writes:
>> I suspect these marker allocations for the mode-line are unrelated to the
>> actual problem.
>
> If this is true, then re-running the benchmarks after removing those
> :eval's from the mode-line-format will still show slowdown with each
> benchmark run.
I still see the slowdown after falling back to default mode-line-format.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-13 17:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-12-14 12:00 ` Ihor Radchenko
2022-12-14 12:23 ` Ihor Radchenko
1 sibling, 0 replies; 81+ messages in thread
From: Ihor Radchenko @ 2022-12-14 12:00 UTC (permalink / raw)
To: Stefan Monnier; +Cc: 58558, Eli Zaretskii, larsi
Stefan Monnier <monnier@iro.umontreal.ca> writes:
>> My guess: number of markers is growing somehow?
>
> `buf_bytepos_to_charpos` itself creates markers (using them as a cache
> of previous conversions), so that might be why.
>
> But we only look at the first N markers where N*50 is the distance to
> the closest marker found so far. So growth is not sufficient (it's
> clearly a part of the reason, tho).
What about the following degenerate case:
- Most of the buffer markers are located near point-min;
- We are searching for position near point-max;
- point-max is in order of 21,677,448 (this is my actual file I use for testing)
The number of for loop cycles is then min(21,677,448/50 = ~400k, BUF_MARKERS.size())
Of course, my above argument should not matter in theory, when recent
search matches are cached by build_marker, but my break build_marker
_never_ triggered for some reason.
How can build_marker not be triggered?
From my reading of the code, it happens when the following switch does
not fire.
bool record = bytepos - best_below_byte > 5000;
I note that this condition will not trigger if all the markers are
above.
On the other hand, this particular condition is there for the last 25
years or so. Just brainstorming...
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-13 17:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-12-14 12:00 ` Ihor Radchenko
@ 2022-12-14 12:23 ` Ihor Radchenko
2022-12-14 13:10 ` Eli Zaretskii
1 sibling, 1 reply; 81+ messages in thread
From: Ihor Radchenko @ 2022-12-14 12:23 UTC (permalink / raw)
To: Stefan Monnier; +Cc: 58558, Eli Zaretskii, larsi
Stefan Monnier <monnier@iro.umontreal.ca> writes:
> I'd be interested to know how many iterations of the `for (tail =
> BUF_MARKERS (b);...` loop are executed on average during your
> `re-search-forward` (and how that average changes between runs of
> `re-search-forward`).
I did not get around to measure separate re-search-forward calls, but
total number of hits to CONSIDER (tail->bytepos, tail->charpos); during
benchmark-run is:
18 breakpoint keep y 0x000055555578be74 in buf_bytepos_to_charpos at marker.c:353
breakpoint already hit 4,245,365 times
Combined with the fact that calling `garbage-collect' between benchmarks
makes the benchmark time nearly constant, this result may or may not
mean something.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-14 11:40 ` Ihor Radchenko
@ 2022-12-14 13:06 ` Eli Zaretskii
2022-12-14 13:23 ` Ihor Radchenko
0 siblings, 1 reply; 81+ messages in thread
From: Eli Zaretskii @ 2022-12-14 13:06 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, larsi, monnier
> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: larsi@gnus.org, monnier@iro.umontreal.ca, 58558@debbugs.gnu.org
> Date: Wed, 14 Dec 2022 11:40:37 +0000
>
> build_marker is not triggered, except during redisplay and completion.
> Fmake_marker is triggered a dozen of times when preparing M-: prompt and
> later a couple of hundreds of times _after_ executing the benchmark:
>
> Called a couple of hundreds of times
> Lisp Backtrace:
> "match-data" (0xf0c02130)
> 0x59846038 PVEC_COMPILED
> "auto-revert-buffers--buffer-list-filter" (0xf0c020b8)
> "apply" (0xf0c020b0)
> "auto-revert-buffers" (0xf0c02058)
> "apply" (0xf0c02050)
> "timer-event-handler" (0xffffcd48)
>
> not related.
I think I'm confused now: what do you mean by "executing the
benchmark"? I thought the problem was that each "execution of the
benchmark" was slower than the one before it, in which case markers
added between benchmarks _are_ relevant. But you say they aren't?
What did I miss?
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-14 12:23 ` Ihor Radchenko
@ 2022-12-14 13:10 ` Eli Zaretskii
2022-12-14 13:26 ` Ihor Radchenko
0 siblings, 1 reply; 81+ messages in thread
From: Eli Zaretskii @ 2022-12-14 13:10 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, larsi, monnier
> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: Eli Zaretskii <eliz@gnu.org>, larsi@gnus.org, 58558@debbugs.gnu.org
> Date: Wed, 14 Dec 2022 12:23:50 +0000
>
> 18 breakpoint keep y 0x000055555578be74 in buf_bytepos_to_charpos at marker.c:353
> breakpoint already hit 4,245,365 times
>
> Combined with the fact that calling `garbage-collect' between benchmarks
> makes the benchmark time nearly constant, this result may or may not
> mean something.
Is the "almost constant" time still significantly slower thane in
previous versions? Or is it similar?
Anyway, the fact that the time doesn't get worse when you GC between
benchmark most probably means that we produce a lot of garbage markers
(i.e., temporary markers that very quickly become unreferenced), and
they get in the way of buf_bytepos_to_charpos.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-14 13:06 ` Eli Zaretskii
@ 2022-12-14 13:23 ` Ihor Radchenko
2022-12-14 13:32 ` Eli Zaretskii
0 siblings, 1 reply; 81+ messages in thread
From: Ihor Radchenko @ 2022-12-14 13:23 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, larsi, monnier
Eli Zaretskii <eliz@gnu.org> writes:
> I think I'm confused now: what do you mean by "executing the
> benchmark"? I thought the problem was that each "execution of the
> benchmark" was slower than the one before it, in which case markers
> added between benchmarks _are_ relevant. But you say they aren't?
> What did I miss?
Increasing time of running benchmarks is just a symptom.
The real issue I am experiencing is that re-search-forward becomes
slower as I keep using Emacs. `garbage-collect' helps, but not in a long
term.
Basically, running
M-: (benchmark-progn (goto-char (point-min)) (while (re-search-forward yant/re nil t)))
- right after starting Emacs is taking 3-4 seconds.
- after several hours -- 10-20 seconds
- in Emacs 28, <1 sec.
Markers may or may not be a problem. If they are, it is not necessarily
related to markers created when I run the benchmarks. May also be some
markers created during the Emacs session.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-14 13:10 ` Eli Zaretskii
@ 2022-12-14 13:26 ` Ihor Radchenko
2022-12-14 13:57 ` Eli Zaretskii
0 siblings, 1 reply; 81+ messages in thread
From: Ihor Radchenko @ 2022-12-14 13:26 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, larsi, monnier
Eli Zaretskii <eliz@gnu.org> writes:
>> Combined with the fact that calling `garbage-collect' between benchmarks
>> makes the benchmark time nearly constant, this result may or may not
>> mean something.
>
> Is the "almost constant" time still significantly slower thane in
> previous versions? Or is it similar?
It is orders of magnitude slower: sub-second in Emacs 28; seconds in
Emacs 29 fresh session; tens of seconds after several hours of Emacs
usage.
> Anyway, the fact that the time doesn't get worse when you GC between
> benchmark most probably means that we produce a lot of garbage markers
> (i.e., temporary markers that very quickly become unreferenced), and
> they get in the way of buf_bytepos_to_charpos.
Most likely, but it is only part of the problem. If these temporary
markers where the only problem, I would not see gradual performance
degradation as I continue Emacs session (`garbage-collect` is called
anyway during normal usage).
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-14 13:23 ` Ihor Radchenko
@ 2022-12-14 13:32 ` Eli Zaretskii
2022-12-14 13:39 ` Ihor Radchenko
0 siblings, 1 reply; 81+ messages in thread
From: Eli Zaretskii @ 2022-12-14 13:32 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, larsi, monnier
> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: larsi@gnus.org, monnier@iro.umontreal.ca, 58558@debbugs.gnu.org
> Date: Wed, 14 Dec 2022 13:23:02 +0000
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> > I think I'm confused now: what do you mean by "executing the
> > benchmark"? I thought the problem was that each "execution of the
> > benchmark" was slower than the one before it, in which case markers
> > added between benchmarks _are_ relevant. But you say they aren't?
> > What did I miss?
>
> Increasing time of running benchmarks is just a symptom.
> The real issue I am experiencing is that re-search-forward becomes
> slower as I keep using Emacs. `garbage-collect' helps, but not in a long
> term.
>
> Basically, running
>
> M-: (benchmark-progn (goto-char (point-min)) (while (re-search-forward yant/re nil t)))
>
> - right after starting Emacs is taking 3-4 seconds.
> - after several hours -- 10-20 seconds
> - in Emacs 28, <1 sec.
>
> Markers may or may not be a problem.
What else could slow down buf_bytepos_to_charpos so much? All it does
is examine markers.
> f they are, it is not necessarily related to markers created when I
> run the benchmarks. May also be some markers created during the
> Emacs session.
Which means massive creation of markers could be the reason,
regardless of what causes such massive creation. Right? But if so,
why did you say that markers created by some timer(s) were not
relevant?
Btw, did you try to compare the number of buffer markers in Emacs 28
and Emacs 29/30, under this scenario, when the search becomes slow
enough?
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-14 13:32 ` Eli Zaretskii
@ 2022-12-14 13:39 ` Ihor Radchenko
2022-12-14 14:12 ` Eli Zaretskii
0 siblings, 1 reply; 81+ messages in thread
From: Ihor Radchenko @ 2022-12-14 13:39 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, larsi, monnier
Eli Zaretskii <eliz@gnu.org> writes:
>> Markers may or may not be a problem.
>
> What else could slow down buf_bytepos_to_charpos so much? All it does
> is examine markers.
Well. I believe so. But I feel confused. So, I do not exclude other
reasons.
Note that I have little experience with gdb.
>> f they are, it is not necessarily related to markers created when I
>> run the benchmarks. May also be some markers created during the
>> Emacs session.
>
> Which means massive creation of markers could be the reason,
> regardless of what causes such massive creation. Right? But if so,
> why did you say that markers created by some timer(s) were not
> relevant?
Because those came from auto-revert-mode and are unlikely going to
contribute to the single Org buffer I have problems with.
> Btw, did you try to compare the number of buffer markers in Emacs 28
> and Emacs 29/30, under this scenario, when the search becomes slow
> enough?
How can I find the number of buffer markers?
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-14 13:26 ` Ihor Radchenko
@ 2022-12-14 13:57 ` Eli Zaretskii
2022-12-14 14:01 ` Ihor Radchenko
0 siblings, 1 reply; 81+ messages in thread
From: Eli Zaretskii @ 2022-12-14 13:57 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, larsi, monnier
> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: monnier@iro.umontreal.ca, larsi@gnus.org, 58558@debbugs.gnu.org
> Date: Wed, 14 Dec 2022 13:26:15 +0000
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> > Anyway, the fact that the time doesn't get worse when you GC between
> > benchmark most probably means that we produce a lot of garbage markers
> > (i.e., temporary markers that very quickly become unreferenced), and
> > they get in the way of buf_bytepos_to_charpos.
>
> Most likely, but it is only part of the problem. If these temporary
> markers where the only problem, I would not see gradual performance
> degradation as I continue Emacs session (`garbage-collect` is called
> anyway during normal usage).
We've only seen perf profiles for the benchmark, and they point
squarely at buf_bytepos_to_charpos, which AFAIU means markers. To
identify other potential causes, we need to see profiles for other
patterns of usage. For example, profile collected when the benchmark
is run at the beginning of the session compared with profile from
benchmark after several hours. I thought you already posted such a
comparison, and it, too, pointed at buf_bytepos_to_charpos? Which
would probably mean that the amount of markers is increasing, albeit
more slowly, even through GC collects some of them.
Did you try to see how the number of markers in the buffer evolves
with the up-time?
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-14 13:57 ` Eli Zaretskii
@ 2022-12-14 14:01 ` Ihor Radchenko
2023-04-06 11:49 ` Ihor Radchenko
0 siblings, 1 reply; 81+ messages in thread
From: Ihor Radchenko @ 2022-12-14 14:01 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, larsi, monnier
Eli Zaretskii <eliz@gnu.org> writes:
> ... For example, profile collected when the benchmark
> is run at the beginning of the session compared with profile from
> benchmark after several hours. I thought you already posted such a
> comparison, and it, too, pointed at buf_bytepos_to_charpos?
Yes. Not exactly. I compared freshly opened buffer vs. after several
hours.
> Did you try to see how the number of markers in the buffer evolves
> with the up-time?
Is there any way to get the number of buffer markers from Elisp?
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-14 13:39 ` Ihor Radchenko
@ 2022-12-14 14:12 ` Eli Zaretskii
0 siblings, 0 replies; 81+ messages in thread
From: Eli Zaretskii @ 2022-12-14 14:12 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, larsi, monnier
> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: larsi@gnus.org, monnier@iro.umontreal.ca, 58558@debbugs.gnu.org
> Date: Wed, 14 Dec 2022 13:39:43 +0000
>
> How can I find the number of buffer markers?
Compile Emacs with -DMARKER_DEBUG, and then you can call count_markers
from GDB:
(gdb) print count_markers(current_buffer)
But you need to make sure current_buffer is the buffer you are
interested in. One trick is to do this:
(gdb) break Fredraw_display
and then type "M-x redraw-display" with the buffer in the selected
window. Then call count_markers as above, and it should return the
number of markers in the current buffer.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-10-16 9:34 ` Ihor Radchenko
2022-10-16 9:37 ` Lars Ingebrigtsen
@ 2023-02-19 12:17 ` Dmitry Gutov
2023-02-20 10:24 ` Ihor Radchenko
1 sibling, 1 reply; 81+ messages in thread
From: Dmitry Gutov @ 2023-02-19 12:17 UTC (permalink / raw)
To: Ihor Radchenko, Lars Ingebrigtsen; +Cc: 58558
On 16/10/2022 12:34, Ihor Radchenko wrote:
> Lars Ingebrigtsen<larsi@gnus.org> writes:
>
>>> It happens consistently in Emacs 29, but not in all buffers. Sometimes,
>>> it only happens after some time after Emacs startup. The slowdown is not
>>> there in Emacs 28.
>> Is there anything special about buffers where you see these slowdowns?
> This is a large complex Org buffer.
>
It seems like it might be helpful to upload the document somewhere, so
that people can also to reproduce it on their own.
Because I tried this with an Org doc laying around, and couldn't see the
problem.
You can probably replace all the characters with X or x to anonymize any
sensitive information, if that's a concern.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-02-19 12:17 ` Dmitry Gutov
@ 2023-02-20 10:24 ` Ihor Radchenko
2023-02-20 14:54 ` Dmitry Gutov
0 siblings, 1 reply; 81+ messages in thread
From: Ihor Radchenko @ 2023-02-20 10:24 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: 58558, Lars Ingebrigtsen
Dmitry Gutov <dgutov@yandex.ru> writes:
> It seems like it might be helpful to upload the document somewhere, so
> that people can also to reproduce it on their own.
Unfortunately not. I can only reproduce using my config.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-02-20 10:24 ` Ihor Radchenko
@ 2023-02-20 14:54 ` Dmitry Gutov
0 siblings, 0 replies; 81+ messages in thread
From: Dmitry Gutov @ 2023-02-20 14:54 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, Lars Ingebrigtsen
On 20/02/2023 12:24, Ihor Radchenko wrote:
> Dmitry Gutov<dgutov@yandex.ru> writes:
>
>> It seems like it might be helpful to upload the document somewhere, so
>> that people can also to reproduce it on their own.
> Unfortunately not. I can only reproduce using my config.
Bisecting it could also be an option. But this can be a pain, I realize.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-12-14 14:01 ` Ihor Radchenko
@ 2023-04-06 11:49 ` Ihor Radchenko
2023-04-06 12:05 ` Eli Zaretskii
0 siblings, 1 reply; 81+ messages in thread
From: Ihor Radchenko @ 2023-04-06 11:49 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, larsi, monnier
[-- Attachment #1: Type: text/plain, Size: 1053 bytes --]
Ihor Radchenko <yantar92@posteo.net> writes:
>> Did you try to see how the number of markers in the buffer evolves
>> with the up-time?
>
> Is there any way to get the number of buffer markers from Elisp?
I finally got back to this and implemented a small subr to count
number of buffer markers:
DEFUN ("buffer-markers", Fbuffer_markers, Sbuffer_markers, 0, 0, 0,
doc: /* Return the number of markers in current buffer.*/)
(void)
{
struct Lisp_Marker *tail;
int count = 0;
for (tail = BUF_MARKERS (current_buffer); tail; tail = tail->next)
count++;
return make_fixnum (count);
}
Then, I tracked how the number of markers evolves in my problematic
buffer when building agenda. On master and on Emacs 28 (where the agenda
is building 10x faster).
As you can see on the attached graph, the number of markers is ~1000, and
it is not significantly different for the two Emacs versions.
So, the number of markers itself does not look like the real culprit.
I have no better ideas for now except slowly bisecting Emacs (again).
[-- Attachment #2: marker-count.png --]
[-- Type: image/png, Size: 27350 bytes --]
[-- Attachment #3: Type: text/plain, Size: 224 bytes --]
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-06 11:49 ` Ihor Radchenko
@ 2023-04-06 12:05 ` Eli Zaretskii
2023-04-09 19:54 ` Ihor Radchenko
0 siblings, 1 reply; 81+ messages in thread
From: Eli Zaretskii @ 2023-04-06 12:05 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, larsi, monnier
> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: monnier@iro.umontreal.ca, larsi@gnus.org, 58558@debbugs.gnu.org
> Date: Thu, 06 Apr 2023 11:49:52 +0000
>
> As you can see on the attached graph, the number of markers is ~1000, and
> it is not significantly different for the two Emacs versions.
>
> So, the number of markers itself does not look like the real culprit.
That's one potential reason down, thanks.
> I have no better ideas for now except slowly bisecting Emacs (again).
I think we should first go back to using perf. I don't think you
compared profiles for Emacs which just started with one that was
running long enough to show the slowdown. Comparing such profiles
should at least give us a hint where to look.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-06 12:05 ` Eli Zaretskii
@ 2023-04-09 19:54 ` Ihor Radchenko
2023-04-10 4:14 ` Eli Zaretskii
0 siblings, 1 reply; 81+ messages in thread
From: Ihor Radchenko @ 2023-04-09 19:54 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, larsi, monnier
Eli Zaretskii <eliz@gnu.org> writes:
> I think we should first go back to using perf. I don't think you
> compared profiles for Emacs which just started with one that was
> running long enough to show the slowdown. Comparing such profiles
> should at least give us a hint where to look.
I now tried perf record -g.
I was able to narrow down the call tree of the problematic
buf_bytepos_to_charpos calls:
43.82%--Fre_search_forward
--43.81%--search_command
--43.78%--search_buffer
--43.78%--search_buffer_re
--43.33%--re_search_2
--36.39%--re_match_2_internal
--21.90%--SYNTAX_TABLE_BYTE_TO_CHAR
--21.57%--BYTE_TO_CHAR
--21.49%--buf_bytepos_to_charpos
Not sure if it is telling much.
I also looked into git history and I can only identify significant
changes in re_match_2_internal after Emacs 28 release.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-09 19:54 ` Ihor Radchenko
@ 2023-04-10 4:14 ` Eli Zaretskii
2023-04-10 12:24 ` Ihor Radchenko
0 siblings, 1 reply; 81+ messages in thread
From: Eli Zaretskii @ 2023-04-10 4:14 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, larsi, monnier
> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: monnier@iro.umontreal.ca, larsi@gnus.org, 58558@debbugs.gnu.org
> Date: Sun, 09 Apr 2023 19:54:49 +0000
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> > I think we should first go back to using perf. I don't think you
> > compared profiles for Emacs which just started with one that was
> > running long enough to show the slowdown. Comparing such profiles
> > should at least give us a hint where to look.
>
> I now tried perf record -g.
> I was able to narrow down the call tree of the problematic
> buf_bytepos_to_charpos calls:
>
> 43.82%--Fre_search_forward
> --43.81%--search_command
> --43.78%--search_buffer
> --43.78%--search_buffer_re
> --43.33%--re_search_2
> --36.39%--re_match_2_internal
> --21.90%--SYNTAX_TABLE_BYTE_TO_CHAR
> --21.57%--BYTE_TO_CHAR
> --21.49%--buf_bytepos_to_charpos
>
> Not sure if it is telling much.
How does this compare with a "fast" session doing the same?
And why are you once again focusing on buf_bytepos_to_charpos, when
you previously (presumably) established that it cannot be the problem,
since the number of markers doesn't change significantly?
> I also looked into git history and I can only identify significant
> changes in re_match_2_internal after Emacs 28 release.
It sounds like most of the time is not in re_match_2_internal itself.
But I think comparison with a "fast" session could help with ideas.
Thanks.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2022-10-16 1:26 bug#58558: 29.0.50; re-search-forward is slow in some buffers Ihor Radchenko
2022-10-16 9:19 ` Lars Ingebrigtsen
@ 2023-04-10 8:48 ` Mattias Engdegård
2023-04-10 9:57 ` Ihor Radchenko
1 sibling, 1 reply; 81+ messages in thread
From: Mattias Engdegård @ 2023-04-10 8:48 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, Eli Zaretskii, Stefan Monnier
[-- Attachment #1: Type: text/plain, Size: 220 bytes --]
Ihor, would you consider the possibility of regexp cache thrashing? It does occur from time to time; that cache is quite small. Try this instrumentation patch. (We should probably have something like it permanently.)
[-- Attachment #2: 0001-Add-regexp-cache-hit-miss-counters.patch --]
[-- Type: application/octet-stream, Size: 1785 bytes --]
From 978ce66e9bd50da11997aeadcc3508549863a116 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
Date: Sat, 7 Nov 2020 17:00:53 +0100
Subject: [PATCH 1/2] Add regexp cache hit/miss counters
---
src/search.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/src/search.c b/src/search.c
index 4eb634a3c0..358b82da2e 100644
--- a/src/search.c
+++ b/src/search.c
@@ -222,7 +222,10 @@ compile_pattern (Lisp_Object pattern, struct re_registers *regp,
|| EQ (cp->syntax_table, BVAR (current_buffer, syntax_table)))
&& !NILP (Fequal (cp->f_whitespace_regexp, Vsearch_spaces_regexp))
&& cp->buf.charset_unibyte == charset_unibyte)
- break;
+ {
+ regexp_cache_hit++;
+ break;
+ }
/* If we're at the end of the cache, compile into the last
(least recently used) non-busy cell in the cache. */
@@ -234,6 +237,7 @@ compile_pattern (Lisp_Object pattern, struct re_registers *regp,
cp = *cpp;
compile_it:
eassert (!cp->busy);
+ regexp_cache_miss++;
compile_pattern_1 (cp, pattern, translate, posix);
break;
}
@@ -3390,6 +3394,13 @@ syms_of_search (void)
is to bind it with `let' around a small expression. */);
Vinhibit_changing_match_data = Qnil;
+ DEFVAR_INT("regexp-cache-hit", regexp_cache_hit,
+ doc: /* Regexp cache hit count. Internal use only. */);
+ regexp_cache_hit = 0;
+ DEFVAR_INT("regexp-cache-miss", regexp_cache_miss,
+ doc: /* Regexp cache miss count. Internal use only. */);
+ regexp_cache_miss = 0;
+
defsubr (&Slooking_at);
defsubr (&Sposix_looking_at);
defsubr (&Sstring_match);
--
2.21.1 (Apple Git-122.3)
^ permalink raw reply related [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-10 8:48 ` Mattias Engdegård
@ 2023-04-10 9:57 ` Ihor Radchenko
2023-04-10 10:05 ` Mattias Engdegård
0 siblings, 1 reply; 81+ messages in thread
From: Ihor Radchenko @ 2023-04-10 9:57 UTC (permalink / raw)
To: Mattias Engdegård
Cc: 58558, Eli Zaretskii, Ihor Radchenko, Stefan Monnier
Mattias Engdegård <mattias.engdegard@gmail.com> writes:
> Ihor, would you consider the possibility of regexp cache thrashing? It does occur from time to time; that cache is quite small. Try this instrumentation patch. (We should probably have something like it permanently.)
Generating agenda with Emacs master + your patch:
:regexp-cache-hit: 6225399 :regexp-cache-miss: 109490
Emacs 28.3 + your patch:
:regexp-cache-hit: 4968571 :regexp-cache-miss: 79637
Also, I tried to play around with increasing REGEXP_CACHE_SIZE in the
past. It does not make noticeable difference in my setup.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-10 9:57 ` Ihor Radchenko
@ 2023-04-10 10:05 ` Mattias Engdegård
0 siblings, 0 replies; 81+ messages in thread
From: Mattias Engdegård @ 2023-04-10 10:05 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, Eli Zaretskii, Ihor Radchenko, Stefan Monnier
10 apr. 2023 kl. 11.57 skrev Ihor Radchenko <yantar92@posteo.net>:
> Generating agenda with Emacs master + your patch:
>
> :regexp-cache-hit: 6225399 :regexp-cache-miss: 109490
>
> Emacs 28.3 + your patch:
> :regexp-cache-hit: 4968571 :regexp-cache-miss: 79637
Those miss rates are similar (1.7 % and 1.5 %, respectively) although rather higher than we'd like. Probably no serious regexp cache thrashing going on then, but it was good to be able to exclude it, thank you for humouring me!
> Also, I tried to play around with increasing REGEXP_CACHE_SIZE in the
> past. It does not make noticeable difference in my setup.
Right, that's consistent with the data collected above.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-10 4:14 ` Eli Zaretskii
@ 2023-04-10 12:24 ` Ihor Radchenko
2023-04-10 13:40 ` Eli Zaretskii
2023-04-10 14:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
0 siblings, 2 replies; 81+ messages in thread
From: Ihor Radchenko @ 2023-04-10 12:24 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, larsi, monnier
[-- Attachment #1: Type: text/plain, Size: 1638 bytes --]
Eli Zaretskii <eliz@gnu.org> writes:
>> 43.82%--Fre_search_forward
>> --43.81%--search_command
>> --43.78%--search_buffer
>> --43.78%--search_buffer_re
>> --43.33%--re_search_2
>> --36.39%--re_match_2_internal
>> --21.90%--SYNTAX_TABLE_BYTE_TO_CHAR
>> --21.57%--BYTE_TO_CHAR
>> --21.49%--buf_bytepos_to_charpos
>>
>> Not sure if it is telling much.
>
> How does this compare with a "fast" session doing the same?
"fast" (emacs 28) session does not have this call tree contributing
significantly.
> And why are you once again focusing on buf_bytepos_to_charpos, when
> you previously (presumably) established that it cannot be the problem,
> since the number of markers doesn't change significantly?
We only established that the number of markers cannot be the problem.
However, buf_bytepos_to_charpos still dominates CPU samples (see the
attached) in Emacs master, but not in Emacs 28.
Unless there is some other place in buf_bytepos_to_charpos that may be
slow, the only possible explanation is that it simply gets called more
times.
Then, we are interested in the callers of buf_bytepos_to_charpos. That's
exactly what I provided in the previous message.
>> I also looked into git history and I can only identify significant
>> changes in re_match_2_internal after Emacs 28 release.
>
> It sounds like most of the time is not in re_match_2_internal itself.
> But I think comparison with a "fast" session could help with ideas.
re_match_2_internal calls SYNTAX_TABLE_BYTE_TO_CHAR in a loop. So, if
something strange is happening with the loop, we may be calling
buf_bytepos_to_charpos more.
[-- Attachment #2: emacs-28-report.png --]
[-- Type: image/png, Size: 69416 bytes --]
[-- Attachment #3: emacs-master-report.png --]
[-- Type: image/png, Size: 72939 bytes --]
[-- Attachment #4: Type: text/plain, Size: 224 bytes --]
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-10 12:24 ` Ihor Radchenko
@ 2023-04-10 13:40 ` Eli Zaretskii
2023-04-10 14:55 ` Ihor Radchenko
2023-04-10 14:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
1 sibling, 1 reply; 81+ messages in thread
From: Eli Zaretskii @ 2023-04-10 13:40 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, larsi, monnier
> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: monnier@iro.umontreal.ca, larsi@gnus.org, 58558@debbugs.gnu.org
> Date: Mon, 10 Apr 2023 12:24:23 +0000
>
> >> 43.82%--Fre_search_forward
> >> --43.81%--search_command
> >> --43.78%--search_buffer
> >> --43.78%--search_buffer_re
> >> --43.33%--re_search_2
> >> --36.39%--re_match_2_internal
> >> --21.90%--SYNTAX_TABLE_BYTE_TO_CHAR
> >> --21.57%--BYTE_TO_CHAR
> >> --21.49%--buf_bytepos_to_charpos
> >>
> >> Not sure if it is telling much.
> >
> > How does this compare with a "fast" session doing the same?
>
> "fast" (emacs 28) session does not have this call tree contributing
> significantly.
Hmm... I though when you just start a new Emacs session of Emacs 30 it
also is fast, and becomes progressively slower with time? Or am I
confused?
> re_match_2_internal calls SYNTAX_TABLE_BYTE_TO_CHAR in a loop. So, if
> something strange is happening with the loop, we may be calling
> buf_bytepos_to_charpos more.
I believe perf is capable of showing the number of calls as well? Can
you compare the number of calls between the two versions?
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-10 12:24 ` Ihor Radchenko
2023-04-10 13:40 ` Eli Zaretskii
@ 2023-04-10 14:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-11 11:29 ` Ihor Radchenko
1 sibling, 1 reply; 81+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-10 14:27 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, Eli Zaretskii, larsi
>>> 43.82%--Fre_search_forward
>>> --43.81%--search_command
>>> --43.78%--search_buffer
>>> --43.78%--search_buffer_re
>>> --43.33%--re_search_2
>>> --36.39%--re_match_2_internal
>>> --21.90%--SYNTAX_TABLE_BYTE_TO_CHAR
>>> --21.57%--BYTE_TO_CHAR
>>> --21.49%--buf_bytepos_to_charpos
>>>
>>> Not sure if it is telling much.
>> How does this compare with a "fast" session doing the same?
> "fast" (emacs 28) session does not have this call tree contributing
> significantly.
And I thought, we already established around Dec 13 that most of the
time is spent in `buf_bytepos_to_charpos` (in other profiles).
> Unless there is some other place in buf_bytepos_to_charpos that may be
> slow, the only possible explanation is that it simply gets called more
> times.
That would be quite surprising.
BTW, when debugging such performance problem, I often resort to
a few `DEFVAR_INT` defining ad-hoc counter variables, then sprinkle
corresponding increments of those variables from various places
(typically function entry point, loops, ...).
That gives me a kind of "poor man's profiler", but with the advantage
that I can look at their value conveniently from within the affected
Emacs session.
Stefan
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-10 13:40 ` Eli Zaretskii
@ 2023-04-10 14:55 ` Ihor Radchenko
2023-04-10 16:04 ` Eli Zaretskii
0 siblings, 1 reply; 81+ messages in thread
From: Ihor Radchenko @ 2023-04-10 14:55 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, larsi, monnier
Eli Zaretskii <eliz@gnu.org> writes:
>> > How does this compare with a "fast" session doing the same?
>>
>> "fast" (emacs 28) session does not have this call tree contributing
>> significantly.
>
> Hmm... I though when you just start a new Emacs session of Emacs 30 it
> also is fast, and becomes progressively slower with time? Or am I
> confused?
My original bug report is about agenda generation being slow because of
re-search-forward slowdown. Later, I tried to simplify the recipe and
found that direct calls to re-search-forward become slower over time
(but still with my setup).
Originally, agenda generation is slower on master compared to Emacs 28
even right after startup.
In my last message and perf data, I have been looking into agenda generation.
> I believe perf is capable of showing the number of calls as well? Can
> you compare the number of calls between the two versions?
I can only see
https://www.brendangregg.com/blog/2014-07-03/perf-counting.html, but it
appears to be only for built-in events. Do you know how to count calls
to specific function using perf? I am not familiar at all with perf.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-10 14:55 ` Ihor Radchenko
@ 2023-04-10 16:04 ` Eli Zaretskii
0 siblings, 0 replies; 81+ messages in thread
From: Eli Zaretskii @ 2023-04-10 16:04 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, larsi, monnier
> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: monnier@iro.umontreal.ca, larsi@gnus.org, 58558@debbugs.gnu.org
> Date: Mon, 10 Apr 2023 14:55:09 +0000
>
> > I believe perf is capable of showing the number of calls as well? Can
> > you compare the number of calls between the two versions?
>
> I can only see
> https://www.brendangregg.com/blog/2014-07-03/perf-counting.html, but it
> appears to be only for built-in events. Do you know how to count calls
> to specific function using perf? I am not familiar at all with perf.
I thought that was part of the profile?
But if not, then maybe Stefan's "poor-man's counters" will be an
easier device for answering that particular question: just increment
it before every call to SYNTAX_TABLE_BYTE_TO_CHAR that you find inside
re_match_2_internal, and then compare the counts with Emacs 28.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-10 14:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-04-11 11:29 ` Ihor Radchenko
2023-04-11 11:51 ` Eli Zaretskii
0 siblings, 1 reply; 81+ messages in thread
From: Ihor Radchenko @ 2023-04-11 11:29 UTC (permalink / raw)
To: Stefan Monnier; +Cc: 58558, Eli Zaretskii, larsi
[-- Attachment #1: Type: text/plain, Size: 399 bytes --]
Stefan Monnier <monnier@iro.umontreal.ca> writes:
> BTW, when debugging such performance problem, I often resort to
> a few `DEFVAR_INT` defining ad-hoc counter variables, then sprinkle
> corresponding increments of those variables from various places
> (typically function entry point, loops, ...).
Well. I just tried, but my Emacs-C foo is not good enough.
The attached patch fails to compile.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-add-debug-vars.patch --]
[-- Type: text/x-patch, Size: 3962 bytes --]
From ac15ad3262ddf0a0bf459dc603cb79f7f9c737f7 Mon Sep 17 00:00:00 2001
Message-Id: <ac15ad3262ddf0a0bf459dc603cb79f7f9c737f7.1681212491.git.yantar92@posteo.net>
From: Ihor Radchenko <yantar92@posteo.net>
Date: Tue, 11 Apr 2023 13:27:56 +0200
Subject: [PATCH] add debug vars
---
src/regex-emacs.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
diff --git a/src/regex-emacs.c b/src/regex-emacs.c
index 2571812cb39..6bcc64d5c0a 100644
--- a/src/regex-emacs.c
+++ b/src/regex-emacs.c
@@ -3889,6 +3889,25 @@ unwind_re_match (void *ptr)
b->text->inhibit_shrinking = 0;
}
+DEFVAR_INT("re-match-2-internal-bytepos-calls-1", re_match_2_internal_bytepos_calls_1,
+ doc: /* Call count 1. Internal use only. */);
+DEFVAR_INT("re-match-2-internal-bytepos-calls-2", re_match_2_internal_bytepos_calls_2,
+ doc: /* Call count 1. Internal use only. */);
+DEFVAR_INT("re-match-2-internal-bytepos-calls-3", re_match_2_internal_bytepos_calls_3,
+ doc: /* Call count 1. Internal use only. */);
+DEFVAR_INT("re-match-2-internal-bytepos-calls-4", re_match_2_internal_bytepos_calls_4,
+ doc: /* Call count 1. Internal use only. */);
+DEFVAR_INT("re-match-2-internal-bytepos-calls-5", re_match_2_internal_bytepos_calls_5,
+ doc: /* Call count 1. Internal use only. */);
+DEFVAR_INT("re-match-2-internal-bytepos-calls-6", re_match_2_internal_bytepos_calls_6,
+ doc: /* Call count 1. Internal use only. */);
+re_match_2_internal_bytepos_calls_1 = 0;
+re_match_2_internal_bytepos_calls_2 = 0;
+re_match_2_internal_bytepos_calls_3 = 0;
+re_match_2_internal_bytepos_calls_4 = 0;
+re_match_2_internal_bytepos_calls_5 = 0;
+re_match_2_internal_bytepos_calls_6 = 0;
+
/* This is a separate function so that we can force an alloca cleanup
afterwards. */
static ptrdiff_t
@@ -4808,6 +4827,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
int dummy;
ptrdiff_t offset = PTR_TO_OFFSET (d);
ptrdiff_t charpos = SYNTAX_TABLE_BYTE_TO_CHAR (offset) - 1;
+ re_match_2_internal_bytepos_calls_1++;
UPDATE_SYNTAX_TABLE (charpos);
GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2);
nchars++;
@@ -4848,6 +4868,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
int dummy;
ptrdiff_t offset = PTR_TO_OFFSET (d);
ptrdiff_t charpos = SYNTAX_TABLE_BYTE_TO_CHAR (offset);
+ re_match_2_internal_bytepos_calls_2++;
UPDATE_SYNTAX_TABLE (charpos);
PREFETCH ();
GET_CHAR_AFTER (c2, d, dummy);
@@ -4891,6 +4912,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
int dummy;
ptrdiff_t offset = PTR_TO_OFFSET (d);
ptrdiff_t charpos = SYNTAX_TABLE_BYTE_TO_CHAR (offset) - 1;
+ re_match_2_internal_bytepos_calls_3++;
UPDATE_SYNTAX_TABLE (charpos);
GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2);
nchars++;
@@ -4933,6 +4955,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
int s1, s2;
ptrdiff_t offset = PTR_TO_OFFSET (d);
ptrdiff_t charpos = SYNTAX_TABLE_BYTE_TO_CHAR (offset);
+ re_match_2_internal_bytepos_calls_4++;
UPDATE_SYNTAX_TABLE (charpos);
PREFETCH ();
c2 = RE_STRING_CHAR (d, target_multibyte);
@@ -4974,6 +4997,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
int s1, s2;
ptrdiff_t offset = PTR_TO_OFFSET (d);
ptrdiff_t charpos = SYNTAX_TABLE_BYTE_TO_CHAR (offset) - 1;
+ re_match_2_internal_bytepos_calls_5++;
UPDATE_SYNTAX_TABLE (charpos);
GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2);
nchars++;
@@ -5010,6 +5034,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
{
ptrdiff_t offset = PTR_TO_OFFSET (d);
ptrdiff_t pos1 = SYNTAX_TABLE_BYTE_TO_CHAR (offset);
+ re_match_2_internal_bytepos_calls_6++;
UPDATE_SYNTAX_TABLE (pos1);
}
{
--
2.40.0
[-- Attachment #3: Type: text/plain, Size: 224 bytes --]
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply related [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-11 11:29 ` Ihor Radchenko
@ 2023-04-11 11:51 ` Eli Zaretskii
2023-04-12 13:39 ` Ihor Radchenko
0 siblings, 1 reply; 81+ messages in thread
From: Eli Zaretskii @ 2023-04-11 11:51 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, larsi, monnier
> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: Eli Zaretskii <eliz@gnu.org>, larsi@gnus.org, 58558@debbugs.gnu.org
> Date: Tue, 11 Apr 2023 11:29:26 +0000
>
> Well. I just tried, but my Emacs-C foo is not good enough.
> The attached patch fails to compile.
That's because you've put DEFVAR_INT outside of any function. They
should be inside one of the syms_of_* functions instead.
regex-emacs.c doesn't have such a function, but search.c does. So
just move those DEFVAR_INT lines to syms_of_search, and I think it
will work.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-11 11:51 ` Eli Zaretskii
@ 2023-04-12 13:39 ` Ihor Radchenko
2023-04-12 14:06 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-13 4:43 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
0 siblings, 2 replies; 81+ messages in thread
From: Ihor Radchenko @ 2023-04-12 13:39 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, larsi, monnier
Eli Zaretskii <eliz@gnu.org> writes:
>> Well. I just tried, but my Emacs-C foo is not good enough.
>> The attached patch fails to compile.
>
> That's because you've put DEFVAR_INT outside of any function. They
> should be inside one of the syms_of_* functions instead.
> regex-emacs.c doesn't have such a function, but search.c does. So
> just move those DEFVAR_INT lines to syms_of_search, and I think it
> will work.
Thanks!
I now managed to define these variables + also a counter inside
buf_bytepos_to_charpos.
The results are interesting.
The call count for each SYNTAX_TABLE_BYTE_TO_CHAR inside
re_match_2_internal (there are 6 places where it is called):
- master :: 28 5011460 20 96 285 539911
- Emacs 28 :: 68 5015326 26 397 1404 558585
Master has less calls...
This was weird, so I also added a counter inside buf_bytepos_to_charpos:
- master :: 6,304,522
- Emacs 28 :: 593,430
Now, it is clear that it is something in SYNTAX_TABLE_BYTE_TO_CHAR that
triggers buf_bytepos_to_charpos more on master compared to Emacs 28.
I looked into the code:
INLINE ptrdiff_t
SYNTAX_TABLE_BYTE_TO_CHAR (ptrdiff_t bytepos)
{
return (! parse_sexp_lookup_properties
? 0
...
}
parse_sexp_lookup_properties looks suspicious, so I checked the value of
parse-sexp-lookup-properties in Org files on master vs. Emacs 28.
On master, the value is t, even though Org mode does not set this
variable. On Emacs 28, the value is nil.
I looked further and narrowed things down to helpful package in my
config, where the culprit is (require 'cc-langs).
It looks like for some reason cc-langs changes the default value of
parse-sexp-lookup-properties globally!
Recipe:
1. emacs -Q
2. M-: (require 'cc-langs) <RET>
3. C-x b asd <RET>
4. M-: parse-sexp-lookup-properties <RET> => t
On Emacs 28, (4) yields nil.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-12 13:39 ` Ihor Radchenko
@ 2023-04-12 14:06 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-12 14:30 ` Eli Zaretskii
` (2 more replies)
2023-04-13 4:43 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
1 sibling, 3 replies; 81+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-12 14:06 UTC (permalink / raw)
To: Alan Mackenzie; +Cc: 58558, larsi, Ihor Radchenko, Eli Zaretskii
> 1. emacs -Q
> 2. M-: (require 'cc-langs) <RET>
> 3. C-x b asd <RET>
> 4. M-: parse-sexp-lookup-properties <RET> => t
>
> On Emacs 28, (4) yields nil.
I suspect that the patch below might fix the immediate problem.
Of course, setting `parse-sexp-lookup-properties` should not have such
a major performance impact, so maybe we should keep digging into
the problem.
Stefan
diff --git a/lisp/progmodes/cc-defs.el b/lisp/progmodes/cc-defs.el
index aa6f33e9cab..92ab0c02de1 100644
--- a/lisp/progmodes/cc-defs.el
+++ b/lisp/progmodes/cc-defs.el
@@ -2153,20 +2153,13 @@ c-emacs-features
;; Record whether the `category' text property works.
(if c-use-category (setq list (cons 'category-properties list)))
- (let ((buf (generate-new-buffer " test"))
- parse-sexp-lookup-properties
- parse-sexp-ignore-comments
- lookup-syntax-properties) ; XEmacs
- (with-current-buffer buf
+ (with-current-buffer (generate-new-buffer " test")
+ ;; Do the let-binding in the right buffer, in case they're buffer-local.
+ (let ((parse-sexp-lookup-properties t)
+ (parse-sexp-ignore-comments t)
+ (lookup-syntax-properties t)) ; XEmacs
(set-syntax-table (make-syntax-table))
- ;; For some reason we have to set some of these after the
- ;; buffer has been made current. (Specifically,
- ;; `parse-sexp-ignore-comments' in Emacs 21.)
- (setq parse-sexp-lookup-properties t
- parse-sexp-ignore-comments t
- lookup-syntax-properties t)
-
;; Find out if the `syntax-table' text property works.
(modify-syntax-entry ?< ".")
(modify-syntax-entry ?> ".")
@@ -2231,8 +2224,8 @@ c-emacs-features
(if (bobp)
(setq list (cons 'col-0-paren list)))))
- (set-buffer-modified-p nil))
- (kill-buffer buf))
+ (set-buffer-modified-p nil)
+ (kill-buffer (current-buffer))))
;; Check how many elements `parse-partial-sexp' returns.
(let ((ppss-size (or (c-safe (length
^ permalink raw reply related [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-12 14:06 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-04-12 14:30 ` Eli Zaretskii
2023-04-12 14:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
` (2 more replies)
2023-04-12 14:39 ` Ihor Radchenko
2023-04-12 18:31 ` Alan Mackenzie
2 siblings, 3 replies; 81+ messages in thread
From: Eli Zaretskii @ 2023-04-12 14:30 UTC (permalink / raw)
To: Stefan Monnier; +Cc: 58558, acm, yantar92, larsi
> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Ihor Radchenko <yantar92@posteo.net>, Eli Zaretskii <eliz@gnu.org>,
> larsi@gnus.org, 58558@debbugs.gnu.org
> Date: Wed, 12 Apr 2023 10:06:03 -0400
>
> > 1. emacs -Q
> > 2. M-: (require 'cc-langs) <RET>
> > 3. C-x b asd <RET>
> > 4. M-: parse-sexp-lookup-properties <RET> => t
> >
> > On Emacs 28, (4) yields nil.
>
> I suspect that the patch below might fix the immediate problem.
> Of course, setting `parse-sexp-lookup-properties` should not have such
> a major performance impact, so maybe we should keep digging into
> the problem.
Also, that code was there in Emacs 28 as well, so how come it suddenly
has this effect now?
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-12 14:30 ` Eli Zaretskii
@ 2023-04-12 14:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-12 15:22 ` Eli Zaretskii
2023-04-12 14:38 ` Stephen Berman
2023-04-12 14:42 ` Ihor Radchenko
2 siblings, 1 reply; 81+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-12 14:38 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, acm, yantar92, larsi
>> > 1. emacs -Q
>> > 2. M-: (require 'cc-langs) <RET>
>> > 3. C-x b asd <RET>
>> > 4. M-: parse-sexp-lookup-properties <RET> => t
>> >
>> > On Emacs 28, (4) yields nil.
>>
>> I suspect that the patch below might fix the immediate problem.
>> Of course, setting `parse-sexp-lookup-properties` should not have such
>> a major performance impact, so maybe we should keep digging into
>> the problem.
>
> Also, that code was there in Emacs 28 as well, so how come it suddenly
> has this effect now?
The effect of the code depends on whether the buffer that's current when
`cc-defs.el` is loaded has set `parse-sexp-lookup-properties`
buffer-locally or not.
I don't have Emacs-28 at hand, but the value of
`parse-sexp-lookup-properties` in *scratch* is (buffer-local) t in
Emacs-29 and (global) nil in Emacs-27.
Stefan
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-12 14:30 ` Eli Zaretskii
2023-04-12 14:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-04-12 14:38 ` Stephen Berman
2023-04-12 14:42 ` Ihor Radchenko
2 siblings, 0 replies; 81+ messages in thread
From: Stephen Berman @ 2023-04-12 14:38 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, acm, yantar92, larsi, Stefan Monnier
On Wed, 12 Apr 2023 17:30:36 +0300 Eli Zaretskii <eliz@gnu.org> wrote:
>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: Ihor Radchenko <yantar92@posteo.net>, Eli Zaretskii <eliz@gnu.org>,
>> larsi@gnus.org, 58558@debbugs.gnu.org
>> Date: Wed, 12 Apr 2023 10:06:03 -0400
>>
>> > 1. emacs -Q
>> > 2. M-: (require 'cc-langs) <RET>
>> > 3. C-x b asd <RET>
>> > 4. M-: parse-sexp-lookup-properties <RET> => t
>> >
>> > On Emacs 28, (4) yields nil.
>>
>> I suspect that the patch below might fix the immediate problem.
>> Of course, setting `parse-sexp-lookup-properties` should not have such
>> a major performance impact, so maybe we should keep digging into
>> the problem.
>
> Also, that code was there in Emacs 28 as well, so how come it suddenly
> has this effect now?
Note that, with emacs-28 -Q, `C-h v parse-sexp-lookup-properties' ==>
parse-sexp-lookup-properties is a variable defined in ‘C source code’.
Its value is nil
while with emacs-29 -Q, `C-h v parse-sexp-lookup-properties' ==>
parse-sexp-lookup-properties is a variable defined in ‘C source code’.
Its value is t
Local in buffer *scratch*; global value is nil
Steve Berman
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-12 14:06 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-12 14:30 ` Eli Zaretskii
@ 2023-04-12 14:39 ` Ihor Radchenko
2023-04-12 15:20 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-12 18:31 ` Alan Mackenzie
2 siblings, 1 reply; 81+ messages in thread
From: Ihor Radchenko @ 2023-04-12 14:39 UTC (permalink / raw)
To: Stefan Monnier; +Cc: 58558, Alan Mackenzie, Eli Zaretskii, larsi
Stefan Monnier <monnier@iro.umontreal.ca> writes:
> I suspect that the patch below might fix the immediate problem.
I confirm that it does fix the problem. But why not `with-temp-buffer'?
Also, how come `setq' changes the global variable value despite it is
let-bound?
> Of course, setting `parse-sexp-lookup-properties` should not have such
> a major performance impact, so maybe we should keep digging into
> the problem.
Agree. I was considering `parse-sexp-lookup-properties' in Org, but this
issue will be a blocker.
To improve the performance, the two obvious ways are reducing the number
of SYNTAX_TABLE_BYTE_TO_CHAR calls in re_match_2_internal and speeding
up buf_bytepos_to_charpos. I'd prefer the latter as it is used
ubiquitously across Emacs and making point lookup faster will thus
benefit other places as well.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-12 14:30 ` Eli Zaretskii
2023-04-12 14:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-12 14:38 ` Stephen Berman
@ 2023-04-12 14:42 ` Ihor Radchenko
2 siblings, 0 replies; 81+ messages in thread
From: Ihor Radchenko @ 2023-04-12 14:42 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, acm, larsi, Stefan Monnier
Eli Zaretskii <eliz@gnu.org> writes:
> Also, that code was there in Emacs 28 as well, so how come it suddenly
> has this effect now?
Random guess: cc-langs.el loads cc-defs via (cc-require 'cc-defs).
`cc-require' is doing something extremely tricky with byte compilation.
May Emacs 29 have some subtle changes in byte code that could have an influence?
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-12 14:39 ` Ihor Radchenko
@ 2023-04-12 15:20 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-12 23:23 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
0 siblings, 1 reply; 81+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-12 15:20 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, Alan Mackenzie, Eli Zaretskii, larsi
> I confirm that it does fix the problem. But why not `with-temp-buffer'?
I think it's for compatibility with TECO Emacs or something like that :-)
> Also, how come `setq' changes the global variable value despite it is
> let-bound?
Because the `let` and the `setq` were not performed in the same buffer,
so if the var is buffer-local ...
> To improve the performance, the two obvious ways are reducing the number
> of SYNTAX_TABLE_BYTE_TO_CHAR calls in re_match_2_internal and speeding
> up buf_bytepos_to_charpos.
I think the behavior you experience doesn't require "speeding up" but it
requires "fixing a performance bug". Technically it's the same, but still..
> I'd prefer the latter as it is used ubiquitously across Emacs and
> making point lookup faster will thus benefit other places as well.
Why choose?
For the former, we could probably extend the `b_property` and
`e_property` fields of `gl_state` (which hold charpos) to also store
their bytepos equivalent, which should significantly reduce the number
of conversions between bytepos and charpos.
Stefan
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-12 14:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-04-12 15:22 ` Eli Zaretskii
2023-04-12 15:59 ` Alan Mackenzie
0 siblings, 1 reply; 81+ messages in thread
From: Eli Zaretskii @ 2023-04-12 15:22 UTC (permalink / raw)
To: Stefan Monnier; +Cc: 58558, acm, yantar92, larsi
> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: acm@muc.de, yantar92@posteo.net, larsi@gnus.org, 58558@debbugs.gnu.org
> Date: Wed, 12 Apr 2023 10:38:50 -0400
>
> > Also, that code was there in Emacs 28 as well, so how come it suddenly
> > has this effect now?
>
> The effect of the code depends on whether the buffer that's current when
> `cc-defs.el` is loaded has set `parse-sexp-lookup-properties`
> buffer-locally or not.
>
> I don't have Emacs-28 at hand, but the value of
> `parse-sexp-lookup-properties` in *scratch* is (buffer-local) t in
> Emacs-29 and (global) nil in Emacs-27.
Ah, okay. So in Emacs 29 we started setting this variable locally in
some buffers? Do you happen to know where's the change which caused
that, and why was it done?
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-12 15:22 ` Eli Zaretskii
@ 2023-04-12 15:59 ` Alan Mackenzie
0 siblings, 0 replies; 81+ messages in thread
From: Alan Mackenzie @ 2023-04-12 15:59 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 58558, larsi, yantar92, Stefan Monnier
Hello, Eli.
On Wed, Apr 12, 2023 at 18:22:22 +0300, Eli Zaretskii wrote:
> > From: Stefan Monnier <monnier@iro.umontreal.ca>
> > Cc: acm@muc.de, yantar92@posteo.net, larsi@gnus.org, 58558@debbugs.gnu.org
> > Date: Wed, 12 Apr 2023 10:38:50 -0400
> > > Also, that code was there in Emacs 28 as well, so how come it suddenly
> > > has this effect now?
> > The effect of the code depends on whether the buffer that's current when
> > `cc-defs.el` is loaded has set `parse-sexp-lookup-properties`
> > buffer-locally or not.
> > I don't have Emacs-28 at hand, but the value of
> > `parse-sexp-lookup-properties` in *scratch* is (buffer-local) t in
> > Emacs-29 and (global) nil in Emacs-27.
> Ah, okay. So in Emacs 29 we started setting this variable locally in
> some buffers? Do you happen to know where's the change which caused
> that, and why was it done?
I suspect this commit as the cause:
commit 6ccc4b6bc8a14daca6b3e3250574752c90c1eb9b
Author: Noam Postavsky <npostavs@gmail.com>
Date: Fri May 6 18:31:00 2022 +0200
Handle elisp #-syntax better in Emacs Lisp mode
* elisp-mode.el (elisp-mode-syntax-propertize): New function.
(emacs-lisp-mode): Set it as syntax-propertize-function (bug#15998).
Lisp Interaction Mode is derived from Emacs Lisp Mode. Whenever there
is a non-nil syntax-propertize-function, run-mode-hooks sets
parse-sexp-lookup-properties to t.
This is probably harmless in *scratch*.
--
Alan Mackenzie (Nuremberg, Germany).
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-12 14:06 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-12 14:30 ` Eli Zaretskii
2023-04-12 14:39 ` Ihor Radchenko
@ 2023-04-12 18:31 ` Alan Mackenzie
2023-04-12 23:25 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2 siblings, 1 reply; 81+ messages in thread
From: Alan Mackenzie @ 2023-04-12 18:31 UTC (permalink / raw)
To: Stefan Monnier; +Cc: 58558, larsi, Ihor Radchenko, Eli Zaretskii
Hello, Stefan.
On Wed, Apr 12, 2023 at 10:06:03 -0400, Stefan Monnier wrote:
> > 1. emacs -Q
> > 2. M-: (require 'cc-langs) <RET>
> > 3. C-x b asd <RET>
> > 4. M-: parse-sexp-lookup-properties <RET> => t
> > On Emacs 28, (4) yields nil.
> I suspect that the patch below might fix the immediate problem.
> Of course, setting `parse-sexp-lookup-properties` should not have such
> a major performance impact, so maybe we should keep digging into
> the problem.
Thanks! That's a nasty little bug for somebody who hasn't seen it
before. I don't the Elisp manual is all that explicit about what
happens in such cases.
Just as a matter of interest, have you searched cc-defs.el for any other
places the same bug might occur? If not, I will.
I suggest I apply your patch now-ish.
> Stefan
[ .... ]
--
Alan Mackenzie (Nuremberg, Germany).
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-12 15:20 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-04-12 23:23 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-13 4:33 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-13 4:52 ` Eli Zaretskii
0 siblings, 2 replies; 81+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-12 23:23 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, Alan Mackenzie, Eli Zaretskii, larsi
[-- Attachment #1: Type: text/plain, Size: 407 bytes --]
> For the former, we could probably extend the `b_property` and
> `e_property` fields of `gl_state` (which hold charpos) to also store
> their bytepos equivalent, which should significantly reduce the number
> of conversions between bytepos and charpos.
I.e. something like the patch below (which passes all tests except for
`test/src/comp-tests` for a reason that completely escapes me).
Stefan
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: regmatch.patch --]
[-- Type: text/x-diff, Size: 13157 bytes --]
diff --git a/src/regex-emacs.c b/src/regex-emacs.c
index 746779490ad..f75f805cd9c 100644
--- a/src/regex-emacs.c
+++ b/src/regex-emacs.c
@@ -3979,7 +3979,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
/* Prevent shrinking and relocation of buffer text if GC happens
while we are inside this function. The calls to
- UPDATE_SYNTAX_TABLE_* macros can call Lisp (via
+ RE_UPDATE_SYNTAX_TABLE_* macros can call Lisp (via
`internal--syntax-propertize`); these calls are careful to defend against
buffer modifications, but even with no modifications, the buffer text may
be relocated during GC by `compact_buffer` which would invalidate
@@ -4792,12 +4792,11 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
int s1, s2;
int dummy;
ptrdiff_t offset = POINTER_TO_OFFSET (d);
- ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset) - 1;
- UPDATE_SYNTAX_TABLE (charpos);
+ RE_UPDATE_SYNTAX_TABLE_BEFORE (offset);
GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2);
nchars++;
s1 = SYNTAX (c1);
- UPDATE_SYNTAX_TABLE_FORWARD (charpos + 1);
+ RE_UPDATE_SYNTAX_TABLE_FORWARD (offset);
PREFETCH_NOLIMIT ();
GET_CHAR_AFTER (c2, d, dummy);
nchars++;
@@ -4832,8 +4831,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
int s1, s2;
int dummy;
ptrdiff_t offset = POINTER_TO_OFFSET (d);
- ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset);
- UPDATE_SYNTAX_TABLE (charpos);
+ RE_UPDATE_SYNTAX_TABLE (offset);
PREFETCH ();
GET_CHAR_AFTER (c2, d, dummy);
nchars++;
@@ -4848,7 +4846,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
{
GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2);
nchars++;
- UPDATE_SYNTAX_TABLE_BACKWARD (charpos - 1);
+ RE_UPDATE_SYNTAX_TABLE_BACKWARD_BEFORE (offset);
s1 = SYNTAX (c1);
/* ... and S1 is Sword, and WORD_BOUNDARY_P (C1, C2)
@@ -4875,8 +4873,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
int s1, s2;
int dummy;
ptrdiff_t offset = POINTER_TO_OFFSET (d);
- ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset) - 1;
- UPDATE_SYNTAX_TABLE (charpos);
+ RE_UPDATE_SYNTAX_TABLE_BEFORE (offset);
GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2);
nchars++;
s1 = SYNTAX (c1);
@@ -4891,13 +4888,13 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
PREFETCH_NOLIMIT ();
GET_CHAR_AFTER (c2, d, dummy);
nchars++;
- UPDATE_SYNTAX_TABLE_FORWARD (charpos + 1);
+ RE_UPDATE_SYNTAX_TABLE_FORWARD (offset);
s2 = SYNTAX (c2);
/* ... and S2 is Sword, and WORD_BOUNDARY_P (C1, C2)
returns 0. */
if ((s2 == Sword) && !WORD_BOUNDARY_P (c1, c2))
- goto fail;
+ goto fail;
}
}
break;
@@ -4917,8 +4914,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
int c1, c2;
int s1, s2;
ptrdiff_t offset = POINTER_TO_OFFSET (d);
- ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset);
- UPDATE_SYNTAX_TABLE (charpos);
+ RE_UPDATE_SYNTAX_TABLE (offset);
PREFETCH ();
c2 = RE_STRING_CHAR (d, target_multibyte);
nchars++;
@@ -4933,7 +4929,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
{
GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2);
nchars++;
- UPDATE_SYNTAX_TABLE_BACKWARD (charpos - 1);
+ RE_UPDATE_SYNTAX_TABLE_BACKWARD_BEFORE (offset);
s1 = SYNTAX (c1);
/* ... and S1 is Sword or Ssymbol. */
@@ -4958,8 +4954,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
int c1, c2;
int s1, s2;
ptrdiff_t offset = POINTER_TO_OFFSET (d);
- ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset) - 1;
- UPDATE_SYNTAX_TABLE (charpos);
+ RE_UPDATE_SYNTAX_TABLE_BEFORE (offset);
GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2);
nchars++;
s1 = SYNTAX (c1);
@@ -4974,7 +4969,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
PREFETCH_NOLIMIT ();
c2 = RE_STRING_CHAR (d, target_multibyte);
nchars++;
- UPDATE_SYNTAX_TABLE_FORWARD (charpos + 1);
+ RE_UPDATE_SYNTAX_TABLE_FORWARD (offset);
s2 = SYNTAX (c2);
/* ... and S2 is Sword or Ssymbol. */
@@ -4994,8 +4989,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
PREFETCH ();
{
ptrdiff_t offset = POINTER_TO_OFFSET (d);
- ptrdiff_t pos1 = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset);
- UPDATE_SYNTAX_TABLE (pos1);
+ RE_UPDATE_SYNTAX_TABLE (offset);
}
{
int len;
diff --git a/src/syntax.c b/src/syntax.c
index e9e04e2d638..0245038dc2d 100644
--- a/src/syntax.c
+++ b/src/syntax.c
@@ -250,6 +250,8 @@ SETUP_SYNTAX_TABLE (ptrdiff_t from, ptrdiff_t count)
gl_state.b_property = BEGV;
gl_state.e_property = ZV + 1;
gl_state.object = Qnil;
+ gl_state.b_re_byte = -1;
+ gl_state.e_re_byte = -1;
if (parse_sexp_lookup_properties)
{
if (count > 0)
@@ -268,11 +270,12 @@ SETUP_SYNTAX_TABLE (ptrdiff_t from, ptrdiff_t count)
FROMBYTE is an regexp-byteoffset. */
void
-RE_SETUP_SYNTAX_TABLE_FOR_OBJECT (Lisp_Object object,
- ptrdiff_t frombyte)
+RE_SETUP_SYNTAX_TABLE_FOR_OBJECT (Lisp_Object object, ptrdiff_t frombyte)
{
SETUP_BUFFER_SYNTAX_TABLE ();
gl_state.object = object;
+ gl_state.b_re_byte = -1;
+ gl_state.e_re_byte = -1;
if (BUFFERP (gl_state.object))
{
struct buffer *buf = XBUFFER (gl_state.object);
@@ -282,7 +285,7 @@ RE_SETUP_SYNTAX_TABLE_FOR_OBJECT (Lisp_Object object,
else if (NILP (gl_state.object))
{
gl_state.b_property = BEG;
- gl_state.e_property = ZV; /* FIXME: Why not +1 like in SETUP_SYNTAX_TABLE? */
+ gl_state.e_property = ZV;
}
else if (EQ (gl_state.object, Qt))
{
@@ -295,8 +298,11 @@ RE_SETUP_SYNTAX_TABLE_FOR_OBJECT (Lisp_Object object,
gl_state.e_property = 1 + SCHARS (gl_state.object);
}
if (parse_sexp_lookup_properties)
- update_syntax_table (RE_SYNTAX_TABLE_BYTE_TO_CHAR (frombyte),
- 1, 1, gl_state.object);
+ {
+ update_syntax_table (RE_SYNTAX_TABLE_BYTE_TO_CHAR (frombyte),
+ 1, 1, gl_state.object);
+ re_update_byteoffsets ();
+ }
}
/* Update gl_state to an appropriate interval which contains CHARPOS. The
diff --git a/src/syntax.h b/src/syntax.h
index 01982be25a0..3e20952053b 100644
--- a/src/syntax.h
+++ b/src/syntax.h
@@ -66,7 +66,7 @@ #define Vstandard_syntax_table BVAR (&buffer_defaults, syntax_table)
struct gl_state_s
{
Lisp_Object object; /* The object we are scanning. */
- ptrdiff_t start; /* Where to stop. */
+ ptrdiff_t start; /* Where to stop(?FIXME?). */
ptrdiff_t stop; /* Where to stop. */
bool use_global; /* Whether to use global_code
or c_s_t. */
@@ -85,6 +85,11 @@ #define Vstandard_syntax_table BVAR (&buffer_defaults, syntax_table)
and possibly at the
intervals too, depending
on: */
+ /* The regexp engine prefers byteoffsets over char positions, so
+ store those to try and reduce the number of byte<->char conversions.
+ This is only kept uptodate when used from the regexp engine. */
+ ptrdiff_t b_re_byte; /* First byteoffset where c_s_t is valid. */
+ ptrdiff_t e_re_byte; /* First byteoffset where c_s_t is not valid. */
};
extern struct gl_state_s gl_state;
@@ -145,19 +150,14 @@ SYNTAX (int c)
extern unsigned char const syntax_spec_code[0400];
-/* Convert the regexp's BYTEOFFSET into a character position,
- for the object recorded in gl_state with RE_SETUP_SYNTAX_TABLE_FOR_OBJECT.
-
- The value is meant for use in code that does nothing when
- parse_sexp_lookup_properties is false, so return 0 in that case,
- for speed. */
+/* Convert the regexp's BYTEOFFSET into a character position, for
+ the object recorded in gl_state with RE_SETUP_SYNTAX_TABLE_FOR_OBJECT. */
INLINE ptrdiff_t
RE_SYNTAX_TABLE_BYTE_TO_CHAR (ptrdiff_t byteoffset)
{
- return (! parse_sexp_lookup_properties
- ? 0
- : STRINGP (gl_state.object)
+ eassert (parse_sexp_lookup_properties);
+ return (STRINGP (gl_state.object)
? string_byte_to_char (gl_state.object, byteoffset)
: BUFFERP (gl_state.object)
? ((buf_bytepos_to_charpos
@@ -168,6 +168,44 @@ RE_SYNTAX_TABLE_BYTE_TO_CHAR (ptrdiff_t byteoffset)
: byteoffset);
}
+INLINE ptrdiff_t
+RE_SYNTAX_TABLE_CHAR_TO_BYTE (ptrdiff_t charpos)
+{
+ eassert (parse_sexp_lookup_properties);
+ return (STRINGP (gl_state.object)
+ ? string_char_to_byte (gl_state.object, charpos)
+ : BUFFERP (gl_state.object)
+ ? ((buf_charpos_to_bytepos
+ (XBUFFER (gl_state.object), charpos)
+ - BUF_BEGV_BYTE (XBUFFER (gl_state.object))))
+ : NILP (gl_state.object)
+ ? CHAR_TO_BYTE (charpos) - BEGV_BYTE
+ : charpos);
+}
+
+static void re_update_byteoffsets (void)
+{
+ gl_state.b_re_byte = RE_SYNTAX_TABLE_CHAR_TO_BYTE (gl_state.b_property);
+ eassert (gl_state.b_property
+ == RE_SYNTAX_TABLE_BYTE_TO_CHAR (gl_state.b_re_byte));
+ /* `e_property` is often set to EOB+1 (or to some value
+ much further than `stop` in narrowed buffers). */
+ gl_state.e_re_byte
+ = gl_state.e_property > gl_state.stop
+ ? 1 + RE_SYNTAX_TABLE_CHAR_TO_BYTE (gl_state.stop)
+ : RE_SYNTAX_TABLE_CHAR_TO_BYTE (gl_state.e_property);
+ eassert (gl_state.e_property > gl_state.stop
+ ? gl_state.e_property
+ >= 1 + RE_SYNTAX_TABLE_BYTE_TO_CHAR (gl_state.e_re_byte - 1)
+ : gl_state.e_property
+ == RE_SYNTAX_TABLE_BYTE_TO_CHAR (gl_state.e_re_byte));
+}
+
+/* The regexp-engine doesn't keep track of char positions, but instead
+ uses byteoffsets, so `syntax.c` uses `UPDATE_SYNTAX_TABLE_*` functions,
+ passing them `charpos`s whereas `regexp.c` uses `RE_UPDATE_SYNTAX_TABLE_*`
+ functions, passing them byteoffsets. */
+
/* Make syntax table state (gl_state) good for CHARPOS, assuming it is
currently good for a position before CHARPOS. */
@@ -178,6 +216,36 @@ UPDATE_SYNTAX_TABLE_FORWARD (ptrdiff_t charpos)
update_syntax_table_forward (charpos, false, gl_state.object);
}
+INLINE void
+RE_UPDATE_SYNTAX_TABLE_FORWARD (ptrdiff_t byteoffset)
+{ /* Performs just-in-time syntax-propertization. */
+ if (!parse_sexp_lookup_properties)
+ return;
+ eassert (gl_state.e_re_byte >= 0); /* gl_state.b_re_byte can be negative. */
+ if (byteoffset >= gl_state.e_re_byte)
+ {
+ ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (byteoffset);
+ eassert (charpos >= gl_state.e_property);
+ UPDATE_SYNTAX_TABLE_FORWARD (charpos);
+ re_update_byteoffsets ();
+ }
+}
+
+INLINE void
+RE_UPDATE_SYNTAX_TABLE_FORWARD_BEFORE (ptrdiff_t byteoffset)
+{ /* Performs just-in-time syntax-propertization. */
+ if (!parse_sexp_lookup_properties)
+ return;
+ eassert (gl_state.e_re_byte >= 0); /* gl_state.b_re_byte can be negative. */
+ if (byteoffset > gl_state.e_re_byte)
+ {
+ ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (byteoffset) - 1;
+ eassert (charpos >= gl_state.e_property);
+ UPDATE_SYNTAX_TABLE_FORWARD (charpos);
+ re_update_byteoffsets ();
+ }
+}
+
/* Make syntax table state (gl_state) good for CHARPOS, assuming it is
currently good for a position after CHARPOS. */
@@ -188,6 +256,36 @@ UPDATE_SYNTAX_TABLE_BACKWARD (ptrdiff_t charpos)
update_syntax_table (charpos, -1, false, gl_state.object);
}
+INLINE void
+RE_UPDATE_SYNTAX_TABLE_BACKWARD (ptrdiff_t byteoffset)
+{
+ if (!parse_sexp_lookup_properties)
+ return;
+ eassert (gl_state.e_re_byte >= 0); /* gl_state.b_re_byte can be negative. */
+ if (byteoffset < gl_state.b_re_byte)
+ {
+ ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (byteoffset);
+ eassert (charpos < gl_state.b_property);
+ UPDATE_SYNTAX_TABLE_FORWARD (charpos);
+ re_update_byteoffsets ();
+ }
+}
+
+INLINE void
+RE_UPDATE_SYNTAX_TABLE_BACKWARD_BEFORE (ptrdiff_t byteoffset)
+{
+ if (!parse_sexp_lookup_properties)
+ return;
+ eassert (gl_state.e_re_byte >= 0); /* gl_state.b_re_byte can be negative. */
+ if (byteoffset <= gl_state.b_re_byte)
+ {
+ ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (byteoffset);
+ eassert (charpos <= gl_state.b_property);
+ UPDATE_SYNTAX_TABLE_FORWARD (charpos - 1);
+ re_update_byteoffsets ();
+ }
+}
+
/* Make syntax table good for CHARPOS. */
INLINE void
@@ -197,6 +295,20 @@ UPDATE_SYNTAX_TABLE (ptrdiff_t charpos)
UPDATE_SYNTAX_TABLE_FORWARD (charpos);
}
+INLINE void
+RE_UPDATE_SYNTAX_TABLE (ptrdiff_t byteoffset)
+{
+ RE_UPDATE_SYNTAX_TABLE_BACKWARD (byteoffset);
+ RE_UPDATE_SYNTAX_TABLE_FORWARD (byteoffset);
+}
+
+INLINE void
+RE_UPDATE_SYNTAX_TABLE_BEFORE (ptrdiff_t byteoffset)
+{
+ RE_UPDATE_SYNTAX_TABLE_BACKWARD_BEFORE (byteoffset);
+ RE_UPDATE_SYNTAX_TABLE_FORWARD_BEFORE (byteoffset);
+}
+
/* Set up the buffer-global syntax table. */
INLINE void
^ permalink raw reply related [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-12 18:31 ` Alan Mackenzie
@ 2023-04-12 23:25 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
0 siblings, 0 replies; 81+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-12 23:25 UTC (permalink / raw)
To: Alan Mackenzie; +Cc: 58558, larsi, Ihor Radchenko, Eli Zaretskii
>> I suspect that the patch below might fix the immediate problem.
>> Of course, setting `parse-sexp-lookup-properties` should not have such
>> a major performance impact, so maybe we should keep digging into
>> the problem.
>
> Thanks! That's a nasty little bug for somebody who hasn't seen it
> before. I don't the Elisp manual is all that explicit about what
> happens in such cases.
According to my reading of the manual, it does explain what happens, but
maybe it should more specifically warn about this kind of interaction.
> Just as a matter of interest, have you searched cc-defs.el for any other
> places the same bug might occur? If not, I will.
No, among other things because I don't know a good regexp that can help
me look for it.
Stefan
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-12 23:23 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-04-13 4:33 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-13 20:05 ` Ihor Radchenko
2023-04-13 4:52 ` Eli Zaretskii
1 sibling, 1 reply; 81+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-13 4:33 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, Alan Mackenzie, Eli Zaretskii, larsi
[-- Attachment #1: Type: text/plain, Size: 471 bytes --]
>> For the former, we could probably extend the `b_property` and
>> `e_property` fields of `gl_state` (which hold charpos) to also store
>> their bytepos equivalent, which should significantly reduce the number
>> of conversions between bytepos and charpos.
> I.e. something like the patch below (which passes all tests except for
> `test/src/comp-tests` for a reason that completely escapes me).
Found the culprit!
The patch below passes `make check`.
Stefan
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: regmatch.patch --]
[-- Type: text/x-diff, Size: 14351 bytes --]
diff --git a/src/fns.c b/src/fns.c
index e92ef7e4c81..591b00103da 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -1194,6 +1194,8 @@ string_char_to_byte (Lisp_Object string, ptrdiff_t char_index)
if (best_above == best_above_byte)
return char_index;
+ eassert (char_index >= 0 && char_index <= best_above);
+
if (BASE_EQ (string, string_char_byte_cache_string))
{
if (string_char_byte_cache_charpos < char_index)
@@ -1254,6 +1256,8 @@ string_byte_to_char (Lisp_Object string, ptrdiff_t byte_index)
if (best_above == best_above_byte)
return byte_index;
+ eassert (byte_index >= 0 && byte_index <= best_above_byte);
+
if (BASE_EQ (string, string_char_byte_cache_string))
{
if (string_char_byte_cache_bytepos < byte_index)
diff --git a/src/regex-emacs.c b/src/regex-emacs.c
index 746779490ad..f75f805cd9c 100644
--- a/src/regex-emacs.c
+++ b/src/regex-emacs.c
@@ -3979,7 +3979,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
/* Prevent shrinking and relocation of buffer text if GC happens
while we are inside this function. The calls to
- UPDATE_SYNTAX_TABLE_* macros can call Lisp (via
+ RE_UPDATE_SYNTAX_TABLE_* macros can call Lisp (via
`internal--syntax-propertize`); these calls are careful to defend against
buffer modifications, but even with no modifications, the buffer text may
be relocated during GC by `compact_buffer` which would invalidate
@@ -4792,12 +4792,11 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
int s1, s2;
int dummy;
ptrdiff_t offset = POINTER_TO_OFFSET (d);
- ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset) - 1;
- UPDATE_SYNTAX_TABLE (charpos);
+ RE_UPDATE_SYNTAX_TABLE_BEFORE (offset);
GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2);
nchars++;
s1 = SYNTAX (c1);
- UPDATE_SYNTAX_TABLE_FORWARD (charpos + 1);
+ RE_UPDATE_SYNTAX_TABLE_FORWARD (offset);
PREFETCH_NOLIMIT ();
GET_CHAR_AFTER (c2, d, dummy);
nchars++;
@@ -4832,8 +4831,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
int s1, s2;
int dummy;
ptrdiff_t offset = POINTER_TO_OFFSET (d);
- ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset);
- UPDATE_SYNTAX_TABLE (charpos);
+ RE_UPDATE_SYNTAX_TABLE (offset);
PREFETCH ();
GET_CHAR_AFTER (c2, d, dummy);
nchars++;
@@ -4848,7 +4846,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
{
GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2);
nchars++;
- UPDATE_SYNTAX_TABLE_BACKWARD (charpos - 1);
+ RE_UPDATE_SYNTAX_TABLE_BACKWARD_BEFORE (offset);
s1 = SYNTAX (c1);
/* ... and S1 is Sword, and WORD_BOUNDARY_P (C1, C2)
@@ -4875,8 +4873,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
int s1, s2;
int dummy;
ptrdiff_t offset = POINTER_TO_OFFSET (d);
- ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset) - 1;
- UPDATE_SYNTAX_TABLE (charpos);
+ RE_UPDATE_SYNTAX_TABLE_BEFORE (offset);
GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2);
nchars++;
s1 = SYNTAX (c1);
@@ -4891,13 +4888,13 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
PREFETCH_NOLIMIT ();
GET_CHAR_AFTER (c2, d, dummy);
nchars++;
- UPDATE_SYNTAX_TABLE_FORWARD (charpos + 1);
+ RE_UPDATE_SYNTAX_TABLE_FORWARD (offset);
s2 = SYNTAX (c2);
/* ... and S2 is Sword, and WORD_BOUNDARY_P (C1, C2)
returns 0. */
if ((s2 == Sword) && !WORD_BOUNDARY_P (c1, c2))
- goto fail;
+ goto fail;
}
}
break;
@@ -4917,8 +4914,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
int c1, c2;
int s1, s2;
ptrdiff_t offset = POINTER_TO_OFFSET (d);
- ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset);
- UPDATE_SYNTAX_TABLE (charpos);
+ RE_UPDATE_SYNTAX_TABLE (offset);
PREFETCH ();
c2 = RE_STRING_CHAR (d, target_multibyte);
nchars++;
@@ -4933,7 +4929,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
{
GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2);
nchars++;
- UPDATE_SYNTAX_TABLE_BACKWARD (charpos - 1);
+ RE_UPDATE_SYNTAX_TABLE_BACKWARD_BEFORE (offset);
s1 = SYNTAX (c1);
/* ... and S1 is Sword or Ssymbol. */
@@ -4958,8 +4954,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
int c1, c2;
int s1, s2;
ptrdiff_t offset = POINTER_TO_OFFSET (d);
- ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset) - 1;
- UPDATE_SYNTAX_TABLE (charpos);
+ RE_UPDATE_SYNTAX_TABLE_BEFORE (offset);
GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2);
nchars++;
s1 = SYNTAX (c1);
@@ -4974,7 +4969,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
PREFETCH_NOLIMIT ();
c2 = RE_STRING_CHAR (d, target_multibyte);
nchars++;
- UPDATE_SYNTAX_TABLE_FORWARD (charpos + 1);
+ RE_UPDATE_SYNTAX_TABLE_FORWARD (offset);
s2 = SYNTAX (c2);
/* ... and S2 is Sword or Ssymbol. */
@@ -4994,8 +4989,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp,
PREFETCH ();
{
ptrdiff_t offset = POINTER_TO_OFFSET (d);
- ptrdiff_t pos1 = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset);
- UPDATE_SYNTAX_TABLE (pos1);
+ RE_UPDATE_SYNTAX_TABLE (offset);
}
{
int len;
diff --git a/src/syntax.c b/src/syntax.c
index e9e04e2d638..fbd08c74092 100644
--- a/src/syntax.c
+++ b/src/syntax.c
@@ -250,6 +250,8 @@ SETUP_SYNTAX_TABLE (ptrdiff_t from, ptrdiff_t count)
gl_state.b_property = BEGV;
gl_state.e_property = ZV + 1;
gl_state.object = Qnil;
+ gl_state.b_re_byte = -1;
+ gl_state.e_re_byte = -1;
if (parse_sexp_lookup_properties)
{
if (count > 0)
@@ -265,14 +267,15 @@ SETUP_SYNTAX_TABLE (ptrdiff_t from, ptrdiff_t count)
/* Same as above, but in OBJECT. If OBJECT is nil, use current buffer.
If it is t (which is only used in fast_c_string_match_ignore_case),
ignore properties altogether.
- FROMBYTE is an regexp-byteoffset. */
+ FROMBYTE is a regexp-byteoffset. */
void
-RE_SETUP_SYNTAX_TABLE_FOR_OBJECT (Lisp_Object object,
- ptrdiff_t frombyte)
+RE_SETUP_SYNTAX_TABLE_FOR_OBJECT (Lisp_Object object, ptrdiff_t frombyte)
{
SETUP_BUFFER_SYNTAX_TABLE ();
gl_state.object = object;
+ gl_state.b_re_byte = -1;
+ gl_state.e_re_byte = -1;
if (BUFFERP (gl_state.object))
{
struct buffer *buf = XBUFFER (gl_state.object);
@@ -282,21 +285,25 @@ RE_SETUP_SYNTAX_TABLE_FOR_OBJECT (Lisp_Object object,
else if (NILP (gl_state.object))
{
gl_state.b_property = BEG;
- gl_state.e_property = ZV; /* FIXME: Why not +1 like in SETUP_SYNTAX_TABLE? */
+ gl_state.e_property = ZV;
}
else if (EQ (gl_state.object, Qt))
{
gl_state.b_property = 0;
- gl_state.e_property = PTRDIFF_MAX;
+ /* -1 so we can do +1 in `re_update_byteoffsets`. */
+ gl_state.e_property = PTRDIFF_MAX - 1;
}
else
{
gl_state.b_property = 0;
- gl_state.e_property = 1 + SCHARS (gl_state.object);
+ gl_state.e_property = SCHARS (gl_state.object);
}
if (parse_sexp_lookup_properties)
- update_syntax_table (RE_SYNTAX_TABLE_BYTE_TO_CHAR (frombyte),
- 1, 1, gl_state.object);
+ {
+ update_syntax_table (RE_SYNTAX_TABLE_BYTE_TO_CHAR (frombyte),
+ 1, 1, gl_state.object);
+ re_update_byteoffsets ();
+ }
}
/* Update gl_state to an appropriate interval which contains CHARPOS. The
diff --git a/src/syntax.h b/src/syntax.h
index 01982be25a0..420ba8f31dc 100644
--- a/src/syntax.h
+++ b/src/syntax.h
@@ -66,7 +66,7 @@ #define Vstandard_syntax_table BVAR (&buffer_defaults, syntax_table)
struct gl_state_s
{
Lisp_Object object; /* The object we are scanning. */
- ptrdiff_t start; /* Where to stop. */
+ ptrdiff_t start; /* Where to stop(?FIXME?). */
ptrdiff_t stop; /* Where to stop. */
bool use_global; /* Whether to use global_code
or c_s_t. */
@@ -85,6 +85,11 @@ #define Vstandard_syntax_table BVAR (&buffer_defaults, syntax_table)
and possibly at the
intervals too, depending
on: */
+ /* The regexp engine prefers byteoffsets over char positions, so
+ store those to try and reduce the number of byte<->char conversions.
+ This is only kept uptodate when used from the regexp engine. */
+ ptrdiff_t b_re_byte; /* First byteoffset where c_s_t is valid. */
+ ptrdiff_t e_re_byte; /* First byteoffset where c_s_t is not valid. */
};
extern struct gl_state_s gl_state;
@@ -145,19 +150,14 @@ SYNTAX (int c)
extern unsigned char const syntax_spec_code[0400];
-/* Convert the regexp's BYTEOFFSET into a character position,
- for the object recorded in gl_state with RE_SETUP_SYNTAX_TABLE_FOR_OBJECT.
-
- The value is meant for use in code that does nothing when
- parse_sexp_lookup_properties is false, so return 0 in that case,
- for speed. */
+/* Convert the BYTEOFFSET into a character position, for the object
+ recorded in gl_state with RE_SETUP_SYNTAX_TABLE_FOR_OBJECT. */
INLINE ptrdiff_t
RE_SYNTAX_TABLE_BYTE_TO_CHAR (ptrdiff_t byteoffset)
{
- return (! parse_sexp_lookup_properties
- ? 0
- : STRINGP (gl_state.object)
+ eassert (parse_sexp_lookup_properties);
+ return (STRINGP (gl_state.object)
? string_byte_to_char (gl_state.object, byteoffset)
: BUFFERP (gl_state.object)
? ((buf_bytepos_to_charpos
@@ -168,6 +168,44 @@ RE_SYNTAX_TABLE_BYTE_TO_CHAR (ptrdiff_t byteoffset)
: byteoffset);
}
+INLINE ptrdiff_t
+RE_SYNTAX_TABLE_CHAR_TO_BYTE (ptrdiff_t charpos)
+{
+ eassert (parse_sexp_lookup_properties);
+ return (STRINGP (gl_state.object)
+ ? string_char_to_byte (gl_state.object, charpos)
+ : BUFFERP (gl_state.object)
+ ? ((buf_charpos_to_bytepos
+ (XBUFFER (gl_state.object), charpos)
+ - BUF_BEGV_BYTE (XBUFFER (gl_state.object))))
+ : NILP (gl_state.object)
+ ? CHAR_TO_BYTE (charpos) - BEGV_BYTE
+ : charpos);
+}
+
+static void re_update_byteoffsets (void)
+{
+ gl_state.b_re_byte = RE_SYNTAX_TABLE_CHAR_TO_BYTE (gl_state.b_property);
+ eassert (gl_state.b_property
+ == RE_SYNTAX_TABLE_BYTE_TO_CHAR (gl_state.b_re_byte));
+ /* `e_property` is often set to EOB+1 (or to some value
+ much further than `stop` in narrowed buffers). */
+ gl_state.e_re_byte
+ = gl_state.e_property > gl_state.stop
+ ? 1 + RE_SYNTAX_TABLE_CHAR_TO_BYTE (gl_state.stop)
+ : RE_SYNTAX_TABLE_CHAR_TO_BYTE (gl_state.e_property);
+ eassert (gl_state.e_property > gl_state.stop
+ ? gl_state.e_property
+ >= 1 + RE_SYNTAX_TABLE_BYTE_TO_CHAR (gl_state.e_re_byte - 1)
+ : gl_state.e_property
+ == RE_SYNTAX_TABLE_BYTE_TO_CHAR (gl_state.e_re_byte));
+}
+
+/* The regexp-engine doesn't keep track of char positions, but instead
+ uses byteoffsets, so `syntax.c` uses `UPDATE_SYNTAX_TABLE_*` functions,
+ passing them `charpos`s whereas `regexp.c` uses `RE_UPDATE_SYNTAX_TABLE_*`
+ functions, passing them byteoffsets. */
+
/* Make syntax table state (gl_state) good for CHARPOS, assuming it is
currently good for a position before CHARPOS. */
@@ -178,6 +216,36 @@ UPDATE_SYNTAX_TABLE_FORWARD (ptrdiff_t charpos)
update_syntax_table_forward (charpos, false, gl_state.object);
}
+INLINE void
+RE_UPDATE_SYNTAX_TABLE_FORWARD (ptrdiff_t byteoffset)
+{ /* Performs just-in-time syntax-propertization. */
+ if (!parse_sexp_lookup_properties)
+ return;
+ eassert (gl_state.e_re_byte >= 0); /* gl_state.b_re_byte can be negative. */
+ if (byteoffset >= gl_state.e_re_byte)
+ {
+ ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (byteoffset);
+ eassert (charpos >= gl_state.e_property);
+ UPDATE_SYNTAX_TABLE_FORWARD (charpos);
+ re_update_byteoffsets ();
+ }
+}
+
+INLINE void
+RE_UPDATE_SYNTAX_TABLE_FORWARD_BEFORE (ptrdiff_t byteoffset)
+{ /* Performs just-in-time syntax-propertization. */
+ if (!parse_sexp_lookup_properties)
+ return;
+ eassert (gl_state.e_re_byte >= 0); /* gl_state.b_re_byte can be negative. */
+ if (byteoffset > gl_state.e_re_byte)
+ {
+ ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (byteoffset) - 1;
+ eassert (charpos >= gl_state.e_property);
+ UPDATE_SYNTAX_TABLE_FORWARD (charpos);
+ re_update_byteoffsets ();
+ }
+}
+
/* Make syntax table state (gl_state) good for CHARPOS, assuming it is
currently good for a position after CHARPOS. */
@@ -188,6 +256,36 @@ UPDATE_SYNTAX_TABLE_BACKWARD (ptrdiff_t charpos)
update_syntax_table (charpos, -1, false, gl_state.object);
}
+INLINE void
+RE_UPDATE_SYNTAX_TABLE_BACKWARD (ptrdiff_t byteoffset)
+{
+ if (!parse_sexp_lookup_properties)
+ return;
+ eassert (gl_state.e_re_byte >= 0); /* gl_state.b_re_byte can be negative. */
+ if (byteoffset < gl_state.b_re_byte)
+ {
+ ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (byteoffset);
+ eassert (charpos < gl_state.b_property);
+ UPDATE_SYNTAX_TABLE_FORWARD (charpos);
+ re_update_byteoffsets ();
+ }
+}
+
+INLINE void
+RE_UPDATE_SYNTAX_TABLE_BACKWARD_BEFORE (ptrdiff_t byteoffset)
+{
+ if (!parse_sexp_lookup_properties)
+ return;
+ eassert (gl_state.e_re_byte >= 0); /* gl_state.b_re_byte can be negative. */
+ if (byteoffset <= gl_state.b_re_byte)
+ {
+ ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (byteoffset);
+ eassert (charpos <= gl_state.b_property);
+ UPDATE_SYNTAX_TABLE_FORWARD (charpos - 1);
+ re_update_byteoffsets ();
+ }
+}
+
/* Make syntax table good for CHARPOS. */
INLINE void
@@ -197,6 +295,20 @@ UPDATE_SYNTAX_TABLE (ptrdiff_t charpos)
UPDATE_SYNTAX_TABLE_FORWARD (charpos);
}
+INLINE void
+RE_UPDATE_SYNTAX_TABLE (ptrdiff_t byteoffset)
+{
+ RE_UPDATE_SYNTAX_TABLE_BACKWARD (byteoffset);
+ RE_UPDATE_SYNTAX_TABLE_FORWARD (byteoffset);
+}
+
+INLINE void
+RE_UPDATE_SYNTAX_TABLE_BEFORE (ptrdiff_t byteoffset)
+{
+ RE_UPDATE_SYNTAX_TABLE_BACKWARD_BEFORE (byteoffset);
+ RE_UPDATE_SYNTAX_TABLE_FORWARD_BEFORE (byteoffset);
+}
+
/* Set up the buffer-global syntax table. */
INLINE void
^ permalink raw reply related [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-12 13:39 ` Ihor Radchenko
2023-04-12 14:06 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-04-13 4:43 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-13 12:09 ` Ihor Radchenko
1 sibling, 1 reply; 81+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-13 4:43 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: 58558, Eli Zaretskii, larsi
> parse_sexp_lookup_properties looks suspicious, so I checked the value of
> parse-sexp-lookup-properties in Org files on master vs. Emacs 28.
>
> On master, the value is t, even though Org mode does not set this
> variable. On Emacs 28, the value is nil.
Any chance you can now give a reproducible recipe of you
big&progressive slowdown?
Stefan
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-12 23:23 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-13 4:33 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-04-13 4:52 ` Eli Zaretskii
2023-04-13 5:15 ` Eli Zaretskii
1 sibling, 1 reply; 81+ messages in thread
From: Eli Zaretskii @ 2023-04-13 4:52 UTC (permalink / raw)
To: Stefan Monnier, Andrea Corallo; +Cc: 58558, acm, yantar92, larsi
> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Alan Mackenzie <acm@muc.de>, Eli Zaretskii <eliz@gnu.org>,
> larsi@gnus.org, 58558@debbugs.gnu.org
> Date: Wed, 12 Apr 2023 19:23:19 -0400
>
> > For the former, we could probably extend the `b_property` and
> > `e_property` fields of `gl_state` (which hold charpos) to also store
> > their bytepos equivalent, which should significantly reduce the number
> > of conversions between bytepos and charpos.
>
> I.e. something like the patch below (which passes all tests except for
> `test/src/comp-tests` for a reason that completely escapes me).
Andrea, could you please help Stefan with that test failure?
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-13 4:52 ` Eli Zaretskii
@ 2023-04-13 5:15 ` Eli Zaretskii
0 siblings, 0 replies; 81+ messages in thread
From: Eli Zaretskii @ 2023-04-13 5:15 UTC (permalink / raw)
To: akrl; +Cc: 58558, acm, yantar92, larsi, monnier
> Cc: 58558@debbugs.gnu.org, acm@muc.de, yantar92@posteo.net, larsi@gnus.org
> Date: Thu, 13 Apr 2023 07:52:43 +0300
> From: Eli Zaretskii <eliz@gnu.org>
>
> > From: Stefan Monnier <monnier@iro.umontreal.ca>
> > Cc: Alan Mackenzie <acm@muc.de>, Eli Zaretskii <eliz@gnu.org>,
> > larsi@gnus.org, 58558@debbugs.gnu.org
> > Date: Wed, 12 Apr 2023 19:23:19 -0400
> >
> > > For the former, we could probably extend the `b_property` and
> > > `e_property` fields of `gl_state` (which hold charpos) to also store
> > > their bytepos equivalent, which should significantly reduce the number
> > > of conversions between bytepos and charpos.
> >
> > I.e. something like the patch below (which passes all tests except for
> > `test/src/comp-tests` for a reason that completely escapes me).
>
> Andrea, could you please help Stefan with that test failure?
No need, as Stefan has found the problem.
Thanks anyway.
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-13 4:43 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-04-13 12:09 ` Ihor Radchenko
0 siblings, 0 replies; 81+ messages in thread
From: Ihor Radchenko @ 2023-04-13 12:09 UTC (permalink / raw)
To: Stefan Monnier; +Cc: 58558, Eli Zaretskii, larsi
Stefan Monnier <monnier@iro.umontreal.ca> writes:
> Any chance you can now give a reproducible recipe of you
> big&progressive slowdown?
No, unfortunately.
I can only see the slowdown with a specific Org file.
My attempts to obfuscate it for sharing made the progressive slowdown
disappear.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers
2023-04-13 4:33 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-04-13 20:05 ` Ihor Radchenko
0 siblings, 0 replies; 81+ messages in thread
From: Ihor Radchenko @ 2023-04-13 20:05 UTC (permalink / raw)
To: Stefan Monnier; +Cc: 58558, Alan Mackenzie, Eli Zaretskii, larsi
Stefan Monnier <monnier@iro.umontreal.ca> writes:
> diff --git a/src/fns.c b/src/fns.c
> index e92ef7e4c81..591b00103da 100644
> --- a/src/fns.c
> +++ b/src/fns.c
With this patch, I see no significant difference in time taken by re-search-forward
with and without parse-sexp-lookup-properties:
patch + workaround (setting parse-sexp-lookup-properties to nil)
Re-search time: 0.735682 sec.
patch + no workaround (leaving parse-sexp-lookup-properties to t)
Re-search time: 0.841605 sec.
no patch + workaround
Re-search time: 0.745678 sec.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 81+ messages in thread
end of thread, other threads:[~2023-04-13 20:05 UTC | newest]
Thread overview: 81+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-16 1:26 bug#58558: 29.0.50; re-search-forward is slow in some buffers Ihor Radchenko
2022-10-16 9:19 ` Lars Ingebrigtsen
2022-10-16 9:34 ` Ihor Radchenko
2022-10-16 9:37 ` Lars Ingebrigtsen
2022-10-16 10:02 ` Ihor Radchenko
2022-10-16 10:04 ` Lars Ingebrigtsen
2022-10-16 10:53 ` Ihor Radchenko
2022-10-16 11:01 ` Lars Ingebrigtsen
2022-10-16 11:21 ` Eli Zaretskii
2022-10-16 14:23 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-10-17 0:56 ` Ihor Radchenko
2022-10-18 11:50 ` Lars Ingebrigtsen
2022-10-18 14:58 ` Eli Zaretskii
2022-10-18 18:19 ` Lars Ingebrigtsen
2022-10-18 18:38 ` Eli Zaretskii
2022-12-13 10:28 ` Ihor Radchenko
2022-12-13 13:11 ` Eli Zaretskii
2022-12-13 13:32 ` Ihor Radchenko
2022-12-13 14:28 ` Eli Zaretskii
2022-12-13 15:56 ` Ihor Radchenko
2022-12-13 16:08 ` Eli Zaretskii
2022-12-13 17:43 ` Ihor Radchenko
2022-12-13 17:52 ` Eli Zaretskii
2022-12-13 18:03 ` Ihor Radchenko
2022-12-13 20:02 ` Eli Zaretskii
2022-12-14 11:40 ` Ihor Radchenko
2022-12-14 13:06 ` Eli Zaretskii
2022-12-14 13:23 ` Ihor Radchenko
2022-12-14 13:32 ` Eli Zaretskii
2022-12-14 13:39 ` Ihor Radchenko
2022-12-14 14:12 ` Eli Zaretskii
2022-12-13 18:15 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-12-13 18:40 ` Ihor Radchenko
2022-12-13 19:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-12-13 20:21 ` Eli Zaretskii
2022-12-14 11:42 ` Ihor Radchenko
2022-12-13 17:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-12-14 12:00 ` Ihor Radchenko
2022-12-14 12:23 ` Ihor Radchenko
2022-12-14 13:10 ` Eli Zaretskii
2022-12-14 13:26 ` Ihor Radchenko
2022-12-14 13:57 ` Eli Zaretskii
2022-12-14 14:01 ` Ihor Radchenko
2023-04-06 11:49 ` Ihor Radchenko
2023-04-06 12:05 ` Eli Zaretskii
2023-04-09 19:54 ` Ihor Radchenko
2023-04-10 4:14 ` Eli Zaretskii
2023-04-10 12:24 ` Ihor Radchenko
2023-04-10 13:40 ` Eli Zaretskii
2023-04-10 14:55 ` Ihor Radchenko
2023-04-10 16:04 ` Eli Zaretskii
2023-04-10 14:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-11 11:29 ` Ihor Radchenko
2023-04-11 11:51 ` Eli Zaretskii
2023-04-12 13:39 ` Ihor Radchenko
2023-04-12 14:06 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-12 14:30 ` Eli Zaretskii
2023-04-12 14:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-12 15:22 ` Eli Zaretskii
2023-04-12 15:59 ` Alan Mackenzie
2023-04-12 14:38 ` Stephen Berman
2023-04-12 14:42 ` Ihor Radchenko
2023-04-12 14:39 ` Ihor Radchenko
2023-04-12 15:20 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-12 23:23 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-13 4:33 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-13 20:05 ` Ihor Radchenko
2023-04-13 4:52 ` Eli Zaretskii
2023-04-13 5:15 ` Eli Zaretskii
2023-04-12 18:31 ` Alan Mackenzie
2023-04-12 23:25 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-13 4:43 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-04-13 12:09 ` Ihor Radchenko
2022-12-13 13:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-10-16 10:36 ` Eli Zaretskii
2023-02-19 12:17 ` Dmitry Gutov
2023-02-20 10:24 ` Ihor Radchenko
2023-02-20 14:54 ` Dmitry Gutov
2023-04-10 8:48 ` Mattias Engdegård
2023-04-10 9:57 ` Ihor Radchenko
2023-04-10 10:05 ` Mattias Engdegård
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).