* bug#58558: 29.0.50; re-search-forward is slow in some buffers @ 2022-10-16 1:26 Ihor Radchenko 2022-10-16 9:19 ` Lars Ingebrigtsen 2023-04-10 8:48 ` Mattias Engdegård 0 siblings, 2 replies; 81+ messages in thread From: Ihor Radchenko @ 2022-10-16 1:26 UTC (permalink / raw) To: 58558 Hi, I am consistently experiencing a significant slowdown of regexp search in large buffers in Emacs 29 (master and noverlay), but not on Emacs 28: ELP data: ;; Emacs 29 ;; re-search-forward 181593 10.090536098 5.556...e-05 ;; re-search-forward 180625 8.7113028330 4.822...e-05 ;; re-search-forward 177357 9.7315074570 5.486...e-05 ;; Emacs 28 ;; re-search-forward 171661 2.7219785009 1.585...e-05 (up to 4x slowdown) It happens consistently in Emacs 29, but not in all buffers. Sometimes, it only happens after some time after Emacs startup. The slowdown is not there in Emacs 28. The issue started long time ago (over a year), but all my attempts to bisect the problem failed or landed on inconsistent bad commits. The above slowdown should have nothing to do with ELP overheads. I tested agenda generation times (agenda uses a huge number of regexp searches) with the following results from manually wrapping re-search-forward calls into time accumulator: Emacs 29. Note re-search time ;; Mapped over elements in #<buffer notes.org>. 33/5592 predicate matches. Total time: 8.788400 sec. Pre-process time: 0.000000 sec. Predicate time: 0.604878 sec. Re-search time: 8.023365 sec. ;; Calling parameters: :granularity headline+inlinetask :restrict-elements (headline inlinetask) :next-re "\\(?:\\(?:\\<DEADLINE: *\\(\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\)\\)\\|\\(?:\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\|^\\*+[[:blank:]]+\\(?:[[:upper:]]+[[:blank:]]+\\)?\\[#A]\\|^[[:space:]]*:STYLE:[[:space:]]+habit[[:space:]]*$\\)\\)" :fail-re "\\(?:\\(?:\\<DEADLINE: *\\(\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\)\\)\\|\\(?:\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\|^\\*+[[:blank:]]+\\(?:[[:upper:]]+[[:blank:]]+\\)?\\[#A]\\|^[[:space:]]*:STYLE:[[:space:]]+habit[[:space:]]*$\\)\\)" :from-pos 321 :to-pos #<marker at 21071050 in notes.org> :limit-count nil :after-element nil Emacs 28. Note re-search time ;; Mapped over elements in #<buffer notes.org>. 33/5592 predicate matches. Total time: 1.396713 sec. Pre-process time: 0.000000 sec. Predicate time: 0.544486 sec. Re-search time: 0.708682 sec. ;; Calling parameters: :granularity headline+inlinetask :restrict-elements (headline inlinetask) :next-re "\\(?:\\(?:\\<DEADLINE: *\\(\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\)\\)\\|\\(?:\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\|^\\*+[[:blank:]]+\\(?:[[:upper:]]+[[:blank:]]+\\)?\\[#A]\\|^[[:space:]]*:STYLE:[[:space:]]+habit[[:space:]]*$\\)\\)" :fail-re "\\(?:\\(?:\\<DEADLINE: *\\(\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\)\\)\\|\\(?:\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\|^\\*+[[:blank:]]+\\(?:[[:upper:]]+[[:blank:]]+\\)?\\[#A]\\|^[[:space:]]*:STYLE:[[:space:]]+habit[[:space:]]*$\\)\\)" :from-pos 321 :to-pos #<marker at 21071050 in notes.org> :limit-count nil :after-element nil Any idea what might be going on or how to debug this further? In GNU Emacs 29.0.50 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.34, cairo version 1.16.0) of 2022-10-15 built on yantar92-laptop Repository revision: b86505387480fed81629cbc81cef6b70098bd607 Repository branch: feature/noverlay Windowing system distributor 'The X.Org Foundation', version 11.0.12101004 System Description: Gentoo Linux Configured features: ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG JSON LCMS2 LIBXML2 MODULES NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS WEBP X11 XDBE XIM XINPUT2 XPM GTK3 ZLIB Important settings: value of $LC_COLLATE: C value of $LANG: en_US.utf8 locale-coding-system: utf-8-unix Major mode: Lisp Interaction Minor modes in effect: windmove-mode: t TeX-PDF-mode: t pyvenv-mode: t git-email-notmuch-mode: t git-email-piem-mode: t piem-notmuch-mode: t org-edna-mode: t eros-mode: t pdf-occur-global-minor-mode: t which-key-mode: t diredfl-global-mode: t dired-async-mode: t winner-mode: t eval-sexp-fu-flash-mode: t global-flycheck-mode: t flycheck-mode: t el-patch-use-package-mode: t global-git-commit-mode: t magit-auto-revert-mode: t recentf-mode: t hl-todo-mode: t pretty-symbols-mode: t company-mode: t persistent-scratch-autosave-mode: t savehist-mode: t helm-adaptive-mode: t helm-mode: t helm-minibuffer-history-mode: t helm-ff-icon-mode: t shell-dirtrack-mode: t helm--remap-mouse-mode: t async-bytecomp-package-mode: t boon-mode: t boon-local-mode: t global-hl-line-mode: t global-page-break-lines-mode: t page-break-lines-mode: t shackle-mode: t override-global-mode: t straight-use-package-mode: t straight-package-neutering-mode: t global-eldoc-mode: t eldoc-mode: t show-paren-mode: t electric-indent-mode: t mouse-wheel-mode: t global-prettify-symbols-mode: t prettify-symbols-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t window-divider-mode: t line-number-mode: t indent-tabs-mode: t transient-mark-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t abbrev-mode: t Load-path shadows: /home/yantar92/.emacs.d/straight/build/transient/transient hides /home/yantar92/Git/emacs/lisp/transient /home/yantar92/.emacs.d/straight/build/xref/xref hides /home/yantar92/Git/emacs/lisp/progmodes/xref /home/yantar92/.emacs.d/straight/build/project/project hides /home/yantar92/Git/emacs/lisp/progmodes/project /home/yantar92/.emacs.d/straight/build/org/ox-publish hides /home/yantar92/Git/emacs/lisp/org/ox-publish /home/yantar92/.emacs.d/straight/build/org/ox-org hides /home/yantar92/Git/emacs/lisp/org/ox-org /home/yantar92/.emacs.d/straight/build/org/ox-odt hides /home/yantar92/Git/emacs/lisp/org/ox-odt /home/yantar92/.emacs.d/straight/build/org/org hides /home/yantar92/Git/emacs/lisp/org/org /home/yantar92/.emacs.d/straight/build/org/ox-md hides /home/yantar92/Git/emacs/lisp/org/ox-md /home/yantar92/.emacs.d/straight/build/org/ox-man hides /home/yantar92/Git/emacs/lisp/org/ox-man /home/yantar92/.emacs.d/straight/build/org/ox-latex hides /home/yantar92/Git/emacs/lisp/org/ox-latex /home/yantar92/.emacs.d/straight/build/org/ox-koma-letter hides /home/yantar92/Git/emacs/lisp/org/ox-koma-letter /home/yantar92/.emacs.d/straight/build/org/ox-icalendar hides /home/yantar92/Git/emacs/lisp/org/ox-icalendar /home/yantar92/.emacs.d/straight/build/org/ox-html hides /home/yantar92/Git/emacs/lisp/org/ox-html /home/yantar92/.emacs.d/straight/build/org/ox-ascii hides /home/yantar92/Git/emacs/lisp/org/ox-ascii /home/yantar92/.emacs.d/straight/build/org/ox-beamer hides /home/yantar92/Git/emacs/lisp/org/ox-beamer /home/yantar92/.emacs.d/straight/build/org/org-timer hides /home/yantar92/Git/emacs/lisp/org/org-timer /home/yantar92/.emacs.d/straight/build/org/org-tempo hides /home/yantar92/Git/emacs/lisp/org/org-tempo /home/yantar92/.emacs.d/straight/build/org/org-table hides /home/yantar92/Git/emacs/lisp/org/org-table /home/yantar92/.emacs.d/straight/build/org/org-src hides /home/yantar92/Git/emacs/lisp/org/org-src /home/yantar92/.emacs.d/straight/build/org/org-protocol hides /home/yantar92/Git/emacs/lisp/org/org-protocol /home/yantar92/.emacs.d/straight/build/org/org-plot hides /home/yantar92/Git/emacs/lisp/org/org-plot /home/yantar92/.emacs.d/straight/build/org/org-refile hides /home/yantar92/Git/emacs/lisp/org/org-refile /home/yantar92/.emacs.d/straight/build/org/org-mouse hides /home/yantar92/Git/emacs/lisp/org/org-mouse /home/yantar92/.emacs.d/straight/build/org/org-num hides /home/yantar92/Git/emacs/lisp/org/org-num /home/yantar92/.emacs.d/straight/build/org/org-mobile hides /home/yantar92/Git/emacs/lisp/org/org-mobile /home/yantar92/.emacs.d/straight/build/org/org-lint hides /home/yantar92/Git/emacs/lisp/org/org-lint /home/yantar92/.emacs.d/straight/build/org/org-pcomplete hides /home/yantar92/Git/emacs/lisp/org/org-pcomplete /home/yantar92/.emacs.d/straight/build/org/org-inlinetask hides /home/yantar92/Git/emacs/lisp/org/org-inlinetask /home/yantar92/.emacs.d/straight/build/org/org-list hides /home/yantar92/Git/emacs/lisp/org/org-list /home/yantar92/.emacs.d/straight/build/org/org-indent hides /home/yantar92/Git/emacs/lisp/org/org-indent /home/yantar92/.emacs.d/straight/build/org/org-macs hides /home/yantar92/Git/emacs/lisp/org/org-macs /home/yantar92/.emacs.d/straight/build/org/org-id hides /home/yantar92/Git/emacs/lisp/org/org-id /home/yantar92/.emacs.d/straight/build/org/org-loaddefs hides /home/yantar92/Git/emacs/lisp/org/org-loaddefs /home/yantar92/.emacs.d/straight/build/org/org-habit hides /home/yantar92/Git/emacs/lisp/org/org-habit /home/yantar92/.emacs.d/straight/build/org/org-goto hides /home/yantar92/Git/emacs/lisp/org/org-goto /home/yantar92/.emacs.d/straight/build/org/org-keys hides /home/yantar92/Git/emacs/lisp/org/org-keys /home/yantar92/.emacs.d/straight/build/org/org-feed hides /home/yantar92/Git/emacs/lisp/org/org-feed /home/yantar92/.emacs.d/straight/build/org/org-datetree hides /home/yantar92/Git/emacs/lisp/org/org-datetree /home/yantar92/.emacs.d/straight/build/org/org-ctags hides /home/yantar92/Git/emacs/lisp/org/org-ctags /home/yantar92/.emacs.d/straight/build/org/org-agenda hides /home/yantar92/Git/emacs/lisp/org/org-agenda /home/yantar92/.emacs.d/straight/build/org/org-footnote hides /home/yantar92/Git/emacs/lisp/org/org-footnote /home/yantar92/.emacs.d/straight/build/org/org-faces hides /home/yantar92/Git/emacs/lisp/org/org-faces /home/yantar92/.emacs.d/straight/build/org/org-entities hides /home/yantar92/Git/emacs/lisp/org/org-entities /home/yantar92/.emacs.d/straight/build/org/org-duration hides /home/yantar92/Git/emacs/lisp/org/org-duration /home/yantar92/.emacs.d/straight/build/org/org-colview hides /home/yantar92/Git/emacs/lisp/org/org-colview /home/yantar92/.emacs.d/straight/build/org/org-compat hides /home/yantar92/Git/emacs/lisp/org/org-compat /home/yantar92/.emacs.d/straight/build/org/org-clock hides /home/yantar92/Git/emacs/lisp/org/org-clock /home/yantar92/.emacs.d/straight/build/org/org-crypt hides /home/yantar92/Git/emacs/lisp/org/org-crypt /home/yantar92/.emacs.d/straight/build/org/org-attach-git hides /home/yantar92/Git/emacs/lisp/org/org-attach-git /home/yantar92/.emacs.d/straight/build/org/org-attach hides /home/yantar92/Git/emacs/lisp/org/org-attach /home/yantar92/.emacs.d/straight/build/org/org-capture hides /home/yantar92/Git/emacs/lisp/org/org-capture /home/yantar92/.emacs.d/straight/build/org/org-archive hides /home/yantar92/Git/emacs/lisp/org/org-archive /home/yantar92/.emacs.d/straight/build/org/ol-gnus hides /home/yantar92/Git/emacs/lisp/org/ol-gnus /home/yantar92/.emacs.d/straight/build/org/ol-w3m hides /home/yantar92/Git/emacs/lisp/org/ol-w3m /home/yantar92/.emacs.d/straight/build/org/ol-mhe hides /home/yantar92/Git/emacs/lisp/org/ol-mhe /home/yantar92/.emacs.d/straight/build/org/ol-rmail hides /home/yantar92/Git/emacs/lisp/org/ol-rmail /home/yantar92/.emacs.d/straight/build/org/ol-eww hides /home/yantar92/Git/emacs/lisp/org/ol-eww /home/yantar92/.emacs.d/straight/build/org/ol-irc hides /home/yantar92/Git/emacs/lisp/org/ol-irc /home/yantar92/.emacs.d/straight/build/org/ol-man hides /home/yantar92/Git/emacs/lisp/org/ol-man /home/yantar92/.emacs.d/straight/build/org/ol-info hides /home/yantar92/Git/emacs/lisp/org/ol-info /home/yantar92/.emacs.d/straight/build/org/ob-fortran hides /home/yantar92/Git/emacs/lisp/org/ob-fortran /home/yantar92/.emacs.d/straight/build/org/ol-eshell hides /home/yantar92/Git/emacs/lisp/org/ol-eshell /home/yantar92/.emacs.d/straight/build/org/ol-doi hides /home/yantar92/Git/emacs/lisp/org/ol-doi /home/yantar92/.emacs.d/straight/build/org/ol-docview hides /home/yantar92/Git/emacs/lisp/org/ol-docview /home/yantar92/.emacs.d/straight/build/org/ol-bibtex hides /home/yantar92/Git/emacs/lisp/org/ol-bibtex /home/yantar92/.emacs.d/straight/build/org/ol-bbdb hides /home/yantar92/Git/emacs/lisp/org/ol-bbdb /home/yantar92/.emacs.d/straight/build/org/oc-natbib hides /home/yantar92/Git/emacs/lisp/org/oc-natbib /home/yantar92/.emacs.d/straight/build/org/oc-csl hides /home/yantar92/Git/emacs/lisp/org/oc-csl /home/yantar92/.emacs.d/straight/build/org/oc-basic hides /home/yantar92/Git/emacs/lisp/org/oc-basic /home/yantar92/.emacs.d/straight/build/org/oc-biblatex hides /home/yantar92/Git/emacs/lisp/org/oc-biblatex /home/yantar92/.emacs.d/straight/build/org/ob hides /home/yantar92/Git/emacs/lisp/org/ob /home/yantar92/.emacs.d/straight/build/org/ob-tangle hides /home/yantar92/Git/emacs/lisp/org/ob-tangle /home/yantar92/.emacs.d/straight/build/org/ob-sql hides /home/yantar92/Git/emacs/lisp/org/ob-sql /home/yantar92/.emacs.d/straight/build/org/ob-sqlite hides /home/yantar92/Git/emacs/lisp/org/ob-sqlite /home/yantar92/.emacs.d/straight/build/org/ob-table hides /home/yantar92/Git/emacs/lisp/org/ob-table /home/yantar92/.emacs.d/straight/build/org/ob-shell hides /home/yantar92/Git/emacs/lisp/org/ob-shell /home/yantar92/.emacs.d/straight/build/org/ob-sed hides /home/yantar92/Git/emacs/lisp/org/ob-sed /home/yantar92/.emacs.d/straight/build/org/ob-screen hides /home/yantar92/Git/emacs/lisp/org/ob-screen /home/yantar92/.emacs.d/straight/build/org/ob-scheme hides /home/yantar92/Git/emacs/lisp/org/ob-scheme /home/yantar92/.emacs.d/straight/build/org/ob-C hides /home/yantar92/Git/emacs/lisp/org/ob-C /home/yantar92/.emacs.d/straight/build/org/ob-sass hides /home/yantar92/Git/emacs/lisp/org/ob-sass /home/yantar92/.emacs.d/straight/build/org/ob-ruby hides /home/yantar92/Git/emacs/lisp/org/ob-ruby /home/yantar92/.emacs.d/straight/build/org/ob-python hides /home/yantar92/Git/emacs/lisp/org/ob-python /home/yantar92/.emacs.d/straight/build/org/ob-processing hides /home/yantar92/Git/emacs/lisp/org/ob-processing /home/yantar92/.emacs.d/straight/build/org/ob-plantuml hides /home/yantar92/Git/emacs/lisp/org/ob-plantuml /home/yantar92/.emacs.d/straight/build/org/ob-ref hides /home/yantar92/Git/emacs/lisp/org/ob-ref /home/yantar92/.emacs.d/straight/build/org/ob-perl hides /home/yantar92/Git/emacs/lisp/org/ob-perl /home/yantar92/.emacs.d/straight/build/org/ob-octave hides /home/yantar92/Git/emacs/lisp/org/ob-octave /home/yantar92/.emacs.d/straight/build/org/ob-org hides /home/yantar92/Git/emacs/lisp/org/ob-org /home/yantar92/.emacs.d/straight/build/org/ob-ocaml hides /home/yantar92/Git/emacs/lisp/org/ob-ocaml /home/yantar92/.emacs.d/straight/build/org/ob-maxima hides /home/yantar92/Git/emacs/lisp/org/ob-maxima /home/yantar92/.emacs.d/straight/build/org/ob-matlab hides /home/yantar92/Git/emacs/lisp/org/ob-matlab /home/yantar92/.emacs.d/straight/build/org/ob-makefile hides /home/yantar92/Git/emacs/lisp/org/ob-makefile /home/yantar92/.emacs.d/straight/build/org/ob-lua hides /home/yantar92/Git/emacs/lisp/org/ob-lua /home/yantar92/.emacs.d/straight/build/org/ob-lisp hides /home/yantar92/Git/emacs/lisp/org/ob-lisp /home/yantar92/.emacs.d/straight/build/org/ob-lilypond hides /home/yantar92/Git/emacs/lisp/org/ob-lilypond /home/yantar92/.emacs.d/straight/build/org/ob-lob hides /home/yantar92/Git/emacs/lisp/org/ob-lob /home/yantar92/.emacs.d/straight/build/org/ob-latex hides /home/yantar92/Git/emacs/lisp/org/ob-latex /home/yantar92/.emacs.d/straight/build/org/ob-julia hides /home/yantar92/Git/emacs/lisp/org/ob-julia /home/yantar92/.emacs.d/straight/build/org/ob-java hides /home/yantar92/Git/emacs/lisp/org/ob-java /home/yantar92/.emacs.d/straight/build/org/ob-js hides /home/yantar92/Git/emacs/lisp/org/ob-js /home/yantar92/.emacs.d/straight/build/org/ob-haskell hides /home/yantar92/Git/emacs/lisp/org/ob-haskell /home/yantar92/.emacs.d/straight/build/org/ob-gnuplot hides /home/yantar92/Git/emacs/lisp/org/ob-gnuplot /home/yantar92/.emacs.d/straight/build/org/ob-groovy hides /home/yantar92/Git/emacs/lisp/org/ob-groovy /home/yantar92/.emacs.d/straight/build/org/ob-forth hides /home/yantar92/Git/emacs/lisp/org/ob-forth /home/yantar92/.emacs.d/straight/build/org/ob-exp hides /home/yantar92/Git/emacs/lisp/org/ob-exp /home/yantar92/.emacs.d/straight/build/org/ob-eval hides /home/yantar92/Git/emacs/lisp/org/ob-eval /home/yantar92/.emacs.d/straight/build/org/ob-eshell hides /home/yantar92/Git/emacs/lisp/org/ob-eshell /home/yantar92/.emacs.d/straight/build/org/ob-dot hides /home/yantar92/Git/emacs/lisp/org/ob-dot /home/yantar92/.emacs.d/straight/build/org/ob-ditaa hides /home/yantar92/Git/emacs/lisp/org/ob-ditaa /home/yantar92/.emacs.d/straight/build/org/ob-css hides /home/yantar92/Git/emacs/lisp/org/ob-css /home/yantar92/.emacs.d/straight/build/org/ob-core hides /home/yantar92/Git/emacs/lisp/org/ob-core /home/yantar92/.emacs.d/straight/build/org/ob-emacs-lisp hides /home/yantar92/Git/emacs/lisp/org/ob-emacs-lisp /home/yantar92/.emacs.d/straight/build/org/ob-calc hides /home/yantar92/Git/emacs/lisp/org/ob-calc /home/yantar92/.emacs.d/straight/build/org/ob-clojure hides /home/yantar92/Git/emacs/lisp/org/ob-clojure /home/yantar92/.emacs.d/straight/build/org/ob-R hides /home/yantar92/Git/emacs/lisp/org/ob-R /home/yantar92/.emacs.d/straight/build/org/ob-comint hides /home/yantar92/Git/emacs/lisp/org/ob-comint /home/yantar92/.emacs.d/straight/build/org/ob-awk hides /home/yantar92/Git/emacs/lisp/org/ob-awk /home/yantar92/.emacs.d/straight/build/org/org-element hides /home/yantar92/Git/emacs/lisp/org/org-element /home/yantar92/.emacs.d/straight/build/org/ox hides /home/yantar92/Git/emacs/lisp/org/ox /home/yantar92/.emacs.d/straight/build/org/ox-texinfo hides /home/yantar92/Git/emacs/lisp/org/ox-texinfo /home/yantar92/.emacs.d/straight/build/org/ol hides /home/yantar92/Git/emacs/lisp/org/ol /home/yantar92/.emacs.d/straight/build/org/oc hides /home/yantar92/Git/emacs/lisp/org/oc /home/yantar92/.emacs.d/straight/build/org/org-macro hides /home/yantar92/Git/emacs/lisp/org/org-macro /home/yantar92/.emacs.d/straight/build/org/org-version hides /home/yantar92/Git/emacs/lisp/org/org-version /home/yantar92/.emacs.d/straight/build/map/map hides /home/yantar92/Git/emacs/lisp/emacs-lisp/map /home/yantar92/.emacs.d/straight/build/let-alist/let-alist hides /home/yantar92/Git/emacs/lisp/emacs-lisp/let-alist Features: (shadow emacsbug org-datetree elfeed-link windmove make-mode gnuplot-context gnuplot org-test ert-x ert finder autoinsert vc-hg vc-bzr vc-src vc-sccs vc-svn vc-cvs vc-rcs log-view helm-imenu latexenc oc-bibtex textsec uni-scripts idna-mapping ucs-normalize uni-confusable textsec-check helm-ring footnote descr-text dired-open all-the-icons-dired dired-filter dired-hide-dotfiles misearch multi-isearch cal-move org-learn network-stream url-cache preview font-latex w3m-form w3m-symbol tabify latex latex-flymake tex-ispell tex-style tex pdf-sync pdf-outline pdf-links pdf-history w3m doc-view w3m-hist w3m-fb bookmark-w3m w3m-ems w3m-favicon w3m-image tab-line w3m-proc w3m-util boon-moves er-basic-expansions expand-region-core expand-region-custom tex-mode compare-w mm-archive helm-command helm-elisp helm-eval helm-x-files helm-for-files helm-bookmark helm-external helm-net boon-main boon-hl boon-arguments multiple-cursors mc-separate-operations rectangular-region-mode mc-mark-pop mc-edit-lines mc-hide-unmatched-lines-mode mc-mark-more mc-cycle-cursors multiple-cursors-core boon-regs boon-utils cl-print tramp-archive tramp-gvfs cal-iso org-duration ffap org-table-sticky-header oc-basic highlight-indentation flymake-proc flymake elpy elpy-rpc pyvenv eshell esh-cmd esh-ext esh-opt esh-proc esh-io esh-arg esh-module esh-groups esh-util elpy-shell elpy-profile elpy-django elpy-refactor grep git-email-magit magit-patch git-email-notmuch git-email-piem git-email git-email-autoloads project-autoloads xref-autoloads piem-notmuch piem piem-maildir mail-extr piem-autoloads org-crypt helm-notmuch helm-notmuch-autoloads ol-notmuch ol-notmuch-autoloads org-eldoc org-table-sticky-header-autoloads posframe posframe-autoloads ob-async ob-async-autoloads ob-latex ob-dot ob-calc calc-store calc-trail ob-gnuplot ob-ditaa ob-C cc-mode cc-fonts cc-guess cc-menus cc-cmds cc-styles cc-align cc-engine cc-langs cc-vars cc-defs cc-bytecomp ob-python ob-perl ob-org ob-shell ob-mathematica ob-mathematica-autoloads org-tempo tempo org-archive ox-md ox-beamer engrave-faces engrave-faces-autoloads ox-extra orgdiff orgdiff-autoloads doct ya-org-capture ya-org-capture-autoloads doct-autoloads org-capture-pop-frame org-capture-pop-frame-autoloads org-protocol org-analyzer-autoloads pomidor-autoloads alert-autoloads log4e-autoloads gntp-autoloads helm-org-ql helm-org org-clock org-autosort org-autosort-autoloads helm-org-contacts helm-org-contacts-autoloads org-contacts gnus-art mm-uu mml2015 gnus-sum gnus-group mm-url gnus-undo gnus-start gnus-dbus gnus-cloud nnimap nnmail mail-source utf7 nnoo gnus-spec gnus-int gnus-range gnus-win gnus org-contacts-autoloads helm-org-ql-autoloads helm-org-autoloads org-ql-search org-ql-view ov org-super-agenda org-ql peg ts org-ql-autoloads peg-autoloads ov-autoloads org-super-agenda-autoloads ts-autoloads org-quick-peek org-quick-peek-autoloads calfw-org calfw-org-autoloads calfw holidays holiday-loaddefs calfw-autoloads org-attach cdlatex reftex reftex-loaddefs reftex-vars texmathp cdlatex-autoloads org-capture-ref org-ref-url-utils org-ref org-ref-core org-ref-glossary org-ref-bibtex avy doi-utils org-ref-utils org-ref-export citeproc citeproc-itemgetters citeproc-biblatex citeproc-bibtex ol-bibtex citeproc-cite citeproc-subbibs citeproc-sort citeproc-name citeproc-formatters citeproc-number rst citeproc-proc citeproc-disamb citeproc-itemdata citeproc-generic-elements citeproc-macro citeproc-choose citeproc-date citeproc-context citeproc-prange citeproc-style citeproc-locale citeproc-term citeproc-rt citeproc-lib citeproc-s queue ox-pandoc ox-org ox-odt rng-loc rng-uri rng-parse rng-match rng-dt rng-util rng-pttrn nxml-parse nxml-ns nxml-enc xmltok nxml-util ox-latex ox-icalendar ox-html table ox-ascii ox-publish ox org-ref-misc-links org-ref-label-link org-ref-ref-links org-ref-citation-links org-ref-bibliography-links bibtex-completion biblio biblio-download biblio-dissemin biblio-ieee biblio-hal biblio-dblp biblio-crossref biblio-arxiv timezone biblio-doi biblio-core ido parsebib bibtex org-ref-autoloads ox-pandoc-autoloads citeproc-autoloads string-inflection-autoloads queue-autoloads bibtex-completion-autoloads biblio-autoloads biblio-core-autoloads parsebib-autoloads htmlize-autoloads scimax-inkscape scimax-inkscape-autoloads org-pdftools pdf-annot facemenu org-noter org-pdftools-autoloads org-noter-autoloads org-capture org-checklist org-habit org-edna org-edna-autoloads org-inlinetask org-drill persist org-agenda org-drill-autoloads persist-autoloads ol-info ol-w3m ol-doi org-link-doi speed-type speed-type-autoloads ement ement-notify ement-room ement-lib ement-api ement-structs ement-macros warnings dns ement-autoloads svg-lib-autoloads taxy-magit-section-autoloads taxy-autoloads map-autoloads plz plz-autoloads 0x0 0x0-autoloads notmuch-calendar-x notmuch-calendar-x-autoloads notmuch notmuch-tree notmuch-jump notmuch-hello notmuch-show notmuch-print notmuch-crypto notmuch-mua notmuch-message notmuch-draft notmuch-maildir-fcc notmuch-address notmuch-company notmuch-parser notmuch-wash coolj notmuch-query goto-addr icalendar diary-lib diary-loaddefs notmuch-tag notmuch-lib notmuch-version notmuch-compat w3m-autoloads elfeed-score elfeed-score-maint elfeed-score-scoring elfeed-score-serde elfeed-score-rule-stats elfeed-org org-element org-persist elfeed-org-autoloads quick-peek quick-peek-autoloads elfeed-show elfeed-search hideshow display-fill-column-indicator eros rainbow-delimiters highlight-numbers parent-mode easy-escape license-snippets yasnippet-snippets-autoloads yasnippet-snippets yasnippet elfeed-csv elfeed elfeed-curl elfeed-log elfeed-db elfeed-lib avl-tree url-queue xml-query elfeed-score-rules elfeed-score-log elfeed-score-autoloads elfeed-autoloads ytel-show-autoloads ytel ytel-autoloads qrencode-el-autoloads tb-keycast tb-keycast-autoloads gif-screencast gif-screencast-autoloads yaml-mode yaml-mode-autoloads mingus libmpdee cl mingus-autoloads libmpdee-autoloads calctex calc-sel calctex-autoloads shell-pop-autoloads eterm-256color-autoloads xterm-color-autoloads vterm term ehelp vterm-module term/xterm xterm vterm-autoloads diffpdf diffpdf-autoloads pdf-view-restore pdf-view-restore-autoloads pdf-occur ibuf-ext ibuffer ibuffer-loaddefs tablist tablist-filter semantic/wisent/comp semantic/wisent semantic/wisent/wisent semantic/util-modes semantic/util semantic semantic/tag semantic/lex semantic/fw mode-local cedet pdf-isearch pdf-misc pdf-tools pdf-roll pdf-view jka-compr pdf-cache pdf-info tq pdf-util pdf-macs pdf-tools-autoloads tablist-autoloads image-roll image-roll-autoloads wolfram-mode wolfram-mode-autoloads ledger-mode-autoloads auctex-autoloads tex-site ebuild-mode skeleton sh-script smie executable ebuild-mode-autoloads lua-mode lua-mode-autoloads gnuplot-autoloads eros-autoloads nameless nameless-autoloads paredit paredit-autoloads company-jedi company-jedi-autoloads jedi jedi-core python-environment epc ctable concurrent auto-complete jedi-autoloads auto-complete-autoloads jedi-core-autoloads python-environment-autoloads epc-autoloads ctable-autoloads concurrent-autoloads elpy-autoloads pyvenv-autoloads highlight-indentation-autoloads python helm-info which-key which-key-autoloads helm-descbinds helm-descbinds-autoloads elisp-demos elisp-demos-autoloads helpful info-look help-fns elisp-refs helpful-autoloads elisp-refs-autoloads tldr tldr-autoloads lsp-ui-autoloads lsp-mode-autoloads spinner-autoloads macrostep macrostep-autoloads highlight-refontification highlight-refontification-autoloads font-lock-profiler font-lock-profiler-autoloads font-lock-studio font-lock-studio-autoloads memory-usage memory-usage-autoloads bug-hunter bug-hunter-autoloads lorem-ipsum lorem-ipsum-autoloads license-snippets-autoloads yasnippet-autoloads move-text move-text-autoloads aggressive-indent aggressive-indent-autoloads visual-regexp-autoloads magit-bookmark bookmark mule-util helm-bm helm-bm-autoloads bm bm-autoloads helm-dash dash-docs helm-dash-autoloads dash-docs-autoloads disk-usage disk-usage-autoloads dired-git-info-autoloads dired-hide-dotfiles-autoloads dired-filter-autoloads diredfl diredfl-autoloads all-the-icons-dired-autoloads dired-async dired-open-autoloads dired-avfs dired-avfs-autoloads dired-narrow-autoloads dired-hacks-utils dired-hacks-utils-autoloads dired+ image-file image-converter dired-x dired-aux dired+-autoloads winner windower emacs-windower-autoloads goggles pulse skip-buffers-mode avy-autoloads eval-sexp-fu eval-sexp-fu-autoloads goggles-autoloads easy-escape-autoloads highlight-numbers-autoloads parent-mode-autoloads rainbow-delimiters-autoloads highlight-parentheses highlight-parentheses-autoloads flycheck-tip error-tip notifications dbus popup flycheck-tip-autoloads flycheck flycheck-autoloads pkg-info-autoloads epl-autoloads wordnut wordnut-history wordnut-u wordnut-autoloads smog smog-autoloads writegood-mode writegood-mode-autoloads langtool-ignore-fonts langtool-ignore-fonts-autoloads langtool compile langtool-autoloads el-patch-autoloads el-patch el-patch-stub flyspell ispell hi-lock ediff ediff-merg ediff-mult ediff-wind ediff-diff ediff-help ediff-init ediff-util browse-at-remote vc-git vc-dir ewoc vc vc-dispatcher f f-shortdoc shortdoc browse-at-remote-autoloads f-autoloads code-review code-review-actions code-review-comment code-review-section code-review-bitbucket code-review-faces shr pixel-fill kinsoku url-file svg xml dom emojify apropos tar-mode arc-mode archive-mode ht code-review-gitlab code-review-utils code-review-parse-hunk code-review-github code-review-db uuidgen calc-misc calc-ext calc calc-loaddefs rect calc-macs a code-review-interfaces deferred forge-list forge-commands forge-semi forge-bitbucket buck forge-gogs gogs forge-gitea gtea forge-gitlab glab forge-github ghub-graphql treepy gsexp ghub forge-notify forge-revnote forge-pullreq forge-issue forge-topic yaml bug-reference forge-post markdown-mode thingatpt forge-repo forge forge-core forge-db closql emacsql-sqlite emacsql emacsql-compiler url-http url-auth url-gw nsm magit-submodule magit-obsolete magit-blame magit-stash magit-reflog magit-bisect magit-push magit-pull magit-fetch magit-clone magit-remote magit-commit magit-sequence magit-notes magit-worktree magit-tag magit-merge magit-branch magit-reset magit-files magit-refs magit-status magit package let-alist browse-url url-handlers magit-repos magit-apply magit-wip magit-log which-func imenu edebug debug backtrace magit-diff smerge-mode diff diff-mode git-commit log-edit message sendmail yank-media rfc822 mml mailabbrev nnheader range mail-utils gmm-utils mailheader pcvs-util add-log magit-core magit-autorevert magit-margin magit-transient magit-process with-editor magit-mode transient magit-git magit-base magit-section crm compat-27 compat-26 code-review-autoloads emojify-autoloads ht-autoloads deferred-autoloads uuidgen-autoloads a-autoloads forge-autoloads yaml-autoloads markdown-mode-autoloads ghub-autoloads treepy-autoloads let-alist-autoloads emacsql-sqlite-autoloads emacsql-autoloads closql-autoloads magit-autoloads magit-section-autoloads git-commit-autoloads with-editor-autoloads transient-autoloads autorevert recentf tree-widget disp-table hl-todo pretty-symbols company-oddmuse company-keywords company-etags etags fileloop generator xref project company-gtags company-dabbrev-code company-dabbrev company-files company-clang company-capf company-cmake company-semantic company-template company-bbdb company persistent-scratch persistent-scratch-autoloads savehist backup-walker-autoloads company-autoloads helm-adaptive helm-mode helm-misc helm-files image-dired image-dired-tags image-dired-external image-dired-util xdg image-mode dired desktop frameset dired-loaddefs exif filenotify tramp tramp-cache time-stamp tramp-loaddefs trampver tramp-integration cus-edit pp cus-load wid-edit tramp-compat shell parse-time iso8601 ls-lisp helm-buffers helm-occur helm-tags helm-locate helm-grep helm-regexp helm-utils helm-help helm-types helm helm-global-bindings helm-easymenu helm-core async-bytecomp helm-source helm-multi-match helm-lib helm-autoloads popup-autoloads helm-core-autoloads face-remap pyim pyim-cloudim url url-proxy url-privacy url-expand url-methods url-history url-cookie url-domsuf mm-view mml-smime mml-sec epa epg rfc6068 epg-config gnus-util text-property-search smime gnutls puny dig mm-decode mm-bodies mm-encode mail-parse rfc2231 rfc2047 rfc2045 mm-util ietf-drums mail-prsvr mailcap pyim-probe pyim-preview pyim-page pyim-indicator pyim-dregcache pyim-dhashcache sort pyim-dict async pyim-autoselector pyim-process pyim-punctuation pyim-outcome pyim-candidates pyim-cstring pyim-cregexp xr pyim-codes pyim-imobjs pyim-pinyin pyim-entered pyim-dcache url-util url-parse auth-source eieio eieio-core password-cache json map url-vars pyim-pymap pyim-scheme pyim-common pyim-autoloads xr-autoloads async-autoloads reverse-im quail reverse-im-autoloads hydra lv boon-qwerty color olivetti straight-x boon boon-keys boon-core advice boon-loaddefs boon-autoloads multiple-cursors-autoloads expand-region-autoloads meta-functions org-id org-refile dash meta-functions-autoloads dash-autoloads hl-line memoize memoize-autoloads info-colors info-colors-autoloads hl-todo-autoloads latex-pretty-symbols latex-pretty-symbols-autoloads pretty-symbols-autoloads page-break-lines page-break-lines-autoloads edmacro kmacro adaptive-wrap adaptive-wrap-autoloads olivetti-autoloads shackle trace shackle-autoloads use-package-diminish all-the-icons all-the-icons-faces data-material data-weathericons data-octicons data-fileicons data-faicons data-alltheicons all-the-icons-autoloads org ob ob-tangle ob-ref ob-lob ob-table ob-exp org-macro org-footnote org-src ob-comint org-pcomplete pcomplete comint files-x derived osc ansi-color ring org-list org-entities time-date noutline outline icons ob-emacs-lisp ob-core ob-eval org-cycle org-font-lock org-font-lock-core org-element-match org-faces org-table ol org-fold org-fold-core org-keys oc org-loaddefs find-func cal-menu calendar cal-loaddefs org-version org-compat org-font-lock-obsolete org-macs format-spec rx modus-operandi-theme modus-themes modus-themes-autoloads s s-autoloads asoc asoc.el-autoloads no-littering compat no-littering-autoloads compat-autoloads hydra-autoloads lv-autoloads finder-inf use-package-bind-key org-contrib-autoloads bind-key diminish diminish-autoloads use-package-core use-package-autoloads bind-key-autoloads straight-autoloads cl-extra help-mode straight info autoload loaddefs-gen generate-lisp-file radix-tree lisp-mnt easy-mmode cl-seq pcase subr-x byte-opt cl-macs gv cl-loaddefs cl-lib bytecomp byte-compile cconv server rmc iso-transl tooltip eldoc paren electric uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel term/x-win x-win term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu timer select scroll-bar mouse jit-lock font-lock syntax font-core term/tty-colors frame minibuffer nadvice seq simple cl-generic indonesian philippine cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite emoji-zwj charscript charprop case-table epa-hook jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs faces cus-face macroexp files window text-properties overlay sha1 md5 base64 format env code-pages mule custom widget keymap hashtable-print-readable backquote threads dbusbind inotify lcms2 dynamic-setting system-font-setting font-render-setting cairo move-toolbar gtk x-toolkit xinput2 x multi-tty make-network-process emacs) Memory information: ((conses 16 8304927 6682874) (symbols 48 111731 347) (strings 32 1998260 614327) (string-bytes 1 74322513) (vectors 16 646829) (vector-slots 8 11831847 5926232) (floats 8 156374 74631) (intervals 56 356161 102742) (buffers 984 132)) -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-10-16 1:26 bug#58558: 29.0.50; re-search-forward is slow in some buffers Ihor Radchenko @ 2022-10-16 9:19 ` Lars Ingebrigtsen 2022-10-16 9:34 ` Ihor Radchenko 2023-04-10 8:48 ` Mattias Engdegård 1 sibling, 1 reply; 81+ messages in thread From: Lars Ingebrigtsen @ 2022-10-16 9:19 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558 Ihor Radchenko <yantar92@posteo.net> writes: > It happens consistently in Emacs 29, but not in all buffers. Sometimes, > it only happens after some time after Emacs startup. The slowdown is not > there in Emacs 28. Is there anything special about buffers where you see these slowdowns? For instance, a large number of text properties or overlays? (length (object-intervals (current-buffer))) will tell you how many text properties there are (sort of), and (length (overlays-in (point-min) (point-max))) should tell you the same for overlays. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-10-16 9:19 ` Lars Ingebrigtsen @ 2022-10-16 9:34 ` Ihor Radchenko 2022-10-16 9:37 ` Lars Ingebrigtsen 2023-02-19 12:17 ` Dmitry Gutov 0 siblings, 2 replies; 81+ messages in thread From: Ihor Radchenko @ 2022-10-16 9:34 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 58558 Lars Ingebrigtsen <larsi@gnus.org> writes: >> It happens consistently in Emacs 29, but not in all buffers. Sometimes, >> it only happens after some time after Emacs startup. The slowdown is not >> there in Emacs 28. > > Is there anything special about buffers where you see these slowdowns? This is a large complex Org buffer. > For instance, a large number of text properties or overlays? > > (length (object-intervals (current-buffer))) => 101075 (took over 10sec to complete the command) > will tell you how many text properties there are (sort of), and > > (length (overlays-in (point-min) (point-max))) > > should tell you the same for overlays. => 1 -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-10-16 9:34 ` Ihor Radchenko @ 2022-10-16 9:37 ` Lars Ingebrigtsen 2022-10-16 10:02 ` Ihor Radchenko 2023-02-19 12:17 ` Dmitry Gutov 1 sibling, 1 reply; 81+ messages in thread From: Lars Ingebrigtsen @ 2022-10-16 9:37 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558 Ihor Radchenko <yantar92@posteo.net> writes: >> Is there anything special about buffers where you see these slowdowns? > > This is a large complex Org buffer. > >> For instance, a large number of text properties or overlays? >> >> (length (object-intervals (current-buffer))) > > => 101075 (took over 10sec to complete the command) If you switch the buffer to `clean-mode' (which should remove all text props), does the slowdown disappear? In that case, it seems likely that the slowdown is connected to text properties, somehow. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-10-16 9:37 ` Lars Ingebrigtsen @ 2022-10-16 10:02 ` Ihor Radchenko 2022-10-16 10:04 ` Lars Ingebrigtsen 2022-10-16 10:36 ` Eli Zaretskii 0 siblings, 2 replies; 81+ messages in thread From: Ihor Radchenko @ 2022-10-16 10:02 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 58558 Lars Ingebrigtsen <larsi@gnus.org> writes: > If you switch the buffer to `clean-mode' (which should remove all text > props), does the slowdown disappear? In that case, it seems likely that > the slowdown is connected to text properties, somehow. The slowdown becomes slightly better, but nowhere close to Emacs 28: ;; Emacs 29 ;; Elapsed time: 16.953404s ;; Emacs 29 + clean-mode ;; Elapsed time: 13.290568s ;; Emacs 28 ;; Elapsed time: 0.869748s I did (setq yant/re "\\(?:\\(?:\\<DEADLINE: *\\(\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\)\\)\\|\\(?:\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\|^\\*+[[:blank:]]+\\(?:[[:upper:]]+[[:blank:]]+\\)?\\[#A]\\|^[[:space:]]*:STYLE:[[:space:]]+habit[[:space:]]*$\\)\\)") (benchmark-progn (goto-char (point-min)) (while (re-search-forward yant/re nil t))) -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-10-16 10:02 ` Ihor Radchenko @ 2022-10-16 10:04 ` Lars Ingebrigtsen 2022-10-16 10:53 ` Ihor Radchenko 2022-10-16 10:36 ` Eli Zaretskii 1 sibling, 1 reply; 81+ messages in thread From: Lars Ingebrigtsen @ 2022-10-16 10:04 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558 Ihor Radchenko <yantar92@posteo.net> writes: >> If you switch the buffer to `clean-mode' (which should remove all text >> props), does the slowdown disappear? In that case, it seems likely that >> the slowdown is connected to text properties, somehow. > > The slowdown becomes slightly better, but nowhere close to Emacs 28: > > ;; Emacs 29 > ;; Elapsed time: 16.953404s > ;; Emacs 29 + clean-mode > ;; Elapsed time: 13.290568s > ;; Emacs 28 > ;; Elapsed time: 0.869748s Hm... Another test -- could you try `find-file-literally' on the Org file and repeat the search? ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-10-16 10:04 ` Lars Ingebrigtsen @ 2022-10-16 10:53 ` Ihor Radchenko 2022-10-16 11:01 ` Lars Ingebrigtsen 0 siblings, 1 reply; 81+ messages in thread From: Ihor Radchenko @ 2022-10-16 10:53 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 58558 Lars Ingebrigtsen <larsi@gnus.org> writes: >> The slowdown becomes slightly better, but nowhere close to Emacs 28: >> >> ;; Emacs 29 >> ;; Elapsed time: 16.953404s >> ;; Emacs 29 + clean-mode >> ;; Elapsed time: 13.290568s >> ;; Emacs 28 >> ;; Elapsed time: 0.869748s > > Hm... Another test -- could you try `find-file-literally' on the Org > file and repeat the search? I just switched between Emacs 28 and Emacs 29 and I do note that right after loading Emacs and the Org file, Emacs 29 takes similar time with Emacs 28. I do know that things will get back to slow after a while. This problem has been around on my machine for a long time. I will report once I use Emacs long enough to observe the slowdown. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-10-16 10:53 ` Ihor Radchenko @ 2022-10-16 11:01 ` Lars Ingebrigtsen 2022-10-16 11:21 ` Eli Zaretskii 0 siblings, 1 reply; 81+ messages in thread From: Lars Ingebrigtsen @ 2022-10-16 11:01 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558 > I just switched between Emacs 28 and Emacs 29 and I do note that > right after loading Emacs and the Org file, Emacs 29 takes similar time > with Emacs 28. Huh, very odd. Almost as something is... fragmenting in the buffer? We do have many caches and stuff -- perhaps something is... degrading? I guess some C-level perf measurements would be handy here, but that's not something I know much about. Anybody? ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-10-16 11:01 ` Lars Ingebrigtsen @ 2022-10-16 11:21 ` Eli Zaretskii 2022-10-16 14:23 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 1 reply; 81+ messages in thread From: Eli Zaretskii @ 2022-10-16 11:21 UTC (permalink / raw) To: Lars Ingebrigtsen, Stefan Monnier; +Cc: 58558, yantar92 > Cc: 58558@debbugs.gnu.org > From: Lars Ingebrigtsen <larsi@gnus.org> > Date: Sun, 16 Oct 2022 13:01:48 +0200 > > > I just switched between Emacs 28 and Emacs 29 and I do note that > > right after loading Emacs and the Org file, Emacs 29 takes similar time > > with Emacs 28. > > Huh, very odd. Almost as something is... fragmenting in the buffer? > We do have many caches and stuff -- perhaps something is... degrading? > > I guess some C-level perf measurements would be handy here, but that's > not something I know much about. Anybody? AFAIU, we use elaborate caching for regular expressions, so maybe that is related. Stefan, any ideas? ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-10-16 11:21 ` Eli Zaretskii @ 2022-10-16 14:23 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2022-10-17 0:56 ` Ihor Radchenko 0 siblings, 1 reply; 81+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-10-16 14:23 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, Lars Ingebrigtsen, yantar92 >> Huh, very odd. Almost as something is... fragmenting in the buffer? >> We do have many caches and stuff -- perhaps something is... degrading? >> >> I guess some C-level perf measurements would be handy here, but that's >> not something I know much about. Anybody? > > AFAIU, we use elaborate caching for regular expressions, so maybe that > is related. Stefan, any ideas? The regexp cache hasn't changed between 28 and 29, so that seems unlikely to be the source of the problem. But that cache is fairly simple-minded, so it's possible that for some reason it thrashes in Emacs-29 but not in Emacs-28 (but see below). IIUC a summary of what we know so far: - the "yant/re" benchmark is ~20x slower in Emacs-29 than in Emacs-28. - removing all text properties reduces the factor down to about ~15x. - that difference is absent after a fresh start: it only appears over time. Since this benchmark always matches the same regexp, I can't imagine how the regexp cache could thrash, so it definitely seems to come from something else. I'd curious to know the result of the following tests: - Run the same benchmark twice in a row: does the second run take the same time as the first, or is the second run significantly faster? [ if it's faster it might be due to something like the on-the-fly `syntax-propertize`ation. BTW, what does the profiler-start/report say? Is the time 100% spent in `re-search-forward`? ] - Try to reduce the number of "features" used in the regexp to see how it affects the slow down. Maybe try a "binary search" where you try to reduce the regexp to something much simpler and see if some regexps exhibit the slowdown while others don't? Stefan ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-10-16 14:23 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-10-17 0:56 ` Ihor Radchenko 2022-10-18 11:50 ` Lars Ingebrigtsen 0 siblings, 1 reply; 81+ messages in thread From: Ihor Radchenko @ 2022-10-17 0:56 UTC (permalink / raw) To: Stefan Monnier; +Cc: 58558, Eli Zaretskii, Lars Ingebrigtsen Stefan Monnier <monnier@iro.umontreal.ca> writes: > IIUC a summary of what we know so far: > - the "yant/re" benchmark is ~20x slower in Emacs-29 than in Emacs-28. > - removing all text properties reduces the factor down to about ~15x. > - that difference is absent after a fresh start: it only appears over time. > > Since this benchmark always matches the same regexp, I can't imagine how > the regexp cache could thrash, so it definitely seems to come from > something else. > > I'd curious to know the result of the following tests: > > - Run the same benchmark twice in a row: does the second run take the > same time as the first, or is the second run significantly faster? > [ if it's faster it might be due to something like the on-the-fly > `syntax-propertize`ation. After 11 hours of Emacs uptime and some edits in the buffer (actually, just a few hours; mostly idle), running the benchmark-progn repetitively: ;; Elapsed time: 8.339753s ;; Elapsed time: 9.243140s ;; Elapsed time: 9.868761s ;; Elapsed time: 10.330362s ;; Elapsed time: 11.279218s ;; Elapsed time: 13.581893s ;; Elapsed time: 13.675609s ;; Elapsed time: 14.553157s ;; Elapsed time: 14.651782s ;; Elapsed time: 17.253983s The elapsed time gradually increases. It is definitely a clue, but very odd one. > BTW, what does the profiler-start/report say? > Is the time 100% spent in `re-search-forward`? ] ;; w CPU profiler ;; Elapsed time: 19.628828s ;; profiler: ;; 19954 99% - command-execute ;; 19926 99% - funcall-interactively ;; 19627 98% - eval-expression ;; 19627 98% - let ;; 19627 98% - progn ;; 19627 98% while ;; ------------ no more data inside while --------- Nothing useful. It's like while loop is doing something bad, but how so in (benchmark-progn (while (re-search-forward yant/re nil t))) ?? I also tried find-file-literally and the timing gets back to fresh Emacs (even faster): ;; find-file-literally ;; Elapsed time: 0.592935s Then, I re-opened the file normally. ;; re-open the file ;; Elapsed time: 7.348727s Note how the time is back to 7-8 seconds, but not same as fresh Emacs. > - Try to reduce the number of "features" used in the regexp to see how > it affects the slow down. Maybe try a "binary search" where you try > to reduce the regexp to something much simpler and see if some regexps > exhibit the slowdown while others don't? Hmm. I tried a very simple regexp "^\\*+ " 10 times in a row: ;; Elapsed time: 0.267681s ;; Elapsed time: 0.381607s ;; Elapsed time: 0.342378s ;; Elapsed time: 0.350618s ;; Elapsed time: 0.376871s ;; Elapsed time: 0.446346s ;; Elapsed time: 0.472543s ;; Elapsed time: 0.529925s ;; Elapsed time: 0.604101s ;; Elapsed time: 0.665601s It is generally faster, but still relatively slow and gets worse over time. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-10-17 0:56 ` Ihor Radchenko @ 2022-10-18 11:50 ` Lars Ingebrigtsen 2022-10-18 14:58 ` Eli Zaretskii 0 siblings, 1 reply; 81+ messages in thread From: Lars Ingebrigtsen @ 2022-10-18 11:50 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, Eli Zaretskii, Stefan Monnier Ihor Radchenko <yantar92@posteo.net> writes: > After 11 hours of Emacs uptime and some edits in the buffer (actually, > just a few hours; mostly idle), running the benchmark-progn > repetitively: > > ;; Elapsed time: 8.339753s > ;; Elapsed time: 9.243140s > ;; Elapsed time: 9.868761s > ;; Elapsed time: 10.330362s > ;; Elapsed time: 11.279218s > ;; Elapsed time: 13.581893s > ;; Elapsed time: 13.675609s > ;; Elapsed time: 14.553157s > ;; Elapsed time: 14.651782s > ;; Elapsed time: 17.253983s > > The elapsed time gradually increases. It is definitely a clue, but very > odd one. The slowdowns are so dramatic that they should show up on a profiler -- which might give us a clue which parts of Emacs is slowing down. I briefly tried to use "perf" under Linux to connect to a running Emacs and get some data out of it, but... er... I've never used it before, and... Does anybody have a recipe for how to do runtime function tracing for a running process? ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-10-18 11:50 ` Lars Ingebrigtsen @ 2022-10-18 14:58 ` Eli Zaretskii 2022-10-18 18:19 ` Lars Ingebrigtsen 0 siblings, 1 reply; 81+ messages in thread From: Eli Zaretskii @ 2022-10-18 14:58 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 58558, yantar92, monnier > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: Stefan Monnier <monnier@iro.umontreal.ca>, 58558@debbugs.gnu.org, Eli > Zaretskii <eliz@gnu.org> > Date: Tue, 18 Oct 2022 13:50:02 +0200 > > The slowdowns are so dramatic that they should show up on a profiler -- > which might give us a clue which parts of Emacs is slowing down. Right. > I briefly tried to use "perf" under Linux to connect to a running > Emacs and get some data out of it, but... er... I've never used it > before, and... > > Does anybody have a recipe for how to do runtime function tracing for a > running process? The way I run perf is to start Emacs under perf to begin with. What did you try? It was quite simple, AFAIR, last time I tried. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-10-18 14:58 ` Eli Zaretskii @ 2022-10-18 18:19 ` Lars Ingebrigtsen 2022-10-18 18:38 ` Eli Zaretskii 2022-12-13 10:28 ` Ihor Radchenko 0 siblings, 2 replies; 81+ messages in thread From: Lars Ingebrigtsen @ 2022-10-18 18:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, yantar92, monnier Eli Zaretskii <eliz@gnu.org> writes: >> I briefly tried to use "perf" under Linux to connect to a running >> Emacs and get some data out of it, but... er... I've never used it >> before, and... >> >> Does anybody have a recipe for how to do runtime function tracing for a >> running process? > > The way I run perf is to start Emacs under perf to begin with. > > What did you try? It was quite simple, AFAIR, last time I tried. I thought it might be easier to see the differences in results if one first attached perf to a fresh (fast) Emacs and got the trace, and the waited until Emacs got slow, and repeated the same thing under another trace. perf is able to do this by: perf record -p <PID> -g and perf report then shows me stuff, but I don't even know what to look for when interpreting that. Or whether perf is, indeed, the right too for this task. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-10-18 18:19 ` Lars Ingebrigtsen @ 2022-10-18 18:38 ` Eli Zaretskii 2022-12-13 10:28 ` Ihor Radchenko 1 sibling, 0 replies; 81+ messages in thread From: Eli Zaretskii @ 2022-10-18 18:38 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 58558, yantar92, monnier > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: yantar92@posteo.net, monnier@iro.umontreal.ca, 58558@debbugs.gnu.org > Date: Tue, 18 Oct 2022 20:19:24 +0200 > > > What did you try? It was quite simple, AFAIR, last time I tried. > > I thought it might be easier to see the differences in results if one > first attached perf to a fresh (fast) Emacs and got the trace, and the > waited until Emacs got slow, and repeated the same thing under another > trace. > > perf is able to do this by: > > perf record -p <PID> -g I never tried that, always started Emacs under perf to begin with. > and > > perf report > > then shows me stuff, but I don't even know what to look for when > interpreting that. I thought you wanted to compare two or more profiles taken at different times? Then looking at percentages of the same functions could tell something. Since the complaint is about regexp search, I guess re_compile_pattern and re_match_2_internal and their subroutines would be the immediate suspects. Or maybe re-search-forward, which is a couple of levels higher. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-10-18 18:19 ` Lars Ingebrigtsen 2022-10-18 18:38 ` Eli Zaretskii @ 2022-12-13 10:28 ` Ihor Radchenko 2022-12-13 13:11 ` Eli Zaretskii 2022-12-13 13:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 1 sibling, 2 replies; 81+ messages in thread From: Ihor Radchenko @ 2022-12-13 10:28 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 58558, Eli Zaretskii, monnier Lars Ingebrigtsen <larsi@gnus.org> writes: > I thought it might be easier to see the differences in results if one > first attached perf to a fresh (fast) Emacs and got the trace, and the > waited until Emacs got slow, and repeated the same thing under another > trace. > > perf is able to do this by: > > perf record -p <PID> -g > > and > > perf report > > then shows me stuff, but I don't even know what to look for when > interpreting that. Or whether perf is, indeed, the right too for this > task. Ok. I got around to try perf, and it turned out to be very easy to get started. perf record -p <PID> + perf report already appear to give some clue: 88.27% emacs emacs-30-vcs [.] buf_bytepos_to_charpos 3.75% emacs emacs-30-vcs [.] re_match_2_internal 1.35% emacs emacs-30-vcs [.] scan_sexps_forward 1.03% emacs emacs-30-vcs [.] re_search_2 0.65% emacs emacs-30-vcs [.] find_interval 0.56% emacs emacs-30-vcs [.] sub_char_table_ref 0.55% emacs emacs-30-vcs [.] lookup_char_property The fraction of buf_bytepos_to_charpos increases over repeated benchmark runs. In contrast, using find-file-literally produces 34.44% emacs emacs-30-vcs [.] re_match_2_internal 25.55% emacs emacs-30-vcs [.] scan_sexps_forward 11.09% emacs emacs-30-vcs [.] re_search_2 ... 0.59% emacs emacs-30-vcs [.] buf_bytepos_to_charpos with buf_bytepos_to_charpos taking diminishing cpu sample fraction. Any ideas what I can do further? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-13 10:28 ` Ihor Radchenko @ 2022-12-13 13:11 ` Eli Zaretskii 2022-12-13 13:32 ` Ihor Radchenko 2022-12-13 13:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 1 sibling, 1 reply; 81+ messages in thread From: Eli Zaretskii @ 2022-12-13 13:11 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, larsi, monnier > From: Ihor Radchenko <yantar92@posteo.net> > Cc: Eli Zaretskii <eliz@gnu.org>, monnier@iro.umontreal.ca, > 58558@debbugs.gnu.org > Date: Tue, 13 Dec 2022 10:28:57 +0000 > > Ok. I got around to try perf, and it turned out to be very easy to get > started. > > perf record -p <PID> + perf report already appear to give some clue: > > 88.27% emacs emacs-30-vcs [.] buf_bytepos_to_charpos > 3.75% emacs emacs-30-vcs [.] re_match_2_internal > 1.35% emacs emacs-30-vcs [.] scan_sexps_forward > 1.03% emacs emacs-30-vcs [.] re_search_2 > 0.65% emacs emacs-30-vcs [.] find_interval > 0.56% emacs emacs-30-vcs [.] sub_char_table_ref > 0.55% emacs emacs-30-vcs [.] lookup_char_property > > The fraction of buf_bytepos_to_charpos increases over repeated benchmark > runs. So buf_bytepos_to_charpos is the main suspect now, I guess. This could happen because either (a) buf_bytepos_to_charpos is called more times as session uptime progresses, or (b) because each call to buf_bytepos_to_charpos becomes more and more expensive. So I think the first question is: how many times is buf_bytepos_to_charpos called for each search, or, equivalently, is the CPU time per call used up by buf_bytepos_to_charpos stays stable or goes up? I think perf can answer these questions if you ask nicely. If the number of calls is the same, but each call becomes more and more expensive, then the next step is to ask perf to produce a detailed profile for each line of buf_bytepos_to_charpos, and see which parts of it become more expensive. I could think about a couple of possible reasons for that, but I'd rather not speculate about profiles, as that is known to produce wrong guesses. Is the buffer in question being edited as time advances? Or is buffer text and everything else in the buffer left unchanged? > In contrast, using find-file-literally produces > > 34.44% emacs emacs-30-vcs [.] re_match_2_internal > 25.55% emacs emacs-30-vcs [.] scan_sexps_forward > 11.09% emacs emacs-30-vcs [.] re_search_2 > ... > 0.59% emacs emacs-30-vcs [.] buf_bytepos_to_charpos > > with buf_bytepos_to_charpos taking diminishing cpu sample fraction. That find-file-literally yields a buffer with a much faster buf_bytepos_to_charpos is not surprising: when each character is a single byte, the conversion is trivial, and buf_bytepos_to_charpos returns immediately. The puzzling part is not that buf_bytepos_to_charpos is much more expensive in a buffer with non-ASCII text, the puzzle is why it becomes more and more expensive with time. Thanks. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-13 13:11 ` Eli Zaretskii @ 2022-12-13 13:32 ` Ihor Radchenko 2022-12-13 14:28 ` Eli Zaretskii 0 siblings, 1 reply; 81+ messages in thread From: Ihor Radchenko @ 2022-12-13 13:32 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, larsi, monnier Eli Zaretskii <eliz@gnu.org> writes: >> The fraction of buf_bytepos_to_charpos increases over repeated benchmark >> runs. > > So buf_bytepos_to_charpos is the main suspect now, I guess. This > could happen because either (a) buf_bytepos_to_charpos is called more > times as session uptime progresses, Just to clarify. The perf records I did are roughly for the duration of benchmark-run calls. Nothing more. > or (b) because each call to > buf_bytepos_to_charpos becomes more and more expensive. So I think > the first question is: how many times is buf_bytepos_to_charpos called > for each search, or, equivalently, is the CPU time per call used up by > buf_bytepos_to_charpos stays stable or goes up? I think perf can > answer these questions if you ask nicely. I will look how to do it. Maybe perf probe. I guess, it will be useful to compile Emacs with debug symbols at this point. > Is the buffer in question being edited as time advances? Or is buffer > text and everything else in the buffer left unchanged? Not edited between benchmarks. Remember that I did sequence of benchmark-run calls and the time gradually increases. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-13 13:32 ` Ihor Radchenko @ 2022-12-13 14:28 ` Eli Zaretskii 2022-12-13 15:56 ` Ihor Radchenko 0 siblings, 1 reply; 81+ messages in thread From: Eli Zaretskii @ 2022-12-13 14:28 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, larsi, monnier > From: Ihor Radchenko <yantar92@posteo.net> > Cc: larsi@gnus.org, monnier@iro.umontreal.ca, 58558@debbugs.gnu.org > Date: Tue, 13 Dec 2022 13:32:13 +0000 > > > or (b) because each call to > > buf_bytepos_to_charpos becomes more and more expensive. So I think > > the first question is: how many times is buf_bytepos_to_charpos called > > for each search, or, equivalently, is the CPU time per call used up by > > buf_bytepos_to_charpos stays stable or goes up? I think perf can > > answer these questions if you ask nicely. > > I will look how to do it. Maybe perf probe. > I guess, it will be useful to compile Emacs with debug symbols at this > point. AFAIR, you can ask perf to profile a single function, and you can ask it to annotate the profile with the source code. > > Is the buffer in question being edited as time advances? Or is buffer > > text and everything else in the buffer left unchanged? > > Not edited between benchmarks. Remember that I did sequence of > benchmark-run calls and the time gradually increases. OK, so it looks more and more like each call becomes more expensive for some reason. But let's see the numbers before jumping to conclusions. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-13 14:28 ` Eli Zaretskii @ 2022-12-13 15:56 ` Ihor Radchenko 2022-12-13 16:08 ` Eli Zaretskii 2022-12-13 17:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 2 replies; 81+ messages in thread From: Ihor Radchenko @ 2022-12-13 15:56 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, larsi, monnier Eli Zaretskii <eliz@gnu.org> writes: >> I will look how to do it. Maybe perf probe. >> I guess, it will be useful to compile Emacs with debug symbols at this >> point. > > AFAIR, you can ask perf to profile a single function, and you can ask > it to annotate the profile with the source code. I now compiled Emacs with debug symbols, waited enough to see observable increase in the benchmark-run timing, and recorded the perf data. buf_bytepos_to_charpos is still on the top 78.06% emacs emacs [.] buf_bytepos_to_charpos 3.00% emacs emacs [.] re_match_2_internal 1.05% emacs emacs [.] find_interval 1.04% emacs emacs [.] CHAR_TABLE_REF_ASCII 0.85% emacs emacs [.] make_lisp_symbol 0.80% emacs emacs [.] re_search_2 0.76% emacs emacs [.] builtin_lisp_symbol 0.62% emacs emacs [.] PSEUDOVECTORP The specific place in the code is: perf annotate -s buf_bytepos_to_charpos : 352 for (tail = BUF_MARKERS (b); tail; tail = tail->next) 0.00 : 237e53: mov -0xe8(%rbp),%rax 0.00 : 237e5a: mov 0x2e8(%rax),%rax 0.01 : 237e61: mov 0x80(%rax),%rax 0.00 : 237e68: mov %rax,-0xc0(%rbp) 0.00 : 237e6f: jmp 237fc6 <buf_bytepos_to_charpos+0x7ba> : 353 { : 354 CONSIDER (tail->bytepos, tail->charpos); 0.02 : 237e74: mov -0xc0(%rbp),%rax 47.07 : 237e7b: mov 0x28(%rax),%rax 7.27 : 237e7f: mov %rax,-0x38(%rbp) 0.02 : 237e83: movl $0x0,-0xc4(%rbp) 9.05 : 237e8d: mov -0x38(%rbp),%rax 0.01 : 237e91: cmp -0xf0(%rbp),%rax 3.73 : 237e98: jne 237eb2 <buf_bytepos_to_charpos+0x6a6> 0.00 : 237e9a: mov -0xc0(%rbp),%rax 0.00 : 237ea1: mov 0x20(%rax),%rax 0.00 : 237ea5: mov %rax,-0x28(%rbp) 0.00 : 237ea9: mov -0x28(%rbp),%rax 0.00 : 237ead: jmp 2381cd <buf_bytepos_to_charpos+0x9c1> 2.14 : 237eb2: mov -0x38(%rbp),%rax 1.87 : 237eb6: cmp -0xf0(%rbp),%rax 0.85 : 237ebd: jle 237ef5 <buf_bytepos_to_charpos+0x6e9> 2.32 : 237ebf: mov -0x38(%rbp),%rax 0.04 : 237ec3: cmp -0xb0(%rbp),%rax 2.56 : 237eca: jge 237f29 <buf_bytepos_to_charpos+0x71d> My guess: number of markers is growing somehow? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-13 15:56 ` Ihor Radchenko @ 2022-12-13 16:08 ` Eli Zaretskii 2022-12-13 17:43 ` Ihor Radchenko 2022-12-13 17:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 1 sibling, 1 reply; 81+ messages in thread From: Eli Zaretskii @ 2022-12-13 16:08 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, larsi, monnier > From: Ihor Radchenko <yantar92@posteo.net> > Cc: larsi@gnus.org, monnier@iro.umontreal.ca, 58558@debbugs.gnu.org > Date: Tue, 13 Dec 2022 15:56:33 +0000 > > My guess: number of markers is growing somehow? That was my guess, yeah. So now the question becomes: who creates all those additional markers if all you do is run the benchmark? If no other idea to find this out comes up, maybe run this with a breakpoint in make-marker, look at the backtrace to see the callers. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-13 16:08 ` Eli Zaretskii @ 2022-12-13 17:43 ` Ihor Radchenko 2022-12-13 17:52 ` Eli Zaretskii 2022-12-13 18:15 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 2 replies; 81+ messages in thread From: Ihor Radchenko @ 2022-12-13 17:43 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, larsi, monnier Eli Zaretskii <eliz@gnu.org> writes: >> My guess: number of markers is growing somehow? > > That was my guess, yeah. > > So now the question becomes: who creates all those additional markers > if all you do is run the benchmark? > > If no other idea to find this out comes up, maybe run this with a > breakpoint in make-marker, look at the backtrace to see the callers. I tried gdb now with break Fmake_marker. The benchmark itself does not trigger the breakpoint. However, a huge number (hundreds) of breakpoint hits is generated upon finishing the benchmark execution. bt: #0 Fmake_marker () at alloc.c:3736 #1 0x00005555557bb750 in Fmatch_data (integers=0x0, reuse=0x0, reseat=0x0) at search.c:2903 #2 0x000055555580eb6d in funcall_subr (subr=0x555555e0dc20 <Smatch_data>, numargs=0, args=0x7ffff0c02070) at eval.c:3038 #3 0x00005555558634c1 in exec_byte_code (fun=0x555557370195, args_template=0, nargs=0, args=0x0) at bytecode.c:809 #4 0x000055555580ee6b in fetch_and_exec_byte_code (fun=0x555557370195, args_template=0, nargs=0, args=0x0) at eval.c:3081 #5 0x000055555580f5a8 in funcall_lambda (fun=0x555557370195, nargs=1, arg_vector=0x7ffff0c02038) at eval.c:3242 #6 0x000055555580e688 in funcall_general (fun=0x555557370195, numargs=1, args=0x7ffff0c02038) at eval.c:2945 #7 0x00005555558634e1 in exec_byte_code (fun=0x55555734c7cd, args_template=0, nargs=0, args=0x0) at bytecode.c:811 #8 0x000055555580ee6b in fetch_and_exec_byte_code (fun=0x55555734c7cd, args_template=0, nargs=0, args=0x0) at eval.c:3081 #9 0x000055555580f5a8 in funcall_lambda (fun=0x55555734c7cd, nargs=1, arg_vector=0x7fffffff6ce0) at eval.c:3242 #10 0x000055555580f00c in apply_lambda (fun=0x55555734c7cd, args=0x555557f7a2c3, count=...) at eval.c:3103 #11 0x000055555580d591 in eval_sub (form=0x555557f7a2b3) at eval.c:2545 #12 0x00005555558084f0 in Fsetq (args=0x555557f7a2a3) at eval.c:483 #13 0x000055555580cfa8 in eval_sub (form=0x555557f7a293) at eval.c:2449 #14 0x00005555558083bc in Fprogn (body=0x555557f7a363) at eval.c:436 #15 0x0000555555809b4e in Flet (args=0x555557f7a283) at eval.c:1026 #16 0x000055555580cfa8 in eval_sub (form=0x555557f7a223) at eval.c:2449 #17 0x000055555580d151 in eval_sub (form=0x555557f712b3) at eval.c:2465 #18 0x000055555580efa6 in apply_lambda (fun=0x555557f8049d, args=0x555557f712a3, count=...) at eval.c:3098 #19 0x000055555580d591 in eval_sub (form=0x555557f71883) at eval.c:2545 #20 0x000055555580cac8 in Feval (form=0x555557f71883, lexical=0x0) at eval.c:2361 #21 0x000055555580eb37 in funcall_subr (subr=0x555555e11ea0 <Seval>, numargs=1, args=0x7fffffff7788) at eval.c:3036 #22 0x000055555580e63c in funcall_general (fun=0x555555e11ea5 <Seval+5>, numargs=1, args=0x7fffffff7788) at eval.c:2941 #23 0x000055555580e909 in Ffuncall (nargs=2, args=0x7fffffff7780) at eval.c:2995 #24 0x000055555580ab30 in internal_condition_case_n (bfun=0x55555580e7eb <Ffuncall>, nargs=2, args=0x7fffffff7780, handlers=0x30, hfun=0x5555555ccfe7 <safe_eval_handler>) at eval.c:1558 #25 0x00005555555cd24c in safe__call (inhibit_quit=true, nargs=2, func=0x6900, ap=0x7fffffff7840) at xdisp.c:3024 #26 0x00005555555cd450 in safe__call1 (inhibit_quit=true, fn=0x6900) at xdisp.c:3060 #27 0x00005555555cd4e0 in safe__eval (inhibit_quit=true, sexpr=0x555557f71883) at xdisp.c:3074 #28 0x000055555561367e in display_mode_element (it=0x7fffffff7d10, depth=2, field_width=0, precision=0, elt=0x555557f71873, props=0x0, risky=false) at xdisp.c:27228 #29 0x0000555555613a28 in display_mode_element (it=0x7fffffff7d10, depth=1, field_width=0, precision=0, elt=0x555557f79cd3, props=0x0, risky=false) at xdisp.c:27314 #30 0x0000555555612210 in display_mode_line (w=0x55555628c8c0, face_id=MODE_LINE_INACTIVE_FACE_ID, format=0x555557f79cd3) at xdisp.c:26740 #31 0x0000555555611efe in display_mode_lines (w=0x55555628c8c0) at xdisp.c:26653 #32 0x00005555555fcf67 in redisplay_window (window=0x55555628c8c5, just_this_one_p=false) at xdisp.c:20345 #33 0x00005555555f2e3f in redisplay_window_0 (window=0x55555628c8c5) at xdisp.c:17434 #34 0x000055555580a994 in internal_condition_case_1 (bfun=0x5555555f2dfd <redisplay_window_0>, arg=0x55555628c8c5, handlers=0x7ffff1adb5a3, hfun=0x5555555f2d16 <redisplay_window_error>) at eval.c:1498 #35 0x00005555555f2cec in redisplay_windows (window=0x55555628c8c5) at xdisp.c:17404 #36 0x00005555555f1a9f in redisplay_internal () at xdisp.c:16854 --Type <RET> for more, q to quit, c to continue without paging-- #37 0x00005555555efb5e in redisplay () at xdisp.c:16043 #38 0x000055555574711a in read_char (commandflag=1, map=0x55556333fb33, prev_event=0x0, used_mouse_menu=0x7fffffffd2a9, end_time=0x0) at keyboard.c:2627 #39 0x0000555555758856 in read_key_sequence (keybuf=0x7fffffffd4e0, prompt=0x0, dont_downcase_last=false, can_return_switch_frame=true, fix_current_buffer=true, prevent_redisplay=false) at keyboard.c:10074 #40 0x00005555557438b0 in command_loop_1 () at keyboard.c:1376 #41 0x000055555580a8ed in internal_condition_case (bfun=0x5555557434a1 <command_loop_1>, handlers=0x90, hfun=0x555555742a7a <cmd_error>) at eval.c:1474 #42 0x0000555555743151 in command_loop_2 (handlers=0x90) at keyboard.c:1125 #43 0x0000555555809f61 in internal_catch (tag=0xff90, func=0x555555743127 <command_loop_2>, arg=0x90) at eval.c:1197 #44 0x00005555557430e3 in command_loop () at keyboard.c:1103 #45 0x000055555574261c in recursive_edit_1 () at keyboard.c:712 #46 0x00005555557427c8 in Frecursive_edit () at keyboard.c:795 #47 0x000055555573e88a in main (argc=1, argv=0x7fffffffd9a8) at emacs.c:2529 If I read the backtrace correctly, something in my custom mode-line is triggering Fmatch_data that creates markers. But that code has not changes for years from git log. One suspicious thing is that my code gets called that much frequently (100s of times) by redisplay. Not sure if it is normal. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-13 17:43 ` Ihor Radchenko @ 2022-12-13 17:52 ` Eli Zaretskii 2022-12-13 18:03 ` Ihor Radchenko 2022-12-13 18:15 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 1 sibling, 1 reply; 81+ messages in thread From: Eli Zaretskii @ 2022-12-13 17:52 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, larsi, monnier > From: Ihor Radchenko <yantar92@posteo.net> > Cc: larsi@gnus.org, monnier@iro.umontreal.ca, 58558@debbugs.gnu.org > Date: Tue, 13 Dec 2022 17:43:36 +0000 > > > If no other idea to find this out comes up, maybe run this with a > > breakpoint in make-marker, look at the backtrace to see the callers. > > I tried gdb now with break Fmake_marker. > > The benchmark itself does not trigger the breakpoint. > However, a huge number (hundreds) of breakpoint hits is generated upon > finishing the benchmark execution. > > bt: > > #0 Fmake_marker () at alloc.c:3736 > #1 0x00005555557bb750 in Fmatch_data (integers=0x0, reuse=0x0, reseat=0x0) at search.c:2903 Ha-ha, shooting ourselves in the foot! Great sleuthing job. Now we need to think what to do with this. Hmm... > If I read the backtrace correctly, something in my custom mode-line is > triggering Fmatch_data that creates markers. Yes, you have sone :eval form in the mode line, it seems? Calling xbacktrace will show a Lisp backtrace, which could be educational here. > But that code has not changes for years from git log. > > One suspicious thing is that my code gets called that much frequently > (100s of times) by redisplay. Not sure if it is normal. You cannot predict when redisplay decides to redraw the mode line. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-13 17:52 ` Eli Zaretskii @ 2022-12-13 18:03 ` Ihor Radchenko 2022-12-13 20:02 ` Eli Zaretskii 0 siblings, 1 reply; 81+ messages in thread From: Ihor Radchenko @ 2022-12-13 18:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, larsi, monnier Eli Zaretskii <eliz@gnu.org> writes: >> If I read the backtrace correctly, something in my custom mode-line is >> triggering Fmatch_data that creates markers. > > Yes, you have sone :eval form in the mode line, it seems? Yes. For example, I call (defun yant/vc-git-current-branch () "Get current GIT branch." (and vc-mode (cadr (s-match "Git.\\([^ ]+\\)" vc-mode)))) with s-match wrapping its code into save-match-data. > Calling xbacktrace will show a Lisp backtrace, which could be > educational here. (gdb) xbacktrace Undefined command: "xbacktrace". Try "help". I am not sure what you mean by xbacktrace. Also, as Stefan pointed, number of markers may or may not be a problem here. However, I had a similar issue even with Emacs 28 when we tested creating a huge number of markers in buffer + re-search-forward. I ended up seeing similar perf logs that time. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-13 18:03 ` Ihor Radchenko @ 2022-12-13 20:02 ` Eli Zaretskii 2022-12-14 11:40 ` Ihor Radchenko 0 siblings, 1 reply; 81+ messages in thread From: Eli Zaretskii @ 2022-12-13 20:02 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, larsi, monnier > From: Ihor Radchenko <yantar92@posteo.net> > Cc: larsi@gnus.org, monnier@iro.umontreal.ca, 58558@debbugs.gnu.org > Date: Tue, 13 Dec 2022 18:03:49 +0000 > > > Calling xbacktrace will show a Lisp backtrace, which could be > > educational here. > > (gdb) xbacktrace > Undefined command: "xbacktrace". Try "help". > > I am not sure what you mean by xbacktrace. It's a command we define in src/.gdbinit. Try this: (gdb) source /path/to/emacs/src/.gdbinit (gdb) xbacktrace But do that after catching Fmake_marker call from Fmatch_data, like you did before. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-13 20:02 ` Eli Zaretskii @ 2022-12-14 11:40 ` Ihor Radchenko 2022-12-14 13:06 ` Eli Zaretskii 0 siblings, 1 reply; 81+ messages in thread From: Ihor Radchenko @ 2022-12-14 11:40 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, larsi, monnier Eli Zaretskii <eliz@gnu.org> writes: >> I am not sure what you mean by xbacktrace. > > It's a command we define in src/.gdbinit. Try this: > > (gdb) source /path/to/emacs/src/.gdbinit > (gdb) xbacktrace > > But do that after catching Fmake_marker call from Fmatch_data, like > you did before. Ok. Now, I disabled my custom mode-line and tied to get the backtrace for Fmake_marker and also build_marker (as suggested by Stefan). Disabling custom mode-line did not cause any apparent improvement in performance. Result: Breakpoint is still _not_ triggered during benchmark-run call (benchmark-progn (goto-char (point-min)) (while (re-search-forward yant/re nil t))) build_marker is not triggered, except during redisplay and completion. Fmake_marker is triggered a dozen of times when preparing M-: prompt and later a couple of hundreds of times _after_ executing the benchmark: Called a couple of hundreds of times Lisp Backtrace: "match-data" (0xf0c02130) 0x59846038 PVEC_COMPILED "auto-revert-buffers--buffer-list-filter" (0xf0c020b8) "apply" (0xf0c020b0) "auto-revert-buffers" (0xf0c02058) "apply" (0xf0c02050) "timer-event-handler" (0xffffcd48) not related. I will now look into counting the number of for look cycles, as Stefan suggested. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-14 11:40 ` Ihor Radchenko @ 2022-12-14 13:06 ` Eli Zaretskii 2022-12-14 13:23 ` Ihor Radchenko 0 siblings, 1 reply; 81+ messages in thread From: Eli Zaretskii @ 2022-12-14 13:06 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, larsi, monnier > From: Ihor Radchenko <yantar92@posteo.net> > Cc: larsi@gnus.org, monnier@iro.umontreal.ca, 58558@debbugs.gnu.org > Date: Wed, 14 Dec 2022 11:40:37 +0000 > > build_marker is not triggered, except during redisplay and completion. > Fmake_marker is triggered a dozen of times when preparing M-: prompt and > later a couple of hundreds of times _after_ executing the benchmark: > > Called a couple of hundreds of times > Lisp Backtrace: > "match-data" (0xf0c02130) > 0x59846038 PVEC_COMPILED > "auto-revert-buffers--buffer-list-filter" (0xf0c020b8) > "apply" (0xf0c020b0) > "auto-revert-buffers" (0xf0c02058) > "apply" (0xf0c02050) > "timer-event-handler" (0xffffcd48) > > not related. I think I'm confused now: what do you mean by "executing the benchmark"? I thought the problem was that each "execution of the benchmark" was slower than the one before it, in which case markers added between benchmarks _are_ relevant. But you say they aren't? What did I miss? ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-14 13:06 ` Eli Zaretskii @ 2022-12-14 13:23 ` Ihor Radchenko 2022-12-14 13:32 ` Eli Zaretskii 0 siblings, 1 reply; 81+ messages in thread From: Ihor Radchenko @ 2022-12-14 13:23 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, larsi, monnier Eli Zaretskii <eliz@gnu.org> writes: > I think I'm confused now: what do you mean by "executing the > benchmark"? I thought the problem was that each "execution of the > benchmark" was slower than the one before it, in which case markers > added between benchmarks _are_ relevant. But you say they aren't? > What did I miss? Increasing time of running benchmarks is just a symptom. The real issue I am experiencing is that re-search-forward becomes slower as I keep using Emacs. `garbage-collect' helps, but not in a long term. Basically, running M-: (benchmark-progn (goto-char (point-min)) (while (re-search-forward yant/re nil t))) - right after starting Emacs is taking 3-4 seconds. - after several hours -- 10-20 seconds - in Emacs 28, <1 sec. Markers may or may not be a problem. If they are, it is not necessarily related to markers created when I run the benchmarks. May also be some markers created during the Emacs session. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-14 13:23 ` Ihor Radchenko @ 2022-12-14 13:32 ` Eli Zaretskii 2022-12-14 13:39 ` Ihor Radchenko 0 siblings, 1 reply; 81+ messages in thread From: Eli Zaretskii @ 2022-12-14 13:32 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, larsi, monnier > From: Ihor Radchenko <yantar92@posteo.net> > Cc: larsi@gnus.org, monnier@iro.umontreal.ca, 58558@debbugs.gnu.org > Date: Wed, 14 Dec 2022 13:23:02 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > > I think I'm confused now: what do you mean by "executing the > > benchmark"? I thought the problem was that each "execution of the > > benchmark" was slower than the one before it, in which case markers > > added between benchmarks _are_ relevant. But you say they aren't? > > What did I miss? > > Increasing time of running benchmarks is just a symptom. > The real issue I am experiencing is that re-search-forward becomes > slower as I keep using Emacs. `garbage-collect' helps, but not in a long > term. > > Basically, running > > M-: (benchmark-progn (goto-char (point-min)) (while (re-search-forward yant/re nil t))) > > - right after starting Emacs is taking 3-4 seconds. > - after several hours -- 10-20 seconds > - in Emacs 28, <1 sec. > > Markers may or may not be a problem. What else could slow down buf_bytepos_to_charpos so much? All it does is examine markers. > f they are, it is not necessarily related to markers created when I > run the benchmarks. May also be some markers created during the > Emacs session. Which means massive creation of markers could be the reason, regardless of what causes such massive creation. Right? But if so, why did you say that markers created by some timer(s) were not relevant? Btw, did you try to compare the number of buffer markers in Emacs 28 and Emacs 29/30, under this scenario, when the search becomes slow enough? ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-14 13:32 ` Eli Zaretskii @ 2022-12-14 13:39 ` Ihor Radchenko 2022-12-14 14:12 ` Eli Zaretskii 0 siblings, 1 reply; 81+ messages in thread From: Ihor Radchenko @ 2022-12-14 13:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, larsi, monnier Eli Zaretskii <eliz@gnu.org> writes: >> Markers may or may not be a problem. > > What else could slow down buf_bytepos_to_charpos so much? All it does > is examine markers. Well. I believe so. But I feel confused. So, I do not exclude other reasons. Note that I have little experience with gdb. >> f they are, it is not necessarily related to markers created when I >> run the benchmarks. May also be some markers created during the >> Emacs session. > > Which means massive creation of markers could be the reason, > regardless of what causes such massive creation. Right? But if so, > why did you say that markers created by some timer(s) were not > relevant? Because those came from auto-revert-mode and are unlikely going to contribute to the single Org buffer I have problems with. > Btw, did you try to compare the number of buffer markers in Emacs 28 > and Emacs 29/30, under this scenario, when the search becomes slow > enough? How can I find the number of buffer markers? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-14 13:39 ` Ihor Radchenko @ 2022-12-14 14:12 ` Eli Zaretskii 0 siblings, 0 replies; 81+ messages in thread From: Eli Zaretskii @ 2022-12-14 14:12 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, larsi, monnier > From: Ihor Radchenko <yantar92@posteo.net> > Cc: larsi@gnus.org, monnier@iro.umontreal.ca, 58558@debbugs.gnu.org > Date: Wed, 14 Dec 2022 13:39:43 +0000 > > How can I find the number of buffer markers? Compile Emacs with -DMARKER_DEBUG, and then you can call count_markers from GDB: (gdb) print count_markers(current_buffer) But you need to make sure current_buffer is the buffer you are interested in. One trick is to do this: (gdb) break Fredraw_display and then type "M-x redraw-display" with the buffer in the selected window. Then call count_markers as above, and it should return the number of markers in the current buffer. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-13 17:43 ` Ihor Radchenko 2022-12-13 17:52 ` Eli Zaretskii @ 2022-12-13 18:15 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2022-12-13 18:40 ` Ihor Radchenko 1 sibling, 1 reply; 81+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-12-13 18:15 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, Eli Zaretskii, larsi > The benchmark itself does not trigger the breakpoint. Does that mean that `Fmatch_data` is not called during a single `re-search-forward` (not a surprise: you'd need to put a breakpoint on `build_marker` to see the markers built by `buf_bytepos_to_charpos`) but is called between `re-search-forward`, or that it's not called at all during the whole benchmark where you perform several `re-search-forward` which grow progressively slower? If it's the latter, then those calls can't explain the slowdown, right? > If I read the backtrace correctly, something in my custom mode-line is > triggering Fmatch_data that creates markers. The most common calls to `match-data` are from `save-match-data`. And most uses of `save-match-data` are ill-advised (as the docstring explains `save-match-data' should be used to save *your* match data rather than your caller's match data), so you might like to double check whether that call to `match-data` can be eliminated altogether. Stefan ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-13 18:15 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-12-13 18:40 ` Ihor Radchenko 2022-12-13 19:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 1 reply; 81+ messages in thread From: Ihor Radchenko @ 2022-12-13 18:40 UTC (permalink / raw) To: Stefan Monnier; +Cc: 58558, Eli Zaretskii, larsi Stefan Monnier <monnier@iro.umontreal.ca> writes: >> The benchmark itself does not trigger the breakpoint. > > Does that mean that `Fmatch_data` is not called during a single > `re-search-forward` (not a surprise: you'd need to put a breakpoint on > `build_marker` to see the markers built by `buf_bytepos_to_charpos`) > but is called between `re-search-forward`, or that it's not called at > all during the whole benchmark where you perform several > `re-search-forward` which grow progressively slower? I do the benchmark via M-: (benchmark-progn (goto-char (point-min)) (while (re-search-forward yant/re nil t))) <RET> The breakpoint triggers after the minibuffer outputs the elapsed time. During redisplay, AFAIU. > If it's the latter, then those calls can't explain the slowdown, right? The slowdown manifests by increasing elapsed time upon subsequent benchmark calls like the above. So, redisplay may or may not be a part of it. I tried to run (progn (benchmark-progn (goto-char (point-min)) (while (re-search-forward yant/re nil t))) (benchmark-progn (goto-char (point-min)) (while (re-search-forward yant/re nil t)))) 4 times: Elapsed time: 16.399824s Elapsed time: 17.009694s nil Elapsed time: 18.187187s Elapsed time: 18.597610s nil Elapsed time: 18.851388s Elapsed time: 19.593968s nil Elapsed time: 20.194616s Elapsed time: 20.414686s nil Though message may still trigger the redisplay. Not sure if this small test really reveals anything useful. Now, with (garbage-collect): (progn (benchmark-progn (goto-char (point-min)) (while (re-search-forward yant/re nil t))) (garbage-collect) (benchmark-progn (goto-char (point-min)) (while (re-search-forward yant/re nil t)))) Elapsed time: 20.576637s <GC> Elapsed time: 15.734101s Elapsed time: 16.101646s <GC> Elapsed time: 16.179796s Elapsed time: 16.545040s <GC> Elapsed time: 16.365847s Elapsed time: 16.842143s <GC> Elapsed time: 16.726615s So, GC does help somewhat. Then, if I kill and re-open the Org buffer: Elapsed time: 72.847256s ;; <- Org just did a bunch of re-search for initial folding and setup <GC> Elapsed time: 4.864642s re-open again, but GC before running the benchmark: <GC> Elapsed time: 4.884221s <GC> Elapsed time: 4.368755s >> If I read the backtrace correctly, something in my custom mode-line is >> triggering Fmatch_data that creates markers. > > The most common calls to `match-data` are from `save-match-data`. > And most uses of `save-match-data` are ill-advised (as the docstring > explains `save-match-data' should be used to save *your* match data > rather than your caller's match data), so you might like to double check > whether that call to `match-data` can be eliminated altogether. This is coming from s.el. In any case, this implementation detail did not change as I switched from Emacs 28 to Emacs 29. It is Emacs doing something less efficiently here. What I can try to do is replacing s-* functions in my mode-line with built-ins. Will it help debugging this issue? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-13 18:40 ` Ihor Radchenko @ 2022-12-13 19:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2022-12-13 20:21 ` Eli Zaretskii 0 siblings, 1 reply; 81+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-12-13 19:55 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, Eli Zaretskii, larsi >> The most common calls to `match-data` are from `save-match-data`. >> And most uses of `save-match-data` are ill-advised (as the docstring >> explains `save-match-data' should be used to save *your* match data >> rather than your caller's match data), so you might like to double check >> whether that call to `match-data` can be eliminated altogether. > > This is coming from s.el. In any case, this implementation detail did > not change as I switched from Emacs 28 to Emacs 29. It is Emacs doing > something less efficiently here. > > What I can try to do is replacing s-* functions in my mode-line with > built-ins. Will it help debugging this issue? I suspect these marker allocations for the mode-line are unrelated to the actual problem. Stefan ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-13 19:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-12-13 20:21 ` Eli Zaretskii 2022-12-14 11:42 ` Ihor Radchenko 0 siblings, 1 reply; 81+ messages in thread From: Eli Zaretskii @ 2022-12-13 20:21 UTC (permalink / raw) To: Stefan Monnier; +Cc: 58558, yantar92, larsi > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Eli Zaretskii <eliz@gnu.org>, larsi@gnus.org, 58558@debbugs.gnu.org > Date: Tue, 13 Dec 2022 14:55:20 -0500 > > I suspect these marker allocations for the mode-line are unrelated to the > actual problem. If this is true, then re-running the benchmarks after removing those :eval's from the mode-line-format will still show slowdown with each benchmark run. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-13 20:21 ` Eli Zaretskii @ 2022-12-14 11:42 ` Ihor Radchenko 0 siblings, 0 replies; 81+ messages in thread From: Ihor Radchenko @ 2022-12-14 11:42 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, larsi, Stefan Monnier Eli Zaretskii <eliz@gnu.org> writes: >> I suspect these marker allocations for the mode-line are unrelated to the >> actual problem. > > If this is true, then re-running the benchmarks after removing those > :eval's from the mode-line-format will still show slowdown with each > benchmark run. I still see the slowdown after falling back to default mode-line-format. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-13 15:56 ` Ihor Radchenko 2022-12-13 16:08 ` Eli Zaretskii @ 2022-12-13 17:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2022-12-14 12:00 ` Ihor Radchenko 2022-12-14 12:23 ` Ihor Radchenko 1 sibling, 2 replies; 81+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-12-13 17:38 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, Eli Zaretskii, larsi >>> I will look how to do it. Maybe perf probe. >>> I guess, it will be useful to compile Emacs with debug symbols at this >>> point. >> >> AFAIR, you can ask perf to profile a single function, and you can ask >> it to annotate the profile with the source code. > > I now compiled Emacs with debug symbols, waited enough to see observable > increase in the benchmark-run timing, and recorded the perf data. > > buf_bytepos_to_charpos is still on the top > > 78.06% emacs emacs [.] buf_bytepos_to_charpos > 3.00% emacs emacs [.] re_match_2_internal > 1.05% emacs emacs [.] find_interval > 1.04% emacs emacs [.] CHAR_TABLE_REF_ASCII > 0.85% emacs emacs [.] make_lisp_symbol > 0.80% emacs emacs [.] re_search_2 > 0.76% emacs emacs [.] builtin_lisp_symbol > 0.62% emacs emacs [.] PSEUDOVECTORP AFAIK the main places where we call `buf_bytepos_to_charpos` from `re_match_2_internal` is via the `SYNTAX_TABLE_BYTE_TO_CHAR` macro, used for regexp elements that depend on syntax tables (i.e. \<, \>, \_<, ...). But I'd expect those to be executed "frequently&closely" enough that the `cached_(byte|char)pos` data should almost always be nearby, making the call to `buf_bytepos_to_charpos` fairly cheap (more specifically the `for (tail = BUF_MARKERS (b);...` loop should not iterate many times, regardless how many markers there are). > My guess: number of markers is growing somehow? `buf_bytepos_to_charpos` itself creates markers (using them as a cache of previous conversions), so that might be why. But we only look at the first N markers where N*50 is the distance to the closest marker found so far. So growth is not sufficient (it's clearly a part of the reason, tho). Regarding growth: could you call `garbage-collect` between the calls to `re-search-forward` to see if that avoids the accumulation? [ I presume here that those markers are created/added by `buf_bytepos_to_charpos` itself, so they should be recovered by the GC because they're not referenced from anywhere else. ] I'd be interested to know how many iterations of the `for (tail = BUF_MARKERS (b);...` loop are executed on average during your `re-search-forward` (and how that average changes between runs of `re-search-forward`). Stefan PS: Of course, another approach would be to replace this code with something else. Using markers as a cache of bytepos/charpos conversions has been a source of a few performance issues over the year. Another approach could be to use a "vector with gap" managed alongside the actual buffer text. It could be indexed by "charpos divided by 1024", so conversion from charpos to bytepos could be a simple vector lookup followed by scanning at most 1kB, and conversion in the other direction would use a binary search in that same vector (or we could use 2 "vectors with gap", one per direction of conversion). ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-13 17:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-12-14 12:00 ` Ihor Radchenko 2022-12-14 12:23 ` Ihor Radchenko 1 sibling, 0 replies; 81+ messages in thread From: Ihor Radchenko @ 2022-12-14 12:00 UTC (permalink / raw) To: Stefan Monnier; +Cc: 58558, Eli Zaretskii, larsi Stefan Monnier <monnier@iro.umontreal.ca> writes: >> My guess: number of markers is growing somehow? > > `buf_bytepos_to_charpos` itself creates markers (using them as a cache > of previous conversions), so that might be why. > > But we only look at the first N markers where N*50 is the distance to > the closest marker found so far. So growth is not sufficient (it's > clearly a part of the reason, tho). What about the following degenerate case: - Most of the buffer markers are located near point-min; - We are searching for position near point-max; - point-max is in order of 21,677,448 (this is my actual file I use for testing) The number of for loop cycles is then min(21,677,448/50 = ~400k, BUF_MARKERS.size()) Of course, my above argument should not matter in theory, when recent search matches are cached by build_marker, but my break build_marker _never_ triggered for some reason. How can build_marker not be triggered? From my reading of the code, it happens when the following switch does not fire. bool record = bytepos - best_below_byte > 5000; I note that this condition will not trigger if all the markers are above. On the other hand, this particular condition is there for the last 25 years or so. Just brainstorming... -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-13 17:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2022-12-14 12:00 ` Ihor Radchenko @ 2022-12-14 12:23 ` Ihor Radchenko 2022-12-14 13:10 ` Eli Zaretskii 1 sibling, 1 reply; 81+ messages in thread From: Ihor Radchenko @ 2022-12-14 12:23 UTC (permalink / raw) To: Stefan Monnier; +Cc: 58558, Eli Zaretskii, larsi Stefan Monnier <monnier@iro.umontreal.ca> writes: > I'd be interested to know how many iterations of the `for (tail = > BUF_MARKERS (b);...` loop are executed on average during your > `re-search-forward` (and how that average changes between runs of > `re-search-forward`). I did not get around to measure separate re-search-forward calls, but total number of hits to CONSIDER (tail->bytepos, tail->charpos); during benchmark-run is: 18 breakpoint keep y 0x000055555578be74 in buf_bytepos_to_charpos at marker.c:353 breakpoint already hit 4,245,365 times Combined with the fact that calling `garbage-collect' between benchmarks makes the benchmark time nearly constant, this result may or may not mean something. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-14 12:23 ` Ihor Radchenko @ 2022-12-14 13:10 ` Eli Zaretskii 2022-12-14 13:26 ` Ihor Radchenko 0 siblings, 1 reply; 81+ messages in thread From: Eli Zaretskii @ 2022-12-14 13:10 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, larsi, monnier > From: Ihor Radchenko <yantar92@posteo.net> > Cc: Eli Zaretskii <eliz@gnu.org>, larsi@gnus.org, 58558@debbugs.gnu.org > Date: Wed, 14 Dec 2022 12:23:50 +0000 > > 18 breakpoint keep y 0x000055555578be74 in buf_bytepos_to_charpos at marker.c:353 > breakpoint already hit 4,245,365 times > > Combined with the fact that calling `garbage-collect' between benchmarks > makes the benchmark time nearly constant, this result may or may not > mean something. Is the "almost constant" time still significantly slower thane in previous versions? Or is it similar? Anyway, the fact that the time doesn't get worse when you GC between benchmark most probably means that we produce a lot of garbage markers (i.e., temporary markers that very quickly become unreferenced), and they get in the way of buf_bytepos_to_charpos. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-14 13:10 ` Eli Zaretskii @ 2022-12-14 13:26 ` Ihor Radchenko 2022-12-14 13:57 ` Eli Zaretskii 0 siblings, 1 reply; 81+ messages in thread From: Ihor Radchenko @ 2022-12-14 13:26 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, larsi, monnier Eli Zaretskii <eliz@gnu.org> writes: >> Combined with the fact that calling `garbage-collect' between benchmarks >> makes the benchmark time nearly constant, this result may or may not >> mean something. > > Is the "almost constant" time still significantly slower thane in > previous versions? Or is it similar? It is orders of magnitude slower: sub-second in Emacs 28; seconds in Emacs 29 fresh session; tens of seconds after several hours of Emacs usage. > Anyway, the fact that the time doesn't get worse when you GC between > benchmark most probably means that we produce a lot of garbage markers > (i.e., temporary markers that very quickly become unreferenced), and > they get in the way of buf_bytepos_to_charpos. Most likely, but it is only part of the problem. If these temporary markers where the only problem, I would not see gradual performance degradation as I continue Emacs session (`garbage-collect` is called anyway during normal usage). -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-14 13:26 ` Ihor Radchenko @ 2022-12-14 13:57 ` Eli Zaretskii 2022-12-14 14:01 ` Ihor Radchenko 0 siblings, 1 reply; 81+ messages in thread From: Eli Zaretskii @ 2022-12-14 13:57 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, larsi, monnier > From: Ihor Radchenko <yantar92@posteo.net> > Cc: monnier@iro.umontreal.ca, larsi@gnus.org, 58558@debbugs.gnu.org > Date: Wed, 14 Dec 2022 13:26:15 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Anyway, the fact that the time doesn't get worse when you GC between > > benchmark most probably means that we produce a lot of garbage markers > > (i.e., temporary markers that very quickly become unreferenced), and > > they get in the way of buf_bytepos_to_charpos. > > Most likely, but it is only part of the problem. If these temporary > markers where the only problem, I would not see gradual performance > degradation as I continue Emacs session (`garbage-collect` is called > anyway during normal usage). We've only seen perf profiles for the benchmark, and they point squarely at buf_bytepos_to_charpos, which AFAIU means markers. To identify other potential causes, we need to see profiles for other patterns of usage. For example, profile collected when the benchmark is run at the beginning of the session compared with profile from benchmark after several hours. I thought you already posted such a comparison, and it, too, pointed at buf_bytepos_to_charpos? Which would probably mean that the amount of markers is increasing, albeit more slowly, even through GC collects some of them. Did you try to see how the number of markers in the buffer evolves with the up-time? ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-14 13:57 ` Eli Zaretskii @ 2022-12-14 14:01 ` Ihor Radchenko 2023-04-06 11:49 ` Ihor Radchenko 0 siblings, 1 reply; 81+ messages in thread From: Ihor Radchenko @ 2022-12-14 14:01 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, larsi, monnier Eli Zaretskii <eliz@gnu.org> writes: > ... For example, profile collected when the benchmark > is run at the beginning of the session compared with profile from > benchmark after several hours. I thought you already posted such a > comparison, and it, too, pointed at buf_bytepos_to_charpos? Yes. Not exactly. I compared freshly opened buffer vs. after several hours. > Did you try to see how the number of markers in the buffer evolves > with the up-time? Is there any way to get the number of buffer markers from Elisp? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-14 14:01 ` Ihor Radchenko @ 2023-04-06 11:49 ` Ihor Radchenko 2023-04-06 12:05 ` Eli Zaretskii 0 siblings, 1 reply; 81+ messages in thread From: Ihor Radchenko @ 2023-04-06 11:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, larsi, monnier [-- Attachment #1: Type: text/plain, Size: 1053 bytes --] Ihor Radchenko <yantar92@posteo.net> writes: >> Did you try to see how the number of markers in the buffer evolves >> with the up-time? > > Is there any way to get the number of buffer markers from Elisp? I finally got back to this and implemented a small subr to count number of buffer markers: DEFUN ("buffer-markers", Fbuffer_markers, Sbuffer_markers, 0, 0, 0, doc: /* Return the number of markers in current buffer.*/) (void) { struct Lisp_Marker *tail; int count = 0; for (tail = BUF_MARKERS (current_buffer); tail; tail = tail->next) count++; return make_fixnum (count); } Then, I tracked how the number of markers evolves in my problematic buffer when building agenda. On master and on Emacs 28 (where the agenda is building 10x faster). As you can see on the attached graph, the number of markers is ~1000, and it is not significantly different for the two Emacs versions. So, the number of markers itself does not look like the real culprit. I have no better ideas for now except slowly bisecting Emacs (again). [-- Attachment #2: marker-count.png --] [-- Type: image/png, Size: 27350 bytes --] [-- Attachment #3: Type: text/plain, Size: 224 bytes --] -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-06 11:49 ` Ihor Radchenko @ 2023-04-06 12:05 ` Eli Zaretskii 2023-04-09 19:54 ` Ihor Radchenko 0 siblings, 1 reply; 81+ messages in thread From: Eli Zaretskii @ 2023-04-06 12:05 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, larsi, monnier > From: Ihor Radchenko <yantar92@posteo.net> > Cc: monnier@iro.umontreal.ca, larsi@gnus.org, 58558@debbugs.gnu.org > Date: Thu, 06 Apr 2023 11:49:52 +0000 > > As you can see on the attached graph, the number of markers is ~1000, and > it is not significantly different for the two Emacs versions. > > So, the number of markers itself does not look like the real culprit. That's one potential reason down, thanks. > I have no better ideas for now except slowly bisecting Emacs (again). I think we should first go back to using perf. I don't think you compared profiles for Emacs which just started with one that was running long enough to show the slowdown. Comparing such profiles should at least give us a hint where to look. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-06 12:05 ` Eli Zaretskii @ 2023-04-09 19:54 ` Ihor Radchenko 2023-04-10 4:14 ` Eli Zaretskii 0 siblings, 1 reply; 81+ messages in thread From: Ihor Radchenko @ 2023-04-09 19:54 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, larsi, monnier Eli Zaretskii <eliz@gnu.org> writes: > I think we should first go back to using perf. I don't think you > compared profiles for Emacs which just started with one that was > running long enough to show the slowdown. Comparing such profiles > should at least give us a hint where to look. I now tried perf record -g. I was able to narrow down the call tree of the problematic buf_bytepos_to_charpos calls: 43.82%--Fre_search_forward --43.81%--search_command --43.78%--search_buffer --43.78%--search_buffer_re --43.33%--re_search_2 --36.39%--re_match_2_internal --21.90%--SYNTAX_TABLE_BYTE_TO_CHAR --21.57%--BYTE_TO_CHAR --21.49%--buf_bytepos_to_charpos Not sure if it is telling much. I also looked into git history and I can only identify significant changes in re_match_2_internal after Emacs 28 release. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-09 19:54 ` Ihor Radchenko @ 2023-04-10 4:14 ` Eli Zaretskii 2023-04-10 12:24 ` Ihor Radchenko 0 siblings, 1 reply; 81+ messages in thread From: Eli Zaretskii @ 2023-04-10 4:14 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, larsi, monnier > From: Ihor Radchenko <yantar92@posteo.net> > Cc: monnier@iro.umontreal.ca, larsi@gnus.org, 58558@debbugs.gnu.org > Date: Sun, 09 Apr 2023 19:54:49 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > > I think we should first go back to using perf. I don't think you > > compared profiles for Emacs which just started with one that was > > running long enough to show the slowdown. Comparing such profiles > > should at least give us a hint where to look. > > I now tried perf record -g. > I was able to narrow down the call tree of the problematic > buf_bytepos_to_charpos calls: > > 43.82%--Fre_search_forward > --43.81%--search_command > --43.78%--search_buffer > --43.78%--search_buffer_re > --43.33%--re_search_2 > --36.39%--re_match_2_internal > --21.90%--SYNTAX_TABLE_BYTE_TO_CHAR > --21.57%--BYTE_TO_CHAR > --21.49%--buf_bytepos_to_charpos > > Not sure if it is telling much. How does this compare with a "fast" session doing the same? And why are you once again focusing on buf_bytepos_to_charpos, when you previously (presumably) established that it cannot be the problem, since the number of markers doesn't change significantly? > I also looked into git history and I can only identify significant > changes in re_match_2_internal after Emacs 28 release. It sounds like most of the time is not in re_match_2_internal itself. But I think comparison with a "fast" session could help with ideas. Thanks. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-10 4:14 ` Eli Zaretskii @ 2023-04-10 12:24 ` Ihor Radchenko 2023-04-10 13:40 ` Eli Zaretskii 2023-04-10 14:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 2 replies; 81+ messages in thread From: Ihor Radchenko @ 2023-04-10 12:24 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, larsi, monnier [-- Attachment #1: Type: text/plain, Size: 1638 bytes --] Eli Zaretskii <eliz@gnu.org> writes: >> 43.82%--Fre_search_forward >> --43.81%--search_command >> --43.78%--search_buffer >> --43.78%--search_buffer_re >> --43.33%--re_search_2 >> --36.39%--re_match_2_internal >> --21.90%--SYNTAX_TABLE_BYTE_TO_CHAR >> --21.57%--BYTE_TO_CHAR >> --21.49%--buf_bytepos_to_charpos >> >> Not sure if it is telling much. > > How does this compare with a "fast" session doing the same? "fast" (emacs 28) session does not have this call tree contributing significantly. > And why are you once again focusing on buf_bytepos_to_charpos, when > you previously (presumably) established that it cannot be the problem, > since the number of markers doesn't change significantly? We only established that the number of markers cannot be the problem. However, buf_bytepos_to_charpos still dominates CPU samples (see the attached) in Emacs master, but not in Emacs 28. Unless there is some other place in buf_bytepos_to_charpos that may be slow, the only possible explanation is that it simply gets called more times. Then, we are interested in the callers of buf_bytepos_to_charpos. That's exactly what I provided in the previous message. >> I also looked into git history and I can only identify significant >> changes in re_match_2_internal after Emacs 28 release. > > It sounds like most of the time is not in re_match_2_internal itself. > But I think comparison with a "fast" session could help with ideas. re_match_2_internal calls SYNTAX_TABLE_BYTE_TO_CHAR in a loop. So, if something strange is happening with the loop, we may be calling buf_bytepos_to_charpos more. [-- Attachment #2: emacs-28-report.png --] [-- Type: image/png, Size: 69416 bytes --] [-- Attachment #3: emacs-master-report.png --] [-- Type: image/png, Size: 72939 bytes --] [-- Attachment #4: Type: text/plain, Size: 224 bytes --] -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-10 12:24 ` Ihor Radchenko @ 2023-04-10 13:40 ` Eli Zaretskii 2023-04-10 14:55 ` Ihor Radchenko 2023-04-10 14:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 1 sibling, 1 reply; 81+ messages in thread From: Eli Zaretskii @ 2023-04-10 13:40 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, larsi, monnier > From: Ihor Radchenko <yantar92@posteo.net> > Cc: monnier@iro.umontreal.ca, larsi@gnus.org, 58558@debbugs.gnu.org > Date: Mon, 10 Apr 2023 12:24:23 +0000 > > >> 43.82%--Fre_search_forward > >> --43.81%--search_command > >> --43.78%--search_buffer > >> --43.78%--search_buffer_re > >> --43.33%--re_search_2 > >> --36.39%--re_match_2_internal > >> --21.90%--SYNTAX_TABLE_BYTE_TO_CHAR > >> --21.57%--BYTE_TO_CHAR > >> --21.49%--buf_bytepos_to_charpos > >> > >> Not sure if it is telling much. > > > > How does this compare with a "fast" session doing the same? > > "fast" (emacs 28) session does not have this call tree contributing > significantly. Hmm... I though when you just start a new Emacs session of Emacs 30 it also is fast, and becomes progressively slower with time? Or am I confused? > re_match_2_internal calls SYNTAX_TABLE_BYTE_TO_CHAR in a loop. So, if > something strange is happening with the loop, we may be calling > buf_bytepos_to_charpos more. I believe perf is capable of showing the number of calls as well? Can you compare the number of calls between the two versions? ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-10 13:40 ` Eli Zaretskii @ 2023-04-10 14:55 ` Ihor Radchenko 2023-04-10 16:04 ` Eli Zaretskii 0 siblings, 1 reply; 81+ messages in thread From: Ihor Radchenko @ 2023-04-10 14:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, larsi, monnier Eli Zaretskii <eliz@gnu.org> writes: >> > How does this compare with a "fast" session doing the same? >> >> "fast" (emacs 28) session does not have this call tree contributing >> significantly. > > Hmm... I though when you just start a new Emacs session of Emacs 30 it > also is fast, and becomes progressively slower with time? Or am I > confused? My original bug report is about agenda generation being slow because of re-search-forward slowdown. Later, I tried to simplify the recipe and found that direct calls to re-search-forward become slower over time (but still with my setup). Originally, agenda generation is slower on master compared to Emacs 28 even right after startup. In my last message and perf data, I have been looking into agenda generation. > I believe perf is capable of showing the number of calls as well? Can > you compare the number of calls between the two versions? I can only see https://www.brendangregg.com/blog/2014-07-03/perf-counting.html, but it appears to be only for built-in events. Do you know how to count calls to specific function using perf? I am not familiar at all with perf. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-10 14:55 ` Ihor Radchenko @ 2023-04-10 16:04 ` Eli Zaretskii 0 siblings, 0 replies; 81+ messages in thread From: Eli Zaretskii @ 2023-04-10 16:04 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, larsi, monnier > From: Ihor Radchenko <yantar92@posteo.net> > Cc: monnier@iro.umontreal.ca, larsi@gnus.org, 58558@debbugs.gnu.org > Date: Mon, 10 Apr 2023 14:55:09 +0000 > > > I believe perf is capable of showing the number of calls as well? Can > > you compare the number of calls between the two versions? > > I can only see > https://www.brendangregg.com/blog/2014-07-03/perf-counting.html, but it > appears to be only for built-in events. Do you know how to count calls > to specific function using perf? I am not familiar at all with perf. I thought that was part of the profile? But if not, then maybe Stefan's "poor-man's counters" will be an easier device for answering that particular question: just increment it before every call to SYNTAX_TABLE_BYTE_TO_CHAR that you find inside re_match_2_internal, and then compare the counts with Emacs 28. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-10 12:24 ` Ihor Radchenko 2023-04-10 13:40 ` Eli Zaretskii @ 2023-04-10 14:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-11 11:29 ` Ihor Radchenko 1 sibling, 1 reply; 81+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-10 14:27 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, Eli Zaretskii, larsi >>> 43.82%--Fre_search_forward >>> --43.81%--search_command >>> --43.78%--search_buffer >>> --43.78%--search_buffer_re >>> --43.33%--re_search_2 >>> --36.39%--re_match_2_internal >>> --21.90%--SYNTAX_TABLE_BYTE_TO_CHAR >>> --21.57%--BYTE_TO_CHAR >>> --21.49%--buf_bytepos_to_charpos >>> >>> Not sure if it is telling much. >> How does this compare with a "fast" session doing the same? > "fast" (emacs 28) session does not have this call tree contributing > significantly. And I thought, we already established around Dec 13 that most of the time is spent in `buf_bytepos_to_charpos` (in other profiles). > Unless there is some other place in buf_bytepos_to_charpos that may be > slow, the only possible explanation is that it simply gets called more > times. That would be quite surprising. BTW, when debugging such performance problem, I often resort to a few `DEFVAR_INT` defining ad-hoc counter variables, then sprinkle corresponding increments of those variables from various places (typically function entry point, loops, ...). That gives me a kind of "poor man's profiler", but with the advantage that I can look at their value conveniently from within the affected Emacs session. Stefan ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-10 14:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-11 11:29 ` Ihor Radchenko 2023-04-11 11:51 ` Eli Zaretskii 0 siblings, 1 reply; 81+ messages in thread From: Ihor Radchenko @ 2023-04-11 11:29 UTC (permalink / raw) To: Stefan Monnier; +Cc: 58558, Eli Zaretskii, larsi [-- Attachment #1: Type: text/plain, Size: 399 bytes --] Stefan Monnier <monnier@iro.umontreal.ca> writes: > BTW, when debugging such performance problem, I often resort to > a few `DEFVAR_INT` defining ad-hoc counter variables, then sprinkle > corresponding increments of those variables from various places > (typically function entry point, loops, ...). Well. I just tried, but my Emacs-C foo is not good enough. The attached patch fails to compile. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-add-debug-vars.patch --] [-- Type: text/x-patch, Size: 3962 bytes --] From ac15ad3262ddf0a0bf459dc603cb79f7f9c737f7 Mon Sep 17 00:00:00 2001 Message-Id: <ac15ad3262ddf0a0bf459dc603cb79f7f9c737f7.1681212491.git.yantar92@posteo.net> From: Ihor Radchenko <yantar92@posteo.net> Date: Tue, 11 Apr 2023 13:27:56 +0200 Subject: [PATCH] add debug vars --- src/regex-emacs.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/src/regex-emacs.c b/src/regex-emacs.c index 2571812cb39..6bcc64d5c0a 100644 --- a/src/regex-emacs.c +++ b/src/regex-emacs.c @@ -3889,6 +3889,25 @@ unwind_re_match (void *ptr) b->text->inhibit_shrinking = 0; } +DEFVAR_INT("re-match-2-internal-bytepos-calls-1", re_match_2_internal_bytepos_calls_1, + doc: /* Call count 1. Internal use only. */); +DEFVAR_INT("re-match-2-internal-bytepos-calls-2", re_match_2_internal_bytepos_calls_2, + doc: /* Call count 1. Internal use only. */); +DEFVAR_INT("re-match-2-internal-bytepos-calls-3", re_match_2_internal_bytepos_calls_3, + doc: /* Call count 1. Internal use only. */); +DEFVAR_INT("re-match-2-internal-bytepos-calls-4", re_match_2_internal_bytepos_calls_4, + doc: /* Call count 1. Internal use only. */); +DEFVAR_INT("re-match-2-internal-bytepos-calls-5", re_match_2_internal_bytepos_calls_5, + doc: /* Call count 1. Internal use only. */); +DEFVAR_INT("re-match-2-internal-bytepos-calls-6", re_match_2_internal_bytepos_calls_6, + doc: /* Call count 1. Internal use only. */); +re_match_2_internal_bytepos_calls_1 = 0; +re_match_2_internal_bytepos_calls_2 = 0; +re_match_2_internal_bytepos_calls_3 = 0; +re_match_2_internal_bytepos_calls_4 = 0; +re_match_2_internal_bytepos_calls_5 = 0; +re_match_2_internal_bytepos_calls_6 = 0; + /* This is a separate function so that we can force an alloca cleanup afterwards. */ static ptrdiff_t @@ -4808,6 +4827,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, int dummy; ptrdiff_t offset = PTR_TO_OFFSET (d); ptrdiff_t charpos = SYNTAX_TABLE_BYTE_TO_CHAR (offset) - 1; + re_match_2_internal_bytepos_calls_1++; UPDATE_SYNTAX_TABLE (charpos); GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2); nchars++; @@ -4848,6 +4868,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, int dummy; ptrdiff_t offset = PTR_TO_OFFSET (d); ptrdiff_t charpos = SYNTAX_TABLE_BYTE_TO_CHAR (offset); + re_match_2_internal_bytepos_calls_2++; UPDATE_SYNTAX_TABLE (charpos); PREFETCH (); GET_CHAR_AFTER (c2, d, dummy); @@ -4891,6 +4912,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, int dummy; ptrdiff_t offset = PTR_TO_OFFSET (d); ptrdiff_t charpos = SYNTAX_TABLE_BYTE_TO_CHAR (offset) - 1; + re_match_2_internal_bytepos_calls_3++; UPDATE_SYNTAX_TABLE (charpos); GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2); nchars++; @@ -4933,6 +4955,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, int s1, s2; ptrdiff_t offset = PTR_TO_OFFSET (d); ptrdiff_t charpos = SYNTAX_TABLE_BYTE_TO_CHAR (offset); + re_match_2_internal_bytepos_calls_4++; UPDATE_SYNTAX_TABLE (charpos); PREFETCH (); c2 = RE_STRING_CHAR (d, target_multibyte); @@ -4974,6 +4997,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, int s1, s2; ptrdiff_t offset = PTR_TO_OFFSET (d); ptrdiff_t charpos = SYNTAX_TABLE_BYTE_TO_CHAR (offset) - 1; + re_match_2_internal_bytepos_calls_5++; UPDATE_SYNTAX_TABLE (charpos); GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2); nchars++; @@ -5010,6 +5034,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, { ptrdiff_t offset = PTR_TO_OFFSET (d); ptrdiff_t pos1 = SYNTAX_TABLE_BYTE_TO_CHAR (offset); + re_match_2_internal_bytepos_calls_6++; UPDATE_SYNTAX_TABLE (pos1); } { -- 2.40.0 [-- Attachment #3: Type: text/plain, Size: 224 bytes --] -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply related [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-11 11:29 ` Ihor Radchenko @ 2023-04-11 11:51 ` Eli Zaretskii 2023-04-12 13:39 ` Ihor Radchenko 0 siblings, 1 reply; 81+ messages in thread From: Eli Zaretskii @ 2023-04-11 11:51 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, larsi, monnier > From: Ihor Radchenko <yantar92@posteo.net> > Cc: Eli Zaretskii <eliz@gnu.org>, larsi@gnus.org, 58558@debbugs.gnu.org > Date: Tue, 11 Apr 2023 11:29:26 +0000 > > Well. I just tried, but my Emacs-C foo is not good enough. > The attached patch fails to compile. That's because you've put DEFVAR_INT outside of any function. They should be inside one of the syms_of_* functions instead. regex-emacs.c doesn't have such a function, but search.c does. So just move those DEFVAR_INT lines to syms_of_search, and I think it will work. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-11 11:51 ` Eli Zaretskii @ 2023-04-12 13:39 ` Ihor Radchenko 2023-04-12 14:06 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-13 4:43 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 2 replies; 81+ messages in thread From: Ihor Radchenko @ 2023-04-12 13:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, larsi, monnier Eli Zaretskii <eliz@gnu.org> writes: >> Well. I just tried, but my Emacs-C foo is not good enough. >> The attached patch fails to compile. > > That's because you've put DEFVAR_INT outside of any function. They > should be inside one of the syms_of_* functions instead. > regex-emacs.c doesn't have such a function, but search.c does. So > just move those DEFVAR_INT lines to syms_of_search, and I think it > will work. Thanks! I now managed to define these variables + also a counter inside buf_bytepos_to_charpos. The results are interesting. The call count for each SYNTAX_TABLE_BYTE_TO_CHAR inside re_match_2_internal (there are 6 places where it is called): - master :: 28 5011460 20 96 285 539911 - Emacs 28 :: 68 5015326 26 397 1404 558585 Master has less calls... This was weird, so I also added a counter inside buf_bytepos_to_charpos: - master :: 6,304,522 - Emacs 28 :: 593,430 Now, it is clear that it is something in SYNTAX_TABLE_BYTE_TO_CHAR that triggers buf_bytepos_to_charpos more on master compared to Emacs 28. I looked into the code: INLINE ptrdiff_t SYNTAX_TABLE_BYTE_TO_CHAR (ptrdiff_t bytepos) { return (! parse_sexp_lookup_properties ? 0 ... } parse_sexp_lookup_properties looks suspicious, so I checked the value of parse-sexp-lookup-properties in Org files on master vs. Emacs 28. On master, the value is t, even though Org mode does not set this variable. On Emacs 28, the value is nil. I looked further and narrowed things down to helpful package in my config, where the culprit is (require 'cc-langs). It looks like for some reason cc-langs changes the default value of parse-sexp-lookup-properties globally! Recipe: 1. emacs -Q 2. M-: (require 'cc-langs) <RET> 3. C-x b asd <RET> 4. M-: parse-sexp-lookup-properties <RET> => t On Emacs 28, (4) yields nil. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-12 13:39 ` Ihor Radchenko @ 2023-04-12 14:06 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-12 14:30 ` Eli Zaretskii ` (2 more replies) 2023-04-13 4:43 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 1 sibling, 3 replies; 81+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-12 14:06 UTC (permalink / raw) To: Alan Mackenzie; +Cc: 58558, larsi, Ihor Radchenko, Eli Zaretskii > 1. emacs -Q > 2. M-: (require 'cc-langs) <RET> > 3. C-x b asd <RET> > 4. M-: parse-sexp-lookup-properties <RET> => t > > On Emacs 28, (4) yields nil. I suspect that the patch below might fix the immediate problem. Of course, setting `parse-sexp-lookup-properties` should not have such a major performance impact, so maybe we should keep digging into the problem. Stefan diff --git a/lisp/progmodes/cc-defs.el b/lisp/progmodes/cc-defs.el index aa6f33e9cab..92ab0c02de1 100644 --- a/lisp/progmodes/cc-defs.el +++ b/lisp/progmodes/cc-defs.el @@ -2153,20 +2153,13 @@ c-emacs-features ;; Record whether the `category' text property works. (if c-use-category (setq list (cons 'category-properties list))) - (let ((buf (generate-new-buffer " test")) - parse-sexp-lookup-properties - parse-sexp-ignore-comments - lookup-syntax-properties) ; XEmacs - (with-current-buffer buf + (with-current-buffer (generate-new-buffer " test") + ;; Do the let-binding in the right buffer, in case they're buffer-local. + (let ((parse-sexp-lookup-properties t) + (parse-sexp-ignore-comments t) + (lookup-syntax-properties t)) ; XEmacs (set-syntax-table (make-syntax-table)) - ;; For some reason we have to set some of these after the - ;; buffer has been made current. (Specifically, - ;; `parse-sexp-ignore-comments' in Emacs 21.) - (setq parse-sexp-lookup-properties t - parse-sexp-ignore-comments t - lookup-syntax-properties t) - ;; Find out if the `syntax-table' text property works. (modify-syntax-entry ?< ".") (modify-syntax-entry ?> ".") @@ -2231,8 +2224,8 @@ c-emacs-features (if (bobp) (setq list (cons 'col-0-paren list))))) - (set-buffer-modified-p nil)) - (kill-buffer buf)) + (set-buffer-modified-p nil) + (kill-buffer (current-buffer)))) ;; Check how many elements `parse-partial-sexp' returns. (let ((ppss-size (or (c-safe (length ^ permalink raw reply related [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-12 14:06 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-12 14:30 ` Eli Zaretskii 2023-04-12 14:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors ` (2 more replies) 2023-04-12 14:39 ` Ihor Radchenko 2023-04-12 18:31 ` Alan Mackenzie 2 siblings, 3 replies; 81+ messages in thread From: Eli Zaretskii @ 2023-04-12 14:30 UTC (permalink / raw) To: Stefan Monnier; +Cc: 58558, acm, yantar92, larsi > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Ihor Radchenko <yantar92@posteo.net>, Eli Zaretskii <eliz@gnu.org>, > larsi@gnus.org, 58558@debbugs.gnu.org > Date: Wed, 12 Apr 2023 10:06:03 -0400 > > > 1. emacs -Q > > 2. M-: (require 'cc-langs) <RET> > > 3. C-x b asd <RET> > > 4. M-: parse-sexp-lookup-properties <RET> => t > > > > On Emacs 28, (4) yields nil. > > I suspect that the patch below might fix the immediate problem. > Of course, setting `parse-sexp-lookup-properties` should not have such > a major performance impact, so maybe we should keep digging into > the problem. Also, that code was there in Emacs 28 as well, so how come it suddenly has this effect now? ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-12 14:30 ` Eli Zaretskii @ 2023-04-12 14:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-12 15:22 ` Eli Zaretskii 2023-04-12 14:38 ` Stephen Berman 2023-04-12 14:42 ` Ihor Radchenko 2 siblings, 1 reply; 81+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-12 14:38 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, acm, yantar92, larsi >> > 1. emacs -Q >> > 2. M-: (require 'cc-langs) <RET> >> > 3. C-x b asd <RET> >> > 4. M-: parse-sexp-lookup-properties <RET> => t >> > >> > On Emacs 28, (4) yields nil. >> >> I suspect that the patch below might fix the immediate problem. >> Of course, setting `parse-sexp-lookup-properties` should not have such >> a major performance impact, so maybe we should keep digging into >> the problem. > > Also, that code was there in Emacs 28 as well, so how come it suddenly > has this effect now? The effect of the code depends on whether the buffer that's current when `cc-defs.el` is loaded has set `parse-sexp-lookup-properties` buffer-locally or not. I don't have Emacs-28 at hand, but the value of `parse-sexp-lookup-properties` in *scratch* is (buffer-local) t in Emacs-29 and (global) nil in Emacs-27. Stefan ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-12 14:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-12 15:22 ` Eli Zaretskii 2023-04-12 15:59 ` Alan Mackenzie 0 siblings, 1 reply; 81+ messages in thread From: Eli Zaretskii @ 2023-04-12 15:22 UTC (permalink / raw) To: Stefan Monnier; +Cc: 58558, acm, yantar92, larsi > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: acm@muc.de, yantar92@posteo.net, larsi@gnus.org, 58558@debbugs.gnu.org > Date: Wed, 12 Apr 2023 10:38:50 -0400 > > > Also, that code was there in Emacs 28 as well, so how come it suddenly > > has this effect now? > > The effect of the code depends on whether the buffer that's current when > `cc-defs.el` is loaded has set `parse-sexp-lookup-properties` > buffer-locally or not. > > I don't have Emacs-28 at hand, but the value of > `parse-sexp-lookup-properties` in *scratch* is (buffer-local) t in > Emacs-29 and (global) nil in Emacs-27. Ah, okay. So in Emacs 29 we started setting this variable locally in some buffers? Do you happen to know where's the change which caused that, and why was it done? ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-12 15:22 ` Eli Zaretskii @ 2023-04-12 15:59 ` Alan Mackenzie 0 siblings, 0 replies; 81+ messages in thread From: Alan Mackenzie @ 2023-04-12 15:59 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, larsi, yantar92, Stefan Monnier Hello, Eli. On Wed, Apr 12, 2023 at 18:22:22 +0300, Eli Zaretskii wrote: > > From: Stefan Monnier <monnier@iro.umontreal.ca> > > Cc: acm@muc.de, yantar92@posteo.net, larsi@gnus.org, 58558@debbugs.gnu.org > > Date: Wed, 12 Apr 2023 10:38:50 -0400 > > > Also, that code was there in Emacs 28 as well, so how come it suddenly > > > has this effect now? > > The effect of the code depends on whether the buffer that's current when > > `cc-defs.el` is loaded has set `parse-sexp-lookup-properties` > > buffer-locally or not. > > I don't have Emacs-28 at hand, but the value of > > `parse-sexp-lookup-properties` in *scratch* is (buffer-local) t in > > Emacs-29 and (global) nil in Emacs-27. > Ah, okay. So in Emacs 29 we started setting this variable locally in > some buffers? Do you happen to know where's the change which caused > that, and why was it done? I suspect this commit as the cause: commit 6ccc4b6bc8a14daca6b3e3250574752c90c1eb9b Author: Noam Postavsky <npostavs@gmail.com> Date: Fri May 6 18:31:00 2022 +0200 Handle elisp #-syntax better in Emacs Lisp mode * elisp-mode.el (elisp-mode-syntax-propertize): New function. (emacs-lisp-mode): Set it as syntax-propertize-function (bug#15998). Lisp Interaction Mode is derived from Emacs Lisp Mode. Whenever there is a non-nil syntax-propertize-function, run-mode-hooks sets parse-sexp-lookup-properties to t. This is probably harmless in *scratch*. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-12 14:30 ` Eli Zaretskii 2023-04-12 14:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-12 14:38 ` Stephen Berman 2023-04-12 14:42 ` Ihor Radchenko 2 siblings, 0 replies; 81+ messages in thread From: Stephen Berman @ 2023-04-12 14:38 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, acm, yantar92, larsi, Stefan Monnier On Wed, 12 Apr 2023 17:30:36 +0300 Eli Zaretskii <eliz@gnu.org> wrote: >> From: Stefan Monnier <monnier@iro.umontreal.ca> >> Cc: Ihor Radchenko <yantar92@posteo.net>, Eli Zaretskii <eliz@gnu.org>, >> larsi@gnus.org, 58558@debbugs.gnu.org >> Date: Wed, 12 Apr 2023 10:06:03 -0400 >> >> > 1. emacs -Q >> > 2. M-: (require 'cc-langs) <RET> >> > 3. C-x b asd <RET> >> > 4. M-: parse-sexp-lookup-properties <RET> => t >> > >> > On Emacs 28, (4) yields nil. >> >> I suspect that the patch below might fix the immediate problem. >> Of course, setting `parse-sexp-lookup-properties` should not have such >> a major performance impact, so maybe we should keep digging into >> the problem. > > Also, that code was there in Emacs 28 as well, so how come it suddenly > has this effect now? Note that, with emacs-28 -Q, `C-h v parse-sexp-lookup-properties' ==> parse-sexp-lookup-properties is a variable defined in ‘C source code’. Its value is nil while with emacs-29 -Q, `C-h v parse-sexp-lookup-properties' ==> parse-sexp-lookup-properties is a variable defined in ‘C source code’. Its value is t Local in buffer *scratch*; global value is nil Steve Berman ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-12 14:30 ` Eli Zaretskii 2023-04-12 14:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-12 14:38 ` Stephen Berman @ 2023-04-12 14:42 ` Ihor Radchenko 2 siblings, 0 replies; 81+ messages in thread From: Ihor Radchenko @ 2023-04-12 14:42 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 58558, acm, larsi, Stefan Monnier Eli Zaretskii <eliz@gnu.org> writes: > Also, that code was there in Emacs 28 as well, so how come it suddenly > has this effect now? Random guess: cc-langs.el loads cc-defs via (cc-require 'cc-defs). `cc-require' is doing something extremely tricky with byte compilation. May Emacs 29 have some subtle changes in byte code that could have an influence? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-12 14:06 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-12 14:30 ` Eli Zaretskii @ 2023-04-12 14:39 ` Ihor Radchenko 2023-04-12 15:20 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-12 18:31 ` Alan Mackenzie 2 siblings, 1 reply; 81+ messages in thread From: Ihor Radchenko @ 2023-04-12 14:39 UTC (permalink / raw) To: Stefan Monnier; +Cc: 58558, Alan Mackenzie, Eli Zaretskii, larsi Stefan Monnier <monnier@iro.umontreal.ca> writes: > I suspect that the patch below might fix the immediate problem. I confirm that it does fix the problem. But why not `with-temp-buffer'? Also, how come `setq' changes the global variable value despite it is let-bound? > Of course, setting `parse-sexp-lookup-properties` should not have such > a major performance impact, so maybe we should keep digging into > the problem. Agree. I was considering `parse-sexp-lookup-properties' in Org, but this issue will be a blocker. To improve the performance, the two obvious ways are reducing the number of SYNTAX_TABLE_BYTE_TO_CHAR calls in re_match_2_internal and speeding up buf_bytepos_to_charpos. I'd prefer the latter as it is used ubiquitously across Emacs and making point lookup faster will thus benefit other places as well. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-12 14:39 ` Ihor Radchenko @ 2023-04-12 15:20 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-12 23:23 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 1 reply; 81+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-12 15:20 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, Alan Mackenzie, Eli Zaretskii, larsi > I confirm that it does fix the problem. But why not `with-temp-buffer'? I think it's for compatibility with TECO Emacs or something like that :-) > Also, how come `setq' changes the global variable value despite it is > let-bound? Because the `let` and the `setq` were not performed in the same buffer, so if the var is buffer-local ... > To improve the performance, the two obvious ways are reducing the number > of SYNTAX_TABLE_BYTE_TO_CHAR calls in re_match_2_internal and speeding > up buf_bytepos_to_charpos. I think the behavior you experience doesn't require "speeding up" but it requires "fixing a performance bug". Technically it's the same, but still.. > I'd prefer the latter as it is used ubiquitously across Emacs and > making point lookup faster will thus benefit other places as well. Why choose? For the former, we could probably extend the `b_property` and `e_property` fields of `gl_state` (which hold charpos) to also store their bytepos equivalent, which should significantly reduce the number of conversions between bytepos and charpos. Stefan ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-12 15:20 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-12 23:23 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-13 4:33 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-13 4:52 ` Eli Zaretskii 0 siblings, 2 replies; 81+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-12 23:23 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, Alan Mackenzie, Eli Zaretskii, larsi [-- Attachment #1: Type: text/plain, Size: 407 bytes --] > For the former, we could probably extend the `b_property` and > `e_property` fields of `gl_state` (which hold charpos) to also store > their bytepos equivalent, which should significantly reduce the number > of conversions between bytepos and charpos. I.e. something like the patch below (which passes all tests except for `test/src/comp-tests` for a reason that completely escapes me). Stefan [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: regmatch.patch --] [-- Type: text/x-diff, Size: 13157 bytes --] diff --git a/src/regex-emacs.c b/src/regex-emacs.c index 746779490ad..f75f805cd9c 100644 --- a/src/regex-emacs.c +++ b/src/regex-emacs.c @@ -3979,7 +3979,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, /* Prevent shrinking and relocation of buffer text if GC happens while we are inside this function. The calls to - UPDATE_SYNTAX_TABLE_* macros can call Lisp (via + RE_UPDATE_SYNTAX_TABLE_* macros can call Lisp (via `internal--syntax-propertize`); these calls are careful to defend against buffer modifications, but even with no modifications, the buffer text may be relocated during GC by `compact_buffer` which would invalidate @@ -4792,12 +4792,11 @@ re_match_2_internal (struct re_pattern_buffer *bufp, int s1, s2; int dummy; ptrdiff_t offset = POINTER_TO_OFFSET (d); - ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset) - 1; - UPDATE_SYNTAX_TABLE (charpos); + RE_UPDATE_SYNTAX_TABLE_BEFORE (offset); GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2); nchars++; s1 = SYNTAX (c1); - UPDATE_SYNTAX_TABLE_FORWARD (charpos + 1); + RE_UPDATE_SYNTAX_TABLE_FORWARD (offset); PREFETCH_NOLIMIT (); GET_CHAR_AFTER (c2, d, dummy); nchars++; @@ -4832,8 +4831,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, int s1, s2; int dummy; ptrdiff_t offset = POINTER_TO_OFFSET (d); - ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset); - UPDATE_SYNTAX_TABLE (charpos); + RE_UPDATE_SYNTAX_TABLE (offset); PREFETCH (); GET_CHAR_AFTER (c2, d, dummy); nchars++; @@ -4848,7 +4846,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, { GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2); nchars++; - UPDATE_SYNTAX_TABLE_BACKWARD (charpos - 1); + RE_UPDATE_SYNTAX_TABLE_BACKWARD_BEFORE (offset); s1 = SYNTAX (c1); /* ... and S1 is Sword, and WORD_BOUNDARY_P (C1, C2) @@ -4875,8 +4873,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, int s1, s2; int dummy; ptrdiff_t offset = POINTER_TO_OFFSET (d); - ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset) - 1; - UPDATE_SYNTAX_TABLE (charpos); + RE_UPDATE_SYNTAX_TABLE_BEFORE (offset); GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2); nchars++; s1 = SYNTAX (c1); @@ -4891,13 +4888,13 @@ re_match_2_internal (struct re_pattern_buffer *bufp, PREFETCH_NOLIMIT (); GET_CHAR_AFTER (c2, d, dummy); nchars++; - UPDATE_SYNTAX_TABLE_FORWARD (charpos + 1); + RE_UPDATE_SYNTAX_TABLE_FORWARD (offset); s2 = SYNTAX (c2); /* ... and S2 is Sword, and WORD_BOUNDARY_P (C1, C2) returns 0. */ if ((s2 == Sword) && !WORD_BOUNDARY_P (c1, c2)) - goto fail; + goto fail; } } break; @@ -4917,8 +4914,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, int c1, c2; int s1, s2; ptrdiff_t offset = POINTER_TO_OFFSET (d); - ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset); - UPDATE_SYNTAX_TABLE (charpos); + RE_UPDATE_SYNTAX_TABLE (offset); PREFETCH (); c2 = RE_STRING_CHAR (d, target_multibyte); nchars++; @@ -4933,7 +4929,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, { GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2); nchars++; - UPDATE_SYNTAX_TABLE_BACKWARD (charpos - 1); + RE_UPDATE_SYNTAX_TABLE_BACKWARD_BEFORE (offset); s1 = SYNTAX (c1); /* ... and S1 is Sword or Ssymbol. */ @@ -4958,8 +4954,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, int c1, c2; int s1, s2; ptrdiff_t offset = POINTER_TO_OFFSET (d); - ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset) - 1; - UPDATE_SYNTAX_TABLE (charpos); + RE_UPDATE_SYNTAX_TABLE_BEFORE (offset); GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2); nchars++; s1 = SYNTAX (c1); @@ -4974,7 +4969,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, PREFETCH_NOLIMIT (); c2 = RE_STRING_CHAR (d, target_multibyte); nchars++; - UPDATE_SYNTAX_TABLE_FORWARD (charpos + 1); + RE_UPDATE_SYNTAX_TABLE_FORWARD (offset); s2 = SYNTAX (c2); /* ... and S2 is Sword or Ssymbol. */ @@ -4994,8 +4989,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, PREFETCH (); { ptrdiff_t offset = POINTER_TO_OFFSET (d); - ptrdiff_t pos1 = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset); - UPDATE_SYNTAX_TABLE (pos1); + RE_UPDATE_SYNTAX_TABLE (offset); } { int len; diff --git a/src/syntax.c b/src/syntax.c index e9e04e2d638..0245038dc2d 100644 --- a/src/syntax.c +++ b/src/syntax.c @@ -250,6 +250,8 @@ SETUP_SYNTAX_TABLE (ptrdiff_t from, ptrdiff_t count) gl_state.b_property = BEGV; gl_state.e_property = ZV + 1; gl_state.object = Qnil; + gl_state.b_re_byte = -1; + gl_state.e_re_byte = -1; if (parse_sexp_lookup_properties) { if (count > 0) @@ -268,11 +270,12 @@ SETUP_SYNTAX_TABLE (ptrdiff_t from, ptrdiff_t count) FROMBYTE is an regexp-byteoffset. */ void -RE_SETUP_SYNTAX_TABLE_FOR_OBJECT (Lisp_Object object, - ptrdiff_t frombyte) +RE_SETUP_SYNTAX_TABLE_FOR_OBJECT (Lisp_Object object, ptrdiff_t frombyte) { SETUP_BUFFER_SYNTAX_TABLE (); gl_state.object = object; + gl_state.b_re_byte = -1; + gl_state.e_re_byte = -1; if (BUFFERP (gl_state.object)) { struct buffer *buf = XBUFFER (gl_state.object); @@ -282,7 +285,7 @@ RE_SETUP_SYNTAX_TABLE_FOR_OBJECT (Lisp_Object object, else if (NILP (gl_state.object)) { gl_state.b_property = BEG; - gl_state.e_property = ZV; /* FIXME: Why not +1 like in SETUP_SYNTAX_TABLE? */ + gl_state.e_property = ZV; } else if (EQ (gl_state.object, Qt)) { @@ -295,8 +298,11 @@ RE_SETUP_SYNTAX_TABLE_FOR_OBJECT (Lisp_Object object, gl_state.e_property = 1 + SCHARS (gl_state.object); } if (parse_sexp_lookup_properties) - update_syntax_table (RE_SYNTAX_TABLE_BYTE_TO_CHAR (frombyte), - 1, 1, gl_state.object); + { + update_syntax_table (RE_SYNTAX_TABLE_BYTE_TO_CHAR (frombyte), + 1, 1, gl_state.object); + re_update_byteoffsets (); + } } /* Update gl_state to an appropriate interval which contains CHARPOS. The diff --git a/src/syntax.h b/src/syntax.h index 01982be25a0..3e20952053b 100644 --- a/src/syntax.h +++ b/src/syntax.h @@ -66,7 +66,7 @@ #define Vstandard_syntax_table BVAR (&buffer_defaults, syntax_table) struct gl_state_s { Lisp_Object object; /* The object we are scanning. */ - ptrdiff_t start; /* Where to stop. */ + ptrdiff_t start; /* Where to stop(?FIXME?). */ ptrdiff_t stop; /* Where to stop. */ bool use_global; /* Whether to use global_code or c_s_t. */ @@ -85,6 +85,11 @@ #define Vstandard_syntax_table BVAR (&buffer_defaults, syntax_table) and possibly at the intervals too, depending on: */ + /* The regexp engine prefers byteoffsets over char positions, so + store those to try and reduce the number of byte<->char conversions. + This is only kept uptodate when used from the regexp engine. */ + ptrdiff_t b_re_byte; /* First byteoffset where c_s_t is valid. */ + ptrdiff_t e_re_byte; /* First byteoffset where c_s_t is not valid. */ }; extern struct gl_state_s gl_state; @@ -145,19 +150,14 @@ SYNTAX (int c) extern unsigned char const syntax_spec_code[0400]; -/* Convert the regexp's BYTEOFFSET into a character position, - for the object recorded in gl_state with RE_SETUP_SYNTAX_TABLE_FOR_OBJECT. - - The value is meant for use in code that does nothing when - parse_sexp_lookup_properties is false, so return 0 in that case, - for speed. */ +/* Convert the regexp's BYTEOFFSET into a character position, for + the object recorded in gl_state with RE_SETUP_SYNTAX_TABLE_FOR_OBJECT. */ INLINE ptrdiff_t RE_SYNTAX_TABLE_BYTE_TO_CHAR (ptrdiff_t byteoffset) { - return (! parse_sexp_lookup_properties - ? 0 - : STRINGP (gl_state.object) + eassert (parse_sexp_lookup_properties); + return (STRINGP (gl_state.object) ? string_byte_to_char (gl_state.object, byteoffset) : BUFFERP (gl_state.object) ? ((buf_bytepos_to_charpos @@ -168,6 +168,44 @@ RE_SYNTAX_TABLE_BYTE_TO_CHAR (ptrdiff_t byteoffset) : byteoffset); } +INLINE ptrdiff_t +RE_SYNTAX_TABLE_CHAR_TO_BYTE (ptrdiff_t charpos) +{ + eassert (parse_sexp_lookup_properties); + return (STRINGP (gl_state.object) + ? string_char_to_byte (gl_state.object, charpos) + : BUFFERP (gl_state.object) + ? ((buf_charpos_to_bytepos + (XBUFFER (gl_state.object), charpos) + - BUF_BEGV_BYTE (XBUFFER (gl_state.object)))) + : NILP (gl_state.object) + ? CHAR_TO_BYTE (charpos) - BEGV_BYTE + : charpos); +} + +static void re_update_byteoffsets (void) +{ + gl_state.b_re_byte = RE_SYNTAX_TABLE_CHAR_TO_BYTE (gl_state.b_property); + eassert (gl_state.b_property + == RE_SYNTAX_TABLE_BYTE_TO_CHAR (gl_state.b_re_byte)); + /* `e_property` is often set to EOB+1 (or to some value + much further than `stop` in narrowed buffers). */ + gl_state.e_re_byte + = gl_state.e_property > gl_state.stop + ? 1 + RE_SYNTAX_TABLE_CHAR_TO_BYTE (gl_state.stop) + : RE_SYNTAX_TABLE_CHAR_TO_BYTE (gl_state.e_property); + eassert (gl_state.e_property > gl_state.stop + ? gl_state.e_property + >= 1 + RE_SYNTAX_TABLE_BYTE_TO_CHAR (gl_state.e_re_byte - 1) + : gl_state.e_property + == RE_SYNTAX_TABLE_BYTE_TO_CHAR (gl_state.e_re_byte)); +} + +/* The regexp-engine doesn't keep track of char positions, but instead + uses byteoffsets, so `syntax.c` uses `UPDATE_SYNTAX_TABLE_*` functions, + passing them `charpos`s whereas `regexp.c` uses `RE_UPDATE_SYNTAX_TABLE_*` + functions, passing them byteoffsets. */ + /* Make syntax table state (gl_state) good for CHARPOS, assuming it is currently good for a position before CHARPOS. */ @@ -178,6 +216,36 @@ UPDATE_SYNTAX_TABLE_FORWARD (ptrdiff_t charpos) update_syntax_table_forward (charpos, false, gl_state.object); } +INLINE void +RE_UPDATE_SYNTAX_TABLE_FORWARD (ptrdiff_t byteoffset) +{ /* Performs just-in-time syntax-propertization. */ + if (!parse_sexp_lookup_properties) + return; + eassert (gl_state.e_re_byte >= 0); /* gl_state.b_re_byte can be negative. */ + if (byteoffset >= gl_state.e_re_byte) + { + ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (byteoffset); + eassert (charpos >= gl_state.e_property); + UPDATE_SYNTAX_TABLE_FORWARD (charpos); + re_update_byteoffsets (); + } +} + +INLINE void +RE_UPDATE_SYNTAX_TABLE_FORWARD_BEFORE (ptrdiff_t byteoffset) +{ /* Performs just-in-time syntax-propertization. */ + if (!parse_sexp_lookup_properties) + return; + eassert (gl_state.e_re_byte >= 0); /* gl_state.b_re_byte can be negative. */ + if (byteoffset > gl_state.e_re_byte) + { + ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (byteoffset) - 1; + eassert (charpos >= gl_state.e_property); + UPDATE_SYNTAX_TABLE_FORWARD (charpos); + re_update_byteoffsets (); + } +} + /* Make syntax table state (gl_state) good for CHARPOS, assuming it is currently good for a position after CHARPOS. */ @@ -188,6 +256,36 @@ UPDATE_SYNTAX_TABLE_BACKWARD (ptrdiff_t charpos) update_syntax_table (charpos, -1, false, gl_state.object); } +INLINE void +RE_UPDATE_SYNTAX_TABLE_BACKWARD (ptrdiff_t byteoffset) +{ + if (!parse_sexp_lookup_properties) + return; + eassert (gl_state.e_re_byte >= 0); /* gl_state.b_re_byte can be negative. */ + if (byteoffset < gl_state.b_re_byte) + { + ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (byteoffset); + eassert (charpos < gl_state.b_property); + UPDATE_SYNTAX_TABLE_FORWARD (charpos); + re_update_byteoffsets (); + } +} + +INLINE void +RE_UPDATE_SYNTAX_TABLE_BACKWARD_BEFORE (ptrdiff_t byteoffset) +{ + if (!parse_sexp_lookup_properties) + return; + eassert (gl_state.e_re_byte >= 0); /* gl_state.b_re_byte can be negative. */ + if (byteoffset <= gl_state.b_re_byte) + { + ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (byteoffset); + eassert (charpos <= gl_state.b_property); + UPDATE_SYNTAX_TABLE_FORWARD (charpos - 1); + re_update_byteoffsets (); + } +} + /* Make syntax table good for CHARPOS. */ INLINE void @@ -197,6 +295,20 @@ UPDATE_SYNTAX_TABLE (ptrdiff_t charpos) UPDATE_SYNTAX_TABLE_FORWARD (charpos); } +INLINE void +RE_UPDATE_SYNTAX_TABLE (ptrdiff_t byteoffset) +{ + RE_UPDATE_SYNTAX_TABLE_BACKWARD (byteoffset); + RE_UPDATE_SYNTAX_TABLE_FORWARD (byteoffset); +} + +INLINE void +RE_UPDATE_SYNTAX_TABLE_BEFORE (ptrdiff_t byteoffset) +{ + RE_UPDATE_SYNTAX_TABLE_BACKWARD_BEFORE (byteoffset); + RE_UPDATE_SYNTAX_TABLE_FORWARD_BEFORE (byteoffset); +} + /* Set up the buffer-global syntax table. */ INLINE void ^ permalink raw reply related [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-12 23:23 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-13 4:33 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-13 20:05 ` Ihor Radchenko 2023-04-13 4:52 ` Eli Zaretskii 1 sibling, 1 reply; 81+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-13 4:33 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, Alan Mackenzie, Eli Zaretskii, larsi [-- Attachment #1: Type: text/plain, Size: 471 bytes --] >> For the former, we could probably extend the `b_property` and >> `e_property` fields of `gl_state` (which hold charpos) to also store >> their bytepos equivalent, which should significantly reduce the number >> of conversions between bytepos and charpos. > I.e. something like the patch below (which passes all tests except for > `test/src/comp-tests` for a reason that completely escapes me). Found the culprit! The patch below passes `make check`. Stefan [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: regmatch.patch --] [-- Type: text/x-diff, Size: 14351 bytes --] diff --git a/src/fns.c b/src/fns.c index e92ef7e4c81..591b00103da 100644 --- a/src/fns.c +++ b/src/fns.c @@ -1194,6 +1194,8 @@ string_char_to_byte (Lisp_Object string, ptrdiff_t char_index) if (best_above == best_above_byte) return char_index; + eassert (char_index >= 0 && char_index <= best_above); + if (BASE_EQ (string, string_char_byte_cache_string)) { if (string_char_byte_cache_charpos < char_index) @@ -1254,6 +1256,8 @@ string_byte_to_char (Lisp_Object string, ptrdiff_t byte_index) if (best_above == best_above_byte) return byte_index; + eassert (byte_index >= 0 && byte_index <= best_above_byte); + if (BASE_EQ (string, string_char_byte_cache_string)) { if (string_char_byte_cache_bytepos < byte_index) diff --git a/src/regex-emacs.c b/src/regex-emacs.c index 746779490ad..f75f805cd9c 100644 --- a/src/regex-emacs.c +++ b/src/regex-emacs.c @@ -3979,7 +3979,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, /* Prevent shrinking and relocation of buffer text if GC happens while we are inside this function. The calls to - UPDATE_SYNTAX_TABLE_* macros can call Lisp (via + RE_UPDATE_SYNTAX_TABLE_* macros can call Lisp (via `internal--syntax-propertize`); these calls are careful to defend against buffer modifications, but even with no modifications, the buffer text may be relocated during GC by `compact_buffer` which would invalidate @@ -4792,12 +4792,11 @@ re_match_2_internal (struct re_pattern_buffer *bufp, int s1, s2; int dummy; ptrdiff_t offset = POINTER_TO_OFFSET (d); - ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset) - 1; - UPDATE_SYNTAX_TABLE (charpos); + RE_UPDATE_SYNTAX_TABLE_BEFORE (offset); GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2); nchars++; s1 = SYNTAX (c1); - UPDATE_SYNTAX_TABLE_FORWARD (charpos + 1); + RE_UPDATE_SYNTAX_TABLE_FORWARD (offset); PREFETCH_NOLIMIT (); GET_CHAR_AFTER (c2, d, dummy); nchars++; @@ -4832,8 +4831,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, int s1, s2; int dummy; ptrdiff_t offset = POINTER_TO_OFFSET (d); - ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset); - UPDATE_SYNTAX_TABLE (charpos); + RE_UPDATE_SYNTAX_TABLE (offset); PREFETCH (); GET_CHAR_AFTER (c2, d, dummy); nchars++; @@ -4848,7 +4846,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, { GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2); nchars++; - UPDATE_SYNTAX_TABLE_BACKWARD (charpos - 1); + RE_UPDATE_SYNTAX_TABLE_BACKWARD_BEFORE (offset); s1 = SYNTAX (c1); /* ... and S1 is Sword, and WORD_BOUNDARY_P (C1, C2) @@ -4875,8 +4873,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, int s1, s2; int dummy; ptrdiff_t offset = POINTER_TO_OFFSET (d); - ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset) - 1; - UPDATE_SYNTAX_TABLE (charpos); + RE_UPDATE_SYNTAX_TABLE_BEFORE (offset); GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2); nchars++; s1 = SYNTAX (c1); @@ -4891,13 +4888,13 @@ re_match_2_internal (struct re_pattern_buffer *bufp, PREFETCH_NOLIMIT (); GET_CHAR_AFTER (c2, d, dummy); nchars++; - UPDATE_SYNTAX_TABLE_FORWARD (charpos + 1); + RE_UPDATE_SYNTAX_TABLE_FORWARD (offset); s2 = SYNTAX (c2); /* ... and S2 is Sword, and WORD_BOUNDARY_P (C1, C2) returns 0. */ if ((s2 == Sword) && !WORD_BOUNDARY_P (c1, c2)) - goto fail; + goto fail; } } break; @@ -4917,8 +4914,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, int c1, c2; int s1, s2; ptrdiff_t offset = POINTER_TO_OFFSET (d); - ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset); - UPDATE_SYNTAX_TABLE (charpos); + RE_UPDATE_SYNTAX_TABLE (offset); PREFETCH (); c2 = RE_STRING_CHAR (d, target_multibyte); nchars++; @@ -4933,7 +4929,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, { GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2); nchars++; - UPDATE_SYNTAX_TABLE_BACKWARD (charpos - 1); + RE_UPDATE_SYNTAX_TABLE_BACKWARD_BEFORE (offset); s1 = SYNTAX (c1); /* ... and S1 is Sword or Ssymbol. */ @@ -4958,8 +4954,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, int c1, c2; int s1, s2; ptrdiff_t offset = POINTER_TO_OFFSET (d); - ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset) - 1; - UPDATE_SYNTAX_TABLE (charpos); + RE_UPDATE_SYNTAX_TABLE_BEFORE (offset); GET_CHAR_BEFORE_2 (c1, d, string1, end1, string2, end2); nchars++; s1 = SYNTAX (c1); @@ -4974,7 +4969,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, PREFETCH_NOLIMIT (); c2 = RE_STRING_CHAR (d, target_multibyte); nchars++; - UPDATE_SYNTAX_TABLE_FORWARD (charpos + 1); + RE_UPDATE_SYNTAX_TABLE_FORWARD (offset); s2 = SYNTAX (c2); /* ... and S2 is Sword or Ssymbol. */ @@ -4994,8 +4989,7 @@ re_match_2_internal (struct re_pattern_buffer *bufp, PREFETCH (); { ptrdiff_t offset = POINTER_TO_OFFSET (d); - ptrdiff_t pos1 = RE_SYNTAX_TABLE_BYTE_TO_CHAR (offset); - UPDATE_SYNTAX_TABLE (pos1); + RE_UPDATE_SYNTAX_TABLE (offset); } { int len; diff --git a/src/syntax.c b/src/syntax.c index e9e04e2d638..fbd08c74092 100644 --- a/src/syntax.c +++ b/src/syntax.c @@ -250,6 +250,8 @@ SETUP_SYNTAX_TABLE (ptrdiff_t from, ptrdiff_t count) gl_state.b_property = BEGV; gl_state.e_property = ZV + 1; gl_state.object = Qnil; + gl_state.b_re_byte = -1; + gl_state.e_re_byte = -1; if (parse_sexp_lookup_properties) { if (count > 0) @@ -265,14 +267,15 @@ SETUP_SYNTAX_TABLE (ptrdiff_t from, ptrdiff_t count) /* Same as above, but in OBJECT. If OBJECT is nil, use current buffer. If it is t (which is only used in fast_c_string_match_ignore_case), ignore properties altogether. - FROMBYTE is an regexp-byteoffset. */ + FROMBYTE is a regexp-byteoffset. */ void -RE_SETUP_SYNTAX_TABLE_FOR_OBJECT (Lisp_Object object, - ptrdiff_t frombyte) +RE_SETUP_SYNTAX_TABLE_FOR_OBJECT (Lisp_Object object, ptrdiff_t frombyte) { SETUP_BUFFER_SYNTAX_TABLE (); gl_state.object = object; + gl_state.b_re_byte = -1; + gl_state.e_re_byte = -1; if (BUFFERP (gl_state.object)) { struct buffer *buf = XBUFFER (gl_state.object); @@ -282,21 +285,25 @@ RE_SETUP_SYNTAX_TABLE_FOR_OBJECT (Lisp_Object object, else if (NILP (gl_state.object)) { gl_state.b_property = BEG; - gl_state.e_property = ZV; /* FIXME: Why not +1 like in SETUP_SYNTAX_TABLE? */ + gl_state.e_property = ZV; } else if (EQ (gl_state.object, Qt)) { gl_state.b_property = 0; - gl_state.e_property = PTRDIFF_MAX; + /* -1 so we can do +1 in `re_update_byteoffsets`. */ + gl_state.e_property = PTRDIFF_MAX - 1; } else { gl_state.b_property = 0; - gl_state.e_property = 1 + SCHARS (gl_state.object); + gl_state.e_property = SCHARS (gl_state.object); } if (parse_sexp_lookup_properties) - update_syntax_table (RE_SYNTAX_TABLE_BYTE_TO_CHAR (frombyte), - 1, 1, gl_state.object); + { + update_syntax_table (RE_SYNTAX_TABLE_BYTE_TO_CHAR (frombyte), + 1, 1, gl_state.object); + re_update_byteoffsets (); + } } /* Update gl_state to an appropriate interval which contains CHARPOS. The diff --git a/src/syntax.h b/src/syntax.h index 01982be25a0..420ba8f31dc 100644 --- a/src/syntax.h +++ b/src/syntax.h @@ -66,7 +66,7 @@ #define Vstandard_syntax_table BVAR (&buffer_defaults, syntax_table) struct gl_state_s { Lisp_Object object; /* The object we are scanning. */ - ptrdiff_t start; /* Where to stop. */ + ptrdiff_t start; /* Where to stop(?FIXME?). */ ptrdiff_t stop; /* Where to stop. */ bool use_global; /* Whether to use global_code or c_s_t. */ @@ -85,6 +85,11 @@ #define Vstandard_syntax_table BVAR (&buffer_defaults, syntax_table) and possibly at the intervals too, depending on: */ + /* The regexp engine prefers byteoffsets over char positions, so + store those to try and reduce the number of byte<->char conversions. + This is only kept uptodate when used from the regexp engine. */ + ptrdiff_t b_re_byte; /* First byteoffset where c_s_t is valid. */ + ptrdiff_t e_re_byte; /* First byteoffset where c_s_t is not valid. */ }; extern struct gl_state_s gl_state; @@ -145,19 +150,14 @@ SYNTAX (int c) extern unsigned char const syntax_spec_code[0400]; -/* Convert the regexp's BYTEOFFSET into a character position, - for the object recorded in gl_state with RE_SETUP_SYNTAX_TABLE_FOR_OBJECT. - - The value is meant for use in code that does nothing when - parse_sexp_lookup_properties is false, so return 0 in that case, - for speed. */ +/* Convert the BYTEOFFSET into a character position, for the object + recorded in gl_state with RE_SETUP_SYNTAX_TABLE_FOR_OBJECT. */ INLINE ptrdiff_t RE_SYNTAX_TABLE_BYTE_TO_CHAR (ptrdiff_t byteoffset) { - return (! parse_sexp_lookup_properties - ? 0 - : STRINGP (gl_state.object) + eassert (parse_sexp_lookup_properties); + return (STRINGP (gl_state.object) ? string_byte_to_char (gl_state.object, byteoffset) : BUFFERP (gl_state.object) ? ((buf_bytepos_to_charpos @@ -168,6 +168,44 @@ RE_SYNTAX_TABLE_BYTE_TO_CHAR (ptrdiff_t byteoffset) : byteoffset); } +INLINE ptrdiff_t +RE_SYNTAX_TABLE_CHAR_TO_BYTE (ptrdiff_t charpos) +{ + eassert (parse_sexp_lookup_properties); + return (STRINGP (gl_state.object) + ? string_char_to_byte (gl_state.object, charpos) + : BUFFERP (gl_state.object) + ? ((buf_charpos_to_bytepos + (XBUFFER (gl_state.object), charpos) + - BUF_BEGV_BYTE (XBUFFER (gl_state.object)))) + : NILP (gl_state.object) + ? CHAR_TO_BYTE (charpos) - BEGV_BYTE + : charpos); +} + +static void re_update_byteoffsets (void) +{ + gl_state.b_re_byte = RE_SYNTAX_TABLE_CHAR_TO_BYTE (gl_state.b_property); + eassert (gl_state.b_property + == RE_SYNTAX_TABLE_BYTE_TO_CHAR (gl_state.b_re_byte)); + /* `e_property` is often set to EOB+1 (or to some value + much further than `stop` in narrowed buffers). */ + gl_state.e_re_byte + = gl_state.e_property > gl_state.stop + ? 1 + RE_SYNTAX_TABLE_CHAR_TO_BYTE (gl_state.stop) + : RE_SYNTAX_TABLE_CHAR_TO_BYTE (gl_state.e_property); + eassert (gl_state.e_property > gl_state.stop + ? gl_state.e_property + >= 1 + RE_SYNTAX_TABLE_BYTE_TO_CHAR (gl_state.e_re_byte - 1) + : gl_state.e_property + == RE_SYNTAX_TABLE_BYTE_TO_CHAR (gl_state.e_re_byte)); +} + +/* The regexp-engine doesn't keep track of char positions, but instead + uses byteoffsets, so `syntax.c` uses `UPDATE_SYNTAX_TABLE_*` functions, + passing them `charpos`s whereas `regexp.c` uses `RE_UPDATE_SYNTAX_TABLE_*` + functions, passing them byteoffsets. */ + /* Make syntax table state (gl_state) good for CHARPOS, assuming it is currently good for a position before CHARPOS. */ @@ -178,6 +216,36 @@ UPDATE_SYNTAX_TABLE_FORWARD (ptrdiff_t charpos) update_syntax_table_forward (charpos, false, gl_state.object); } +INLINE void +RE_UPDATE_SYNTAX_TABLE_FORWARD (ptrdiff_t byteoffset) +{ /* Performs just-in-time syntax-propertization. */ + if (!parse_sexp_lookup_properties) + return; + eassert (gl_state.e_re_byte >= 0); /* gl_state.b_re_byte can be negative. */ + if (byteoffset >= gl_state.e_re_byte) + { + ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (byteoffset); + eassert (charpos >= gl_state.e_property); + UPDATE_SYNTAX_TABLE_FORWARD (charpos); + re_update_byteoffsets (); + } +} + +INLINE void +RE_UPDATE_SYNTAX_TABLE_FORWARD_BEFORE (ptrdiff_t byteoffset) +{ /* Performs just-in-time syntax-propertization. */ + if (!parse_sexp_lookup_properties) + return; + eassert (gl_state.e_re_byte >= 0); /* gl_state.b_re_byte can be negative. */ + if (byteoffset > gl_state.e_re_byte) + { + ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (byteoffset) - 1; + eassert (charpos >= gl_state.e_property); + UPDATE_SYNTAX_TABLE_FORWARD (charpos); + re_update_byteoffsets (); + } +} + /* Make syntax table state (gl_state) good for CHARPOS, assuming it is currently good for a position after CHARPOS. */ @@ -188,6 +256,36 @@ UPDATE_SYNTAX_TABLE_BACKWARD (ptrdiff_t charpos) update_syntax_table (charpos, -1, false, gl_state.object); } +INLINE void +RE_UPDATE_SYNTAX_TABLE_BACKWARD (ptrdiff_t byteoffset) +{ + if (!parse_sexp_lookup_properties) + return; + eassert (gl_state.e_re_byte >= 0); /* gl_state.b_re_byte can be negative. */ + if (byteoffset < gl_state.b_re_byte) + { + ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (byteoffset); + eassert (charpos < gl_state.b_property); + UPDATE_SYNTAX_TABLE_FORWARD (charpos); + re_update_byteoffsets (); + } +} + +INLINE void +RE_UPDATE_SYNTAX_TABLE_BACKWARD_BEFORE (ptrdiff_t byteoffset) +{ + if (!parse_sexp_lookup_properties) + return; + eassert (gl_state.e_re_byte >= 0); /* gl_state.b_re_byte can be negative. */ + if (byteoffset <= gl_state.b_re_byte) + { + ptrdiff_t charpos = RE_SYNTAX_TABLE_BYTE_TO_CHAR (byteoffset); + eassert (charpos <= gl_state.b_property); + UPDATE_SYNTAX_TABLE_FORWARD (charpos - 1); + re_update_byteoffsets (); + } +} + /* Make syntax table good for CHARPOS. */ INLINE void @@ -197,6 +295,20 @@ UPDATE_SYNTAX_TABLE (ptrdiff_t charpos) UPDATE_SYNTAX_TABLE_FORWARD (charpos); } +INLINE void +RE_UPDATE_SYNTAX_TABLE (ptrdiff_t byteoffset) +{ + RE_UPDATE_SYNTAX_TABLE_BACKWARD (byteoffset); + RE_UPDATE_SYNTAX_TABLE_FORWARD (byteoffset); +} + +INLINE void +RE_UPDATE_SYNTAX_TABLE_BEFORE (ptrdiff_t byteoffset) +{ + RE_UPDATE_SYNTAX_TABLE_BACKWARD_BEFORE (byteoffset); + RE_UPDATE_SYNTAX_TABLE_FORWARD_BEFORE (byteoffset); +} + /* Set up the buffer-global syntax table. */ INLINE void ^ permalink raw reply related [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-13 4:33 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-13 20:05 ` Ihor Radchenko 0 siblings, 0 replies; 81+ messages in thread From: Ihor Radchenko @ 2023-04-13 20:05 UTC (permalink / raw) To: Stefan Monnier; +Cc: 58558, Alan Mackenzie, Eli Zaretskii, larsi Stefan Monnier <monnier@iro.umontreal.ca> writes: > diff --git a/src/fns.c b/src/fns.c > index e92ef7e4c81..591b00103da 100644 > --- a/src/fns.c > +++ b/src/fns.c With this patch, I see no significant difference in time taken by re-search-forward with and without parse-sexp-lookup-properties: patch + workaround (setting parse-sexp-lookup-properties to nil) Re-search time: 0.735682 sec. patch + no workaround (leaving parse-sexp-lookup-properties to t) Re-search time: 0.841605 sec. no patch + workaround Re-search time: 0.745678 sec. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-12 23:23 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-13 4:33 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-13 4:52 ` Eli Zaretskii 2023-04-13 5:15 ` Eli Zaretskii 1 sibling, 1 reply; 81+ messages in thread From: Eli Zaretskii @ 2023-04-13 4:52 UTC (permalink / raw) To: Stefan Monnier, Andrea Corallo; +Cc: 58558, acm, yantar92, larsi > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Alan Mackenzie <acm@muc.de>, Eli Zaretskii <eliz@gnu.org>, > larsi@gnus.org, 58558@debbugs.gnu.org > Date: Wed, 12 Apr 2023 19:23:19 -0400 > > > For the former, we could probably extend the `b_property` and > > `e_property` fields of `gl_state` (which hold charpos) to also store > > their bytepos equivalent, which should significantly reduce the number > > of conversions between bytepos and charpos. > > I.e. something like the patch below (which passes all tests except for > `test/src/comp-tests` for a reason that completely escapes me). Andrea, could you please help Stefan with that test failure? ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-13 4:52 ` Eli Zaretskii @ 2023-04-13 5:15 ` Eli Zaretskii 0 siblings, 0 replies; 81+ messages in thread From: Eli Zaretskii @ 2023-04-13 5:15 UTC (permalink / raw) To: akrl; +Cc: 58558, acm, yantar92, larsi, monnier > Cc: 58558@debbugs.gnu.org, acm@muc.de, yantar92@posteo.net, larsi@gnus.org > Date: Thu, 13 Apr 2023 07:52:43 +0300 > From: Eli Zaretskii <eliz@gnu.org> > > > From: Stefan Monnier <monnier@iro.umontreal.ca> > > Cc: Alan Mackenzie <acm@muc.de>, Eli Zaretskii <eliz@gnu.org>, > > larsi@gnus.org, 58558@debbugs.gnu.org > > Date: Wed, 12 Apr 2023 19:23:19 -0400 > > > > > For the former, we could probably extend the `b_property` and > > > `e_property` fields of `gl_state` (which hold charpos) to also store > > > their bytepos equivalent, which should significantly reduce the number > > > of conversions between bytepos and charpos. > > > > I.e. something like the patch below (which passes all tests except for > > `test/src/comp-tests` for a reason that completely escapes me). > > Andrea, could you please help Stefan with that test failure? No need, as Stefan has found the problem. Thanks anyway. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-12 14:06 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-12 14:30 ` Eli Zaretskii 2023-04-12 14:39 ` Ihor Radchenko @ 2023-04-12 18:31 ` Alan Mackenzie 2023-04-12 23:25 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2 siblings, 1 reply; 81+ messages in thread From: Alan Mackenzie @ 2023-04-12 18:31 UTC (permalink / raw) To: Stefan Monnier; +Cc: 58558, larsi, Ihor Radchenko, Eli Zaretskii Hello, Stefan. On Wed, Apr 12, 2023 at 10:06:03 -0400, Stefan Monnier wrote: > > 1. emacs -Q > > 2. M-: (require 'cc-langs) <RET> > > 3. C-x b asd <RET> > > 4. M-: parse-sexp-lookup-properties <RET> => t > > On Emacs 28, (4) yields nil. > I suspect that the patch below might fix the immediate problem. > Of course, setting `parse-sexp-lookup-properties` should not have such > a major performance impact, so maybe we should keep digging into > the problem. Thanks! That's a nasty little bug for somebody who hasn't seen it before. I don't the Elisp manual is all that explicit about what happens in such cases. Just as a matter of interest, have you searched cc-defs.el for any other places the same bug might occur? If not, I will. I suggest I apply your patch now-ish. > Stefan [ .... ] -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-12 18:31 ` Alan Mackenzie @ 2023-04-12 23:25 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 0 replies; 81+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-12 23:25 UTC (permalink / raw) To: Alan Mackenzie; +Cc: 58558, larsi, Ihor Radchenko, Eli Zaretskii >> I suspect that the patch below might fix the immediate problem. >> Of course, setting `parse-sexp-lookup-properties` should not have such >> a major performance impact, so maybe we should keep digging into >> the problem. > > Thanks! That's a nasty little bug for somebody who hasn't seen it > before. I don't the Elisp manual is all that explicit about what > happens in such cases. According to my reading of the manual, it does explain what happens, but maybe it should more specifically warn about this kind of interaction. > Just as a matter of interest, have you searched cc-defs.el for any other > places the same bug might occur? If not, I will. No, among other things because I don't know a good regexp that can help me look for it. Stefan ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-12 13:39 ` Ihor Radchenko 2023-04-12 14:06 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-13 4:43 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-13 12:09 ` Ihor Radchenko 1 sibling, 1 reply; 81+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-13 4:43 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, Eli Zaretskii, larsi > parse_sexp_lookup_properties looks suspicious, so I checked the value of > parse-sexp-lookup-properties in Org files on master vs. Emacs 28. > > On master, the value is t, even though Org mode does not set this > variable. On Emacs 28, the value is nil. Any chance you can now give a reproducible recipe of you big&progressive slowdown? Stefan ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-13 4:43 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-04-13 12:09 ` Ihor Radchenko 0 siblings, 0 replies; 81+ messages in thread From: Ihor Radchenko @ 2023-04-13 12:09 UTC (permalink / raw) To: Stefan Monnier; +Cc: 58558, Eli Zaretskii, larsi Stefan Monnier <monnier@iro.umontreal.ca> writes: > Any chance you can now give a reproducible recipe of you > big&progressive slowdown? No, unfortunately. I can only see the slowdown with a specific Org file. My attempts to obfuscate it for sharing made the progressive slowdown disappear. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-12-13 10:28 ` Ihor Radchenko 2022-12-13 13:11 ` Eli Zaretskii @ 2022-12-13 13:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 1 sibling, 0 replies; 81+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-12-13 13:27 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, Lars Ingebrigtsen, Eli Zaretskii > The fraction of buf_bytepos_to_charpos increases over repeated benchmark > runs. [...] > Any ideas what I can do further? As usual, the problem is either that we call this function too often or that it takes too much time every time we call it so: - Try and figure out who is the most frequent caller of `buf_bytepos_to_charpos` during your benchmark. Most calls to this function can usually be eliminated by changing the code to keep track of both bytes and chars at the same time. Actually, most of the time we already have the char info somewhere nearby, so it might be a simple change. `gprof` can often give that info. - Try and figure out why `buf_bytepos_to_charpos` is so slow. Last time we tweaked that code, AFAIK, is commit b300052fb4ef1261519b0fd57f5eb186c2d10295. My debugging approach for those cases is the following: DEFVAR_LISP a new variable in which you put a vector of N integers (initialized to 0), and then at various "interesting" points in the `buf_bytepos_to_charpos`, increment one of the vector elements. This way you can see from ELisp how many times each "interesting" point was executed. IOW, I do the profiling counters by hand. Stefan ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-10-16 10:02 ` Ihor Radchenko 2022-10-16 10:04 ` Lars Ingebrigtsen @ 2022-10-16 10:36 ` Eli Zaretskii 1 sibling, 0 replies; 81+ messages in thread From: Eli Zaretskii @ 2022-10-16 10:36 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, larsi > Cc: 58558@debbugs.gnu.org > From: Ihor Radchenko <yantar92@posteo.net> > Date: Sun, 16 Oct 2022 10:02:25 +0000 > > Lars Ingebrigtsen <larsi@gnus.org> writes: > > > If you switch the buffer to `clean-mode' (which should remove all text > > props), does the slowdown disappear? In that case, it seems likely that > > the slowdown is connected to text properties, somehow. > > The slowdown becomes slightly better, but nowhere close to Emacs 28: > > ;; Emacs 29 > ;; Elapsed time: 16.953404s > ;; Emacs 29 + clean-mode > ;; Elapsed time: 13.290568s > ;; Emacs 28 > ;; Elapsed time: 0.869748s > > I did > > (setq yant/re "\\(?:\\(?:\\<DEADLINE: *\\(\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\)\\)\\|\\(?:\\(?:<\\(?:[[:digit:]]\\{4\\}-[[:digit:]]\\{2\\}-[[:digit:]]\\{2\\}\\(?: [[:alpha:]]+\\)?\\)\\(?: [[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\(?:-[[:digit:]]\\{1,2\\}:[[:digit:]]\\{2\\}\\)?\\)?\\(?:\\(?: [+.:-]\\{1,2\\}[[:digit:]]+[dhmwy]\\(?:/[[:digit:]]+[dhmwy]\\)?\\)\\{1,2\\}\\)?>\\)\\|^\\*+[[:blank:]]+\\(?:[[:upper:]]+[[:blank:]]+\\)?\\[#A]\\|^[[:space:]]*:STYLE:[[:space:]]+habit[[:space:]]*$\\)\\)") > (benchmark-progn (goto-char (point-min)) (while (re-search-forward yant/re nil t))) AFAICT, the changes in regex-emacs.c between these two versions are very minor, almost non-existent. So it sounds like the reason is somewhere else, not in regexp search per se. But to be absolutely sure, could you please try building Emacs 29 with regex-emacs.c from Emacs 28, and see if the slowdown disappears or not? Thanks. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-10-16 9:34 ` Ihor Radchenko 2022-10-16 9:37 ` Lars Ingebrigtsen @ 2023-02-19 12:17 ` Dmitry Gutov 2023-02-20 10:24 ` Ihor Radchenko 1 sibling, 1 reply; 81+ messages in thread From: Dmitry Gutov @ 2023-02-19 12:17 UTC (permalink / raw) To: Ihor Radchenko, Lars Ingebrigtsen; +Cc: 58558 On 16/10/2022 12:34, Ihor Radchenko wrote: > Lars Ingebrigtsen<larsi@gnus.org> writes: > >>> It happens consistently in Emacs 29, but not in all buffers. Sometimes, >>> it only happens after some time after Emacs startup. The slowdown is not >>> there in Emacs 28. >> Is there anything special about buffers where you see these slowdowns? > This is a large complex Org buffer. > It seems like it might be helpful to upload the document somewhere, so that people can also to reproduce it on their own. Because I tried this with an Org doc laying around, and couldn't see the problem. You can probably replace all the characters with X or x to anonymize any sensitive information, if that's a concern. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-02-19 12:17 ` Dmitry Gutov @ 2023-02-20 10:24 ` Ihor Radchenko 2023-02-20 14:54 ` Dmitry Gutov 0 siblings, 1 reply; 81+ messages in thread From: Ihor Radchenko @ 2023-02-20 10:24 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 58558, Lars Ingebrigtsen Dmitry Gutov <dgutov@yandex.ru> writes: > It seems like it might be helpful to upload the document somewhere, so > that people can also to reproduce it on their own. Unfortunately not. I can only reproduce using my config. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-02-20 10:24 ` Ihor Radchenko @ 2023-02-20 14:54 ` Dmitry Gutov 0 siblings, 0 replies; 81+ messages in thread From: Dmitry Gutov @ 2023-02-20 14:54 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, Lars Ingebrigtsen On 20/02/2023 12:24, Ihor Radchenko wrote: > Dmitry Gutov<dgutov@yandex.ru> writes: > >> It seems like it might be helpful to upload the document somewhere, so >> that people can also to reproduce it on their own. > Unfortunately not. I can only reproduce using my config. Bisecting it could also be an option. But this can be a pain, I realize. ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2022-10-16 1:26 bug#58558: 29.0.50; re-search-forward is slow in some buffers Ihor Radchenko 2022-10-16 9:19 ` Lars Ingebrigtsen @ 2023-04-10 8:48 ` Mattias Engdegård 2023-04-10 9:57 ` Ihor Radchenko 1 sibling, 1 reply; 81+ messages in thread From: Mattias Engdegård @ 2023-04-10 8:48 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, Eli Zaretskii, Stefan Monnier [-- Attachment #1: Type: text/plain, Size: 220 bytes --] Ihor, would you consider the possibility of regexp cache thrashing? It does occur from time to time; that cache is quite small. Try this instrumentation patch. (We should probably have something like it permanently.) [-- Attachment #2: 0001-Add-regexp-cache-hit-miss-counters.patch --] [-- Type: application/octet-stream, Size: 1785 bytes --] From 978ce66e9bd50da11997aeadcc3508549863a116 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org> Date: Sat, 7 Nov 2020 17:00:53 +0100 Subject: [PATCH 1/2] Add regexp cache hit/miss counters --- src/search.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/src/search.c b/src/search.c index 4eb634a3c0..358b82da2e 100644 --- a/src/search.c +++ b/src/search.c @@ -222,7 +222,10 @@ compile_pattern (Lisp_Object pattern, struct re_registers *regp, || EQ (cp->syntax_table, BVAR (current_buffer, syntax_table))) && !NILP (Fequal (cp->f_whitespace_regexp, Vsearch_spaces_regexp)) && cp->buf.charset_unibyte == charset_unibyte) - break; + { + regexp_cache_hit++; + break; + } /* If we're at the end of the cache, compile into the last (least recently used) non-busy cell in the cache. */ @@ -234,6 +237,7 @@ compile_pattern (Lisp_Object pattern, struct re_registers *regp, cp = *cpp; compile_it: eassert (!cp->busy); + regexp_cache_miss++; compile_pattern_1 (cp, pattern, translate, posix); break; } @@ -3390,6 +3394,13 @@ syms_of_search (void) is to bind it with `let' around a small expression. */); Vinhibit_changing_match_data = Qnil; + DEFVAR_INT("regexp-cache-hit", regexp_cache_hit, + doc: /* Regexp cache hit count. Internal use only. */); + regexp_cache_hit = 0; + DEFVAR_INT("regexp-cache-miss", regexp_cache_miss, + doc: /* Regexp cache miss count. Internal use only. */); + regexp_cache_miss = 0; + defsubr (&Slooking_at); defsubr (&Sposix_looking_at); defsubr (&Sstring_match); -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply related [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-10 8:48 ` Mattias Engdegård @ 2023-04-10 9:57 ` Ihor Radchenko 2023-04-10 10:05 ` Mattias Engdegård 0 siblings, 1 reply; 81+ messages in thread From: Ihor Radchenko @ 2023-04-10 9:57 UTC (permalink / raw) To: Mattias Engdegård Cc: 58558, Eli Zaretskii, Ihor Radchenko, Stefan Monnier Mattias Engdegård <mattias.engdegard@gmail.com> writes: > Ihor, would you consider the possibility of regexp cache thrashing? It does occur from time to time; that cache is quite small. Try this instrumentation patch. (We should probably have something like it permanently.) Generating agenda with Emacs master + your patch: :regexp-cache-hit: 6225399 :regexp-cache-miss: 109490 Emacs 28.3 + your patch: :regexp-cache-hit: 4968571 :regexp-cache-miss: 79637 Also, I tried to play around with increasing REGEXP_CACHE_SIZE in the past. It does not make noticeable difference in my setup. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 81+ messages in thread
* bug#58558: 29.0.50; re-search-forward is slow in some buffers 2023-04-10 9:57 ` Ihor Radchenko @ 2023-04-10 10:05 ` Mattias Engdegård 0 siblings, 0 replies; 81+ messages in thread From: Mattias Engdegård @ 2023-04-10 10:05 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 58558, Eli Zaretskii, Ihor Radchenko, Stefan Monnier 10 apr. 2023 kl. 11.57 skrev Ihor Radchenko <yantar92@posteo.net>: > Generating agenda with Emacs master + your patch: > > :regexp-cache-hit: 6225399 :regexp-cache-miss: 109490 > > Emacs 28.3 + your patch: > :regexp-cache-hit: 4968571 :regexp-cache-miss: 79637 Those miss rates are similar (1.7 % and 1.5 %, respectively) although rather higher than we'd like. Probably no serious regexp cache thrashing going on then, but it was good to be able to exclude it, thank you for humouring me! > Also, I tried to play around with increasing REGEXP_CACHE_SIZE in the > past. It does not make noticeable difference in my setup. Right, that's consistent with the data collected above. ^ permalink raw reply [flat|nested] 81+ messages in thread
end of thread, other threads:[~2023-04-13 20:05 UTC | newest] Thread overview: 81+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-10-16 1:26 bug#58558: 29.0.50; re-search-forward is slow in some buffers Ihor Radchenko 2022-10-16 9:19 ` Lars Ingebrigtsen 2022-10-16 9:34 ` Ihor Radchenko 2022-10-16 9:37 ` Lars Ingebrigtsen 2022-10-16 10:02 ` Ihor Radchenko 2022-10-16 10:04 ` Lars Ingebrigtsen 2022-10-16 10:53 ` Ihor Radchenko 2022-10-16 11:01 ` Lars Ingebrigtsen 2022-10-16 11:21 ` Eli Zaretskii 2022-10-16 14:23 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2022-10-17 0:56 ` Ihor Radchenko 2022-10-18 11:50 ` Lars Ingebrigtsen 2022-10-18 14:58 ` Eli Zaretskii 2022-10-18 18:19 ` Lars Ingebrigtsen 2022-10-18 18:38 ` Eli Zaretskii 2022-12-13 10:28 ` Ihor Radchenko 2022-12-13 13:11 ` Eli Zaretskii 2022-12-13 13:32 ` Ihor Radchenko 2022-12-13 14:28 ` Eli Zaretskii 2022-12-13 15:56 ` Ihor Radchenko 2022-12-13 16:08 ` Eli Zaretskii 2022-12-13 17:43 ` Ihor Radchenko 2022-12-13 17:52 ` Eli Zaretskii 2022-12-13 18:03 ` Ihor Radchenko 2022-12-13 20:02 ` Eli Zaretskii 2022-12-14 11:40 ` Ihor Radchenko 2022-12-14 13:06 ` Eli Zaretskii 2022-12-14 13:23 ` Ihor Radchenko 2022-12-14 13:32 ` Eli Zaretskii 2022-12-14 13:39 ` Ihor Radchenko 2022-12-14 14:12 ` Eli Zaretskii 2022-12-13 18:15 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2022-12-13 18:40 ` Ihor Radchenko 2022-12-13 19:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2022-12-13 20:21 ` Eli Zaretskii 2022-12-14 11:42 ` Ihor Radchenko 2022-12-13 17:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2022-12-14 12:00 ` Ihor Radchenko 2022-12-14 12:23 ` Ihor Radchenko 2022-12-14 13:10 ` Eli Zaretskii 2022-12-14 13:26 ` Ihor Radchenko 2022-12-14 13:57 ` Eli Zaretskii 2022-12-14 14:01 ` Ihor Radchenko 2023-04-06 11:49 ` Ihor Radchenko 2023-04-06 12:05 ` Eli Zaretskii 2023-04-09 19:54 ` Ihor Radchenko 2023-04-10 4:14 ` Eli Zaretskii 2023-04-10 12:24 ` Ihor Radchenko 2023-04-10 13:40 ` Eli Zaretskii 2023-04-10 14:55 ` Ihor Radchenko 2023-04-10 16:04 ` Eli Zaretskii 2023-04-10 14:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-11 11:29 ` Ihor Radchenko 2023-04-11 11:51 ` Eli Zaretskii 2023-04-12 13:39 ` Ihor Radchenko 2023-04-12 14:06 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-12 14:30 ` Eli Zaretskii 2023-04-12 14:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-12 15:22 ` Eli Zaretskii 2023-04-12 15:59 ` Alan Mackenzie 2023-04-12 14:38 ` Stephen Berman 2023-04-12 14:42 ` Ihor Radchenko 2023-04-12 14:39 ` Ihor Radchenko 2023-04-12 15:20 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-12 23:23 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-13 4:33 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-13 20:05 ` Ihor Radchenko 2023-04-13 4:52 ` Eli Zaretskii 2023-04-13 5:15 ` Eli Zaretskii 2023-04-12 18:31 ` Alan Mackenzie 2023-04-12 23:25 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-13 4:43 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-04-13 12:09 ` Ihor Radchenko 2022-12-13 13:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2022-10-16 10:36 ` Eli Zaretskii 2023-02-19 12:17 ` Dmitry Gutov 2023-02-20 10:24 ` Ihor Radchenko 2023-02-20 14:54 ` Dmitry Gutov 2023-04-10 8:48 ` Mattias Engdegård 2023-04-10 9:57 ` Ihor Radchenko 2023-04-10 10:05 ` Mattias Engdegård
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).