From: Ihor Radchenko <yantar92@gmail.com>
To: Maxim Nikulin <manikulin@gmail.com>
Cc: emacs-orgmode@gnu.org
Subject: Re: [PATCH] Use cache in org-up-heading-safe
Date: Thu, 06 May 2021 22:34:13 +0800 [thread overview]
Message-ID: <87tung2aoq.fsf@localhost> (raw)
In-Reply-To: <s6uhpi$kek$1@ciao.gmane.io>
[-- Attachment #1: Type: text/plain, Size: 4271 bytes --]
Maxim Nikulin <manikulin@gmail.com> writes:
> I have not tested the patch due to I do not use agenda. My interest was
> stimulated solely by my attempts to make org-refile-get-targets faster.
Thanks for the feedback!
> I consider it as an improvement. I have noticed the only thing that must
> be fixed: comments with old variant of the function. Reaction to my
> other comments is optional.
Oops. Fixed.
> In org-agenda.el org-up-heading-safe function is called only from
> org-find-top-headline.
That's probably not the slowest part. org-agenda also calls
org-up-heading-safe indirectly. In particular, org-get-tags is calling
org-up-heading-safe to get inherited tags. It is org-get-tags that is
taking >50% time of agenda generation in my agendas. And
org-up-heading-safe was the largest contributor to that >50% time.
> ... So the purpose of the dance with backward
> searches is to get top level headings. My expectation is that scan
> through the whole buffer and storing result in a structure that allows
> binary search of position (array, red-black tree, etc) may be even
> faster.
Scan through the whole buffer could be faster, but it is not always
desired. Most of Org code only need information for current headline.
Re-scanning the whole buffer would be costly.
Also, I tried to compare avl-tree with hash table. However, avl-tree
does not give much benefit for my test case of an agenda with ~12000
todo items. Since avl-tree requires much more custom code, hash table
should be better.
;; No cache
;; org-up-heading-safe 180493 285.37713697 0.0015810980
;; org-up-heading-safe 134876 199.44584854 0.0014787349
;; org-up-heading-safe 134872 198.23494205 0.0014698005
;; Hash cache
;; org-up-heading-safe 180493 5.0327451709 2.788...e-05
;; org-up-heading-safe 134872 1.3568663460 1.006...e-05
;; org-up-heading-safe 134872 0.7527555679 5.581...e-06
;; AVL cache
;; org-up-heading-safe 180500 5.1044455589 2.827...e-05
;; org-up-heading-safe 134872 1.2781428280 9.476...e-06
;; org-up-heading-safe 134872 1.2866258809 9.539...e-06
> Alternatively lazily obtained position of top heading could be stored in
> cache to reduce number of iterations on org-find-top-line.
That one is used for org-agenda-filter-by-top-headline. I do not really
use it, so I have no clue if there is much need to optimise it
specifically. Yet, caching org-up-heading-safe will indeed speed it up
as well.
>> + (let ((level-cache (gethash (point) org--up-heading-cache)))
>> + (if (and level-cache
>> + (eq (buffer-chars-modified-tick) org--up-heading-cache-tick))
>
> If buffer-chars-modified-tick is faster than gethash then reordering
> might result in very slight improvement of performance.
Sure. Done.
>> + ;; Parent is inside accessible part of the buffer.
>> + (progn (goto-char level-cache)
>> + (funcall outline-level)))
>
> I do not see any reason why outline level can not be cached in pair with
> position.
Hmm. You are right. Benchmarking with and without additional caching
shows that it can give extra ~2x improvement:
;; Hash cache
;; org-up-heading-safe 180493 5.0327451709 2.788...e-05
;; org-up-heading-safe 134872 1.3568663460 1.006...e-05
;; org-up-heading-safe 134872 0.7527555679 5.581...e-06
;; Hash cache + outline-level cache
;; org-up-heading-safe 180507 4.3326449169 2.400...e-05
;; org-up-heading-safe 134872 0.4952472380 3.671...e-06
;; org-up-heading-safe 134879 0.5298789350 3.928...e-06
So, outline-level is also cached in the new version of the patch.
>> + (let (result)
>> + (setq result
>
> I am not a lisp guru, so from my point of view this can be reduced to
> (let ((result ...
Sure.
>> + (format "^\\*\\{1,%d\\} " level-up) nil t)
>
> \t as an alternative to " " is used in org-refile.el,
> e.g. "^\\*+[ \t]+". Unsure which variant is canonical one and I see that
> regexp is taken from original function, so no regression is expected.
I do not know either. Actually, I wish if Org mode code base used less
literal strings and more regexp variables from org-element.el and
org.el.
Best,
Ihor
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Use-cache-in-org-up-heading-safe.patch --]
[-- Type: text/x-diff, Size: 3775 bytes --]
From d40b2bbb1dc5113d3085be2fae52ebe5ac1b023d Mon Sep 17 00:00:00 2001
Message-Id: <d40b2bbb1dc5113d3085be2fae52ebe5ac1b023d.1620299446.git.yantar92@gmail.com>
From: Ihor Radchenko <yantar92@gmail.com>
Date: Thu, 6 May 2021 14:13:20 +0800
Subject: [PATCH] Use cache in org-up-heading-safe
* lisp/org.el (org-up-heading-safe): Use buffer-local cache to store
positions of parent headings. The cache is invalidated when buffer
text is changed, according to `buffer-chars-modified-tick'.
(org--up-heading-cache): Buffer-local hash-table storing the cache.
(org--up-heading-cache-tick): The buffer modification state for
currently active `org--up-heading-cache'.
The optimisation is critical when running agenda or org-entry-get
queries using property/tag inheritance. In such scenarios
`org-up-heading-safe' can be called thousands of times. For example,
building todo agenda on large number of headings lead to the following
benchmark results:
Benchmark:
1. (elp-instrument-function #'org-up-heading-safe)
2. Run agenda
3. (elp-results) ;; function, # calls, total time, avg time
Without cache:
org-up-heading-safe 27555 8.4234025759 0.0003056941
With cache, first run:
org-up-heading-safe 23227 0.5300747539 2.282...e-05
With cache, second run on unchanged buffer:
org-up-heading-safe 23227 0.1447754880 6.233...e-06
The final speedup is 1-2 orders of magnitude (~15-56 times).
---
lisp/org.el | 28 ++++++++++++++++++++++++----
1 file changed, 24 insertions(+), 4 deletions(-)
diff --git a/lisp/org.el b/lisp/org.el
index 0ff13c79c..c7e87a804 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -20440,6 +20440,10 @@ (defun org-up-heading-all (arg)
With argument, move up ARG levels."
(outline-up-heading arg t))
+(defvar-local org--up-heading-cache (make-hash-table)
+ "Buffer-local `org-up-heading-safe' cache.")
+(defvar-local org--up-heading-cache-tick nil
+ "Buffer `buffer-chars-modified-tick' in `org--up-heading-cache'.")
(defun org-up-heading-safe ()
"Move to the heading line of which the present line is a subheading.
This version will not throw an error. It will return the level of the
@@ -20449,10 +20453,26 @@ (defun org-up-heading-safe ()
because it relies on stars being the outline starters. This can really
make a significant difference in outlines with very many siblings."
(when (ignore-errors (org-back-to-heading t))
- (let ((level-up (1- (funcall outline-level))))
- (and (> level-up 0)
- (re-search-backward (format "^\\*\\{1,%d\\} " level-up) nil t)
- (funcall outline-level)))))
+ (let (level-cache)
+ (if (and (eq (buffer-chars-modified-tick) org--up-heading-cache-tick)
+ (setq level-cache (gethash (point) org--up-heading-cache)))
+ (when (<= (point-min) (car level-cache) (point-max))
+ ;; Parent is inside accessible part of the buffer.
+ (progn (goto-char (car level-cache))
+ (cdr level-cache)))
+ ;; Buffer modified. Invalidate cache.
+ (unless (eq (buffer-chars-modified-tick) org--up-heading-cache-tick)
+ (setq-local org--up-heading-cache-tick
+ (buffer-chars-modified-tick))
+ (clrhash org--up-heading-cache))
+ (let* ((level-up (1- (funcall outline-level)))
+ (pos (point))
+ (result (and (> level-up 0)
+ (re-search-backward
+ (format "^\\*\\{1,%d\\} " level-up) nil t)
+ (funcall outline-level))))
+ (when result (puthash pos (cons (point) result) org--up-heading-cache))
+ result)))))
(defun org-up-heading-or-point-min ()
"Move to the heading line of which the present is a subheading, or point-min.
--
2.26.3
next prev parent reply other threads:[~2021-05-06 14:31 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-04 15:08 [PATCH] Use cache in org-up-heading-safe Ihor Radchenko
2021-05-05 16:40 ` Maxim Nikulin
2021-05-06 14:34 ` Ihor Radchenko [this message]
2021-05-06 17:02 ` Maxim Nikulin
2021-05-07 2:08 ` Ihor Radchenko
2021-05-08 11:28 ` Maxim Nikulin
2021-05-10 15:14 ` Ihor Radchenko
2021-05-15 11:58 ` Bastien
2021-05-16 6:15 ` Bastien
2021-05-16 6:36 ` Ihor Radchenko
2021-05-16 8:53 ` Bastien
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.orgmode.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87tung2aoq.fsf@localhost \
--to=yantar92@gmail.com \
--cc=emacs-orgmode@gnu.org \
--cc=manikulin@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).