From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id aIwzLdP9k2C8fAAAgWs5BA (envelope-from ) for ; Thu, 06 May 2021 16:31:47 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id KKjZKNP9k2CwbgAAB5/wlQ (envelope-from ) for ; Thu, 06 May 2021 14:31:47 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id DFD9A20717 for ; Thu, 6 May 2021 16:31:46 +0200 (CEST) Received: from localhost ([::1]:50732 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lef2X-0005OP-UC for larch@yhetil.org; Thu, 06 May 2021 10:31:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44204) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lef0h-0005It-2O for emacs-orgmode@gnu.org; Thu, 06 May 2021 10:29:51 -0400 Received: from mail-lf1-x136.google.com ([2a00:1450:4864:20::136]:41621) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lef0e-00055C-Vo for emacs-orgmode@gnu.org; Thu, 06 May 2021 10:29:50 -0400 Received: by mail-lf1-x136.google.com with SMTP id z9so8050152lfu.8 for ; Thu, 06 May 2021 07:29:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:in-reply-to:references:cc:date:message-id :mime-version; bh=M0i/WLBBT0uQtJPRo1Eoq6s0mgWODRfbYLVdi8Umyko=; b=LIuwX8S0VNsurV2k16JTth5V80EliiHmcXrK/Z0hQB4zaeyLzYVJAX8OSnHd04IUWV k2ll/nzMvnYujalVYi18MOnIaD2MbyJWCSlSNe78WxUu2NyzhyDInzw7oTxlZoE7iC9d 1pGHYzbbAzGZaCn4n3b5XWp1hibz0pJLwtKKCc8QP6sZTIwVSt4VNuxbBTqyCOWuMWBv KOplGiB0Nula9WniGQKK1czCLl9Egs1O+z9UrXC7534guXHB+SA0mQwXAZgxpQ/P1yzT 0KLOQSxKGH7UKTNzdtHpke3VQT2NMXBlY2vVVOmCIOhV5ZHoPy4GVjyZm9cwIwJI3QN6 c8xw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:in-reply-to:references:cc:date :message-id:mime-version; bh=M0i/WLBBT0uQtJPRo1Eoq6s0mgWODRfbYLVdi8Umyko=; b=et8n6C5dRWkXN51JvZ7LYVM3F2fSQSvf6BIzP5f8+QUI/sW3fBwXEIHTIfGujoMFHp DXhbHqeKtJ2BFm+QjVBaxMLyHH3S0mDMVcjVyo6MWCe36efKUiP3cBYQrC5UzR+t2HBY 1teoZ6kyIA3JfzGR/jU4XferroFN0pPHJpQNJUaNZpbiGplS3eRscs//0PRsR/J1QfCc LJ7A8ARjlDgi+JL9wa0XmrdatqvD52/kv9ktfDWn6z1+5wcdzDoof0qxwGrUvTcOjYxu qAh4SfDLhmzmPv5Q/mbNAsnH61PJVbetONvL+tqpdgT8nS2lupUl1ObGpFmQuHmavuq/ j1zw== X-Gm-Message-State: AOAM533Io6ndbHfNx44x+p6IaBTBY+ZFax6+KRAEozj/33e3YN5KPr0o eCDRQXUEPb4WotuJlJv4SE0= X-Google-Smtp-Source: ABdhPJzLaz2wVKX0liX+jvv57ccMDDrerxMT2aTTXreznm4taprZkoA3wOW6li5hzqhluQb2F6OVdw== X-Received: by 2002:a19:e21d:: with SMTP id z29mr3209221lfg.175.1620311387010; Thu, 06 May 2021 07:29:47 -0700 (PDT) Received: from localhost ([141.105.67.194]) by smtp.gmail.com with ESMTPSA id z10sm715915lfq.45.2021.05.06.07.29.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 May 2021 07:29:46 -0700 (PDT) From: Ihor Radchenko To: Maxim Nikulin Subject: Re: [PATCH] Use cache in org-up-heading-safe In-Reply-To: References: <874kfieduh.fsf@localhost> Date: Thu, 06 May 2021 22:34:13 +0800 Message-ID: <87tung2aoq.fsf@localhost> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Received-SPF: pass client-ip=2a00:1450:4864:20::136; envelope-from=yantar92@gmail.com; helo=mail-lf1-x136.google.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: emacs-orgmode@gnu.org Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1620311507; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=M0i/WLBBT0uQtJPRo1Eoq6s0mgWODRfbYLVdi8Umyko=; b=X9fFmyfGruhy5KrRfCUertpiov3J7Jw09SwBM8ETFE4I3+DxynEV33h/96LuH9WuyPLAyG KDhgipoM1JTa1xyLvrRVnPEmJUr5b20erhV0Nu+Pjfm7a9pWWvFjNrsbiWRQqT1dHxvYKD bHTSzjeWfCi+5Z2uo1Rqwq1ptD5VYMn8T4TNAqaBem2JYoY56uOAOnsvn4pl9TDWcrbYh5 d6yG+gemT2/wf9BUCr9YoSaU2Eg6yItF6W5a+fEGl7BQDNkn/7kZDJxD3zCFg6uk6Ofoa2 NraohnyA7GqaKAaNj+8yJMieJO0XXThuGLf83H3s/OV8Hne6jM2I7tEOOfk7Rw== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1620311507; a=rsa-sha256; cv=none; b=UFMpxPTifNCqOyvWMtTP5Cc2jvO/R6tAYxwWnuodI702SpMKdnaN/mNv3+CVbPgbdmvqJ3 XVXVAOWhQsLJ0TIJE2SalI+ptwCBb+cJvpz7/xDrdKq8Z5X+IumMu3/uuzMqKfvDWTuqeP 5gUgoYPHj+s6nKm3lH5AfGo/eaoQb6UZOiBY7rVP9+XNZWeGCeDHrE/yRKi/KjR3FaJaip 342Sm/GF+T69OK2rFxNG63eanF6ZUp4+pguiaFb+DSm2+TsI1YgRT72W12TCwV9Fqg6z54 ELMoQtUnEPo5jbaFfGEwOfeu6Ww6KgsuUvECxwzh9YjX3BhUSfllWZga7ag95w== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=LIuwX8S0; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Spam-Score: -1.16 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=LIuwX8S0; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Queue-Id: DFD9A20717 X-Spam-Score: -1.16 X-Migadu-Scanner: scn0.migadu.com X-TUID: lA3eqvmx/mDX --=-=-= Content-Type: text/plain Maxim Nikulin writes: > I have not tested the patch due to I do not use agenda. My interest was > stimulated solely by my attempts to make org-refile-get-targets faster. Thanks for the feedback! > I consider it as an improvement. I have noticed the only thing that must > be fixed: comments with old variant of the function. Reaction to my > other comments is optional. Oops. Fixed. > In org-agenda.el org-up-heading-safe function is called only from > org-find-top-headline. That's probably not the slowest part. org-agenda also calls org-up-heading-safe indirectly. In particular, org-get-tags is calling org-up-heading-safe to get inherited tags. It is org-get-tags that is taking >50% time of agenda generation in my agendas. And org-up-heading-safe was the largest contributor to that >50% time. > ... So the purpose of the dance with backward > searches is to get top level headings. My expectation is that scan > through the whole buffer and storing result in a structure that allows > binary search of position (array, red-black tree, etc) may be even > faster. Scan through the whole buffer could be faster, but it is not always desired. Most of Org code only need information for current headline. Re-scanning the whole buffer would be costly. Also, I tried to compare avl-tree with hash table. However, avl-tree does not give much benefit for my test case of an agenda with ~12000 todo items. Since avl-tree requires much more custom code, hash table should be better. ;; No cache ;; org-up-heading-safe 180493 285.37713697 0.0015810980 ;; org-up-heading-safe 134876 199.44584854 0.0014787349 ;; org-up-heading-safe 134872 198.23494205 0.0014698005 ;; Hash cache ;; org-up-heading-safe 180493 5.0327451709 2.788...e-05 ;; org-up-heading-safe 134872 1.3568663460 1.006...e-05 ;; org-up-heading-safe 134872 0.7527555679 5.581...e-06 ;; AVL cache ;; org-up-heading-safe 180500 5.1044455589 2.827...e-05 ;; org-up-heading-safe 134872 1.2781428280 9.476...e-06 ;; org-up-heading-safe 134872 1.2866258809 9.539...e-06 > Alternatively lazily obtained position of top heading could be stored in > cache to reduce number of iterations on org-find-top-line. That one is used for org-agenda-filter-by-top-headline. I do not really use it, so I have no clue if there is much need to optimise it specifically. Yet, caching org-up-heading-safe will indeed speed it up as well. >> + (let ((level-cache (gethash (point) org--up-heading-cache))) >> + (if (and level-cache >> + (eq (buffer-chars-modified-tick) org--up-heading-cache-tick)) > > If buffer-chars-modified-tick is faster than gethash then reordering > might result in very slight improvement of performance. Sure. Done. >> + ;; Parent is inside accessible part of the buffer. >> + (progn (goto-char level-cache) >> + (funcall outline-level))) > > I do not see any reason why outline level can not be cached in pair with > position. Hmm. You are right. Benchmarking with and without additional caching shows that it can give extra ~2x improvement: ;; Hash cache ;; org-up-heading-safe 180493 5.0327451709 2.788...e-05 ;; org-up-heading-safe 134872 1.3568663460 1.006...e-05 ;; org-up-heading-safe 134872 0.7527555679 5.581...e-06 ;; Hash cache + outline-level cache ;; org-up-heading-safe 180507 4.3326449169 2.400...e-05 ;; org-up-heading-safe 134872 0.4952472380 3.671...e-06 ;; org-up-heading-safe 134879 0.5298789350 3.928...e-06 So, outline-level is also cached in the new version of the patch. >> + (let (result) >> + (setq result > > I am not a lisp guru, so from my point of view this can be reduced to > (let ((result ... Sure. >> + (format "^\\*\\{1,%d\\} " level-up) nil t) > > \t as an alternative to " " is used in org-refile.el, > e.g. "^\\*+[ \t]+". Unsure which variant is canonical one and I see that > regexp is taken from original function, so no regression is expected. I do not know either. Actually, I wish if Org mode code base used less literal strings and more regexp variables from org-element.el and org.el. Best, Ihor --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=0001-Use-cache-in-org-up-heading-safe.patch >From d40b2bbb1dc5113d3085be2fae52ebe5ac1b023d Mon Sep 17 00:00:00 2001 Message-Id: From: Ihor Radchenko Date: Thu, 6 May 2021 14:13:20 +0800 Subject: [PATCH] Use cache in org-up-heading-safe * lisp/org.el (org-up-heading-safe): Use buffer-local cache to store positions of parent headings. The cache is invalidated when buffer text is changed, according to `buffer-chars-modified-tick'. (org--up-heading-cache): Buffer-local hash-table storing the cache. (org--up-heading-cache-tick): The buffer modification state for currently active `org--up-heading-cache'. The optimisation is critical when running agenda or org-entry-get queries using property/tag inheritance. In such scenarios `org-up-heading-safe' can be called thousands of times. For example, building todo agenda on large number of headings lead to the following benchmark results: Benchmark: 1. (elp-instrument-function #'org-up-heading-safe) 2. Run agenda 3. (elp-results) ;; function, # calls, total time, avg time Without cache: org-up-heading-safe 27555 8.4234025759 0.0003056941 With cache, first run: org-up-heading-safe 23227 0.5300747539 2.282...e-05 With cache, second run on unchanged buffer: org-up-heading-safe 23227 0.1447754880 6.233...e-06 The final speedup is 1-2 orders of magnitude (~15-56 times). --- lisp/org.el | 28 ++++++++++++++++++++++++---- 1 file changed, 24 insertions(+), 4 deletions(-) diff --git a/lisp/org.el b/lisp/org.el index 0ff13c79c..c7e87a804 100644 --- a/lisp/org.el +++ b/lisp/org.el @@ -20440,6 +20440,10 @@ (defun org-up-heading-all (arg) With argument, move up ARG levels." (outline-up-heading arg t)) +(defvar-local org--up-heading-cache (make-hash-table) + "Buffer-local `org-up-heading-safe' cache.") +(defvar-local org--up-heading-cache-tick nil + "Buffer `buffer-chars-modified-tick' in `org--up-heading-cache'.") (defun org-up-heading-safe () "Move to the heading line of which the present line is a subheading. This version will not throw an error. It will return the level of the @@ -20449,10 +20453,26 @@ (defun org-up-heading-safe () because it relies on stars being the outline starters. This can really make a significant difference in outlines with very many siblings." (when (ignore-errors (org-back-to-heading t)) - (let ((level-up (1- (funcall outline-level)))) - (and (> level-up 0) - (re-search-backward (format "^\\*\\{1,%d\\} " level-up) nil t) - (funcall outline-level))))) + (let (level-cache) + (if (and (eq (buffer-chars-modified-tick) org--up-heading-cache-tick) + (setq level-cache (gethash (point) org--up-heading-cache))) + (when (<= (point-min) (car level-cache) (point-max)) + ;; Parent is inside accessible part of the buffer. + (progn (goto-char (car level-cache)) + (cdr level-cache))) + ;; Buffer modified. Invalidate cache. + (unless (eq (buffer-chars-modified-tick) org--up-heading-cache-tick) + (setq-local org--up-heading-cache-tick + (buffer-chars-modified-tick)) + (clrhash org--up-heading-cache)) + (let* ((level-up (1- (funcall outline-level))) + (pos (point)) + (result (and (> level-up 0) + (re-search-backward + (format "^\\*\\{1,%d\\} " level-up) nil t) + (funcall outline-level)))) + (when result (puthash pos (cons (point) result) org--up-heading-cache)) + result))))) (defun org-up-heading-or-point-min () "Move to the heading line of which the present is a subheading, or point-min. -- 2.26.3 --=-=-=--