emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [PATCH] Rewrite `org-clock-sum'
@ 2024-04-27 13:13 Morgan Smith
  2024-04-30 10:59 ` Ihor Radchenko
  0 siblings, 1 reply; 2+ messages in thread
From: Morgan Smith @ 2024-04-27 13:13 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 2744 bytes --]

Hello!

I may have rewritten org-clock-sum yet again.  See attached patch.

* things I want you to tell me
1. Does this look like something that could be eventually merged upstream or am
   I wasting my time?
2. Would you like me to do more performance testing?  I basically only tested
   my use case.  If yes, should I create some test files for benchmarking that
   can be shared?
3. Do you want `org-element-cache-map' fixed before we merge this patch?  If
   yes, please be willing to wait.  I have already spent probably about 8 hours
   looking into it and it still makes my head hurt.

* todo
The patch is like 95% done.  I still gotta

1. Write a decent docstring for `org-clock-ranges'.  Maybe add a news entry for
   it too.

2. Check `org-clock-hd-marker' for open clock.

3. Figure out what to do about open clocks that aren't the current
   one. Historically we ignored them so I guess I should just do that.

4. Maybe test clocking in inlinetasks.  I honestly don't even know what these
   are.

* Benefits of my rewrite

1. New function `org-clock-ranges' which should help third party packages with
   clock range visualization stuff

2. Performance (see table below)
   - We run the filter before doing all the clock range calculations unlike
     before so aggressive filters should run much faster (I didn't test this
     though).

3. Code is easier to understand (subjective)

* Downsides of my rewrite

1. Does it still perform better with the cache disabled?  idk.  Probably not.

2. Radical change.  Likely has bugs

3. Dances around bugs in `org-element-cache-map' but does it actually dance
   around all of them?

* Performance
I didn't see a big difference on the third run so I assume run 1 is with a cold
cache (obtained by running `org-element-cache-reset') and run 2 is with a warm
cache.

I have an almost 3M file of clocking data.  In it I have this source block
which I use to update my 10 clocktables:

#+BEGIN_SRC elisp
(let (;; (gc-cons-threshold (* 50 1000 1000))
      (start-time (current-time)))
  (org-dblock-update t)
  (time-to-seconds (time-since start-time)))
#+END_SRC

The time results are as follows


| patch       | run # | gc-cons-threshold |     time (s) |
|-------------+-------+-------------------+--------------|
| origin/main |     1 |            800000 | 59.824324488 |
| mine        |     1 |            800000 | 33.397901059 |
| origin/main |     2 |            800000 | 48.354095581 |
| mine        |     2 |            800000 | 23.581749901 |
| origin/main |     1 |          50000000 | 41.856530738 |
| mine        |     1 |          50000000 | 30.237918254 |
| origin/main |     2 |          50000000 | 33.944309156 |
| mine        |     2 |          50000000 |  19.84887913 |


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-lisp-org-clock.el-org-clock-sum-Rewrite-using-elemen.patch --]
[-- Type: text/x-patch, Size: 10900 bytes --]

From bfc01710186be01aab2186762cf678d360c5476e Mon Sep 17 00:00:00 2001
From: Morgan Smith <Morgan.J.Smith@outlook.com>
Date: Thu, 11 Apr 2024 12:23:21 -0400
Subject: [PATCH] lisp/org-clock.el (org-clock-sum): Rewrite using element api

---
 lisp/org-clock.el | 191 +++++++++++++++++++++++-----------------------
 1 file changed, 94 insertions(+), 97 deletions(-)

diff --git a/lisp/org-clock.el b/lisp/org-clock.el
index 65a54579a..8731d6ee5 100644
--- a/lisp/org-clock.el
+++ b/lisp/org-clock.el
@@ -33,15 +33,13 @@
 
 (require 'cl-lib)
 (require 'org)
+(require 'org-element)
 
 (declare-function calendar-iso-to-absolute "cal-iso" (date))
 (declare-function notifications-notify "notifications" (&rest params))
 (declare-function org-element-property "org-element-ast" (property node))
-(declare-function org-element-contents-end "org-element" (node))
-(declare-function org-element-end "org-element" (node))
 (declare-function org-element-type "org-element-ast" (node &optional anonymous))
 (declare-function org-element-type-p "org-element-ast" (node types))
-(defvar org-element-use-cache)
 (declare-function org-inlinetask-at-task-p "org-inlinetask" ())
 (declare-function org-inlinetask-goto-beginning "org-inlinetask" ())
 (declare-function org-inlinetask-goto-end "org-inlinetask" ())
@@ -1998,6 +1996,9 @@ With prefix arg SELECT, offer recently clocked tasks for selection."
     (org-clock-sum (car r) (cadr r)
 		   headline-filter (or propname :org-clock-minutes-custom))))
 
+;;; TODO:
+;; Maybe add more tests?
+;; Are there tests for inlinetasks?
 ;;;###autoload
 (defun org-clock-sum (&optional tstart tend headline-filter propname)
   "Sum the times for each subtree.
@@ -2008,100 +2009,62 @@ each headline in the time range with point at the headline.  Headlines for
 which HEADLINE-FILTER returns nil are excluded from the clock summation.
 PROPNAME lets you set a custom text property instead of :org-clock-minutes."
   (with-silent-modifications
-    (let* ((re (concat "^\\(\\*+\\)[ \t]\\|^[ \t]*"
-		       org-clock-string
-		       "[ \t]*\\(?:\\(\\[.*?\\]\\)-+\\(\\[.*?\\]\\)\\|=>[ \t]+\\([0-9]+\\):\\([0-9]+\\)\\)"))
-	   (lmax 30)
-	   (ltimes (make-vector lmax 0))
-	   (level 0)
-	   (tstart (cond ((stringp tstart) (org-time-string-to-seconds tstart))
-			 ((consp tstart) (float-time tstart))
-			 (t tstart)))
-	   (tend (cond ((stringp tend) (org-time-string-to-seconds tend))
-		       ((consp tend) (float-time tend))
-		       (t tend)))
-	   (t1 0)
-	   time)
-      (remove-text-properties (point-min) (point-max)
-			      `(,(or propname :org-clock-minutes) t
-				:org-clock-force-headline-inclusion t))
-      (save-excursion
-	(goto-char (point-max))
-	(while (re-search-backward re nil t)
-          (let* ((element (save-match-data (org-element-at-point)))
-                 (element-type (org-element-type element)))
-	    (cond
-	     ((and (eq element-type 'clock) (match-end 2))
-	      ;; Two time stamps.
-	      (let* ((timestamp (org-element-property :value element))
-		     (ts (float-time
-                          (org-encode-time
-                           (list 0
-                                 (org-element-property :minute-start timestamp)
-                                 (org-element-property :hour-start timestamp)
-                                 (org-element-property :day-start timestamp)
-                                 (org-element-property :month-start timestamp)
-                                 (org-element-property :year-start timestamp)
-                                 nil -1 nil))))
-		     (te (float-time
-                          (org-encode-time
-                           (list 0
-                                 (org-element-property :minute-end timestamp)
-                                 (org-element-property :hour-end timestamp)
-                                 (org-element-property :day-end timestamp)
-                                 (org-element-property :month-end timestamp)
-                                 (org-element-property :year-end timestamp)
-                                 nil -1 nil))))
-		     (dt (- (if tend (min te tend) te)
-			    (if tstart (max ts tstart) ts))))
-	        (when (> dt 0) (cl-incf t1 (floor dt 60)))))
-	     ((match-end 4)
-	      ;; A naked time.
-	      (setq t1 (+ t1 (string-to-number (match-string 5))
-			  (* 60 (string-to-number (match-string 4))))))
-	     ((memq element-type '(headline inlinetask)) ;A headline
-	      ;; Add the currently clocking item time to the total.
-	      (when (and org-clock-report-include-clocking-task
-		         (eq (org-clocking-buffer) (current-buffer))
-		         (eq (marker-position org-clock-hd-marker) (point))
-		         tstart
-		         tend
-		         (>= (float-time org-clock-start-time) tstart)
-		         (<= (float-time org-clock-start-time) tend))
-	        (let ((time (floor (org-time-convert-to-integer
-				    (time-since org-clock-start-time))
-				   60)))
-		  (setq t1 (+ t1 time))))
-	      (let* ((headline-forced
-		      (get-text-property (point)
-				         :org-clock-force-headline-inclusion))
-		     (headline-included
-		      (or (null headline-filter)
-			  (save-excursion
-			    (save-match-data (funcall headline-filter))))))
-	        (setq level (- (match-end 1) (match-beginning 1)))
-	        (when (>= level lmax)
-		  (setq ltimes (vconcat ltimes (make-vector lmax 0)) lmax (* 2 lmax)))
-	        (when (or (> t1 0) (> (aref ltimes level) 0))
-		  (when (or headline-included headline-forced)
-		    (if headline-included
-		        (cl-loop for l from 0 to level do
-			         (aset ltimes l (+ (aref ltimes l) t1))))
-		    (setq time (aref ltimes level))
-		    (goto-char (match-beginning 0))
-                    (put-text-property (point) (line-end-position)
-				       (or propname :org-clock-minutes) time)
-		    (when headline-filter
-		      (save-excursion
-		        (save-match-data
-			  (while (org-up-heading-safe)
-			    (put-text-property
-			     (point) (line-end-position)
-			     :org-clock-force-headline-inclusion t))))))
-		  (setq t1 0)
-		  (cl-loop for l from level to (1- lmax) do
-			   (aset ltimes l 0))))))))
-	(setq org-clock-file-total-minutes (aref ltimes 0))))))
+    (let ((tstart (cond ((stringp tstart) (org-time-string-to-seconds tstart))
+                        ((consp tstart) (float-time tstart))
+                        (t tstart)))
+          (tend (cond ((stringp tend) (org-time-string-to-seconds tend))
+                      ((consp tend) (float-time tend))
+                      (t tend)))
+          (propname (or propname :org-clock-minutes))
+          (t1 0)
+          (total 0)
+          time)
+      (remove-text-properties (point-min) (point-max) `(,propname t))
+      (org-element-cache-map
+       (lambda (element)
+         (when (or (null headline-filter)
+                   (save-excursion
+                     (funcall headline-filter)))
+           (mapc
+            (lambda (range)
+              (setq time
+                    (pcase range
+                      (`(,_ . now)
+                       (when (and org-clock-report-include-clocking-task
+                                  (eq (org-clocking-buffer) (current-buffer))
+                                  ;; TODO
+                                  ;; (eq (marker-position org-clock-hd-marker) (point))
+                                  tstart
+                                  tend
+                                  (>= (float-time org-clock-start-time) tstart)
+                                  (<= (float-time org-clock-start-time) tend))
+                         (floor (org-time-convert-to-integer
+                                 (time-since org-clock-start-time))
+                                60)))
+                      ((pred floatp) range)
+                      (`(,time1 . ,time2)
+                       (let* ((ts (float-time time1))
+                              (te (float-time time2))
+                              (dt (- (if tend (min te tend) te)
+                                     (if tstart (max ts tstart) ts))))
+                         (floor dt 60)))))
+              (when (and time (> time 0)) (cl-incf t1 time)))
+            (org-clock-ranges element))
+           (when (> t1 0)
+             (setq total (+ total t1))
+             (org-element-lineage-map element
+                 (lambda (parent)
+                   (put-text-property
+                    (org-element-begin parent) (1- (org-element-contents-begin parent))
+                    propname
+                    (+ t1 (or (get-text-property
+                               (org-element-begin parent)
+                               propname)
+                              0))))
+               '(headline) t))
+           (setq t1 0)))
+       :narrow t)
+      (setq org-clock-file-total-minutes total))))
 
 (defun org-clock-sum-current-item (&optional tstart)
   "Return time, clocked on current item in total."
@@ -2116,6 +2079,40 @@ PROPNAME lets you set a custom text property instead of :org-clock-minutes."
       (org-clock-sum tstart)
       org-clock-file-total-minutes)))
 
+(defun org-clock-ranges (headline)
+  "Return the clock ranges of HEADLINE.
+Does not recurse into subheadings.
+Ranges are one of 3 formats:
+\(cons time . time)
+\(cons time . now)
+float"
+  (unless (org-element-type-p headline '(headline inlinetask))
+    (error "Argument must be a headline"))
+  (or (org-element-cache-get-key headline :clock-ranges)
+      (let ((clock-ranges
+             (org-element-cache-map
+              (lambda (elem)
+                (when (org-element-type-p elem 'clock)
+                  (if-let ((timestamp (org-element-property :value elem)))
+                      (cons (org-timestamp-to-time timestamp)
+                            (if (eq 'running (org-element-property :status elem))
+                                'now
+                              (org-timestamp-to-time timestamp t)))
+                    (org-duration-to-minutes (org-element-property :duration elem)))))
+              ;; XXX: using these arguments would be more intuitive
+              ;; but don't seem to work due to bugs in
+              ;; `org-element-cache-map'
+              ;; :restrict-elements '(clock)
+              ;; :after-element headline
+              :granularity 'element
+              :next-re org-element-clock-line-re
+              :from-pos (org-element-contents-begin headline)
+              :to-pos (save-excursion
+                        (goto-char (org-element-begin headline))
+                        (org-entry-end-position)))))
+        (org-element-cache-store-key headline :clock-ranges clock-ranges)
+        clock-ranges)))
+
 ;;;###autoload
 (defun org-clock-display (&optional arg)
   "Show subtree times in the entire buffer.
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] Rewrite `org-clock-sum'
  2024-04-27 13:13 [PATCH] Rewrite `org-clock-sum' Morgan Smith
@ 2024-04-30 10:59 ` Ihor Radchenko
  0 siblings, 0 replies; 2+ messages in thread
From: Ihor Radchenko @ 2024-04-30 10:59 UTC (permalink / raw)
  To: Morgan Smith; +Cc: emacs-orgmode

Morgan Smith <Morgan.J.Smith@outlook.com> writes:

> I may have rewritten org-clock-sum yet again.  See attached patch.
>
> * things I want you to tell me
> 1. Does this look like something that could be eventually merged upstream or am
>    I wasting my time?

Yes, it could be merged upstream. I do not see why not.

> 2. Would you like me to do more performance testing?  I basically only tested
>    my use case.  If yes, should I create some test files for benchmarking that
>    can be shared?

Your patch clearly provides more caching ability, so I anticipate an
improvement. I will still need to test is on my side though to be sure.

Having benchmarks would be nice, but optional.

> 3. Do you want `org-element-cache-map' fixed before we merge this patch?  If
>    yes, please be willing to wait.  I have already spent probably about 8 hours
>    looking into it and it still makes my head hurt.

A fix would be nice, but it should not be a blocker for your patch.

If necessary, we can discuss that function by screen sharing.

> * todo
> The patch is like 95% done.  I still gotta
>
> 1. Write a decent docstring for `org-clock-ranges'.  Maybe add a news entry for
>    it too.

Or make it internal. Then, no news entry will be required.
I am not 100% sure if the return value is useful for generic use outside
org-clock-sum.

> 2. Check `org-clock-hd-marker' for open clock.

You can simply compare it with org-element-begin for current headline.

> 3. Figure out what to do about open clocks that aren't the current
>    one. Historically we ignored them so I guess I should just do that.

Yes. Ideally, also document this in the docstring.

> 4. Maybe test clocking in inlinetasks.  I honestly don't even know what these
>    are.

********************************** TODO inline
********************************** END

They can appear in parallel with paragraphs.

Inlinetasks are an optional markup feature that is enabled by (require 'org-inlinetask)

> * Downsides of my rewrite
>
> 1. Does it still perform better with the cache disabled?  idk.  Probably not.

That should not be a problem. We are slowly moving Org code to use cache
API everywhere.

> 2. Radical change.  Likely has bugs

Then, it would be nice to add some test coverage.

> 3. Dances around bugs in `org-element-cache-map' but does it actually dance
>    around all of them?

It would be nice if you help with this by fixing known bugs and writing
more tests, but it is generally not a concern _you_ need to worry about
- I will (sooner or later) fix bugs in `org-element-cache-map' if they arise.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-04-30 10:58 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-27 13:13 [PATCH] Rewrite `org-clock-sum' Morgan Smith
2024-04-30 10:59 ` Ihor Radchenko

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).