emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [PATCH] org-element.el; significant optimizations for org-element--interpret-affiliated-keywords
@ 2024-11-29 22:38 Dwarshuis, Nathan J
  0 siblings, 0 replies; only message in thread
From: Dwarshuis, Nathan J @ 2024-11-29 22:38 UTC (permalink / raw)
  To: org-mode-email

[-- Attachment #1: Type: text/plain, Size: 809 bytes --]

Hi all,

I noticed that calling `org-element-interpret-data' on objects is generally 5-10x faster than when calling on elements. The reason seems to be that `org-element--interpret-affiliated-keywords' (which is only called on elements) does alot of unnecessary work. Namely, it runs on all elements (including those that should never have an affiliated keyword) and also loops over :standard-properties which should not be relevant here.

The attached patch addresses this.

I also attached some benchmarks pre and post-patch and accompanying code, which shows many elements (specifically those without children) approaching the execution time of objects. For the graphs, just pay attention to the 'org-element' bars (red); the others were from me trying to optimize the `org-ml` package.

Thank you,
Nate



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-org-element.el-Make-affiliated-keyword-interpreter-f.patch --]
[-- Type: text/x-patch, Size: 5720 bytes --]

From 1b4fb607bd6e233a15a569960517a9388a80974b Mon Sep 17 00:00:00 2001
From: ndwarshuis <ndwar@yavin4.ch>
Date: Mon, 25 Nov 2024 22:04:09 -0500
Subject: [PATCH] org-element.el: Make affiliated keyword interpreter faster

* lisp/org-element.el (org-element--interpret-affiliated-keywords):
Optimize performance by bypassing unnecessary types and reducing
loop complexity. Added new constant
`org-element-elements-no-affiliated` which stores the types to
be bypassed.

This function was doing redundant work on several levels which
dramatically reduced performance of interpreting element nodes
relative to object nodes.

First, all types were interpreted regardless of if they could
possibly contain affiliated keywords. Skipping these types
dramatically speeds up typical use cases since many of these
skipped types are common (headline, item, etc).

Second, the loop was much more complex than needed. The loop included
:standard-properties which should not be necessary here. It also
duplicated some work between calls to `org-element--properties-mapc`
and `mapconcat` (the code was moved entirely under the former). The
result should be faster and more readable.

TINYCHANGE
---
 lisp/org-element.el | 82 +++++++++++++++++++++++----------------------
 1 file changed, 42 insertions(+), 40 deletions(-)

diff --git a/lisp/org-element.el b/lisp/org-element.el
index 9f8f52745..3b90dce2a 100644
--- a/lisp/org-element.el
+++ b/lisp/org-element.el
@@ -335,6 +335,11 @@ specially in `org-element--object-lex'.")
   (append org-element-recursive-objects '(paragraph table-row verse-block))
   "List of object or element types that can directly contain objects.")
 
+(defconst org-element-elements-no-affiliated
+  '(org-data comment clock headline inlinetask item
+             node-property planning property-drawer
+             section table-row))
+
 (defconst org-element-affiliated-keywords
   '("CAPTION" "DATA" "HEADER" "HEADERS" "LABEL" "NAME" "PLOT" "RESNAME" "RESULT"
     "RESULTS" "SOURCE" "SRCNAME" "TBLNAME")
@@ -5517,49 +5522,46 @@ to interpret.  Return Org syntax as a string."
 			      (make-string blank ?\n)))))))))
     (funcall fun data nil)))
 
+(defun org-element--keyword-to-org (key value)
+  (let (dual)
+    (when (member key org-element-dual-keywords)
+      (setq dual (cdr value) value (car value)))
+    (concat "#+" (downcase key)
+            (and dual
+                 (format "[%s]" (org-element-interpret-data dual)))
+            ": "
+            (if (member key org-element-parsed-keywords)
+                (org-element-interpret-data value)
+              value)
+            "\n")))
+
 (defun org-element--interpret-affiliated-keywords (element)
   "Return ELEMENT's affiliated keywords as Org syntax.
 If there is no affiliated keyword, return the empty string."
-  (let ((keyword-to-org
-	 (lambda (key value)
-	   (let (dual)
-	     (when (member key org-element-dual-keywords)
-	       (setq dual (cdr value) value (car value)))
-	     (concat "#+" (downcase key)
-		     (and dual
-			  (format "[%s]" (org-element-interpret-data dual)))
-		     ": "
-		     (if (member key org-element-parsed-keywords)
-			 (org-element-interpret-data value)
-		       value)
-		     "\n")))))
-    (mapconcat
-     (lambda (prop)
-       (let ((value (org-element-property prop element))
-	     (keyword (upcase (substring (symbol-name prop) 1))))
-	 (when value
-	   (if (or (member keyword org-element-multiple-keywords)
-		   ;; All attribute keywords can have multiple lines.
-		   (string-match-p "^ATTR_" keyword))
-	       (mapconcat (lambda (line) (funcall keyword-to-org keyword line))
-			  value "")
-	     (funcall keyword-to-org keyword value)))))
-     ;; List all ELEMENT's properties matching an attribute line or an
-     ;; affiliated keyword, but ignore translated keywords since they
-     ;; cannot belong to the property list.
-     (let (acc)
-       (org-element-properties-mapc
-        (lambda (prop _ __)
-          (let  ((keyword (upcase (substring (symbol-name prop) 1))))
-            (when (or (string-match-p "^ATTR_" keyword)
-		      (and
-		       (member keyword org-element-affiliated-keywords)
-		       (not (assoc keyword
-			         org-element-keyword-translation-alist))))
-              (push prop acc))))
-        element t)
-       (nreverse acc))
-     "")))
+  ;; there are some elements that will never have affiliated keywords,
+  ;; so do nothing for these
+  (if (member (org-element-type element) org-element-elements-no-affiliated)
+      ""
+    (let (acc)
+      (org-element-properties-resolve element t)
+      (org-element--properties-mapc
+       (lambda (prop value)
+         (when value
+           (let ((keyword (upcase (substring (symbol-name prop) 1))))
+             (when (or (string-match-p "^ATTR_" keyword)
+                       (and
+                        (member keyword org-element-affiliated-keywords)
+                        (not (assoc keyword
+                                    org-element-keyword-translation-alist))))
+               (push (if (or (member keyword org-element-multiple-keywords)
+                             ;; All attribute keywords can have multiple lines.
+                             (string-match-p "^ATTR_" keyword))
+                         (mapconcat (lambda (line) (org-element--keyword-to-org keyword line))
+                                    value "")
+                       (org-element--keyword-to-org keyword value))
+                     acc)))))
+      element nil t)
+    (apply #'concat (nreverse acc)))))
 
 ;; Because interpretation of the parse tree must return the same
 ;; number of blank lines between elements and the same number of white
-- 
2.47.0


[-- Attachment #3: org-ml-postpatch.pdf --]
[-- Type: application/pdf, Size: 10220 bytes --]

[-- Attachment #4: org-ml-prepatch.pdf --]
[-- Type: application/pdf, Size: 10187 bytes --]

[-- Attachment #5: benchmark.el --]
[-- Type: text/plain, Size: 9161 bytes --]

(defmacro time-stringy (class what thing)
  `(progn
     (let ((reps 5000)
           t0 t1 t2 t3)
       (garbage-collect)
       (setq t0 (float-time))
       (--dotimes reps (org-element-interpret-data ,thing))
       (setq t1 (float-time))
       (garbage-collect)
       (setq t2 (float-time))
       (--dotimes reps (org-ml-to-string ,thing))
       (setq t3 (float-time))
       (format "%s, %s, %f, %f" ,class ,what (- t3 t2) (- t1 t0)))))

(defun run-tests ()
  (let* ((text "abc")
         ;; value objects
         (code (org-ml-build-code "abc"))
         (verbatim (org-ml-build-verbatim "abc"))
         (target (org-ml-build-target "abc"))
         (ibcall (org-ml-build-inline-babel-call "abc"))
         (isblock (org-ml-build-inline-src-block "abc" :value "xyz"))
         (break (org-ml-build-line-break))
         (macro (org-ml-build-macro "abc"))
         (stat (org-ml-build-statistics-cookie '(1 2)))

         ;; recursive objects
         (bold (org-ml-build-bold "abc"))
         (radio-target (org-ml-build-radio-target "abc"))
         (strike (org-ml-build-strike-through "abc"))
         (italic (org-ml-build-italic "abc"))
         (underline (org-ml-build-underline "abc"))
         (subscript (org-ml-build-subscript :use-brackets-p t "abc"))
         (superscript (org-ml-build-superscript :use-brackets-p t "abc"))
         (cell (org-ml-build-table-cell "abc"))
         (link (org-ml-build-link "https://downloadmoreram.com" "abc"))
         (fnref (org-ml-build-footnote-reference "abc"))

         ;; other objects
         (latex-frag (org-ml-build-latex-fragment "abc"))
         (keyword (org-ml-build-keyword "thing" "abc"))
         (timestamp (org-ml-build-timestamp! '(2024 1 1 0 0)))
         (timestamp-long (org-ml-build-timestamp!
                          '(2024 1 1 0 0)
                          :end '(2024 1 1 1 0)
                          :collapsed nil
                          :repeater '(cumulate 10 day)
                          :deadline '(14 day)
                          :warning '(all 2 day)
                          ))
         (entity (org-ml-build-entity "Agrave"))
         (export-snip (org-ml-build-export-snippet "abc" "xyz"))

         ;; leaf elements
         (comment (org-ml-build-comment "abc"))
         (comment-block (org-ml-build-comment-block :value "abc"))
         (example-block (org-ml-build-example-block :value "abc"))
         (horz (org-ml-build-horizontal-rule))
         (export-block (org-ml-build-export-block "abc" "xyz"))
         (fixed-width (org-ml-build-fixed-width "abc"))
         (clock (org-ml-build-clock! '(2024 1 1 0 0)))
         (clock-long (org-ml-build-clock! '(2024 1 1 0 0) :end '(2024 1 1 1 1)))
         (bcall (org-ml-build-babel-call "abc"))
         (diary-sexp (org-ml-build-diary-sexp :value '(poop)))
         (latex-env (org-ml-build-latex-environment (list "abc" "xyz")))
         (node-prop (org-ml-build-node-property "abc" "xyz"))
         (plan (org-ml-build-planning! :closed '(2024 1 1)))
         (plan-long (org-ml-build-planning! :closed '(2024 1 1) :scheduled '(2024 1 1) :deadline '(2024 1 1)))
         (src-block (org-ml-build-src-block :value "abc"))

         ;; elements with objects
         (para0 (org-ml-build-paragraph))
         (row0 (org-ml-build-table-row))
         (verse0 (org-ml-build-verse-block))
         (para3 (org-ml-build-paragraph text text text))
         (row3 (org-ml-build-table-row cell cell cell))
         (verse3 (org-ml-build-verse-block text text text))
         (para6 (org-ml-build-paragraph text text text text text text))
         (row6 (org-ml-build-table-row cell cell cell cell cell cell))
         (verse6 (org-ml-build-verse-block text text text text text text))

         ;; greater elements
         (item (org-ml-build-item! :bullet '- :checkbox 'on :paragraph "abc"))
         (hl0 (org-ml-build-headline! :title-text "abc"
                                      :todo-keyword "TODO"
                                      :tags '("poopy" "butthole")))
         (hl3 (org-ml-build-headline! :title-text "abc"
                                      :todo-keyword "TODO"
                                      :tags '("poopy" "butthole")
                                      :section-children (-repeat 3 para3)))
         (hl6 (org-ml-build-headline! :title-text "abc"
                                      :todo-keyword "TODO"
                                      :tags '("poopy" "butthole")
                                      :section-children (-repeat 3 para3)))
         (pd0 (org-ml-build-property-drawer))
         (pd3 (apply #'org-ml-build-property-drawer! (-repeat 3 '("abc" "xyz"))))
         (pd6 (apply #'org-ml-build-property-drawer! (-repeat 6 '("abc" "xyz"))))
         (pl3 (apply #'org-ml-build-plain-list (-repeat 3 item)))
         (pl6 (apply #'org-ml-build-plain-list (-repeat 6 item)))
         (drawer0 (org-ml-build-drawer "abc"))
         (drawer3 (org-ml-build-drawer "abc" pl3))
         (drawer6 (org-ml-build-drawer "abc" pl6))
         (table3 (apply #'org-ml-build-table (-repeat 3 row3)))
         (table6 (apply #'org-ml-build-table (-repeat 6 row6)))
         )
    (->> (list
          (time-stringy "plain" "text" text)
          (time-stringy "object" "code" code)
          (time-stringy "object" "verbatim" verbatim)
          (time-stringy "object" "target" target)
          (time-stringy "object" "inline-babel-call" ibcall)
          (time-stringy "object" "inline-src-block" isblock)
          (time-stringy "object" "entity" entity)
          (time-stringy "object" "keyword" keyword)
          (time-stringy "object" "timestamp" timestamp)
          (time-stringy "object" "timestamp-long" timestamp-long)
          (time-stringy "object" "latex-frag" latex-frag)
          (time-stringy "object" "export-snippet" export-snip)
          (time-stringy "object" "line-break" break)
          (time-stringy "object" "macro" macro)
          (time-stringy "object" "statcookie" stat)

          (time-stringy "recursive object" "bold" bold)
          (time-stringy "recursive object" "strikethru" strike)
          (time-stringy "recursive object" "italic" italic)
          (time-stringy "recursive object" "underline" underline)
          (time-stringy "recursive object" "superscript" superscript)
          (time-stringy "recursive object" "subscript" subscript)
          (time-stringy "recursive object" "cell" cell)
          (time-stringy "recursive object" "link" link)
          (time-stringy "recursive object" "footnote-ref" fnref)
          (time-stringy "recursive object" "radio-target" radio-target)

          (time-stringy "element" "babel call" bcall)
          (time-stringy "element" "comment" comment)
          (time-stringy "element" "comment-block" comment-block)
          (time-stringy "element" "clock" clock)
          (time-stringy "element" "clock-long" clock-long)
          (time-stringy "element" "diary-sexp" diary-sexp)
          (time-stringy "element" "export-block" export-block)
          (time-stringy "element" "example-block" example-block)
          (time-stringy "element" "fixed-width" fixed-width)
          (time-stringy "element" "horizontal rule" horz)
          (time-stringy "element" "latex env" latex-env)
          (time-stringy "element" "prop" node-prop)
          (time-stringy "element" "planning" plan)
          (time-stringy "element" "planning-long" plan-long)
          (time-stringy "element" "src-block" src-block)

          (time-stringy "element with objects" "paragraph0" para0)
          (time-stringy "element with objects" "table-row0" row0)
          (time-stringy "element with objects" "verse-block0" verse0)
          (time-stringy "element with objects" "paragraph3" para3)
          (time-stringy "element with objects" "table-row3" row3)
          (time-stringy "element with objects" "verse-block3" verse3)
          (time-stringy "element with objects" "paragraph6" para6)
          (time-stringy "element with objects" "table-row6" row6)
          (time-stringy "element with objects" "verse-block6" verse6)

          (time-stringy "greater element" "headline0" hl0)
          (time-stringy "greater element" "headline3" hl3)
          ;; (time-stringy "greater element" "headline6" hl6)
          (time-stringy "greater element" "drawer0" drawer0)
          (time-stringy "greater element" "drawer3" drawer3)
          ;; (time-stringy "greater element" "drawer6" drawer6)
          (time-stringy "greater element" "item" item)
          (time-stringy "greater element" "property-drawer0" pd0)
          (time-stringy "greater element" "property-drawer3" pd3)
          ;; (time-stringy "greater element" "property-drawer6" pd6)
          (time-stringy "greater element" "plain-list3" pl3)
          ;; (time-stringy "greater element" "plain-list6" pl6)
          (time-stringy "greater element" "table3" table3)
          ;; (time-stringy "greater element" "table6" table6)
          )
         (s-join "\n"))))

;; (run-tests)
       
(--> (-repeat 5 (run-tests))
     (s-join "\n" it)
     (f-write-text it 'utf-8 "/tmp/runs_interpret.out"))

^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2024-11-29 22:44 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-29 22:38 [PATCH] org-element.el; significant optimizations for org-element--interpret-affiliated-keywords Dwarshuis, Nathan J

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).