From: Nicolas Goaziou <n.goaziou@gmail.com>
To: Achim Gratz <Stromeko@nexgo.de>
Cc: emacs-orgmode@gnu.org
Subject: Re: Exporting large documents
Date: Mon, 06 May 2013 21:17:50 +0200 [thread overview]
Message-ID: <87haifex41.fsf@gmail.com> (raw)
In-Reply-To: <87vc6wuf0t.fsf@Rainer.invalid> (Achim Gratz's message of "Mon, 06 May 2013 20:41:54 +0200")
Hello,
Achim Gratz <Stromeko@nexgo.de> writes:
> Lawrence Mitchell writes:
>> org-element--current-element takes (on my machine) 0.0003 seconds per
>> call. However, when exporting 128x the orgmanual introduction, it's
>> called around 250000 times giving ~ 80 seconds total time (out of ~200
>> total).
>
> I've traced this a bit and the question does warrant further
> investigation. Exporting the introduction without any duplications
> already shows some interesting things: the property drawer for the
> introduction is scanned a whopping 137 times, followed by 134 times the
> cindex entry following it, followed by 125 times the "Summary" headline.
> The header options feature prominently with around 100 scans each as
> well.
>
> The rest of the calls have mostly just a single invocation, but there
> are some instances where parts of the tree are traversed multiple times
> in succession to apparently adjust the :end property to the leaf element
> in small increments or decrements. If elements are mutable during
> parsing then caching is more difficult as well, obviously.
>
>> So it sort of feels like actually what is needed is microoptimisations
>> of the bits of the export engine that are called the most.
>
> Looking at the traces I'd think if we could eliminate the repeated
> backtracking to adjust the leafs or at least skip over those elements in
> a backtrack that are already fully parsed instead of parsing them again,
> that would be a good start.
Actually this is a bit different. Parsing doesn't backtrack. Look at
`org-element-parse-buffer' through elp to see that elements are parsed
only once.
The problem comes from `org-element-at-point'. To be effective, it needs
to move back to the current headline, and start parsing buffer again
from there. That means the first element after the headline (often
a property drawer) will be parsed each time we need information within
the section.
A very good improvement for the exporter and, more importantly, for the
parser, would be to cache results from `org-element--current-element'.
Though, this cache would also need to be refreshed after each buffer
modification. This is the tricky part.
One solution would be to use `after-change-functions' and
`before-change-functions' to store intervals of modified areas in the
buffer. Then, during idle time, a `maphash' could update boundaries of
cached values or remove them completely, according to the intervals.
Regards,
--
Nicolas Goaziou
next prev parent reply other threads:[~2013-05-06 19:17 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-27 19:28 Exporting large documents Achim Gratz
2013-04-27 19:35 ` Carsten Dominik
2013-04-29 16:04 ` Lawrence Mitchell
2013-04-29 18:44 ` Achim Gratz
2013-05-01 12:18 ` [PATCH] ox: Cache locations of fuzzy links Lawrence Mitchell
2013-05-01 21:46 ` Nicolas Goaziou
2013-05-02 9:03 ` [PATCH v2] " Lawrence Mitchell
2013-05-02 12:35 ` Nicolas Goaziou
2013-05-02 12:53 ` Nicolas Goaziou
2013-05-03 8:43 ` Exporting large documents Carsten Dominik
2013-05-03 11:12 ` Lawrence Mitchell
[not found] ` <877gjfgnl9.fsf@gmail.com>
[not found] ` <0F877AB5-D488-4223-B0E7-F11B4B973614@gmail.com>
[not found] ` <87ip2xfd0x.fsf@gmail.com>
2013-05-06 11:07 ` Lawrence Mitchell
2013-05-06 16:15 ` Lawrence Mitchell
2013-05-07 10:26 ` Bastien
2013-05-06 18:41 ` Achim Gratz
2013-05-06 19:17 ` Nicolas Goaziou [this message]
2013-05-06 19:32 ` Achim Gratz
2013-05-07 14:29 ` Nicolas Goaziou
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.orgmode.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87haifex41.fsf@gmail.com \
--to=n.goaziou@gmail.com \
--cc=Stromeko@nexgo.de \
--cc=emacs-orgmode@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).