emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* multipage html output
@ 2024-07-03  9:44 Orm Finnendahl
  2024-07-03 10:33 ` Dr. Arne Babenhauserheide
                   ` (2 more replies)
  0 siblings, 3 replies; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-03  9:44 UTC (permalink / raw)
  To: emacs-orgmode

Hi,

 after my clunky publishing chain from org to gitbook with multipage
page output broke down recently I finally decided to tackle adding an
export backend for multipage html output to org-export.

It is done now and mainly working. The backend uses all the
funcionality of the ox html exporter, only slightly modifying the code
in places where it is necessary for multipage output. In addition I
tried to make it as general, as possible to enable adding other
multipage backends (like for md output) easily.

Before sharing it I thought it might be a good idea to think about
integrating it properly/officially into org. I would be willing to
provide the code, docs, patches, etc.

There are a couple of decisions to make (should it be integrated as an
option into the html output backend or should it be a separate backend
altogether?  What options concerning footnotes, toc, etc. should be
provided?  etc...) and this mail is basically asking about how to
proceed.

My questions:

- Is there widespread interest to fully integrate it into org mode?

- If so, whom should I contact, or is it expected that I just go ahead
  and supply merge requests?

I'm a bit hesitant putting in the extra work of fully integrating it
without approval by the maintainers to go ahead.

In case someone wants to take a peek at the current state of the code
you can check out my github repository here:

https://github.com/ormf/ox-html-multipage

Be aware and warned that the code is in constant flux, not finalized
and there still are some open questions for me what would be the best
way to integrate the code into the old export engine, like whether
adding optional args to the transcoding functions or using properties
in the info channel, etc... Once it is finalized, the current single
page html export will work exactly as before (it already does, but
while checking it out I am modifying the html templates for the
multipage navigation, toc, etc.)

Hope to hear from you, especially if the maintainers are reading this.

--
Orm


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-03  9:44 multipage html output Orm Finnendahl
@ 2024-07-03 10:33 ` Dr. Arne Babenhauserheide
  2024-07-03 10:58 ` Christian Moe
  2024-07-03 21:11 ` Rudolf Adamkovič
  2 siblings, 0 replies; 46+ messages in thread
From: Dr. Arne Babenhauserheide @ 2024-07-03 10:33 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 440 bytes --]

Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:

> https://github.com/ormf/ox-html-multipage

Do I understand it right, that this exports a single org file into
multiple HTML files in the html subfolder?

In the interest of making it possible to build upon the code, can you
make the license GPL v2.0 *or later*?

Best wishes,
Arne
-- 
Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
draketo.de

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1125 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-03  9:44 multipage html output Orm Finnendahl
  2024-07-03 10:33 ` Dr. Arne Babenhauserheide
@ 2024-07-03 10:58 ` Christian Moe
  2024-07-03 11:05   ` Ihor Radchenko
  2024-07-03 21:11 ` Rudolf Adamkovič
  2 siblings, 1 reply; 46+ messages in thread
From: Christian Moe @ 2024-07-03 10:58 UTC (permalink / raw)
  To: emacs-orgmode


Orm Finnendahl writes:

> Hi,
>
>  after my clunky publishing chain from org to gitbook with multipage
> page output broke down recently I finally decided to tackle adding an
> export backend for multipage html output to org-export.
>
> (... snip ...)
>
> - Is there widespread interest to fully integrate it into org mode?

It would be nice to have.

Conceptually, I'd see it as fitting into org-publish, perhaps, rather
than as an exporter? With org-publish-project-alist as a convenient
place to set up various options?

Yours,
Christian


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-03 10:58 ` Christian Moe
@ 2024-07-03 11:05   ` Ihor Radchenko
  2024-07-03 14:34     ` Christian Moe
  2024-07-04  9:50     ` Orm Finnendahl
  0 siblings, 2 replies; 46+ messages in thread
From: Ihor Radchenko @ 2024-07-03 11:05 UTC (permalink / raw)
  To: Christian Moe; +Cc: emacs-orgmode

Christian Moe <mail@christianmoe.com> writes:

>>  after my clunky publishing chain from org to gitbook with multipage
>> page output broke down recently I finally decided to tackle adding an
>> export backend for multipage html output to org-export.
>>
>> (... snip ...)
>>
>> - Is there widespread interest to fully integrate it into org mode?
>
> It would be nice to have.
>
> Conceptually, I'd see it as fitting into org-publish, perhaps, rather
> than as an exporter? With org-publish-project-alist as a convenient
> place to set up various options?

Not really. ox-publish is more about exporting multiple input
.org/non-.org files into outputs.

I'd rather see this kind of feature being a part of ox.el - an option to
export one .org to many smaller files. Currently, we only have an option
to export one .org (or part of it) to a single string/file. (And then,
ox-odt has to try various kludges to make things work as expected with
.odt, which consist of multiple files under the hood).

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-03 11:05   ` Ihor Radchenko
@ 2024-07-03 14:34     ` Christian Moe
  2024-07-04  9:50     ` Orm Finnendahl
  1 sibling, 0 replies; 46+ messages in thread
From: Christian Moe @ 2024-07-03 14:34 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode


Ihor Radchenko writes:

> Christian Moe <mail@christianmoe.com> writes:
>
>>>  after my clunky publishing chain from org to gitbook with multipage
>>> page output broke down recently I finally decided to tackle adding an
>>> export backend for multipage html output to org-export.
>>>
>>> (... snip ...)
>>>
>>> - Is there widespread interest to fully integrate it into org mode?
>>
>> It would be nice to have.
>>
>> Conceptually, I'd see it as fitting into org-publish, perhaps, rather
>> than as an exporter? With org-publish-project-alist as a convenient
>> place to set up various options?
>
> Not really. ox-publish is more about exporting multiple input
> .org/non-.org files into outputs.

I was thinking in terms of purpose: organizing export of multiple
outputs to be published together. It does that with multiple inputs
because, as you say, one-to-one export is the option we currently have.

> I'd rather see this kind of feature being a part of ox.el - an option to
> export one .org to many smaller files. Currently, we only have an option
> to export one .org (or part of it) to a single string/file. (And then,
> ox-odt has to try various kludges to make things work as expected with
> .odt, which consist of multiple files under the hood).

Yes, I suppose the code for multipage export belongs on the ox.el
level. And then one would want to be able to use it out of the box
without necessarily having to configure a publishing project, just
relying on sensible defaults. So I take that back.

(There might be some considerations for ox-publish when using
multipage/chunked export *inside* a publishing project, e.g. regarding
which levels of output to include in a sitemap, but that's for another
day.)

Yours,
Christian


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-03  9:44 multipage html output Orm Finnendahl
  2024-07-03 10:33 ` Dr. Arne Babenhauserheide
  2024-07-03 10:58 ` Christian Moe
@ 2024-07-03 21:11 ` Rudolf Adamkovič
  2 siblings, 0 replies; 46+ messages in thread
From: Rudolf Adamkovič @ 2024-07-03 21:11 UTC (permalink / raw)
  To: Orm Finnendahl, emacs-orgmode

Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:

> - Is there widespread interest to fully integrate it into org mode?

Definitely. :)

Rudy
-- 
"It is no paradox to say that in our most theoretical moods we may be
nearest to our most practical applications."  --- Alfred North
Whitehead, 1861-1947

Rudolf Adamkovič <rudolf@adamkovic.org> [he/him]
http://adamkovic.org


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-03 11:05   ` Ihor Radchenko
  2024-07-03 14:34     ` Christian Moe
@ 2024-07-04  9:50     ` Orm Finnendahl
  2024-07-04 11:41       ` Ihor Radchenko
  1 sibling, 1 reply; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-04  9:50 UTC (permalink / raw)
  To: emacs-orgmode

Hi,

Am Mittwoch, den 03. Juli 2024 um 11:05:39 Uhr (+0000) schrieb Ihor
Radchenko:
> 
> Not really. ox-publish is more about exporting multiple input
> .org/non-.org files into outputs.
> 
> I'd rather see this kind of feature being a part of ox.el - an option to
> export one .org to many smaller files. Currently, we only have an option
> to export one .org (or part of it) to a single string/file. (And then,
> ox-odt has to try various kludges to make things work as expected with
> .odt, which consist of multiple files under the hood).

 that is/was my intention: Basically there was only a very small
change to ox.el necessary to make it work (it's mentioned in the
comment on top of ox-multipage-html in my github repository):

Currently `org-export-as' combines parsing the org document into a
global parse tree with all additional options applied and serializing
that into the final output target format. My code simply splits the
code sections of these tasks into two separate functions, which are
called by org-export-as, `org-export--collect-tree-info' and
`org-export--transcode-headline'. The advantage of this approach is
that it is fully compatible with the prior code, but gives the
necessary flexibility to the backend export code to split up the
global parse tree before serializing.

The multipage html backend (ox-html-multipage.el) takes care of
generating the global parse tree with org-export--headline, divides
that tree into the subtrees of the individual pages, then calls the
serializing function for each of the subtrees and writes the results
to file. Is that along the lines of what you meant?

In the meantime I thought about the proposed backend. Maybe it's a
good idea to integrate the single page *and* the multipage backend
into one backend altogether: The Backend *always* produces multipage
output, but you can define the level at which the pages are split with
an #+OPTION: in the org file. Setting the default level to 0 if the
option is not set will generate the exact same output as the old
backend without breaking anything for anybody. I'm quite sure it'll
work and as I said it's mainly done and wouldn't require a lot of
work.

What do you think?

--
Orm


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-04  9:50     ` Orm Finnendahl
@ 2024-07-04 11:41       ` Ihor Radchenko
  2024-07-04 13:33         ` Orm Finnendahl
  0 siblings, 1 reply; 46+ messages in thread
From: Ihor Radchenko @ 2024-07-04 11:41 UTC (permalink / raw)
  To: Orm Finnendahl; +Cc: emacs-orgmode

Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:

>> I'd rather see this kind of feature being a part of ox.el - an option to
>> export one .org to many smaller files. Currently, we only have an option
>> to export one .org (or part of it) to a single string/file. (And then,
>> ox-odt has to try various kludges to make things work as expected with
>> .odt, which consist of multiple files under the hood).
>
>  that is/was my intention: Basically there was only a very small
> change to ox.el necessary to make it work (it's mentioned in the
> comment on top of ox-multipage-html in my github repository):
>
> Currently `org-export-as' combines parsing the org document into a
> global parse tree with all additional options applied and serializing
> that into the final output target format. My code simply splits the
> code sections of these tasks into two separate functions, which are
> called by org-export-as, `org-export--collect-tree-info' and
> `org-export--transcode-headline'. The advantage of this approach is
> that it is fully compatible with the prior code, but gives the
> necessary flexibility to the backend export code to split up the
> global parse tree before serializing.

This makes sense.

Although, multipage export may imply two different things:
1. An ability to produce multiple pages from parts of the original Org
   file.
2. An ability to produce multiple pages from a single part of Org file.
   For example, consider an Org document with images exported to
   ODT. The images should be stored alongside XML content file and
   referenced from there. So, export produces multiple files from the
   same document/subtree.
   
Your approach only addresses (1), but not (2).

That said, even having (1) is a welcome improvement.

> The multipage html backend (ox-html-multipage.el) takes care of
> generating the global parse tree with org-export--headline, divides
> that tree into the subtrees of the individual pages, then calls the
> serializing function for each of the subtrees and writes the results
> to file. Is that along the lines of what you meant?

Yes, but we also need to carefully discuss the rules how the full parse
tree is separated into subtrees. Your proof of concept code hard-codes
these rules.

> In the meantime I thought about the proposed backend. Maybe it's a
> good idea to integrate the single page *and* the multipage backend
> into one backend altogether: The Backend *always* produces multipage
> output, but you can define the level at which the pages are split with
> an #+OPTION: in the org file. Setting the default level to 0 if the
> option is not set will generate the exact same output as the old
> backend without breaking anything for anybody. I'm quite sure it'll
> work and as I said it's mainly done and wouldn't require a lot of
> work.

1. Most of the existing backends are written to produce a single
   page. So, our design of ox.el part should be able to handle
   those. What you proposed (calling the same backend on pre-split parse
   tree) sounds good in this context.

2. Some backends, as you proposed, may target multipage export from the
   very beginning. So, we need to provide some way for the backend (in
   org-export-define*-backend) to specify that it wants to split the
   original parse tree. I imagine some kind of option with default
   values configured via backend, but optionally overwritten by user
   settings/in-buffer keywords.

3. Your suggestion to add a new export option for splitting based on
   headline level is one idea.

   Another idea is to split out subtrees with :EXPORT_FILE_NAME:
   property.

4. One possible extra feature might be exporting only a part of the
   original Org file to separate pages. Say, only pages with specific
   tag. The whole original Org file is also exported, replacing the
   split-out parts with, for example, links. This will generalize
   "index" pages from ox-publish.

5. We need to consider the rules used to generate export file names.
   Currently, we choose between :EXPORT_FILE_NAME: property,
   #+EXPORT_FILE_NAME: keyword, and the original file name.

   As I see in your code, you also introduced deriving file name from
   the headline title.

6. I can see people flipping between exporting the whole document and
   multipage document. We probably need some kind of easy switch in M-x
   org-export-dispatch to choose how to export.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-04 11:41       ` Ihor Radchenko
@ 2024-07-04 13:33         ` Orm Finnendahl
  2024-07-04 16:20           ` Ihor Radchenko
  0 siblings, 1 reply; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-04 13:33 UTC (permalink / raw)
  To: emacs-orgmode

Hi Ihor,

 thanks for your time to study the code and your very valuable input,
much appreciated!

Am Donnerstag, den 04. Juli 2024 um 11:41:35 Uhr (+0000) schrieb Ihor
Radchenko:
>
> 2. An ability to produce multiple pages from a single part of Org file.
>    For example, consider an Org document with images exported to
>    ODT. The images should be stored alongside XML content file and
>    referenced from there. So, export produces multiple files from the
>    same document/subtree.
>    
> Your approach only addresses (1), but not (2).

Sure. I'm not at all familiar with the peculiarities of other output
backends, but see your point. If you can give any hints or have any
ideas *how* we could find general rules for separating the subtrees,
which cover foreseeable use cases, or devise a flexible mechanism for
doing so, I'd be glad to help setting them up and implementing them. I
definitely agree, the code should be as general as possible while
providing complete backward compatibility.

> 1. Most of the existing backends are written to produce a single
>    page. So, our design of ox.el part should be able to handle
>    those. What you proposed (calling the same backend on pre-split parse
>    tree) sounds good in this context.

Ok.

> 2. Some backends, as you proposed, may target multipage export from the
>    very beginning. So, we need to provide some way for the backend (in
>    org-export-define*-backend) to specify that it wants to split the
>    original parse tree. I imagine some kind of option with default
>    values configured via backend, but optionally overwritten by user
>    settings/in-buffer keywords.

I'll look into that and maybe I can come up with something. I was
hesitant to propose anything as I tried to stay as limited as possible
and not get too deep into changing things. If you have suggestions,
please let me know.

> 3. Your suggestion to add a new export option for splitting based on
>    headline level is one idea.
> 
>    Another idea is to split out subtrees with :EXPORT_FILE_NAME:
>    property.

I'm not sure I fully understand what you mean: Do you mean specifying
different :EXPORT_FILE_NAME: properties throughout the same document
and then export accordingly?

> 4. One possible extra feature might be exporting only a part of the
>    original Org file to separate pages. Say, only pages with specific
>    tag. The whole original Org file is also exported, replacing the
>    split-out parts with, for example, links. This will generalize
>    "index" pages from ox-publish.

Very nice idea! MAybe along these lines is that I thought about
"Master" org files which combine different documentations by linking
to them in some sort of top menu which is included on every page of
all these documentations and then being able to generate a single
documentation without having to recompile everything. But for now I'd
prefer to first get it working and then think about such extensions (I
have more ideas for different extensions and "plugins" which could be
useful). It shouldn't be too hard to implement at a later point and
probably also wouldn't need a complete rewrite.

> 5. We need to consider the rules used to generate export file names.
>    Currently, we choose between :EXPORT_FILE_NAME: property,
>    #+EXPORT_FILE_NAME: keyword, and the original file name.
> 
>    As I see in your code, you also introduced deriving file name from
>    the headline title.

Exactly. I wanted to make sure, the file names are sorted correctly,
are unique and the title is relatable to the section it names on the
directory level. I also thought about making it user-configurable, but
first wanted to implement a working solution.

> 6. I can see people flipping between exporting the whole document and
>    multipage document. We probably need some kind of easy switch in M-x
>    org-export-dispatch to choose how to export.

Sure, that is the disadvantage of my proposal to make everything a
"multipage" document. Another disadvantage is that when the user
chooses to open the final document or display it in a buffer the user
can't choose whether to only open/display one page or every exported
page. In most circumstances it should be advisable to just
open/display the first page. We can also just add a switch between
single-page and multipage, with multipage always just exporting to
file, but that also has disadvantages.

As the code I proposed is encapsulated in the html backend and not
spreading all over the place, I will now first go ahead to finalize
the existing code to a fully working setup. ASFAICT adapting that to
other needs shouldn't require a complete rewrite. And I might be
around for a while ;-)

--
Orm


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-04 13:33         ` Orm Finnendahl
@ 2024-07-04 16:20           ` Ihor Radchenko
  2024-07-07 19:33             ` Orm Finnendahl
  2024-07-07 20:50             ` Orm Finnendahl
  0 siblings, 2 replies; 46+ messages in thread
From: Ihor Radchenko @ 2024-07-04 16:20 UTC (permalink / raw)
  To: Orm Finnendahl; +Cc: emacs-orgmode

Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:

> Sure. I'm not at all familiar with the peculiarities of other output
> backends, but see your point. If you can give any hints or have any
> ideas *how* we could find general rules for separating the subtrees,
> which cover foreseeable use cases, or devise a flexible mechanism for
> doing so, I'd be glad to help setting them up and implementing them. I
> definitely agree, the code should be as general as possible while
> providing complete backward compatibility.

I think that the easiest would be adding a new option to
`org-export-options-alist' - it is already extendable for individual
backends and allows users to tweak things via in-buffer keywords,
properties, variables, and export options.

The most generic rule would be some kind of function that takes AST
node as input and returns whether that node should be going to a separate
file or not, and if yes, tell (1) which export backend to use to export
that subtree to a file (may as well allow exporting to different
formats, while we are at it); (2) what are the export parameters to be
used for that export, (possibly) including the file path.

Then, in addition to the most generic (and most flexible) "rule being an
Elisp function", we can allow some simplified semantics to define rules.

The semantics should probably give a couple of toggles to customize:
(1) which subtrees are selected for export; (2) which export backend is
used (3) how their file names are generated; (4) (optional) how they are
represented when exporting the whole original file; e.g. whether to put
links to exported files in place of their subtrees; (5) (optional) how
the original file is represented in the exported subtrees; e.g. whether
to put backlink to parent file

The subtree selection may boil down to the usual TAGS matcher (or
function), as described in "11.3.3 Matching tags and properties" section
of the manual. This will cover the previously discussed separation based
on headline level, a tag, or a property.

The export backend selection may be realized by allowing multiple rules
with each rule defining selection/backend/file name/....

In terms of the value semantics in Elisp, I am thinking about something
re-using backend definition format:

(setq org-export-pages
      '(:selector "LEVEL=2+blog+TODO=DONE"
        :backend html
         ;; completely remove the exported subtree is original document
         ;; is being exported.
        :page-transcoder nil
         ;; or :page-transcoder #'org-export-page-as-heading-with-link
        :export-file-name "%{TITLE}-%{page-number}" ;; or some other kind of template syntax
        )

       '(:selector a-function-accepting-ast-node
         :source-backend any 
         :backend
         (:parent html ;; `org-export-define-derived-backend'-like semantics
          :options-alist
          ;; Do not export private headings in HTML pages.
          ((:exclude-tags "EXCLUDE_TAGS" nil (cons "private" org-export-exclude-tags) split))))

        '(:selector "+export_ascii_page"
          :source-backend html ; only use this rule when exporting to html
          :backend
          (:parent ascii
           ((template .
              (lambda (contents info)
                (format "Paged out from %s\n%s"
                   (plist-get
                     ;; INFO channel for parent document
                     (plist-get info :page-source)
                     :title)
                   (org-ascii-template contents info)))))))))

>> 2. Some backends, as you proposed, may target multipage export from the
>>    very beginning. So, we need to provide some way for the backend (in
>>    org-export-define*-backend) to specify that it wants to split the
>>    original parse tree. I imagine some kind of option with default
>>    values configured via backend, but optionally overwritten by user
>>    settings/in-buffer keywords.
>
> I'll look into that and maybe I can come up with something. I was
> hesitant to propose anything as I tried to stay as limited as possible
> and not get too deep into changing things. If you have suggestions,
> please let me know.

One way could be simply adding an option like :selector above to the
backend definition. Then, it will be used as default selector:

(setq org-export-pages
  (:selector default :backend html) ; export pages to html with default selector
)

or even

(setq org-export-pages
  (:backend html) ; export pages to html with default selector
)

or just

;; export using the same target backend as selected in the export menu
(setq org-export-pages t)
;; (setq org-export-pages nil) - existing single page export
;; (setq org-export-pages 'only-pages) - only export pages, ignore original file

>> 3. Your suggestion to add a new export option for splitting based on
>>    headline level is one idea.
>> 
>>    Another idea is to split out subtrees with :EXPORT_FILE_NAME:
>>    property.
>
> I'm not sure I fully understand what you mean: Do you mean specifying
> different :EXPORT_FILE_NAME: properties throughout the same document
> and then export accordingly?

Yes. It is re-using the existing idea with subtree export

13.2 Export Settings

‘EXPORT_FILE_NAME’
     The name of the output file to be generated.  Otherwise, Org
     generates the file name based on the buffer name and the extension
     based on the backend format.

If a subtree has that property set, it is used as output file name.
Since there is usually no reason to set this property unless you also
want to export subtree to individual file, it makes sense to use this as
selector for what to export as pages.

Example:

#+TITLE: Index document

* Emacs notes
** Emacs blog post #1
:PROPERTIES:
:EXPORT_FILE_NAME: my-first-post
:END:
...
** Fleeting note at [2024-06-20 Thu 22:16]
Some notes, no need to export them.

* Personal notes
** Personal blog post #1
:PROPERTIES:
:EXPORT_FILE_NAME: private/personal-post-trial
:END:
...

>> 6. I can see people flipping between exporting the whole document and
>>    multipage document. We probably need some kind of easy switch in M-x
>>    org-export-dispatch to choose how to export.
>
> Sure, that is the disadvantage of my proposal to make everything a
> "multipage" document. Another disadvantage is that when the user
> chooses to open the final document or display it in a buffer the user
> can't choose whether to only open/display one page or every exported
> page. In most circumstances it should be advisable to just
> open/display the first page. We can also just add a switch between
> single-page and multipage, with multipage always just exporting to
> file, but that also has disadvantages.

What to open is a minor detail, really. It can be worked out any moment
we need to. The most sensible default, IMHO, it to open dired with the
containing directory with all the exported pages.

> As the code I proposed is encapsulated in the html backend and not
> spreading all over the place, I will now first go ahead to finalize
> the existing code to a fully working setup. ASFAICT adapting that to
> other needs shouldn't require a complete rewrite. And I might be
> around for a while ;-)

I advice against doing this.
While reading your code, I saw that you used some html-specific
functions for modifications in ox.el. If you start by modifying ox.el in
Org git repo directly, simply doing "make compile" will warn about
instances of using functions not defined in ox.el.
Another advantage of editing the ox.el and using Org repository is that
you can run "make test" any time and see if you managed to break Org :)

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
@ 2024-07-06  5:47 Pedro Andres Aranda Gutierrez
  2024-07-06  9:04 ` Orm Finnendahl
  0 siblings, 1 reply; 46+ messages in thread
From: Pedro Andres Aranda Gutierrez @ 2024-07-06  5:47 UTC (permalink / raw)
  To: orm.finnendahl; +Cc: Ihor Radchenko, Org Mode List

[-- Attachment #1: Type: text/plain, Size: 586 bytes --]

Sorry for bumping in, I've been more off than on in the last couple of
weeks...
Just a stupid question: have you considered any marker to force a page
break?
That would make this functionality portable to other exporters like LaTeX,
where
you can force a page break with \clearpage or \cleardoublepage.

(Hopefully) my .2 cents, /PA

-- 
Fragen sind nicht da, um beantwortet zu werden,
Fragen sind da um gestellt zu werden
Georg Kreisler

Headaches with a Juju log:
unit-basic-16: 09:17:36 WARNING juju.worker.uniter.operation we should run
a leader-deposed hook here, but we can't yet

[-- Attachment #2: Type: text/html, Size: 947 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-06  5:47 Pedro Andres Aranda Gutierrez
@ 2024-07-06  9:04 ` Orm Finnendahl
  0 siblings, 0 replies; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-06  9:04 UTC (permalink / raw)
  To: Org Mode List

Hi,

Am Samstag, den 06. Juli 2024 um 07:47:43 Uhr (+0200) schrieb Pedro Andres Aranda Gutierrez:
> Sorry for bumping in, I've been more off than on in the last couple of
> weeks...
> Just a stupid question: have you considered any marker to force a page
> break?
> That would make this functionality portable to other exporters like LaTeX,
> where
> you can force a page break with \clearpage or \cleardoublepage.

 although this is of course possible, currently I'm not planning to
implement it.

Regarding html export I see some problems with that idea:

1. It would either open a new can of worms if this page would be added
   to the toc with all sorts of ensuing problems like naming, etc. and
   getting out of sync with the Latex document's toc.

or

2. Those additinal pages don't get added to the toc and are only
   reachable by navigation elements, which I consider suboptimal (and
   you'd still have to name them).

In any case, currently I'm facing many problems concerning the
glorious hairy details and am glad if I can sort them out in a way
that they are general enough to be added to ox. Adding additional
engines to handle page breaks the way you envision should then be
feasible without reinventing the wheel.

--
Orm



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-04 16:20           ` Ihor Radchenko
@ 2024-07-07 19:33             ` Orm Finnendahl
  2024-07-08 15:29               ` Ihor Radchenko
  2024-07-07 20:50             ` Orm Finnendahl
  1 sibling, 1 reply; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-07 19:33 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode

Hi,

 this is a report of my current state with the html multipage export
backend: I finished most of the heavy lifting and am currently trying
to integrate it with the old backend into a single file.

For now I plan to use a custom menu-entry ('m') in the export dialog
rather than doing it with an option in the file. The main reason is
that I like to be able to switch between output formats easily without
having to change the document. But that's debatable. I could also
implement it with an option in the document and I'm open for opinions.

For the backend I'm planning to realize the following options
(implemented as custom variables, which can be overwritten in the
document):

- org-html-multipage-export-directory

  The directory for the exported files (relative or absolute).

- org-html-multipage-head

  (similar to HTML_HEAD but will be used instead of the HTML_HEAD for
  custom css/js)

- org-html-multipage-front-matter

  A list to specify pages in front of the headlines of the
  document. Possible values are 'title, 'title-toc and 'toc. title-toc
  is a combined page containing the title and the toc. Multiple
  entries are possible.

- org-html-multipage-join-first-subsection

  Boolean: Non-nil means that the first subsection of a section
  without a body will be joined on the section page (recursively). See
  my generated example pages linked below (Chapters 4, 5 and 7 for a
  recursive example)

- org-html-multipage-split

  How to split the document. Possible values are

  'toc for generating a page for each toc entry.
  
  'export-filename for splitting into pages along :EXPORT_FILENAME:
  properties. The autogenerated filename mechanism for the other
  options will be overwritten in this case.

  A number for the depth to split (similar to the value for h: or
  toc:) I haven't tested all options yet but will see whether/how it
  works.

- org-html-multipage-open

  Whether and where to open the first page of the document after
  export. Possible values are 'browser 'buffer or nil. (As Ihor
  mentioned this is a minor issue).

This is fairly straightforward for me to realize (it's mostly done
already). The suggestions of Ihor are excellent, but IIUC they
implement a larger and more general context, which of course is
desirable. I have to study the ideas more thoroughly to see, how
difficult/time consuming it will be to implement. It might be that it
is better to do it in two steps to keep it manageable for me. I'm
pretty sure that the current approach can be adapted to the larger
context easily so the work is not in vain.

In addition I have a question about the html output layout
structure. Here is an example of a file generated with the current
code with some preliminary layout. It might give an idea about my use
case:

https://www.selma.hfmdk-frankfurt.de/finnendahl/klangsynthesebuch/01_00_00_vorwort.html#orge24571b

Regardless of the colours, the file has a slightly different hierarchy
than the single page html template of ORGMODE and is more oriented
towards the layout of documentation nowadays with a (hideable) toc at
the side on every page rather than the texinfo oriented layout used by
the orgmode manual. If my code gets accepted/merged to org what should
be the default layout shipped with multipage output? FYI: The
visibility of the toc entries is managed by the css and the whole toc
is included on each page (and its visibility could be managed with js
as well). Should I rather go for the classic texinfo view?

And now just a short answer to Ihor's remarks.

Am Donnerstag, den 04. Juli 2024 um 16:20:29 Uhr (+0000) schrieb Ihor
Radchenko:
> While reading your code, I saw that you used some html-specific
> functions for modifications in ox.el. If you start by modifying ox.el in
> Org git repo directly, simply doing "make compile" will warn about
> instances of using functions not defined in ox.el.
> Another advantage of editing the ox.el and using Org repository is that
> you can run "make test" any time and see if you managed to break Org :)

Of course. I never intended to corrupt ox.el with html specific stuff,
that was just preliminary while getting acquainted with the
code. Currently I'm in the process of separating everything and
reducing it to the minimal requirements for change. I'll let you know
when it's done.

--
Orm


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-04 16:20           ` Ihor Radchenko
  2024-07-07 19:33             ` Orm Finnendahl
@ 2024-07-07 20:50             ` Orm Finnendahl
  2024-07-08 15:05               ` Ihor Radchenko
  1 sibling, 1 reply; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-07 20:50 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode

Hi Ihor,

 I'm trying to grasp what you are proposing and have some questions to
make sure I've understood (please correct me if I'm wrong):

- Your idea is to add an option to the backend definition called
  org-export-pages which is a plist containing information about the
  way to export the document in case some "multipage" option is chosen
  in the export dialog.

- Am I right that you suggest that all these org-export-pages
  properties can be overwritten in the header of the org file?

- If that is correct I assume multipage export should then be a
  generic option common to different export backends (if defined)
  (something like "export-as-multipage") and the question is how to
  specify that when exporting. Should this option just be listed in
  the export dialog for every export backend which supports it (like
  in my current approach for html) and when choosing it the rules of
  the current definition of org-export-pages in the current context
  are used?

- This implies that the code handling this is done in ox.el like this:

  The export-pages function in ox.el
  
  1. generates the parse-tree
  
  2. extracts the subtrees according to the rules

  3. calls org-export-to-file on the backends for each of them.

  4. optionally also exports the whole document, maybe stripped from
     its exported sections (replaced by links, etc.)

If this is the way you suggest it, it doesn't sound too complicated as
most of it is done already.

My only concern is that in this case org-export-pages is not really
backend specific and therefore the place for it semantically shouldn't
be in the definition of the backend, but separate from it.

The backend should just define a general function for exporting a
subtree to a file for the multipage case as this might differ from the
definition for single file output of the complete parse-tree (with the
name of this general multipage export function being the same in all
backends which support multipage output).

This would also imply a mechanism to define different org-export-pages
plists and select from them before exporting by calling a generic
backend-agnostic org-export-to-pages function in ox.el. This is very
elegant but also somewhat different from the current layout of
org-export which is single-page single-backend centered. Hmm...

--
Orm


Am Donnerstag, den 04. Juli 2024 um 16:20:29 Uhr (+0000) schrieb Ihor Radchenko:
> Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:
> 
> > Sure. I'm not at all familiar with the peculiarities of other output
> > backends, but see your point. If you can give any hints or have any
> > ideas *how* we could find general rules for separating the subtrees,
> > which cover foreseeable use cases, or devise a flexible mechanism for
> > doing so, I'd be glad to help setting them up and implementing them. I
> > definitely agree, the code should be as general as possible while
> > providing complete backward compatibility.
> 
> I think that the easiest would be adding a new option to
> `org-export-options-alist' - it is already extendable for individual
> backends and allows users to tweak things via in-buffer keywords,
> properties, variables, and export options.
> 
> The most generic rule would be some kind of function that takes AST
> node as input and returns whether that node should be going to a separate
> file or not, and if yes, tell (1) which export backend to use to export
> that subtree to a file (may as well allow exporting to different
> formats, while we are at it); (2) what are the export parameters to be
> used for that export, (possibly) including the file path.
> 
> Then, in addition to the most generic (and most flexible) "rule being an
> Elisp function", we can allow some simplified semantics to define rules.
> 
> The semantics should probably give a couple of toggles to customize:
> (1) which subtrees are selected for export; (2) which export backend is
> used (3) how their file names are generated; (4) (optional) how they are
> represented when exporting the whole original file; e.g. whether to put
> links to exported files in place of their subtrees; (5) (optional) how
> the original file is represented in the exported subtrees; e.g. whether
> to put backlink to parent file
> 
> The subtree selection may boil down to the usual TAGS matcher (or
> function), as described in "11.3.3 Matching tags and properties" section
> of the manual. This will cover the previously discussed separation based
> on headline level, a tag, or a property.
> 
> The export backend selection may be realized by allowing multiple rules
> with each rule defining selection/backend/file name/....
> 
> In terms of the value semantics in Elisp, I am thinking about something
> re-using backend definition format:
> 
> (setq org-export-pages
>       '(:selector "LEVEL=2+blog+TODO=DONE"
>         :backend html
>          ;; completely remove the exported subtree is original document
>          ;; is being exported.
>         :page-transcoder nil
>          ;; or :page-transcoder #'org-export-page-as-heading-with-link
>         :export-file-name "%{TITLE}-%{page-number}" ;; or some other kind of template syntax
>         )
> 
>        '(:selector a-function-accepting-ast-node
>          :source-backend any 
>          :backend
>          (:parent html ;; `org-export-define-derived-backend'-like semantics
>           :options-alist
>           ;; Do not export private headings in HTML pages.
>           ((:exclude-tags "EXCLUDE_TAGS" nil (cons "private" org-export-exclude-tags) split))))
> 
>         '(:selector "+export_ascii_page"
>           :source-backend html ; only use this rule when exporting to html
>           :backend
>           (:parent ascii
>            ((template .
>               (lambda (contents info)
>                 (format "Paged out from %s\n%s"
>                    (plist-get
>                      ;; INFO channel for parent document
>                      (plist-get info :page-source)
>                      :title)
>                    (org-ascii-template contents info)))))))))
> 
> >> 2. Some backends, as you proposed, may target multipage export from the
> >>    very beginning. So, we need to provide some way for the backend (in
> >>    org-export-define*-backend) to specify that it wants to split the
> >>    original parse tree. I imagine some kind of option with default
> >>    values configured via backend, but optionally overwritten by user
> >>    settings/in-buffer keywords.
> >
> > I'll look into that and maybe I can come up with something. I was
> > hesitant to propose anything as I tried to stay as limited as possible
> > and not get too deep into changing things. If you have suggestions,
> > please let me know.
> 
> One way could be simply adding an option like :selector above to the
> backend definition. Then, it will be used as default selector:
> 
> (setq org-export-pages
>   (:selector default :backend html) ; export pages to html with default selector
> )
> 
> or even
> 
> (setq org-export-pages
>   (:backend html) ; export pages to html with default selector
> )
> 
> or just
> 
> ;; export using the same target backend as selected in the export menu
> (setq org-export-pages t)
> ;; (setq org-export-pages nil) - existing single page export
> ;; (setq org-export-pages 'only-pages) - only export pages, ignore original file
> 
> >> 3. Your suggestion to add a new export option for splitting based on
> >>    headline level is one idea.
> >> 
> >>    Another idea is to split out subtrees with :EXPORT_FILE_NAME:
> >>    property.
> >
> > I'm not sure I fully understand what you mean: Do you mean specifying
> > different :EXPORT_FILE_NAME: properties throughout the same document
> > and then export accordingly?
> 
> Yes. It is re-using the existing idea with subtree export
> 
> 13.2 Export Settings
> 
> ‘EXPORT_FILE_NAME’
>      The name of the output file to be generated.  Otherwise, Org
>      generates the file name based on the buffer name and the extension
>      based on the backend format.
> 
> If a subtree has that property set, it is used as output file name.
> Since there is usually no reason to set this property unless you also
> want to export subtree to individual file, it makes sense to use this as
> selector for what to export as pages.
> 
> Example:
> 
> #+TITLE: Index document
> 
> * Emacs notes
> ** Emacs blog post #1
> :PROPERTIES:
> :EXPORT_FILE_NAME: my-first-post
> :END:
> ...
> ** Fleeting note at [2024-06-20 Thu 22:16]
> Some notes, no need to export them.
> 
> * Personal notes
> ** Personal blog post #1
> :PROPERTIES:
> :EXPORT_FILE_NAME: private/personal-post-trial
> :END:
> ...
> 
> >> 6. I can see people flipping between exporting the whole document and
> >>    multipage document. We probably need some kind of easy switch in M-x
> >>    org-export-dispatch to choose how to export.
> >
> > Sure, that is the disadvantage of my proposal to make everything a
> > "multipage" document. Another disadvantage is that when the user
> > chooses to open the final document or display it in a buffer the user
> > can't choose whether to only open/display one page or every exported
> > page. In most circumstances it should be advisable to just
> > open/display the first page. We can also just add a switch between
> > single-page and multipage, with multipage always just exporting to
> > file, but that also has disadvantages.
> 
> What to open is a minor detail, really. It can be worked out any moment
> we need to. The most sensible default, IMHO, it to open dired with the
> containing directory with all the exported pages.
> 
> > As the code I proposed is encapsulated in the html backend and not
> > spreading all over the place, I will now first go ahead to finalize
> > the existing code to a fully working setup. ASFAICT adapting that to
> > other needs shouldn't require a complete rewrite. And I might be
> > around for a while ;-)
> 
> I advice against doing this.
> While reading your code, I saw that you used some html-specific
> functions for modifications in ox.el. If you start by modifying ox.el in
> Org git repo directly, simply doing "make compile" will warn about
> instances of using functions not defined in ox.el.
> Another advantage of editing the ox.el and using Org repository is that
> you can run "make test" any time and see if you managed to break Org :)
> 
> -- 
> Ihor Radchenko // yantar92,
> Org mode contributor,
> Learn more about Org mode at <https://orgmode.org/>.
> Support Org development at <https://liberapay.com/org-mode>,
> or support my work at <https://liberapay.com/yantar92>
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-07 20:50             ` Orm Finnendahl
@ 2024-07-08 15:05               ` Ihor Radchenko
  2024-07-08 15:41                 ` Orm Finnendahl
  0 siblings, 1 reply; 46+ messages in thread
From: Ihor Radchenko @ 2024-07-08 15:05 UTC (permalink / raw)
  To: Orm Finnendahl; +Cc: emacs-orgmode

Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:

>  I'm trying to grasp what you are proposing and have some questions to
> make sure I've understood (please correct me if I'm wrong):

(Just for some context, do not take my ideas as something you must
follow 100% accurately. I am largely brainstorming here. So, feel free
to disagree, propose anything alternative, etc; My main focus in this
discussion is that multipage export should be backend-agnostic if
possible)

> - Your idea is to add an option to the backend definition called
>   org-export-pages which is a plist containing information about the
>   way to export the document in case some "multipage" option is chosen
>   in the export dialog.

Yup. Not an "option" in a sense of variable, but a proper export option
that can be set via (1) variable; (2) backend option plist (in other
words, overridden by backends); (3) in-buffer keyword, locally.

> - Am I right that you suggest that all these org-export-pages
>   properties can be overwritten in the header of the org file?

Yes. But that may be controlled by the backends, as with any other
export option. To illustrate, there is CREATOR option that ox-html
re-defines like the following:

;; Original global definition in ox.el
    (:creator "CREATOR" nil org-export-creator-string)

;; Override inside ox.el.  In this example, it uses a backend-specific
;; customization instead of `org-export-creator-string', but anything
;; at all can be overridden.
    (:creator "CREATOR" nil org-html-creator-string)

In both cases, the :creator export option can be set in buffer via,
#+CREATOR: name

> - If that is correct I assume multipage export should then be a
>   generic option common to different export backends (if defined)
>   (something like "export-as-multipage") and the question is how to
>   specify that when exporting. Should this option just be listed in
>   the export dialog for every export backend which supports it (like
>   in my current approach for html) and when choosing it the rules of
>   the current definition of org-export-pages in the current context
>   are used?

Yes. Something similar to `org-export-visible-only',
`org-export-body-only', etc. These customizations can be toggled
interactively, from `org-export-dispatch'.

A question for future is whether we want more than just "t" or "nil"
toggle, but it should not be too hard to generalize if we simply start
from just t/nil.

We might also consider adding MULTIPAGE as an additional argument to the
API function (just like BODY-ONLY, VISIBLE-ONLY, SUBTREEP that we
already use), but that's probably an implementation idea we may or may
not need to use.

> - This implies that the code handling this is done in ox.el like this:
>
>   The export-pages function in ox.el
>   
>   1. generates the parse-tree
>   
>   2. extracts the subtrees according to the rules
>
>   3. calls org-export-to-file on the backends for each of them.
>
>   4. optionally also exports the whole document, maybe stripped from
>      its exported sections (replaced by links, etc.)
>
> If this is the way you suggest it, it doesn't sound too complicated as
> most of it is done already.

Yes, roughly like this.
Ideally, we should simply modify `org-export-as', but handling output
file name may be a bit tricky - it is somewhat awkwardly placed in the
current ox.el API (see the discussion in https://list.orgmode.org/orgmode/25393.61240.135445.401251@gargle.gargle.HOWL/T/#u).

> My only concern is that in this case org-export-pages is not really
> backend specific and therefore the place for it semantically shouldn't
> be in the definition of the backend, but separate from it.

I guess that backends may provide some defaults that make more sense for
those backends only. But otherwise splitting the full AST before
individual page export might be simply handled in ox.el.

> The backend should just define a general function for exporting a
> subtree to a file for the multipage case as this might differ from the
> definition for single file output of the complete parse-tree (with the
> name of this general multipage export function being the same in all
> backends which support multipage output).

All the built-in backends already have such function. For example,

(defun org-html-export-to-html
    (&optional async subtreep visible-only body-only ext-plist)
                     ^^^^^^^^

If subtree export is good enough to handle multi-page export, we may not
even need to do much. (Although, deriving the file names is currently
hard-coded for subtrees and is not very customizable; see the link I
shared above)

> This would also imply a mechanism to define different org-export-pages
> plists and select from them before exporting by calling a generic
> backend-agnostic org-export-to-pages function in ox.el. This is very
> elegant but also somewhat different from the current layout of
> org-export which is single-page single-backend centered. Hmm...

I do not think that we need to go too deep into this rabbit hole for
now. A simple toggle based on `org-export-dispatch' might be good
enough. It can be easily extended to something like multi-state switch
(t/nil vs. t -> option A -> option B -> nil -> t -> ...).

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-07 19:33             ` Orm Finnendahl
@ 2024-07-08 15:29               ` Ihor Radchenko
  2024-07-08 19:12                 ` Orm Finnendahl
  0 siblings, 1 reply; 46+ messages in thread
From: Ihor Radchenko @ 2024-07-08 15:29 UTC (permalink / raw)
  To: Orm Finnendahl; +Cc: emacs-orgmode

Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:

> For the backend I'm planning to realize the following options
> (implemented as custom variables, which can be overwritten in the
> document):
>
> - org-html-multipage-export-directory
>
>   The directory for the exported files (relative or absolute).

I am wondering about the reasoning behind not re-using
#+EXPORT_FILE_NAME: here (its directory part) and simply defaulting to
 `default-directory'.

Is there any situation when you need to export the full document
vs. multipage to different places?

> - org-html-multipage-head
>
>   (similar to HTML_HEAD but will be used instead of the HTML_HEAD for
>   custom css/js)

Again, why not directly using #+HTML_HEAD?

> - org-html-multipage-front-matter
>
>   A list to specify pages in front of the headlines of the
>   document. Possible values are 'title, 'title-toc and 'toc. title-toc
>   is a combined page containing the title and the toc. Multiple
>   entries are possible.

This sounds orthogonal to multipage export. May you please illustrate
what you want to achieve by introducing this option? Maybe there is an
existing feature that can be re-used instead of creating something new?

> - org-html-multipage-join-first-subsection
>
>   Boolean: Non-nil means that the first subsection of a section
>   without a body will be joined on the section page (recursively). See
>   my generated example pages linked below (Chapters 4, 5 and 7 for a
>   recursive example)

Sorry, but I cannot understand anything from there. May you explain in
words?

> - org-html-multipage-split
>
>   How to split the document. Possible values are
>
>   'toc for generating a page for each toc entry.

May I guess that the previous option may have something do with
situation when #+TOC: keyword is in the middle of a text?
   
> In addition I have a question about the html output layout
> structure. Here is an example of a file generated with the current
> code with some preliminary layout. It might give an idea about my use
> case:
>
> https://www.selma.hfmdk-frankfurt.de/finnendahl/klangsynthesebuch/01_00_00_vorwort.html#orge24571b
>
> Regardless of the colours, the file has a slightly different hierarchy
> than the single page html template of ORGMODE and is more oriented
> towards the layout of documentation nowadays with a (hideable) toc at
> the side on every page rather than the texinfo oriented layout used by
> the orgmode manual. If my code gets accepted/merged to org what should
> be the default layout shipped with multipage output? FYI: The
> visibility of the toc entries is managed by the css and the whole toc
> is included on each page (and its visibility could be managed with js
> as well). Should I rather go for the classic texinfo view?

Do I understand correctly that your alternative layout is simply a
question of custom #+HTML_HEADER? Or is there something more to it?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-08 15:05               ` Ihor Radchenko
@ 2024-07-08 15:41                 ` Orm Finnendahl
  2024-07-08 15:56                   ` Ihor Radchenko
  0 siblings, 1 reply; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-08 15:41 UTC (permalink / raw)
  To: emacs-orgmode

Hi,

Am Montag, den 08. Juli 2024 um 15:05:58 Uhr (+0000) schrieb Ihor Radchenko:
> 
> We might also consider adding MULTIPAGE as an additional argument to the
> API function (just like BODY-ONLY, VISIBLE-ONLY, SUBTREEP that we
> already use), but that's probably an implementation idea we may or may
> not need to use.

Currently I set the :multipage property in info, but that's a detail
that can be sorted out later.

> Yes, roughly like this.  Ideally, we should simply modify
> `org-export-as', but handling output file name may be a bit tricky -
> it is somewhat awkwardly placed in the current ox.el API (see the
> discussion in
> https://list.orgmode.org/orgmode/25393.61240.135445.401251@gargle.gargle.HOWL/T/#u).

Today I had a look at ox.el when upgrading my code to
9.8-pre. Unfortunately the code (and behaviour of org-element, etc.)
has changed quite a bit and I had to fix many things.

Especially in org-export-as the parsing of the tree is now done in the
lexical context of a copy of the buffer which makes implementing a
multipage backend even more awkward.

IMHO the code is just the wrong way around: org-export-to-file calls
org-export-as which combines the parsing with generating the output
string. The multipage code has to split that part and that doesn't get
easier when both parts have to be evaluated in the context of
org-export-with-buffer-copy. I'd rather have that turned inside out:
Instead of org-export-as being a part of
org-export-to-file/buffer/etc., its functionality could be at the
top-level and then call org-export-to... appropriately (either for
multipage output, single-page output, buffer-output...). I will handle
it by splitting org-export-as just before the
org-export-with-buffer-copy, but consider it a bit ugly.

> I do not think that we need to go too deep into this rabbit hole for
> now. A simple toggle based on `org-export-dispatch' might be good
> enough. It can be easily extended to something like multi-state switch
> (t/nil vs. t -> option A -> option B -> nil -> t -> ...).

There is something else: A lot of my energy in the multipage backend
went into getting links and footnotes correct. Footnotes aren't a big
deal, but I have no idea how to handle cross document links if
different backends are present (e.g. linking from html to a pdf
document and vice versa ;-) I think this requires quite a bit more
thinking and maybe is unrealistic altogether, but at least the
framework could be changed to be able to tackle that in the distant
future...

--
Orm


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-08 15:41                 ` Orm Finnendahl
@ 2024-07-08 15:56                   ` Ihor Radchenko
  2024-07-08 19:18                     ` Orm Finnendahl
  0 siblings, 1 reply; 46+ messages in thread
From: Ihor Radchenko @ 2024-07-08 15:56 UTC (permalink / raw)
  To: Orm Finnendahl; +Cc: emacs-orgmode

Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:

>> Yes, roughly like this.  Ideally, we should simply modify
>> `org-export-as', but handling output file name may be a bit tricky -
>> it is somewhat awkwardly placed in the current ox.el API (see the
>> discussion in
>> https://list.orgmode.org/orgmode/25393.61240.135445.401251@gargle.gargle.HOWL/T/#u).
>
> Today I had a look at ox.el when upgrading my code to
> 9.8-pre. Unfortunately the code (and behaviour of org-element, etc.)
> has changed quite a bit and I had to fix many things.
>
> Especially in org-export-as the parsing of the tree is now done in the
> lexical context of a copy of the buffer which makes implementing a
> multipage backend even more awkward.
>
> IMHO the code is just the wrong way around: org-export-to-file calls
> org-export-as which combines the parsing with generating the output
> string. The multipage code has to split that part and that doesn't get
> easier when both parts have to be evaluated in the context of
> org-export-with-buffer-copy. I'd rather have that turned inside out:
> Instead of org-export-as being a part of
> org-export-to-file/buffer/etc., its functionality could be at the
> top-level and then call org-export-to... appropriately (either for
> multipage output, single-page output, buffer-output...). I will handle
> it by splitting org-export-as just before the
> org-export-with-buffer-copy, but consider it a bit ugly.

Or we can make `org-export-as' retain INFO channel when returning the
output. Then, we can make `org-export-to-file' make use of the INFO
channel to decide the file name. This way, there will be no need to
decide the file name before running the parsing.

> There is something else: A lot of my energy in the multipage backend
> went into getting links and footnotes correct. Footnotes aren't a big
> deal, but I have no idea how to handle cross document links if
> different backends are present (e.g. linking from html to a pdf
> document and vice versa ;-) I think this requires quite a bit more
> thinking and maybe is unrealistic altogether, but at least the
> framework could be changed to be able to tackle that in the distant
> future...

Yes, it is an important feature we would need to implement - turning
internal links into external when they no longer point inside the same
document.

Somewhat relevant code: `org-export--update-included-link' and ox-publish.

For links to external pdfs and co, we have discussed what can be done in
https://list.orgmode.org/orgmode/87a5rpoi4c.fsf@localhost/
TL;DR: In latex, \href{file.pdf#anchor} works; In web, anchors should
also work with pdfjs.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-08 15:29               ` Ihor Radchenko
@ 2024-07-08 19:12                 ` Orm Finnendahl
  2024-07-09 17:55                   ` Ihor Radchenko
  0 siblings, 1 reply; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-08 19:12 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode

Am Montag, den 08. Juli 2024 um 15:29:47 Uhr (+0000) schrieb Ihor Radchenko:
> Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:
> 
> > For the backend I'm planning to realize the following options
> > (implemented as custom variables, which can be overwritten in the
> > document):
> >
> > - org-html-multipage-export-directory
> >
> >   The directory for the exported files (relative or absolute).
> 
> I am wondering about the reasoning behind not re-using
> #+EXPORT_FILE_NAME: here (its directory part) and simply defaulting to
>  `default-directory'.
> 
> Is there any situation when you need to export the full document
> vs. multipage to different places?

Actually that is what I'm currently doing (and what I need for my
publishing chain): The single-page document is not in the html folder
used for the multipage document. Both files happen to have the same
name so it wouldn't work out, if I want to generate single-page along
the multipage version, without having to change the document.

> > - org-html-multipage-head
> >
> >   (similar to HTML_HEAD but will be used instead of the HTML_HEAD for
> >   custom css/js)
> 
> Again, why not directly using #+HTML_HEAD?

Same as above: My multipage has a completely different css and js and
I think this is unavoidable. All this is just for being able to do
both exports without interfering.

> > - org-html-multipage-front-matter
> >
> >   A list to specify pages in front of the headlines of the
> >   document. Possible values are 'title, 'title-toc and 'toc. title-toc
> >   is a combined page containing the title and the toc. Multiple
> >   entries are possible.
> 
> This sounds orthogonal to multipage export. May you please illustrate
> what you want to achieve by introducing this option? Maybe there is an
> existing feature that can be re-used instead of creating something new?

Could be: The toc as a first page is needed, when you don't want a toc
on the side of each html page, e.g. when using the classical info
layout. And it might be necessary to be able to distinguish between a
separate title page with author and the toc on the next page (or a
combined page with title and toc or no front matter at all because the
title appears on every page). If this is possible with already
existing options, even better. I just think that it might be necessary
to be able to distinguish between the needs for html output format
vs. the needs for LaTex or single-page output without having to edit
the document (I need that as my publishing chain is going to export
info, html multipage, pdf output and html single-page output using the
same org file).

> > - org-html-multipage-join-first-subsection
> >
> >   Boolean: Non-nil means that the first subsection of a section
> >   without a body will be joined on the section page
> >   (recursively). See my generated example pages linked below
> >   (Chapters 4, 5 and 7 for a recursive example)
> 
> Sorry, but I cannot understand anything from there. May you explain in
> words?

Consider a case like this:

* Headline 1
** Headline 2
*** Headline 3
    Text for Headline 3

Without the above option, Headline 1, Headline 2 and Headline 3 would
be on separate pages with Headline 1 and Headline 2 being empty pages
with just the Headline. The option puts all three Headlines and the
Contents of Headline 3 on the same page. See here:

https://www.selma.hfmdk-frankfurt.de/finnendahl/klangsynthesebuch

Chapters 4, 4.8, 5, 5.4 and 6 (two Headline levels combined) and
Chapter 7 (three Headline levels combined) are examples of joined
headlines and the other (sub)chapters are examples, how Chapters
containing body text are handled. It's mainly a matter of style but in
some situations it doesn't make much sense to me to add content below
a headline just to avoid an empty page in multipage html output.

> > - org-html-multipage-split
> >
> >   How to split the document. Possible values are
> >
> >   'toc for generating a page for each toc entry.
> 
> May I guess that the previous option may have something do with
> situation when #+TOC: keyword is in the middle of a text?

No: In the online document of the link above the page splitting
follows the toc (with the exception of the page joining explained
above), meaning that each visible toc entry will generate one page. Be
aware that this is not obvious on the online page as subfolders are
folded automatically using the css (folded elements have the class
"toc-hidden"). If you look at the html page source you can see that
every page contains the full toc to enable other css or js based
styling decisions.

> Do I understand correctly that your alternative layout is simply a
> question of custom #+HTML_HEADER? Or is there something more to it?

In my layout the main difference is that the nav left and nav right
elements are part of the page-main-body rather than part of
<content>. I'm not positive this is elegantly manageable with css,
when the navigation is outside the page-main-body.

--
Orm


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-08 15:56                   ` Ihor Radchenko
@ 2024-07-08 19:18                     ` Orm Finnendahl
  2024-07-09 18:08                       ` Ihor Radchenko
  0 siblings, 1 reply; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-08 19:18 UTC (permalink / raw)
  To: emacs-orgmode

Hi Ihor,

Am Montag, den 08. Juli 2024 um 15:56:48 Uhr (+0000) schrieb Ihor
Radchenko:
> 
> Or we can make `org-export-as' retain INFO channel when returning the
> output. Then, we can make `org-export-to-file' make use of the INFO
> channel to decide the file name. This way, there will be no need to
> decide the file name before running the parsing.

Are you sure that works? org-export-as currently returns a string. It
could in addition return the parse-tree in info, plus the smaller
parts which need to be exported, but we should not forget, that
org-export-as is an inferior function called from org-export-to-file
or org-export-to-buffer. But maybe I misunderstand what you mean.

Here is what is needed from my perspective:

1. parse the tree of the whole document

2. split the tree up.

3. call the export backend on each of the split parts to generate the
   string and save it to disk or do whatever is appropriate.

For me the most natural way would be that a central function
(export-according-to-org-property-list) does the parsing and then call
the different backend functions to export according to their rules
(the trees being converted in the central function or in backend
code).

If toplevel functions like org-export-to-file use org-export-as, than
org-export-as should only be concerned with generating the string but
not with reparsing.

Alternatively we can do the conversion to a string in the central
function as now with org-export-as, but there still needs to be a
mechanism to generate the different files for multipage output and
call the export backend on them to save them or whatever. Or what did
you have in mind?

> For links to external pdfs and co, we have discussed what can be done in
> https://list.orgmode.org/orgmode/87a5rpoi4c.fsf@localhost/
> TL;DR: In latex, \href{file.pdf#anchor} works; In web, anchors should
> also work with pdfjs.

Thanks, I'll check that out.

--
Orm


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-08 19:12                 ` Orm Finnendahl
@ 2024-07-09 17:55                   ` Ihor Radchenko
  2024-07-10 18:03                     ` Orm Finnendahl
  0 siblings, 1 reply; 46+ messages in thread
From: Ihor Radchenko @ 2024-07-09 17:55 UTC (permalink / raw)
  To: Orm Finnendahl; +Cc: emacs-orgmode

Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:

>> Is there any situation when you need to export the full document
>> vs. multipage to different places?
>
> Actually that is what I'm currently doing (and what I need for my
> publishing chain): The single-page document is not in the html folder
> used for the multipage document. Both files happen to have the same
> name so it wouldn't work out, if I want to generate single-page along
> the multipage version, without having to change the document.

If this is the case, users may potentially need similar diverging
settings for single- vs. multi- page documents for almost any given
export option, not just the ones you mentioned.

To address such situations, we may, for example, allow an alternative
"multi" version of each export keyword to act specially when multipage
export is used. Consider that there is an export option #+SAMPLEOPTION.
If the document has only "#+SAMPLEOPTION: value", exporter will use it
for both normal and multipage export. However, we may allow an
alternative #+SAMPLEOPTION[multipage]: multipage value that will be used
instead when defined.

In addition to defining alternative variants of in-buffer settings, we
also need to provide the equivalent feature for custom variables
defining the export options. We can do it by treating the value of such
export-related variables specially - we may allow special values like
[org-export-variants :default default-value :multipage multipage-value]
and provide helper functions like

(org-export-set-option option-name  value) ; :default
(org-export-set-option option-name :multipage value) ; for multipage export only
(org-export-set-option option-name :singlepage value) ; just for singlepage export

(Or can be some other consistent way to define alternatives; feel free
to brainstorm)

>> > - org-html-multipage-front-matter
>> >
>> >   A list to specify pages in front of the headlines of the
>> >   document. Possible values are 'title, 'title-toc and 'toc. title-toc
>> >   is a combined page containing the title and the toc. Multiple
>> >   entries are possible.
>> 
>> This sounds orthogonal to multipage export. May you please illustrate
>> what you want to achieve by introducing this option? Maybe there is an
>> existing feature that can be re-used instead of creating something new?
>
> Could be: The toc as a first page is needed, when you don't want a toc
> on the side of each html page, e.g. when using the classical info
> layout. And it might be necessary to be able to distinguish between a
> separate title page with author and the toc on the next page (or a
> combined page with title and toc or no front matter at all because the
> title appears on every page). If this is possible with already
> existing options, even better. I just think that it might be necessary
> to be able to distinguish between the needs for html output format
> vs. the needs for LaTex or single-page output without having to edit
> the document (I need that as my publishing chain is going to export
> info, html multipage, pdf output and html single-page output using the
> same org file).

Sorry, but I still do not quite understand. May you please illustrate a
bit more with some kind of simple example?

>> > - org-html-multipage-join-first-subsection
>> >
>> >   Boolean: Non-nil means that the first subsection of a section
>> >   without a body will be joined on the section page
>> >   (recursively). See my generated example pages linked below
>> >   (Chapters 4, 5 and 7 for a recursive example)
>> 
>> Sorry, but I cannot understand anything from there. May you explain in
>> words?
>
> Consider a case like this:
>
> * Headline 1
> ** Headline 2
> *** Headline 3
>     Text for Headline 3
>
> Without the above option, Headline 1, Headline 2 and Headline 3 would
> be on separate pages with Headline 1 and Headline 2 being empty pages
> with just the Headline. The option puts all three Headlines and the
> Contents of Headline 3 on the same page. See here:

I see. It sounds useful given that your strategy to split the document
into pages is "on each headline on each level".

Conceptually, I see this as one of possible customizations for paging
strategies. Your `org-html-multipage-join-first-subsection' simply tells
to split off pages only when there is non-empty contents inside the
containing headings.

This also reveals that we may sometimes want more than just to tell how
to split the document. After splitting, we may want to rearrange the
pages differently (maybe even re-order?). In other words, multipage
export may need to:

1. Take document AST
2. Split it into multiple parts
3. Filter the obtained part list (post-process)
4. Perform actual per-page export
...

>> > - org-html-multipage-split
>> >
>> >   How to split the document. Possible values are
>> >
>> >   'toc for generating a page for each toc entry.
>> 
>> May I guess that the previous option may have something do with
>> situation when #+TOC: keyword is in the middle of a text?
>
> No: In the online document of the link above the page splitting
> follows the toc (with the exception of the page joining explained
> above), meaning that each visible toc entry will generate one page. Be
> aware that this is not obvious on the online page as subfolders are
> folded automatically using the css (folded elements have the class
> "toc-hidden"). If you look at the html page source you can see that
> every page contains the full toc to enable other css or js based
> styling decisions.

Sounds reasonable. I guess that the docstring can be improved :)

>> Do I understand correctly that your alternative layout is simply a
>> question of custom #+HTML_HEADER? Or is there something more to it?
>
> In my layout the main difference is that the nav left and nav right
> elements are part of the page-main-body rather than part of
> <content>. I'm not positive this is elegantly manageable with css,
> when the navigation is outside the page-main-body.

Sorry, but I am lost. What do you mean by "content" and what do you mean
by "page-main-body"?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-08 19:18                     ` Orm Finnendahl
@ 2024-07-09 18:08                       ` Ihor Radchenko
  2024-07-10 19:37                         ` Orm Finnendahl
  0 siblings, 1 reply; 46+ messages in thread
From: Ihor Radchenko @ 2024-07-09 18:08 UTC (permalink / raw)
  To: Orm Finnendahl; +Cc: emacs-orgmode

Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:

>> Or we can make `org-export-as' retain INFO channel when returning the
>> output. Then, we can make `org-export-to-file' make use of the INFO
>> channel to decide the file name. This way, there will be no need to
>> decide the file name before running the parsing.
>
> Are you sure that works? org-export-as currently returns a string. It
> could in addition return the parse-tree in info, plus the smaller
> parts which need to be exported, but we should not forget, that
> org-export-as is an inferior function called from org-export-to-file
> or org-export-to-buffer. But maybe I misunderstand what you mean.

That's exactly what I mean.

> Here is what is needed from my perspective:
>
> 1. parse the tree of the whole document
>
> 2. split the tree up.
>
> 3. call the export backend on each of the split parts to generate the
>    string and save it to disk or do whatever is appropriate.
>
> For me the most natural way would be that a central function
> (export-according-to-org-property-list) does the parsing and then call
> the different backend functions to export according to their rules
> (the trees being converted in the central function or in backend
> code).
>
> If toplevel functions like org-export-to-file use org-export-as, than
> org-export-as should only be concerned with generating the string but
> not with reparsing.

Sorry, but I do not understand your concern.

> Alternatively we can do the conversion to a string in the central
> function as now with org-export-as, but there still needs to be a
> mechanism to generate the different files for multipage output and
> call the export backend on them to save them or whatever. Or what did
> you have in mind?

What I have in mind is that `org-export-as' will return a list of
strings + INFO. INFO will contain data about which files to use for
saving the strings. Then, the caller does the saving and whatever is
necessary. If we write to files from `org-export-as' it will be a
massive breaking change in the expected behavior.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-09 17:55                   ` Ihor Radchenko
@ 2024-07-10 18:03                     ` Orm Finnendahl
  2024-07-10 18:53                       ` Ihor Radchenko
  0 siblings, 1 reply; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-10 18:03 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode

Am Dienstag, den 09. Juli 2024 um 17:55:51 Uhr (+0000) schrieb Ihor Radchenko:
> 
> To address such situations, we may, for example, allow an alternative
> "multi" version of each export keyword to act specially when multipage
> export is used. Consider that there is an export option #+SAMPLEOPTION.
> If the document has only "#+SAMPLEOPTION: value", exporter will use it
> for both normal and multipage export. However, we may allow an
> alternative #+SAMPLEOPTION[multipage]: multipage value that will be used
> instead when defined.
> 
> In addition to defining alternative variants of in-buffer settings, we
> also need to provide the equivalent feature for custom variables
> defining the export options. We can do it by treating the value of such
> export-related variables specially - we may allow special values like
> [org-export-variants :default default-value :multipage multipage-value]
> and provide helper functions like
> 
> (org-export-set-option option-name  value) ; :default
> (org-export-set-option option-name :multipage value) ; for multipage export only
> (org-export-set-option option-name :singlepage value) ; just for singlepage export
> 
> (Or can be some other consistent way to define alternatives; feel free
> to brainstorm)

Yes. Currently I' more concerned with the architecural
layout. Everything else is a matter of taste and easily configured
(and hopefully agreed upon) once the structure is done. I'm very
relaxed and unopinionated about how to handle options as long as they
don't involve changing the org document for each export backend.

> >> > - org-html-multipage-front-matter
> >> >
> >> >   A list to specify pages in front of the headlines of the
> >> >   document. Possible values are 'title, 'title-toc and 'toc. title-toc
> >> >   is a combined page containing the title and the toc. Multiple
> >> >   entries are possible.
> >> 
> >> This sounds orthogonal to multipage export. May you please illustrate
> >> what you want to achieve by introducing this option? Maybe there is an
> >> existing feature that can be re-used instead of creating something new?
> >
> > Could be: The toc as a first page is needed, when you don't want a toc
> > on the side of each html page, e.g. when using the classical info
> > layout. And it might be necessary to be able to distinguish between a
> > separate title page with author and the toc on the next page (or a
> > combined page with title and toc or no front matter at all because the
> > title appears on every page). If this is possible with already
> > existing options, even better. I just think that it might be necessary
> > to be able to distinguish between the needs for html output format
> > vs. the needs for LaTex or single-page output without having to edit
> > the document (I need that as my publishing chain is going to export
> > info, html multipage, pdf output and html single-page output using the
> > same org file).
> 
> Sorry, but I still do not quite understand. May you please illustrate a
> bit more with some kind of simple example?

Consider a doucument like this:

** Headline 1
** Headline 2
*** Subheadline 2.1

In the multipage export you want a front page with booktitle, author,
date, etc. (maybe even an image...) and as a second page after the
front page you want to have a full toc. Both pages should be reachable
by the side toc but shouldn't get numbered so the toc on the side
would appear like this:

My Booktitle
Contents
1 Headline 1
2 Headline 1
  2.1 Subheadline 1

On the other hand you might always print the booktitle on every page
and as the toc is always at the side you might not need titlepage and
toc as seperate pages.

Or you like the layout of the info mode with just navigation buttons
and no side toc. In these documents, the toc is normally on the first
"home" page. This would also imply a seperate html page with a toc and
possibly the title on it. As there are always different preferences
for this I thought to introduce a list which specifies, what king of
documents should appear at the front of the document which aren't
counted in the toc. All these are -in my opinion- legitimate decisions
not at all unusual in publication situations so I thought I accomodate
for that. Is my explanation somewhat clearer?

> 
> >> > - org-html-multipage-join-first-subsection
> >> >
> >> >   Boolean: Non-nil means that the first subsection of a section
> >> >   without a body will be joined on the section page
> >> >   (recursively). See my generated example pages linked below
> >> >   (Chapters 4, 5 and 7 for a recursive example)
> >> 
> >> Sorry, but I cannot understand anything from there. May you explain in
> >> words?
> >
> > Consider a case like this:
> >
> > * Headline 1
> > ** Headline 2
> > *** Headline 3
> >     Text for Headline 3
> >
> > Without the above option, Headline 1, Headline 2 and Headline 3 would
> > be on separate pages with Headline 1 and Headline 2 being empty pages
> > with just the Headline. The option puts all three Headlines and the
> > Contents of Headline 3 on the same page. See here:
> 
> I see. It sounds useful given that your strategy to split the document
> into pages is "on each headline on each level".
> 
> Conceptually, I see this as one of possible customizations for paging
> strategies. Your `org-html-multipage-join-first-subsection' simply tells
> to split off pages only when there is non-empty contents inside the
> containing headings.
> 
> This also reveals that we may sometimes want more than just to tell how
> to split the document. After splitting, we may want to rearrange the
> pages differently (maybe even re-order?). In other words, multipage
> export may need to:
> 
> 1. Take document AST
> 2. Split it into multiple parts
> 3. Filter the obtained part list (post-process)
> 4. Perform actual per-page export
> ...

yes. we can build a complete machinery around all that, but currently
I fear that this gets a bit out of control for me: I really have to
get going with other things and currently I'd prefer to realize
something that works with the architecture built in way that it is
easlily extendable in the future without having to redo everything
again.

> 
> >> > - org-html-multipage-split
> >> >
> >> >   How to split the document. Possible values are
> >> >
> >> >   'toc for generating a page for each toc entry.
> >> 
> >> May I guess that the previous option may have something do with
> >> situation when #+TOC: keyword is in the middle of a text?
> >
> > No: In the online document of the link above the page splitting
> > follows the toc (with the exception of the page joining explained
> > above), meaning that each visible toc entry will generate one page. Be
> > aware that this is not obvious on the online page as subfolders are
> > folded automatically using the css (folded elements have the class
> > "toc-hidden"). If you look at the html page source you can see that
> > every page contains the full toc to enable other css or js based
> > styling decisions.
> 
> Sounds reasonable. I guess that the docstring can be improved :)
>

:-)

> >> Do I understand correctly that your alternative layout is simply a
> >> question of custom #+HTML_HEADER? Or is there something more to it?
> >
> > In my layout the main difference is that the nav left and nav right
> > elements are part of the page-main-body rather than part of
> > <content>. I'm not positive this is elegantly manageable with css,
> > when the navigation is outside the page-main-body.
> 
> Sorry, but I am lost. What do you mean by "content" and what do you mean
> by "page-main-body"?

Look at the tags and ids of the html code on these pages.


--
Orm


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-10 18:03                     ` Orm Finnendahl
@ 2024-07-10 18:53                       ` Ihor Radchenko
  0 siblings, 0 replies; 46+ messages in thread
From: Ihor Radchenko @ 2024-07-10 18:53 UTC (permalink / raw)
  To: Orm Finnendahl; +Cc: emacs-orgmode

Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:

>> >> > - org-html-multipage-front-matter
>> >> >
>> >> >   A list to specify pages in front of the headlines of the
>> >> >   document. Possible values are 'title, 'title-toc and 'toc. title-toc
>> >> >   is a combined page containing the title and the toc. Multiple
>> >> >   entries are possible.
> ...
> Consider a doucument like this:
>
> ** Headline 1
> ** Headline 2
> *** Subheadline 2.1
>
> In the multipage export you want a front page with booktitle, author,
> date, etc. (maybe even an image...) and as a second page after the
> front page you want to have a full toc. Both pages should be reachable
> by the side toc but shouldn't get numbered so the toc on the side
> would appear like this:
>
> My Booktitle
> Contents
> 1 Headline 1
> 2 Headline 1
>   2.1 Subheadline 1
> ...Is my explanation somewhat clearer?

Yup. Clear now.
In the nutshell, you want

1. Special export settings for certain pages (inline toc vs. side toc)
2. Extend TOC generation rules to be more automatic than they are now
   (insert TOC inline automatically for certain headings vs side TOC for
   the rest)

Looks doable using the available means + previously discussed multipage
setting ideas.

>> 1. Take document AST
>> 2. Split it into multiple parts
>> 3. Filter the obtained part list (post-process)
>> 4. Perform actual per-page export
>> ...
>
> yes. we can build a complete machinery around all that, but currently
> I fear that this gets a bit out of control for me: I really have to
> get going with other things and currently I'd prefer to realize
> something that works with the architecture built in way that it is
> easlily extendable in the future without having to redo everything
> again.

Sure. Feel free to do things that work better for you. Just keep in mind
the ideas we discuss - we will eventually need to get them going and the
preliminary implementation should not cause hard blockers.

I am looking forward to your future contributions.

(And, BTW, feel free to check out
https://orgmode.org/worg/org-contribute.html - we provide some important
information about our patch conventions. Please pay attention to
https://orgmode.org/worg/org-contribute.html#copyright)

> ...
>> Sorry, but I am lost. What do you mean by "content" and what do you mean
>> by "page-main-body"?
>
> Look at the tags and ids of the html code on these pages.

Ok. More clear now.

What you have is
<body>
<nav ...>
{TOC element aligned left}
</nav>
{TITLE}
<div contents>
{document body aligned right}
<div>
</body>

Now, I think I can answer your original question:

	> In addition I have a question about the html output layout
	> structure. Here is an example of a file generated with the current
	> code with some preliminary layout. It might give an idea about my use
	> case:
	>
	> https://www.selma.hfmdk-frankfurt.de/finnendahl/klangsynthesebuch/01_00_00_vorwort.html#orge24571b
	>
	> Regardless of the colours, the file has a slightly different hierarchy
	> than the single page html template of ORGMODE and is more oriented
	> towards the layout of documentation nowadays with a (hideable) toc at
	> the side on every page rather than the texinfo oriented layout used by
	> the orgmode manual. If my code gets accepted/merged to org what should
	> be the default layout shipped with multipage output? FYI: The
	> visibility of the toc entries is managed by the css and the whole toc
	> is included on each page (and its visibility could be managed with js
	> as well). Should I rather go for the classic texinfo view?

For context, the vanilla exported HTML body is
(see `org-html-template', `org-html-inner-template', `org-html-toc')

<body>
{LINK UP}
{PREAMBLE}
<div content>
{TITLE}
{TOC}
{exported CONTENTS}
{Footnotes}
</div>
{POSTAMBLE}
{JS scripts}
</body>

I see no major problem customizing TOC position in the HTML template.
In fact, it would be rather desirable to provide a set of less
bare-bones defaults (as long as they do not get too complex).

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-09 18:08                       ` Ihor Radchenko
@ 2024-07-10 19:37                         ` Orm Finnendahl
  2024-07-11 12:35                           ` Ihor Radchenko
  0 siblings, 1 reply; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-10 19:37 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode

Am Dienstag, den 09. Juli 2024 um 18:08:10 Uhr (+0000) schrieb Ihor Radchenko:
> Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:
> > If toplevel functions like org-export-to-file use org-export-as, than
> > org-export-as should only be concerned with generating the string but
> > not with reparsing.
> 
> Sorry, but I do not understand your concern.

If org-export-as returns just one string, then it will reparse the
parse tree each time it needs to generate an output string. But as you
say below, you rather think org-export-as returns a list of strings
for the multipage case.

> > Alternatively we can do the conversion to a string in the central
> > function as now with org-export-as, but there still needs to be a
> > mechanism to generate the different files for multipage output and
> > call the export backend on them to save them or whatever. Or what did
> > you have in mind?
> 
> What I have in mind is that `org-export-as' will return a list of
> strings + INFO. INFO will contain data about which files to use for
> saving the strings. Then, the caller does the saving and whatever is
> necessary. If we write to files from `org-export-as' it will be a
> massive breaking change in the expected behavior.

ok, that's what you mean. I can do this, but don't you think it'd be
more consistent with the general layout of ox, if org-export-as uses a
callback function to call on each generated string with the filename
as argument nad we agree on names for multipage file output which have
to get implemented by multipage backends?

Whatever, both ways will do what's needed, just let me know what you
prefer and I will provide a suggestion, ok? I try to find time on the
weekend, otherwise I'll have time after the end of next week.

--
Orm



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-10 19:37                         ` Orm Finnendahl
@ 2024-07-11 12:35                           ` Ihor Radchenko
  2024-07-13  7:44                             ` Orm Finnendahl
  0 siblings, 1 reply; 46+ messages in thread
From: Ihor Radchenko @ 2024-07-11 12:35 UTC (permalink / raw)
  To: Orm Finnendahl; +Cc: emacs-orgmode

Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:

> If org-export-as returns just one string, then it will reparse the
> parse tree each time it needs to generate an output string. But as you
> say below, you rather think org-export-as returns a list of strings
> for the multipage case.

Got it now.

>> What I have in mind is that `org-export-as' will return a list of
>> strings + INFO. INFO will contain data about which files to use for
>> saving the strings. Then, the caller does the saving and whatever is
>> necessary. If we write to files from `org-export-as' it will be a
>> massive breaking change in the expected behavior.
>
> ok, that's what you mean. I can do this, but don't you think it'd be
> more consistent with the general layout of ox, if org-export-as uses a
> callback function to call on each generated string with the filename
> as argument nad we agree on names for multipage file output which have
> to get implemented by multipage backends?

This sounds like some kind of extension to :filter-final-output.
I think it should also be an ok option.

> Whatever, both ways will do what's needed, just let me know what you
> prefer and I will provide a suggestion, ok? I try to find time on the
> weekend, otherwise I'll have time after the end of next week.

I am ok with what you propose. So, please go ahead.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-11 12:35                           ` Ihor Radchenko
@ 2024-07-13  7:44                             ` Orm Finnendahl
  2024-07-13 10:13                               ` Ihor Radchenko
  0 siblings, 1 reply; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-13  7:44 UTC (permalink / raw)
  To: emacs-orgmode

Hi,

Am Donnerstag, den 11. Juli 2024 um 12:35:21 Uhr (+0000) schrieb Ihor Radchenko:
> > ok, that's what you mean. I can do this, but don't you think it'd be
> > more consistent with the general layout of ox, if org-export-as uses a
> > callback function to call on each generated string with the filename
> > as argument and we agree on names for multipage file output which have
> > to get implemented by multipage backends?
> 
> This sounds like some kind of extension to :filter-final-output.
> I think it should also be an ok option.

:filter-final-output functions could be used, but the name is a bit
misleading. Therefore I'd suggest to extend the
org-export-filters-alist with :export-final-output which only gets
called if non-nil. Otherwise org-export-as will return a single string
as before, so we don't break anything.

In the multipage case we still need a hook to split the parse tree
before transcoding. The place for this should probably be in
org-export--annotate-info. I don't see any mechanism/alist function to
use so I would suggest to add an option :multipage-process-hook to
org-export-filters-alist.

In addition the backend will set a :multipage option at the beginning
of the export, when exporting to multipage.

I will go ahead and implement a proposal. Let me know if something
sounds bad/unreasonable.

--
Orm


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-13  7:44                             ` Orm Finnendahl
@ 2024-07-13 10:13                               ` Ihor Radchenko
  2024-07-13 11:01                                 ` Orm Finnendahl
  2024-07-23  8:56                                 ` Orm Finnendahl
  0 siblings, 2 replies; 46+ messages in thread
From: Ihor Radchenko @ 2024-07-13 10:13 UTC (permalink / raw)
  To: Orm Finnendahl; +Cc: emacs-orgmode

Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:

>> This sounds like some kind of extension to :filter-final-output.
>> I think it should also be an ok option.
>
> :filter-final-output functions could be used, but the name is a bit
> misleading. Therefore I'd suggest to extend the
> org-export-filters-alist with :export-final-output which only gets
> called if non-nil. Otherwise org-export-as will return a single string
> as before, so we don't break anything.

It is not very clear for me from the name how :export-final-output would
differ from :filter-final-output. Maybe :finalize-export-functions?

> In the multipage case we still need a hook to split the parse tree
> before transcoding. The place for this should probably be in
> org-export--annotate-info. I don't see any mechanism/alist function to
> use so I would suggest to add an option :multipage-process-hook to
> org-export-filters-alist.

We should better use a new option, yes. Many existing options can be
modified by users by accident, and we do not want that.

> In addition the backend will set a :multipage option at the beginning
> of the export, when exporting to multipage.

It will be more in-line with the existing design to set
:export-options. See `org-export--get-export-attributes'.

> I will go ahead and implement a proposal. Let me know if something
> sounds bad/unreasonable.

As you see, nothing major. We can work out the details after you get a
prototype to work with.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-13 10:13                               ` Ihor Radchenko
@ 2024-07-13 11:01                                 ` Orm Finnendahl
  2024-07-23  8:56                                 ` Orm Finnendahl
  1 sibling, 0 replies; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-13 11:01 UTC (permalink / raw)
  To: emacs-orgmode

Hi,

 while doing it, I found out it only needs one function call in
org-export-as for splitting, transcoding and writing the
file. Currently it's called :process-multipage, but we can change that
later.

I'll take your advice and use an :export-option for multipage.

--
Orm

Am Samstag, den 13. Juli 2024 um 10:13:56 Uhr (+0000) schrieb Ihor Radchenko:
> Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:
> 
> >> This sounds like some kind of extension to :filter-final-output.
> >> I think it should also be an ok option.
> >
> > :filter-final-output functions could be used, but the name is a bit
> > misleading. Therefore I'd suggest to extend the
> > org-export-filters-alist with :export-final-output which only gets
> > called if non-nil. Otherwise org-export-as will return a single string
> > as before, so we don't break anything.
> 
> It is not very clear for me from the name how :export-final-output would
> differ from :filter-final-output. Maybe :finalize-export-functions?
> 
> > In the multipage case we still need a hook to split the parse tree
> > before transcoding. The place for this should probably be in
> > org-export--annotate-info. I don't see any mechanism/alist function to
> > use so I would suggest to add an option :multipage-process-hook to
> > org-export-filters-alist.
> 
> We should better use a new option, yes. Many existing options can be
> modified by users by accident, and we do not want that.
> 
> > In addition the backend will set a :multipage option at the beginning
> > of the export, when exporting to multipage.
> 
> It will be more in-line with the existing design to set
> :export-options. See `org-export--get-export-attributes'.
> 
> > I will go ahead and implement a proposal. Let me know if something
> > sounds bad/unreasonable.
> 
> As you see, nothing major. We can work out the details after you get a
> prototype to work with.
> 
> -- 
> Ihor Radchenko // yantar92,
> Org mode contributor,
> Learn more about Org mode at <https://orgmode.org/>.
> Support Org development at <https://liberapay.com/org-mode>,
> or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-13 10:13                               ` Ihor Radchenko
  2024-07-13 11:01                                 ` Orm Finnendahl
@ 2024-07-23  8:56                                 ` Orm Finnendahl
  2024-07-23 10:24                                   ` Ihor Radchenko
  1 sibling, 1 reply; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-23  8:56 UTC (permalink / raw)
  To: emacs-orgmode

Hi,

 I managed to get the proposal for the multipage output done. Please
review it and let me know what you think/would prefer to change. I'm
pretty open about it.

You can find it here:

https://github.com/ormf/ox-html-multipage

The code is intended to replace ox.el and ox-html.el. The repository
contains a pretty exhaustive CHANGELOG.txt to show what I did.

I also found a way to tackle the problem with the correct output
template by integrating both approaches into one template with the
option of customizing it simply with css. Here are the two layouts,
the first being just the plain output with a css styling similar to
the plain singlepage output, the second with the navigation elements
integrated into the main-text-body:

1. Plain

   https://www.selma.hfmdk-frankfurt.de/finnendahl/klangsynthesebuch-plain/

2. Inline navigation

   https://www.selma.hfmdk-frankfurt.de/finnendahl/klangsynthesebuch/

My code proposal shouldn't break anything in the single-page export
for any backend and produce the exact same output as before with one
exception:

There now is an option :html-numbered-link-format which applies to
numbered links to Chapters, Sections or Images. If the link doesn't
have a label, in previous versions of ox-html the link label just
consisted of a number. With the chenge, the link label will be
replaced by a customizable string for the three cases. The default
setting now is "Chapter %s", "Section %s" and "Fig. %s", which will
get translated using the org-export-dictionary (I added those entries
in ox.el). The customizable strings can be set to "%s" if the previous
behaviour is preferred, but I consider it an enhancement and assume,
the new behaviour is preferred by most users.

In addition I found a minor bug regarding infojs and implemented a
more general way to determine footnote numbers (which is not a bug in
single-page output, but in my opinion a more concise method aligning
with the way footnote numbers are created in the first place).

The new multipage output will get triggered with 'C-c C-e h m'.

Whether the first page opens in a buffer, browser or the output just
get written to file can be controlled with the :html-multipage-open
option in the file (or as a customized variable).

In addition these customizable options are implemented:

- :html-multipage-head-include-default-style

  default css style for multipage documents.

- :html-multipage-join-empty-bodies / org-html-multipage-join-empty-bodies

  Whether to join subheadlines on the same page in case a headline has
  no body text (I tried to clarify that in the doc string of the
  defcustom).

- :html-multipage-export-directory / org-html-multipage-export-direcotry

  The directory for the multipage output (relative or absolute).

- :html-multipage-nav-format / org-multipage-nav-format

  Html snippets for the top navigation elements.

- :html-multipage-split / org-multipage-split

  Where to split the document. Possible values are 'toc to split at
  the toc entries or a number indicating the headline level.  

- :html-multipage-toc-to-top / org-html-multipage-toc-to-top
  
  link destination from toc (either directly to the headline, or to
  the top of the page, more convenient in the standard case with the
  navigation on top).

I did *not* implement:

- Front matter options as I think the standard tools for org mode
  cover most cases I thought of very elegantly and it seemed somewhat
  clunky to me.

- Page split at section-filenames. The main reason for this is that it
  needs a longer discussion, how this should get implemented correctly
  to cover all use cases. In principle it is not very complicated,
  especially with my better understanding of the underlying principles
  of ox. But if I understand Ihor's ideas correctly, it is a separate
  issue altogether which won't be handled properly in the html backend
  but rather in a general multipage backend which is backend
  agnostic. I'm perfectly willing to tackle this and to contribute,
  but currently I think it is better to make the proposed code with
  applied improvements available, as it is useful and pretty complete
  for the use case of publishing an org document to multiple html
  pages.

If the code gets reviewed and accepted I have some questions regarding
final submittal:

1. How do I provide the code? Is there a mechanism like issuing a
   merge-request or how is it normally done?

2. How do I add documentation to the org manual?

3. Should there be test functions for the code added and are there
   recommendations how to do that?

I'm glad that I finally got it done. Hope you like it and please let
me know what you think.

--
Orm


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-23  8:56                                 ` Orm Finnendahl
@ 2024-07-23 10:24                                   ` Ihor Radchenko
  2024-07-23 11:35                                     ` Orm Finnendahl
  0 siblings, 1 reply; 46+ messages in thread
From: Ihor Radchenko @ 2024-07-23 10:24 UTC (permalink / raw)
  To: Orm Finnendahl; +Cc: emacs-orgmode

Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:

>  I managed to get the proposal for the multipage output done. Please
> review it and let me know what you think/would prefer to change. I'm
> pretty open about it.
>
> You can find it here:
>
> https://github.com/ormf/ox-html-multipage
>
> The code is intended to replace ox.el and ox-html.el. The repository
> contains a pretty exhaustive CHANGELOG.txt to show what I did.

Thanks!
I will first focus on reviewing changes to ox.el.

> ox.el
> 
> - added `org-export-collect-tree-info', and

And it is not used anywhere... What is the purpose?

>   org-export-transcode-headline, extracted from `org-export-as'

How does it have anything to do with "headline"? Maybe `org-export-transcode-page'?
 
> - added :multipage case to `org-export-as', calling :process-multipage
>   callback submitted in info. In the multipage case, org-export-as
>   returns nil relying on :process-multipage to do the exporting, while
>   in the single page case it returns the transcoded string to the
>   caller from the backend.

Does it mean that you do not want page splitting to be controlled
globally, in ox.el, as we discussed?
 
> - changed `org-export-collect-footnote-definitions' to get its
>   numbering using `org-export-get-footnote-number' rather than always
>   counting from 1 as before. This should always work for single-page
>   and multipage export.

This looks reasonable. Maybe even as a separate patch we can merge earlier.
 
> - changed that `org-export-numbered-headline-p' always returns t for
>   headlines in the multipage case to ensure headline numbering is
>   collected.

>   NOTE: The single-page case will be handled like before, but it might
>   be a better idea to change the behaviour and do it the same way as
>   in the multipage case: Always collect the headline-numbering and
>   only decide at the transcoding stage whether the headline should be
>   numbered in the output.

Why? What if one wants headlines to be not numbered?

> If the code gets reviewed and accepted I have some questions regarding
> final submittal:
>
> 1. How do I provide the code? Is there a mechanism like issuing a
>    merge-request or how is it normally done?

https://orgmode.org/worg/org-contribute.html#orge044121

You need to clone Org mode repository and modify it on a public
branch. Then, just share it. Also, please make sure that you track the
latest main branch. Your version of ox.el already diverged from the
latest main.

> 2. How do I add documentation to the org manual?

Edit doc/org-manual.org in the Org repository.

> 3. Should there be test functions for the code added and are there
>    recommendations how to do that?

Yes, ideally.
See testing/README, testing/lisp/test-ox.el, and
testing/lisp/text-ox-html.el files in the Org repo.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-23 10:24                                   ` Ihor Radchenko
@ 2024-07-23 11:35                                     ` Orm Finnendahl
  2024-07-23 12:52                                       ` Ihor Radchenko
                                                         ` (2 more replies)
  0 siblings, 3 replies; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-23 11:35 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode

Hi,

 thanks for the quick response!

Am Dienstag, den 23. Juli 2024 um 10:24:54 Uhr (+0000) schrieb Ihor
Radchenko:
> I will first focus on reviewing changes to ox.el.
> 
> > ox.el
> > 
> > - added `org-export-collect-tree-info', and
> 
> And it is not used anywhere... What is the purpose?

That was not cleaned up from a previous stage. Removed, thanks!

> 
> >   org-export-transcode-headline, extracted from `org-export-as'
> 
> How does it have anything to do with "headline"? Maybe `org-export-transcode-page'?

changed.

>  
> > - added :multipage case to `org-export-as', calling :process-multipage
> >   callback submitted in info. In the multipage case, org-export-as
> >   returns nil relying on :process-multipage to do the exporting, while
> >   in the single page case it returns the transcoded string to the
> >   caller from the backend.
> 
> Does it mean that you do not want page splitting to be controlled
> globally, in ox.el, as we discussed?

Not for now (I mention that later in the mail when I talk about
section-filenames and future generalizations, where this definitely
has to be done). In the html multipage export are many peculiarities
which don't apply to other backends, so ox.el wouldn't be the place
for it, so we will need some callback mechanism anyway.

Right now this gets accomplished with a small branch in org-export-as
in order to change as little as possible. It'll be easy to change if
we find a good way to get this done using a more general approach. But
I'm open for suggestions, if you have an idea how to already do it
now.

> This looks reasonable. Maybe even as a separate patch we can merge
> earlier.

Sure.

>  
> > - changed that `org-export-numbered-headline-p' always returns t for
> >   headlines in the multipage case to ensure headline numbering is
> >   collected.
> 
> >   NOTE: The single-page case will be handled like before, but it might
> >   be a better idea to change the behaviour and do it the same way as
> >   in the multipage case: Always collect the headline-numbering and
> >   only decide at the transcoding stage whether the headline should be
> >   numbered in the output.
> 
> Why? What if one wants headlines to be not numbered?

Just set num:nil in the options. As mentioned, I think printing
headline numbers should get handled in the transcoding stage of the
backend and not before. Multipage export behind the scenes is
completely dependant on headline numbering, even if headlines aren't
displayed, so the code in ox.el first proceeds, as if headline
numbering is turned on and moves the check for headline numbering to
the transcoding stage. I didn't change the behaviour in the
single-page html situation. Although I think that it might make sense
that headline-numbering in general only gets checked at the
transcoding stage that would affect all backends, so I didn't change
anything.

Thanks also for the info regarding how to contribute. It'd be nice if
you could gibe me a go in case you approve the proposal.

--
Orm


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-23 11:35                                     ` Orm Finnendahl
@ 2024-07-23 12:52                                       ` Ihor Radchenko
  2024-07-23 14:56                                         ` Orm Finnendahl
       [not found]                                         ` <Zp_EhDDxxYRWKFPL@orm-t14s>
  2024-07-23 14:13                                       ` Ihor Radchenko
  2024-07-23 14:19                                       ` Ihor Radchenko
  2 siblings, 2 replies; 46+ messages in thread
From: Ihor Radchenko @ 2024-07-23 12:52 UTC (permalink / raw)
  To: Orm Finnendahl; +Cc: emacs-orgmode

Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:

> Thanks also for the info regarding how to contribute. It'd be nice if
> you could gibe me a go in case you approve the proposal.

May you please elaborate?

All you need to do is cloning/forking Org mode repository, making edits
there, and sharing the link to your branch.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-23 11:35                                     ` Orm Finnendahl
  2024-07-23 12:52                                       ` Ihor Radchenko
@ 2024-07-23 14:13                                       ` Ihor Radchenko
       [not found]                                         ` <Zp_b2lL2SzDswa-w@orm-t14s>
  2024-07-23 14:19                                       ` Ihor Radchenko
  2 siblings, 1 reply; 46+ messages in thread
From: Ihor Radchenko @ 2024-07-23 14:13 UTC (permalink / raw)
  To: Orm Finnendahl; +Cc: emacs-orgmode

Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:

>> > - added :multipage case to `org-export-as', calling :process-multipage
>> >   callback submitted in info. In the multipage case, org-export-as
>> >   returns nil relying on :process-multipage to do the exporting, while
>> >   in the single page case it returns the transcoded string to the
>> >   caller from the backend.
>> 
>> Does it mean that you do not want page splitting to be controlled
>> globally, in ox.el, as we discussed?
>
> Not for now (I mention that later in the mail when I talk about
> section-filenames and future generalizations, where this definitely
> has to be done). In the html multipage export are many peculiarities
> which don't apply to other backends, so ox.el wouldn't be the place
> for it, so we will need some callback mechanism anyway.
>
> Right now this gets accomplished with a small branch in org-export-as
> in order to change as little as possible. It'll be easy to change if
> we find a good way to get this done using a more general approach. But
> I'm open for suggestions, if you have an idea how to already do it
> now.

Then, a more natural way to achieve custom document-wide transcoder will
be introducing "org-data" transcoder into `org-export-transcoder':

(defun org-export-transcoder (blob info)
  "Return appropriate transcoder for BLOB.
INFO is a plist containing export directives."
  (let ((type (org-element-type blob)))
    ;; Return contents only for complete parse trees.
    (if (eq type 'org-data) (lambda (_datum contents _info) contents) ; <=------------------
      (let ((transcoder (cdr (assq type (plist-get info :translate-alist)))))
	(and (functionp transcoder) transcoder)))))

For now, we have a hard-coded identity CONTENTS -> CONTENTS transcoder
when exporting the whole document, followed by applying inner/outer
templates. We may instead allow the export backends to introduce
"org-data" transcoder as a part of exporter definition. When non-nil, it
will be used instead of what you extracted into
`org-export-transcode-page'. And `org-export-transcode-page' will be
used as the fallback.

WDYT?

>> > - changed that `org-export-numbered-headline-p' always returns t for
>> >   headlines in the multipage case to ensure headline numbering is
>> >   collected.
>> 
>> >   NOTE: The single-page case will be handled like before, but it might
>> >   be a better idea to change the behaviour and do it the same way as
>> >   in the multipage case: Always collect the headline-numbering and
>> >   only decide at the transcoding stage whether the headline should be
>> >   numbered in the output.
>> 
>> Why? What if one wants headlines to be not numbered?
>
> Just set num:nil in the options.

But your code ignores num:nil, does it not?

(defun org-export-numbered-headline-p (headline info)
  "Return a non-nil value if HEADLINE element should be numbered.
INFO is a plist used as a communication channel."
  (unless (org-not-nil (org-export-get-node-property :UNNUMBERED headline t))
    (let ((sec-num (or (plist-get info :section-numbers)
                       (plist-get info :multipage))) ; <-- overrides num:nil
	  (level (org-export-get-relative-level headline info)))
      (if (wholenump sec-num) (<= level sec-num) sec-num))))


-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-23 11:35                                     ` Orm Finnendahl
  2024-07-23 12:52                                       ` Ihor Radchenko
  2024-07-23 14:13                                       ` Ihor Radchenko
@ 2024-07-23 14:19                                       ` Ihor Radchenko
  2024-07-23 15:13                                         ` Orm Finnendahl
  2 siblings, 1 reply; 46+ messages in thread
From: Ihor Radchenko @ 2024-07-23 14:19 UTC (permalink / raw)
  To: Orm Finnendahl; +Cc: emacs-orgmode

Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:

> ... I think printing
> headline numbers should get handled in the transcoding stage of the
> backend and not before.

I am confused here.
What do you mean by "printing"?

> ... Multipage export behind the scenes is
> completely dependant on headline numbering, even if headlines aren't
> displayed, so the code in ox.el first proceeds, as if headline
> numbering is turned on and moves the check for headline numbering to
> the transcoding stage. I didn't change the behaviour in the
> single-page html situation. Although I think that it might make sense
> that headline-numbering in general only gets checked at the
> transcoding stage that would affect all backends, so I didn't change
> anything.

I am again confused.
There are three main functions handling headline numbering:
1. `org-export--collect-headline-numbering'
2. `org-export-get-headline-number'
3. `org-export-numbered-headline-p'

`org-export--collect-headline-numbering' is evaluated unconditionally,
regardless of num:t or num:nil settings.

`org-export-get-headline-number' and `org-export-numbered-headline-p'
are API functions that get called by the individual backends as needed.
If they deem it necessary to ignore :section-numbers setting, they are
free to.

What is wrong with these three functions? 

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-23 12:52                                       ` Ihor Radchenko
@ 2024-07-23 14:56                                         ` Orm Finnendahl
       [not found]                                         ` <Zp_EhDDxxYRWKFPL@orm-t14s>
  1 sibling, 0 replies; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-23 14:56 UTC (permalink / raw)
  To: emacs-orgmode

Am Dienstag, den 23. Juli 2024 um 12:52:51 Uhr (+0000) schrieb Ihor Radchenko:
> Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:
> 
> > Thanks also for the info regarding how to contribute. It'd be nice if
> > you could gibe me a go in case you approve the proposal.
> 
> May you please elaborate?

Writing documentation and test functions doesn't make a lot of sense
if the code doesn't get integrated into an org release and in addition
I'd like to start working on the doc after finalizing the design and
names, therefore I'm asking (I don't need the documentation for myself
and wouldn't write it if it didn't get published ;-)

> All you need to do is cloning/forking Org mode repository, making edits
> there, and sharing the link to your branch.

ok, I will do that.

Best,
Orm
----------------------------------------------------------------------
Prof. Orm Finnendahl
Komposition
Hochschule für Musik und Darstellende Kunst
Eschersheimer Landstr. 29-39
60322 Frankfurt am Main

https://www.youtube.com/watch?v=2rWha1HTfFE&list=PLiGfneJSWmNw6dTUvcTHbTkCYOOTiB_N6


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-23 14:19                                       ` Ihor Radchenko
@ 2024-07-23 15:13                                         ` Orm Finnendahl
  2024-07-23 16:20                                           ` Ihor Radchenko
  0 siblings, 1 reply; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-23 15:13 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode

tAm Dienstag, den 23. Juli 2024 um 14:19:00 Uhr (+0000) schrieb Ihor Radchenko:
> Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:
> 
> > ... I think printing
> > headline numbers should get handled in the transcoding stage of the
> > backend and not before.
> 
> I am confused here.
> What do you mean by "printing"?

I mean creating the output string.

> `org-export--collect-headline-numbering' is evaluated unconditionally,
> regardless of num:t or num:nil settings.

Are you sure? org-export--collect-headline-numbering has this in its
body:

(org-element-map data 'headline
      (lambda (headline)
        (when (and (org-export-numbered-headline-p headline options)
	   (not (org-element-property :footnote-section-p headline)))
           ...)))

If num:nil headline numbers don't get collected, or am I missing
something?

But I will doublecheck just to be sure...

--
Orm


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
       [not found]                                           ` <874j8g2lvq.fsf@localhost>
@ 2024-07-23 15:36                                             ` Orm Finnendahl
  0 siblings, 0 replies; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-23 15:36 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode

Am Dienstag, den 23. Juli 2024 um 15:01:13 Uhr (+0000) schrieb Ihor Radchenko:
> 
> Multipage export is something I want to see as a part of Org mode.
> I thought that you were aiming for upstream from the very beginning. I
> never opposed that.

Ok, thanks. You're right, I was aiming at that from the very
beginning, but it was unclear to me how integration of code is handled
in org development and whether my code is considered acceptable or
aligns with design guidelines.


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-23 15:13                                         ` Orm Finnendahl
@ 2024-07-23 16:20                                           ` Ihor Radchenko
  2024-07-23 17:02                                             ` Orm Finnendahl
  0 siblings, 1 reply; 46+ messages in thread
From: Ihor Radchenko @ 2024-07-23 16:20 UTC (permalink / raw)
  To: Orm Finnendahl; +Cc: emacs-orgmode

Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:

>> `org-export--collect-headline-numbering' is evaluated unconditionally,
>> regardless of num:t or num:nil settings.
>
> Are you sure? org-export--collect-headline-numbering has this in its
> body:
>
> (org-element-map data 'headline
>       (lambda (headline)
>         (when (and (org-export-numbered-headline-p headline options)
> 	   (not (org-element-property :footnote-section-p headline)))
>            ...)))
>
> If num:nil headline numbers don't get collected, or am I missing
> something?

You are right.
However, changing `org-export-numbered-headline-p' to use :multipage is
not the right approach to achieve what you need.

If you think that multipage export should use a different set of
options, we need to implement it differently.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-23 16:20                                           ` Ihor Radchenko
@ 2024-07-23 17:02                                             ` Orm Finnendahl
  2024-07-23 17:13                                               ` Ihor Radchenko
  0 siblings, 1 reply; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-23 17:02 UTC (permalink / raw)
  To: emacs-orgmode

Am Dienstag, den 23. Juli 2024 um 16:20:39 Uhr (+0000) schrieb Ihor Radchenko:
> Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:
> 
> >> `org-export--collect-headline-numbering' is evaluated unconditionally,
> >> regardless of num:t or num:nil settings.
> >
> > Are you sure? org-export--collect-headline-numbering has this in its
> > body:
> >
> > (org-element-map data 'headline
> >       (lambda (headline)
> >         (when (and (org-export-numbered-headline-p headline options)
> > 	   (not (org-element-property :footnote-section-p headline)))
> >            ...)))
> >
> > If num:nil headline numbers don't get collected, or am I missing
> > something?
> 
> You are right.
> However, changing `org-export-numbered-headline-p' to use :multipage is
> not the right approach to achieve what you need.
> 
> If you think that multipage export should use a different set of
> options, we need to implement it differently.

Is that a semantic problem so we need to implement an option like
:always-collect-headline-numbering instead of :multipage in
org-export-numbered-headline-p? If on the other hand we define a
replacement of org-export--collect-headline-numbering, we also have to
do so for all functions up the stack, like org-export--annotate-info
and org-export--collect-tree-properties. Or what did you have in mind?

--
Orm


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
       [not found]                                         ` <Zp_b2lL2SzDswa-w@orm-t14s>
@ 2024-07-23 17:10                                           ` Ihor Radchenko
  2024-07-23 20:35                                             ` Orm Finnendahl
  0 siblings, 1 reply; 46+ messages in thread
From: Ihor Radchenko @ 2024-07-23 17:10 UTC (permalink / raw)
  To: Orm Finnendahl; +Cc: emacs-orgmode

[ Adding the mailing list back to CC ]

Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:

> Let me recapitulate to make sure I understand you completely:
>
> 1. Replace the call to org-export-transcode-page at the end of
>    ord-export-as by a call to org-export-data

Yes.

> 2. If a transcoder for org-data is defined, call that and return nil
>    from org-export-date.
>
>    Otherwise return the transcoded string.

> 3. In case a string is returned, process it as it is done in
>    org-export-transcode-page (only that the output string will be
>    supplied in place of the headline and we will find a better name for
>    org-export-transcode-page as it is called *after* the transcoding.

No.

If a transcoder for org-data is defined, call it and return whatever it
returns. Otherwise, call `org-export-transcode-page' (adjusted to follow
transcoder arguments).

> 1. org-export-data has to be modified to catch the case of
>    org-export-transcoder being called on org-data in the
>    multipage-case (after ;; Element/Object with contents.). This seems
>    a bit complicated as there is memoization going on in
>    :exported-data of info further down in org-export-data which
>    probably should get circumvented in the multipage case (e.g. by
>    checking the value of results).

I do not fully understand the problem you are describing here, but hope
that my clarification above resolved it.

> 2. The code has to define/provide a transcoding function in the
>    multipage case but should *not* provide such a function in the
>    single page case, which means (in the multipage case) to modify the
>    alist of the backend on-the-fly before calling org-export-as.

I propose to allow custom org-data transcoder for single page case as well.
If there is no need to have custom transcoder for single page, the
custom transcoder can check :multipage property and fall back to
calling `org-export-transcode-page' if it is nil.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-23 17:02                                             ` Orm Finnendahl
@ 2024-07-23 17:13                                               ` Ihor Radchenko
  2024-07-23 19:00                                                 ` Orm Finnendahl
  0 siblings, 1 reply; 46+ messages in thread
From: Ihor Radchenko @ 2024-07-23 17:13 UTC (permalink / raw)
  To: Orm Finnendahl; +Cc: emacs-orgmode

Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:

>> If you think that multipage export should use a different set of
>> options, we need to implement it differently.
>
> Is that a semantic problem so we need to implement an option like
> :always-collect-headline-numbering instead of :multipage in
> org-export-numbered-headline-p? If on the other hand we define a
> replacement of org-export--collect-headline-numbering, we also have to
> do so for all functions up the stack, like org-export--annotate-info
> and org-export--collect-tree-properties. Or what did you have in mind?

What I had in mind if using backend-specific :filter-options.
If a backends needs to enable headline numbering unconditionally, when
:multipage is used, it can install :filter-options filter that will set
:section-numbers to t.

Does it make sense?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-23 17:13                                               ` Ihor Radchenko
@ 2024-07-23 19:00                                                 ` Orm Finnendahl
  0 siblings, 0 replies; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-23 19:00 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Org mailing list

Am Dienstag, den 23. Juli 2024 um 17:13:56 Uhr (+0000) schrieb Ihor Radchenko:
> 
> What I had in mind if using backend-specific :filter-options.
> If a backends needs to enable headline numbering unconditionally, when
> :multipage is used, it can install :filter-options filter that will set
> :section-numbers to t.
> 
> Does it make sense?

Yes, but I have to check whether this can be reverted afterwards, as
in the :multipage case we still need the information of the initial
setting of :section-numbers when transcoding finally happens.

It's a bit late today to digest it all, but I'll look into it
tomorrow and will get back to you with questions in case I don't get
it.

--
Orm


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-23 17:10                                           ` Ihor Radchenko
@ 2024-07-23 20:35                                             ` Orm Finnendahl
  2024-07-24 10:20                                               ` Ihor Radchenko
  0 siblings, 1 reply; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-23 20:35 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Org mailing list

Am Dienstag, den 23. Juli 2024 um 17:10:17 Uhr (+0000) schrieb Ihor Radchenko:
> > 2. If a transcoder for org-data is defined, call that and return nil
> >    from org-export-date.
> >
> >    Otherwise return the transcoded string.
> 
> > 3. In case a string is returned, process it as it is done in
> >    org-export-transcode-page (only that the output string will be
> >    supplied in place of the headline and we will find a better name for
> >    org-export-transcode-page as it is called *after* the transcoding.
> 
> No.
> 
> If a transcoder for org-data is defined, call it and return whatever it
> returns. Otherwise, call `org-export-transcode-page' (adjusted to follow
> transcoder arguments).

Sorry, this is still quite obscure to me: Why should a transcoder for
org-data return anything in the multipage case and who should handle
the return value(s)? The transcoder could return a list of strings
which can get returned by org-export-as and then handled in the
function which called org-export-as. If that's what you mean I can
implement it, although I'm admittedly not really convinced, especially
as there are hairy details to solve, when we really want to use
org-export-data to generate multiple return values:

- what should 'results in org-export-data be when calling the
  transcoding function for multipage? A list of strings returned by
  the transcoding of the individual pages? Shall each string be
  memoized?  How? How to deal with assigning a file-name to each
  string, should we rather return a (filename . transcoded-string)
  alist?

To recapitulate: In my code, org-export-as calls process-multipage in
the backend. This function:

- collects and adds information necessary for org-multipage to do its
  job, splitting the document into different parts, etc. and

- then calls org-export-data on the subtrees and exports each returned
  string to an individual file.

- It finally issues a done string and executes a browser open/visit
  file or simply exits nil.

For me this is rather clean and it seems unnecessary to go through all
the hassle of dealing with a multipage transcoder within
org-export-data. Anyway, I will try to follow your recommendation once
I fully understand what you're up to, although I fear this will open a
can of worms...

> 
> > 1. org-export-data has to be modified to catch the case of
> >    org-export-transcoder being called on org-data in the
> >    multipage-case (after ;; Element/Object with contents.). This seems
> >    a bit complicated as there is memoization going on in
> >    :exported-data of info further down in org-export-data which
> >    probably should get circumvented in the multipage case (e.g. by
> >    checking the value of results).
> 
> I do not fully understand the problem you are describing here, but hope
> that my clarification above resolved it.
>

Unfortunately not :-( Sorry that I can't really make sense of your
explanations. Somehow we seem to think from quite different
perspectives and it is really hard for me to get your point (although
it is also fascinating and I'm not willing to give up ;-)

> > 2. The code has to define/provide a transcoding function in the
> >    multipage case but should *not* provide such a function in the
> >    single page case, which means (in the multipage case) to modify the
> >    alist of the backend on-the-fly before calling org-export-as.
> 
> I propose to allow custom org-data transcoder for single page case as well.
> If there is no need to have custom transcoder for single page, the
> custom transcoder can check :multipage property and fall back to
> calling `org-export-transcode-page' if it is nil.

ok, that much is clear.

--
Orm


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-23 20:35                                             ` Orm Finnendahl
@ 2024-07-24 10:20                                               ` Ihor Radchenko
  2024-07-24 11:24                                                 ` Orm Finnendahl
  0 siblings, 1 reply; 46+ messages in thread
From: Ihor Radchenko @ 2024-07-24 10:20 UTC (permalink / raw)
  To: Orm Finnendahl; +Cc: Org mailing list

[-- Attachment #1: Type: text/plain, Size: 3467 bytes --]

Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:

> To recapitulate: In my code, org-export-as calls process-multipage in
> the backend. This function:
>
> - collects and adds information necessary for org-multipage to do its
>   job, splitting the document into different parts, etc. and
>
> - then calls org-export-data on the subtrees and exports each returned
>   string to an individual file.
>
> - It finally issues a done string and executes a browser open/visit
>   file or simply exits nil.

Currently, org-export-as does the following:

1. Compute global export attributes, according to the selected export backend
2. Copy original buffer into working copy
3. Process and parse the copy, generating AST
4. Do the actual export

You plugged your multipage processing into (4), but what it actually
does involves (3), (4), and also a new kind of post-processing.
I do not think that it is a good design from the point of view of ox.el.
I prefer to reuse or extend the existing mechanisms if at all possible -
this makes new features less confusing for users and backend developers.

> - collects and adds information necessary for org-multipage to do its
>   job, splitting the document into different parts, etc. and

What you describe here is more or less what :filter-parse-tree filters
do - they can rearrange the parse tree before passing it to the
transcoders. Why not reusing it for multipage export?

> - then calls org-export-data on the subtrees and exports each returned
>   string to an individual file.

And you simply call `org-export-transcode-page' for this, followed by
writing the returned string to file.

The first part can fit within `org-export-as', but writing to file is
going a step beyond, duplicating what `org-export-to-file' does.

> - It finally issues a done string and executes a browser open/visit
>   file or simply exits nil.

... which again steps beyond `org-export-as' scope - post-processing is
currently done as a part of `org-export-to-file'/`org-export-to-buffer'.

----

Let me propose the following changes to ox.el:

1. org-data will be transcoded using `org-export-transcode-org-data',
   which can be overridden by setting org-data transcoders in the
   individual backends.

2. org-export-as will understand transcoded output to be a list of
   strings and will transfer INFO plist as text property in the return
   values

3. org-export-to-file will make use of the text properties to retrieve
   the file name to write.  This way, export backend itself can assign
   the file names where each exporter string should go.

I believe that my changes should allow you to implement multipage export
in the following way:

1. You can use :filter-parse-tree in ox-html backend to replace the
   original (org-data ...) AST with a list of
   ((org-page ...) (org-page ...) ...) pseudo-elements and populate INFO
   channel with auxiliary information you now compute in `org-html-process-multipage'

2. You can define org-page transcoder to render individual pages as
needed

3. You can assign :output-file text property to the returned org-page
   strings and use org-export-to-file to generate the multipage output
   on disk

4. You can handle opening exported files by augmenting POST-PROCESS
   argument in `org-html-export-to-multipage-html' and calling
   `org-export-file' instead of `org-export-as'.

The tentative patches (against Org mode main branch) implementing my
changes are attached.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-ox-Factor-out-org-data-transcoding-into-dedicated-ov.patch --]
[-- Type: text/x-patch, Size: 4398 bytes --]

From 540c8ef21c26df79cf48f58afb4e88130985e2f7 Mon Sep 17 00:00:00 2001
Message-ID: <540c8ef21c26df79cf48f58afb4e88130985e2f7.1721815865.git.yantar92@posteo.net>
From: Ihor Radchenko <yantar92@posteo.net>
Date: Wed, 24 Jul 2024 11:40:57 +0200
Subject: [PATCH 1/3] ox: Factor out org-data transcoding into dedicated
 overrideable transcoder

* lisp/ox.el (org-export-transcode-org-data): New function serving as
the default transcoder for org-data export.
(org-export-transcoder): Use `org-export-transcode-org-data' when no
org-data transcoder is defined.
(org-export-as): Rely upon org-data transcoder to do its job.
---
 lisp/ox.el | 55 +++++++++++++++++++++++++++++-------------------------
 1 file changed, 30 insertions(+), 25 deletions(-)

diff --git a/lisp/ox.el b/lisp/ox.el
index fbd9bb0df..bdee71082 100644
--- a/lisp/ox.el
+++ b/lisp/ox.el
@@ -1883,9 +1883,11 @@ (defun org-export-transcoder (blob info)
 INFO is a plist containing export directives."
   (let ((type (org-element-type blob)))
     ;; Return contents only for complete parse trees.
-    (if (eq type 'org-data) (lambda (_datum contents _info) contents)
-      (let ((transcoder (cdr (assq type (plist-get info :translate-alist)))))
-	(and (functionp transcoder) transcoder)))))
+    (let ((transcoder (cdr (assq type (plist-get info :translate-alist)))))
+      (cond
+       ((functionp transcoder) transcoder)
+       ;; Use default org-data transcoder unless specified.
+       ((eq type 'org-data) #'org-export-transcode-org-data)))))
 
 (defun org-export--keep-spaces (data info)
   "Non-nil, when post-blank spaces after removing DATA should be preserved.
@@ -3004,31 +3006,34 @@ (defun org-export-as
                        backend info subtreep visible-only ext-plist))
 	   ;; Eventually transcode TREE.  Wrap the resulting string into
 	   ;; a template.
-	   (let* ((body (org-element-normalize-string
-		         (or (org-export-data (plist-get info :parse-tree) info)
-                             "")))
-		  (inner-template (cdr (assq 'inner-template
-					     (plist-get info :translate-alist))))
-		  (full-body (org-export-filter-apply-functions
-			      (plist-get info :filter-body)
-			      (if (not (functionp inner-template)) body
-			        (funcall inner-template body info))
-			      info))
-		  (template (cdr (assq 'template
-				       (plist-get info :translate-alist))))
-                  (output
-                   (if (or (not (functionp template)) body-only) full-body
-	             (funcall template full-body info))))
+	   (let ((output
+                  (or (org-export-data (plist-get info :parse-tree) info)
+                      "")))
              ;; Call citation export finalizer.
              (when (plist-get info :with-cite-processors)
                (setq output (org-cite-finalize-export output info)))
-	     ;; Remove all text properties since they cannot be
-	     ;; retrieved from an external process.  Finally call
-	     ;; final-output filter and return result.
-	     (org-no-properties
-	      (org-export-filter-apply-functions
-	       (plist-get info :filter-final-output)
-	       output info)))))))))
+             (let ((filters (plist-get info :filter-final-output)))
+               ;; Remove all text properties since they cannot be
+	       ;; retrieved from an external process.  Finally call
+	       ;; final-output filter and return result.
+               (org-no-properties
+                (org-export-filter-apply-functions filters output info))))))))))
+
+(defun org-export-transcode-org-data (_ body info)
+  "Transcode `org-data' node with BODY.  Return transcoded string.
+INFO is the communication channel plist."
+  (let* ((inner-template (cdr (assq 'inner-template
+				    (plist-get info :translate-alist))))
+	 (full-body (org-export-filter-apply-functions
+		     (plist-get info :filter-body)
+		     (if (not (functionp inner-template)) body
+		       (funcall inner-template body info))
+		     info))
+	 (template (cdr (assq 'template
+			      (plist-get info :translate-alist))))
+         (body-only (memq 'body-only (plist-get info :export-options))))
+    (if (or (not (functionp template)) body-only) full-body
+      (funcall template full-body info))))
 
 (defun org-export--annotate-info (backend info &optional subtreep visible-only ext-plist)
   "Annotate the INFO plist according to the BACKEND.
-- 
2.45.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-org-export-as-Allow-the-return-value-to-be-a-list-of.patch --]
[-- Type: text/x-patch, Size: 3375 bytes --]

From 1b0b331f92abc1ca7e04f71fe7ff60da57c719b8 Mon Sep 17 00:00:00 2001
Message-ID: <1b0b331f92abc1ca7e04f71fe7ff60da57c719b8.1721815865.git.yantar92@posteo.net>
In-Reply-To: <540c8ef21c26df79cf48f58afb4e88130985e2f7.1721815865.git.yantar92@posteo.net>
References: <540c8ef21c26df79cf48f58afb4e88130985e2f7.1721815865.git.yantar92@posteo.net>
From: Ihor Radchenko <yantar92@posteo.net>
Date: Wed, 24 Jul 2024 11:51:21 +0200
Subject: [PATCH 2/3] org-export-as: Allow the return value to be a list of
 strings; add INFO

* lisp/ox.el (org-export-as): Allow the transcoders to return list of
strings and return it.  When returning a string, put INFO plist as
text property.  Do not remove text properties assigned by the
transcoders.
(org-export-data): Document that list of strings may be returned.
---
 lisp/ox.el | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/lisp/ox.el b/lisp/ox.el
index bdee71082..a76b3b353 100644
--- a/lisp/ox.el
+++ b/lisp/ox.el
@@ -1930,7 +1930,7 @@ (defun org-export-data (data info)
 
 The `:filter-parse-tree' filters are not applied.
 
-Return a string."
+Return a string or a list of strings."
   (or (gethash data (plist-get info :exported-data))
       ;; Handle broken links according to
       ;; `org-export-with-broken-links'.
@@ -2969,7 +2969,9 @@ (defun org-export-as
 with external parameters overriding Org default settings, but
 still inferior to file-local settings.
 
-Return code as a string."
+Return code as a string or a list of strings.
+The returned strings will have their `org-export-info' property set to
+export information channel."
   (when (symbolp backend) (setq backend (org-export-get-backend backend)))
   (org-export-barf-if-invalid-backend backend)
   (org-fold-core-ignore-modifications
@@ -3009,15 +3011,25 @@ (defun org-export-as
 	   (let ((output
                   (or (org-export-data (plist-get info :parse-tree) info)
                       "")))
+             (setq output (ensure-list output))
              ;; Call citation export finalizer.
              (when (plist-get info :with-cite-processors)
-               (setq output (org-cite-finalize-export output info)))
+               (setq output
+                     (mapcar
+                      (lambda (o) (org-cite-finalize-export o info))
+                      output)))
              (let ((filters (plist-get info :filter-final-output)))
-               ;; Remove all text properties since they cannot be
-	       ;; retrieved from an external process.  Finally call
-	       ;; final-output filter and return result.
-               (org-no-properties
-                (org-export-filter-apply-functions filters output info))))))))))
+               ;; Call final-output filter and return result.
+               (setq output
+                     (mapcar
+                      (lambda (o) (org-export-filter-apply-functions filters o info))
+                      output)))
+             ;; Apply org-export-info property.
+             (setq output
+                   (mapcar
+                    (lambda (o) (org-add-props o nil 'org-export-info info))
+                    output))
+             (if (length= output 1) (car output) output))))))))
 
 (defun org-export-transcode-org-data (_ body info)
   "Transcode `org-data' node with BODY.  Return transcoded string.
-- 
2.45.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: 0003-org-export-to-file-Derive-file-name-to-write-from-ex.patch --]
[-- Type: text/x-patch, Size: 4627 bytes --]

From 6fa2efadd229a667fba1b18aecc9d1ead5f284ac Mon Sep 17 00:00:00 2001
Message-ID: <6fa2efadd229a667fba1b18aecc9d1ead5f284ac.1721815865.git.yantar92@posteo.net>
In-Reply-To: <540c8ef21c26df79cf48f58afb4e88130985e2f7.1721815865.git.yantar92@posteo.net>
References: <540c8ef21c26df79cf48f58afb4e88130985e2f7.1721815865.git.yantar92@posteo.net>
From: Ihor Radchenko <yantar92@posteo.net>
Date: Wed, 24 Jul 2024 12:09:36 +0200
Subject: [PATCH 3/3] org-export-to-file: Derive file name to write from export
 output

* lisp/ox.el (org-export--write-output): New helper function
performing writing an export output or a list of outputs to file.  It
derives the file name from :output-file property in the output string
or INFO plist stored in the output string.
(org-export-to-file): Handle export output being a list of strings.
Use `org-export--write-output'.
---
 lisp/ox.el | 61 ++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 38 insertions(+), 23 deletions(-)

diff --git a/lisp/ox.el b/lisp/ox.el
index a76b3b353..d78c04998 100644
--- a/lisp/ox.el
+++ b/lisp/ox.el
@@ -6830,6 +6830,31 @@   (defun org-latex-export-as-latex
 	(switch-to-buffer-other-window buffer))
       buffer)))
 
+(defun org-export--write-output (output encoding)
+  "Write OUTPUT to file with ENCODING.
+OUTPUT may be a string or a list of strings.
+The target file is retrieved from :output-file OUTPUT property or
+:output-file property in plist stored in `org-export-info' property of
+each string.
+
+Return the file name or a list of file names."
+  (if (listp output) (mapcar #'org-export--write-output output)
+    (let ((file (or
+                 (get-text-property 0 :output-file output)
+                 (plist-get
+                  (get-text-property 0 'org-export-info output)
+                  :output-file))))
+      (with-temp-buffer
+        (insert output)
+        ;; Ensure final newline.  This is what was done
+        ;; historically, when we used `write-file'.
+        ;; Note that adding a newline is only safe for
+        ;; non-binary data.
+        (unless (bolp) (insert "\n"))
+        (let ((coding-system-for-write encoding))
+	  (write-region nil nil file))
+        file))))
+
 ;;;###autoload
 (defun org-export-to-file
     (backend file &optional async subtreep visible-only body-only ext-plist
@@ -6878,33 +6903,23 @@   (defun org-latex-export-to-latex
 	    `(let ((output
 		    (org-export-as
 		     ',backend ,subtreep ,visible-only ,body-only
-		     ',ext-plist)))
-	       (with-temp-buffer
-		 (insert output)
-                 ;; Ensure final newline.  This is what was done
-                 ;; historically, when we used `write-file'.
-                 ;; Note that adding a newline is only safe for
-                 ;; non-binary data.
-                 (unless (bolp) (insert "\n"))
-		 (let ((coding-system-for-write ',encoding))
-		   (write-region nil nil ,file)))
-	       (or (ignore-errors (funcall ',post-process ,file)) ,file)))
+		     ',ext-plist))
+                   file)
+               (setq file (org-export--write-output output ',encoding))
+               (let ((post (lambda (f) (or (ignore-errors (funcall ',post-process f)) f))))
+                 (if (listp file) (mapcar post file) (funcall post file)))))
         (let ((output (org-export-as
-                       backend subtreep visible-only body-only ext-plist)))
-          (with-temp-buffer
-            (insert output)
-            ;; Ensure final newline.  This is what was done
-            ;; historically, when we used `write-file'.
-            ;; Note that adding a newline is only safe for
-            ;; non-binary data.
-            (unless (bolp) (insert "\n"))
-            (let ((coding-system-for-write encoding))
-	      (write-region nil nil file)))
+                       backend subtreep visible-only body-only ext-plist))
+              file)
+          (setq file (org-export--write-output output encoding))
           (when (and (org-export--copy-to-kill-ring-p) (org-string-nw-p output))
             (org-kill-new output))
           ;; Get proper return value.
-          (or (and (functionp post-process) (funcall post-process file))
-	      file))))))
+          (let ((post (lambda (f)
+                        (or (and (functionp post-process)
+                                 (funcall post-process f))
+	                    f))))
+            (if (listp file) (mapcar post file) (funcall post file))))))))
 
 (defun org-export-output-file-name (extension &optional subtreep pub-dir)
   "Return output file's name according to buffer specifications.
-- 
2.45.2


[-- Attachment #5: Type: text/plain, Size: 224 bytes --]


-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: multipage html output
  2024-07-24 10:20                                               ` Ihor Radchenko
@ 2024-07-24 11:24                                                 ` Orm Finnendahl
  0 siblings, 0 replies; 46+ messages in thread
From: Orm Finnendahl @ 2024-07-24 11:24 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Org mailing list

Hi Ihor,

 thanks a lot for the patches and explanations. Your assessment makes
sense and I agree that we should stick to the current design as much
as possible. I will go ahead and adapt my ox-html.el code the way you
propose and use your patched ox.el for this. It shouldn't take too
much effort and I will get back as soon as I have results (or
questions ;-).

--
Orm

Am Mittwoch, den 24. Juli 2024 um 10:20:16 Uhr (+0000) schrieb Ihor Radchenko:
> Orm Finnendahl <orm.finnendahl@selma.hfmdk-frankfurt.de> writes:
> 
> > To recapitulate: In my code, org-export-as calls process-multipage in
> > the backend. This function:
> >
> > - collects and adds information necessary for org-multipage to do its
> >   job, splitting the document into different parts, etc. and
> >
> > - then calls org-export-data on the subtrees and exports each returned
> >   string to an individual file.
> >
> > - It finally issues a done string and executes a browser open/visit
> >   file or simply exits nil.
> 
> Currently, org-export-as does the following:
> 
> 1. Compute global export attributes, according to the selected export backend
> 2. Copy original buffer into working copy
> 3. Process and parse the copy, generating AST
> 4. Do the actual export
> 
> You plugged your multipage processing into (4), but what it actually
> does involves (3), (4), and also a new kind of post-processing.
> I do not think that it is a good design from the point of view of ox.el.
> I prefer to reuse or extend the existing mechanisms if at all possible -
> this makes new features less confusing for users and backend developers.
> 
> > - collects and adds information necessary for org-multipage to do its
> >   job, splitting the document into different parts, etc. and
> 
> What you describe here is more or less what :filter-parse-tree filters
> do - they can rearrange the parse tree before passing it to the
> transcoders. Why not reusing it for multipage export?
> 
> > - then calls org-export-data on the subtrees and exports each returned
> >   string to an individual file.
> 
> And you simply call `org-export-transcode-page' for this, followed by
> writing the returned string to file.
> 
> The first part can fit within `org-export-as', but writing to file is
> going a step beyond, duplicating what `org-export-to-file' does.
> 
> > - It finally issues a done string and executes a browser open/visit
> >   file or simply exits nil.
> 
> ... which again steps beyond `org-export-as' scope - post-processing is
> currently done as a part of `org-export-to-file'/`org-export-to-buffer'.
> 
> ----
> 
> Let me propose the following changes to ox.el:
> 
> 1. org-data will be transcoded using `org-export-transcode-org-data',
>    which can be overridden by setting org-data transcoders in the
>    individual backends.
> 
> 2. org-export-as will understand transcoded output to be a list of
>    strings and will transfer INFO plist as text property in the return
>    values
> 
> 3. org-export-to-file will make use of the text properties to retrieve
>    the file name to write.  This way, export backend itself can assign
>    the file names where each exporter string should go.
> 
> I believe that my changes should allow you to implement multipage export
> in the following way:
> 
> 1. You can use :filter-parse-tree in ox-html backend to replace the
>    original (org-data ...) AST with a list of
>    ((org-page ...) (org-page ...) ...) pseudo-elements and populate INFO
>    channel with auxiliary information you now compute in `org-html-process-multipage'
> 
> 2. You can define org-page transcoder to render individual pages as
> needed
> 
> 3. You can assign :output-file text property to the returned org-page
>    strings and use org-export-to-file to generate the multipage output
>    on disk
> 
> 4. You can handle opening exported files by augmenting POST-PROCESS
>    argument in `org-html-export-to-multipage-html' and calling
>    `org-export-file' instead of `org-export-as'.
> 
> The tentative patches (against Org mode main branch) implementing my
> changes are attached.
> 

> From 540c8ef21c26df79cf48f58afb4e88130985e2f7 Mon Sep 17 00:00:00 2001
> Message-ID: <540c8ef21c26df79cf48f58afb4e88130985e2f7.1721815865.git.yantar92@posteo.net>
> From: Ihor Radchenko <yantar92@posteo.net>
> Date: Wed, 24 Jul 2024 11:40:57 +0200
> Subject: [PATCH 1/3] ox: Factor out org-data transcoding into dedicated
>  overrideable transcoder
> 
> * lisp/ox.el (org-export-transcode-org-data): New function serving as
> the default transcoder for org-data export.
> (org-export-transcoder): Use `org-export-transcode-org-data' when no
> org-data transcoder is defined.
> (org-export-as): Rely upon org-data transcoder to do its job.
> ---
>  lisp/ox.el | 55 +++++++++++++++++++++++++++++-------------------------
>  1 file changed, 30 insertions(+), 25 deletions(-)
> 
> diff --git a/lisp/ox.el b/lisp/ox.el
> index fbd9bb0df..bdee71082 100644
> --- a/lisp/ox.el
> +++ b/lisp/ox.el
> @@ -1883,9 +1883,11 @@ (defun org-export-transcoder (blob info)
>  INFO is a plist containing export directives."
>    (let ((type (org-element-type blob)))
>      ;; Return contents only for complete parse trees.
> -    (if (eq type 'org-data) (lambda (_datum contents _info) contents)
> -      (let ((transcoder (cdr (assq type (plist-get info :translate-alist)))))
> -	(and (functionp transcoder) transcoder)))))
> +    (let ((transcoder (cdr (assq type (plist-get info :translate-alist)))))
> +      (cond
> +       ((functionp transcoder) transcoder)
> +       ;; Use default org-data transcoder unless specified.
> +       ((eq type 'org-data) #'org-export-transcode-org-data)))))
>  
>  (defun org-export--keep-spaces (data info)
>    "Non-nil, when post-blank spaces after removing DATA should be preserved.
> @@ -3004,31 +3006,34 @@ (defun org-export-as
>                         backend info subtreep visible-only ext-plist))
>  	   ;; Eventually transcode TREE.  Wrap the resulting string into
>  	   ;; a template.
> -	   (let* ((body (org-element-normalize-string
> -		         (or (org-export-data (plist-get info :parse-tree) info)
> -                             "")))
> -		  (inner-template (cdr (assq 'inner-template
> -					     (plist-get info :translate-alist))))
> -		  (full-body (org-export-filter-apply-functions
> -			      (plist-get info :filter-body)
> -			      (if (not (functionp inner-template)) body
> -			        (funcall inner-template body info))
> -			      info))
> -		  (template (cdr (assq 'template
> -				       (plist-get info :translate-alist))))
> -                  (output
> -                   (if (or (not (functionp template)) body-only) full-body
> -	             (funcall template full-body info))))
> +	   (let ((output
> +                  (or (org-export-data (plist-get info :parse-tree) info)
> +                      "")))
>               ;; Call citation export finalizer.
>               (when (plist-get info :with-cite-processors)
>                 (setq output (org-cite-finalize-export output info)))
> -	     ;; Remove all text properties since they cannot be
> -	     ;; retrieved from an external process.  Finally call
> -	     ;; final-output filter and return result.
> -	     (org-no-properties
> -	      (org-export-filter-apply-functions
> -	       (plist-get info :filter-final-output)
> -	       output info)))))))))
> +             (let ((filters (plist-get info :filter-final-output)))
> +               ;; Remove all text properties since they cannot be
> +	       ;; retrieved from an external process.  Finally call
> +	       ;; final-output filter and return result.
> +               (org-no-properties
> +                (org-export-filter-apply-functions filters output info))))))))))
> +
> +(defun org-export-transcode-org-data (_ body info)
> +  "Transcode `org-data' node with BODY.  Return transcoded string.
> +INFO is the communication channel plist."
> +  (let* ((inner-template (cdr (assq 'inner-template
> +				    (plist-get info :translate-alist))))
> +	 (full-body (org-export-filter-apply-functions
> +		     (plist-get info :filter-body)
> +		     (if (not (functionp inner-template)) body
> +		       (funcall inner-template body info))
> +		     info))
> +	 (template (cdr (assq 'template
> +			      (plist-get info :translate-alist))))
> +         (body-only (memq 'body-only (plist-get info :export-options))))
> +    (if (or (not (functionp template)) body-only) full-body
> +      (funcall template full-body info))))
>  
>  (defun org-export--annotate-info (backend info &optional subtreep visible-only ext-plist)
>    "Annotate the INFO plist according to the BACKEND.
> -- 
> 2.45.2
> 

> From 1b0b331f92abc1ca7e04f71fe7ff60da57c719b8 Mon Sep 17 00:00:00 2001
> Message-ID: <1b0b331f92abc1ca7e04f71fe7ff60da57c719b8.1721815865.git.yantar92@posteo.net>
> In-Reply-To: <540c8ef21c26df79cf48f58afb4e88130985e2f7.1721815865.git.yantar92@posteo.net>
> References: <540c8ef21c26df79cf48f58afb4e88130985e2f7.1721815865.git.yantar92@posteo.net>
> From: Ihor Radchenko <yantar92@posteo.net>
> Date: Wed, 24 Jul 2024 11:51:21 +0200
> Subject: [PATCH 2/3] org-export-as: Allow the return value to be a list of
>  strings; add INFO
> 
> * lisp/ox.el (org-export-as): Allow the transcoders to return list of
> strings and return it.  When returning a string, put INFO plist as
> text property.  Do not remove text properties assigned by the
> transcoders.
> (org-export-data): Document that list of strings may be returned.
> ---
>  lisp/ox.el | 28 ++++++++++++++++++++--------
>  1 file changed, 20 insertions(+), 8 deletions(-)
> 
> diff --git a/lisp/ox.el b/lisp/ox.el
> index bdee71082..a76b3b353 100644
> --- a/lisp/ox.el
> +++ b/lisp/ox.el
> @@ -1930,7 +1930,7 @@ (defun org-export-data (data info)
>  
>  The `:filter-parse-tree' filters are not applied.
>  
> -Return a string."
> +Return a string or a list of strings."
>    (or (gethash data (plist-get info :exported-data))
>        ;; Handle broken links according to
>        ;; `org-export-with-broken-links'.
> @@ -2969,7 +2969,9 @@ (defun org-export-as
>  with external parameters overriding Org default settings, but
>  still inferior to file-local settings.
>  
> -Return code as a string."
> +Return code as a string or a list of strings.
> +The returned strings will have their `org-export-info' property set to
> +export information channel."
>    (when (symbolp backend) (setq backend (org-export-get-backend backend)))
>    (org-export-barf-if-invalid-backend backend)
>    (org-fold-core-ignore-modifications
> @@ -3009,15 +3011,25 @@ (defun org-export-as
>  	   (let ((output
>                    (or (org-export-data (plist-get info :parse-tree) info)
>                        "")))
> +             (setq output (ensure-list output))
>               ;; Call citation export finalizer.
>               (when (plist-get info :with-cite-processors)
> -               (setq output (org-cite-finalize-export output info)))
> +               (setq output
> +                     (mapcar
> +                      (lambda (o) (org-cite-finalize-export o info))
> +                      output)))
>               (let ((filters (plist-get info :filter-final-output)))
> -               ;; Remove all text properties since they cannot be
> -	       ;; retrieved from an external process.  Finally call
> -	       ;; final-output filter and return result.
> -               (org-no-properties
> -                (org-export-filter-apply-functions filters output info))))))))))
> +               ;; Call final-output filter and return result.
> +               (setq output
> +                     (mapcar
> +                      (lambda (o) (org-export-filter-apply-functions filters o info))
> +                      output)))
> +             ;; Apply org-export-info property.
> +             (setq output
> +                   (mapcar
> +                    (lambda (o) (org-add-props o nil 'org-export-info info))
> +                    output))
> +             (if (length= output 1) (car output) output))))))))
>  
>  (defun org-export-transcode-org-data (_ body info)
>    "Transcode `org-data' node with BODY.  Return transcoded string.
> -- 
> 2.45.2
> 

> From 6fa2efadd229a667fba1b18aecc9d1ead5f284ac Mon Sep 17 00:00:00 2001
> Message-ID: <6fa2efadd229a667fba1b18aecc9d1ead5f284ac.1721815865.git.yantar92@posteo.net>
> In-Reply-To: <540c8ef21c26df79cf48f58afb4e88130985e2f7.1721815865.git.yantar92@posteo.net>
> References: <540c8ef21c26df79cf48f58afb4e88130985e2f7.1721815865.git.yantar92@posteo.net>
> From: Ihor Radchenko <yantar92@posteo.net>
> Date: Wed, 24 Jul 2024 12:09:36 +0200
> Subject: [PATCH 3/3] org-export-to-file: Derive file name to write from export
>  output
> 
> * lisp/ox.el (org-export--write-output): New helper function
> performing writing an export output or a list of outputs to file.  It
> derives the file name from :output-file property in the output string
> or INFO plist stored in the output string.
> (org-export-to-file): Handle export output being a list of strings.
> Use `org-export--write-output'.
> ---
>  lisp/ox.el | 61 ++++++++++++++++++++++++++++++++++--------------------
>  1 file changed, 38 insertions(+), 23 deletions(-)
> 
> diff --git a/lisp/ox.el b/lisp/ox.el
> index a76b3b353..d78c04998 100644
> --- a/lisp/ox.el
> +++ b/lisp/ox.el
> @@ -6830,6 +6830,31 @@   (defun org-latex-export-as-latex
>  	(switch-to-buffer-other-window buffer))
>        buffer)))
>  
> +(defun org-export--write-output (output encoding)
> +  "Write OUTPUT to file with ENCODING.
> +OUTPUT may be a string or a list of strings.
> +The target file is retrieved from :output-file OUTPUT property or
> +:output-file property in plist stored in `org-export-info' property of
> +each string.
> +
> +Return the file name or a list of file names."
> +  (if (listp output) (mapcar #'org-export--write-output output)
> +    (let ((file (or
> +                 (get-text-property 0 :output-file output)
> +                 (plist-get
> +                  (get-text-property 0 'org-export-info output)
> +                  :output-file))))
> +      (with-temp-buffer
> +        (insert output)
> +        ;; Ensure final newline.  This is what was done
> +        ;; historically, when we used `write-file'.
> +        ;; Note that adding a newline is only safe for
> +        ;; non-binary data.
> +        (unless (bolp) (insert "\n"))
> +        (let ((coding-system-for-write encoding))
> +	  (write-region nil nil file))
> +        file))))
> +
>  ;;;###autoload
>  (defun org-export-to-file
>      (backend file &optional async subtreep visible-only body-only ext-plist
> @@ -6878,33 +6903,23 @@   (defun org-latex-export-to-latex
>  	    `(let ((output
>  		    (org-export-as
>  		     ',backend ,subtreep ,visible-only ,body-only
> -		     ',ext-plist)))
> -	       (with-temp-buffer
> -		 (insert output)
> -                 ;; Ensure final newline.  This is what was done
> -                 ;; historically, when we used `write-file'.
> -                 ;; Note that adding a newline is only safe for
> -                 ;; non-binary data.
> -                 (unless (bolp) (insert "\n"))
> -		 (let ((coding-system-for-write ',encoding))
> -		   (write-region nil nil ,file)))
> -	       (or (ignore-errors (funcall ',post-process ,file)) ,file)))
> +		     ',ext-plist))
> +                   file)
> +               (setq file (org-export--write-output output ',encoding))
> +               (let ((post (lambda (f) (or (ignore-errors (funcall ',post-process f)) f))))
> +                 (if (listp file) (mapcar post file) (funcall post file)))))
>          (let ((output (org-export-as
> -                       backend subtreep visible-only body-only ext-plist)))
> -          (with-temp-buffer
> -            (insert output)
> -            ;; Ensure final newline.  This is what was done
> -            ;; historically, when we used `write-file'.
> -            ;; Note that adding a newline is only safe for
> -            ;; non-binary data.
> -            (unless (bolp) (insert "\n"))
> -            (let ((coding-system-for-write encoding))
> -	      (write-region nil nil file)))
> +                       backend subtreep visible-only body-only ext-plist))
> +              file)
> +          (setq file (org-export--write-output output encoding))
>            (when (and (org-export--copy-to-kill-ring-p) (org-string-nw-p output))
>              (org-kill-new output))
>            ;; Get proper return value.
> -          (or (and (functionp post-process) (funcall post-process file))
> -	      file))))))
> +          (let ((post (lambda (f)
> +                        (or (and (functionp post-process)
> +                                 (funcall post-process f))
> +	                    f))))
> +            (if (listp file) (mapcar post file) (funcall post file))))))))
>  
>  (defun org-export-output-file-name (extension &optional subtreep pub-dir)
>    "Return output file's name according to buffer specifications.
> -- 
> 2.45.2
> 

> 
> -- 
> Ihor Radchenko // yantar92,
> Org mode contributor,
> Learn more about Org mode at <https://orgmode.org/>.
> Support Org development at <https://liberapay.com/org-mode>,
> or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2024-07-24 11:25 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-03  9:44 multipage html output Orm Finnendahl
2024-07-03 10:33 ` Dr. Arne Babenhauserheide
2024-07-03 10:58 ` Christian Moe
2024-07-03 11:05   ` Ihor Radchenko
2024-07-03 14:34     ` Christian Moe
2024-07-04  9:50     ` Orm Finnendahl
2024-07-04 11:41       ` Ihor Radchenko
2024-07-04 13:33         ` Orm Finnendahl
2024-07-04 16:20           ` Ihor Radchenko
2024-07-07 19:33             ` Orm Finnendahl
2024-07-08 15:29               ` Ihor Radchenko
2024-07-08 19:12                 ` Orm Finnendahl
2024-07-09 17:55                   ` Ihor Radchenko
2024-07-10 18:03                     ` Orm Finnendahl
2024-07-10 18:53                       ` Ihor Radchenko
2024-07-07 20:50             ` Orm Finnendahl
2024-07-08 15:05               ` Ihor Radchenko
2024-07-08 15:41                 ` Orm Finnendahl
2024-07-08 15:56                   ` Ihor Radchenko
2024-07-08 19:18                     ` Orm Finnendahl
2024-07-09 18:08                       ` Ihor Radchenko
2024-07-10 19:37                         ` Orm Finnendahl
2024-07-11 12:35                           ` Ihor Radchenko
2024-07-13  7:44                             ` Orm Finnendahl
2024-07-13 10:13                               ` Ihor Radchenko
2024-07-13 11:01                                 ` Orm Finnendahl
2024-07-23  8:56                                 ` Orm Finnendahl
2024-07-23 10:24                                   ` Ihor Radchenko
2024-07-23 11:35                                     ` Orm Finnendahl
2024-07-23 12:52                                       ` Ihor Radchenko
2024-07-23 14:56                                         ` Orm Finnendahl
     [not found]                                         ` <Zp_EhDDxxYRWKFPL@orm-t14s>
     [not found]                                           ` <874j8g2lvq.fsf@localhost>
2024-07-23 15:36                                             ` Orm Finnendahl
2024-07-23 14:13                                       ` Ihor Radchenko
     [not found]                                         ` <Zp_b2lL2SzDswa-w@orm-t14s>
2024-07-23 17:10                                           ` Ihor Radchenko
2024-07-23 20:35                                             ` Orm Finnendahl
2024-07-24 10:20                                               ` Ihor Radchenko
2024-07-24 11:24                                                 ` Orm Finnendahl
2024-07-23 14:19                                       ` Ihor Radchenko
2024-07-23 15:13                                         ` Orm Finnendahl
2024-07-23 16:20                                           ` Ihor Radchenko
2024-07-23 17:02                                             ` Orm Finnendahl
2024-07-23 17:13                                               ` Ihor Radchenko
2024-07-23 19:00                                                 ` Orm Finnendahl
2024-07-03 21:11 ` Rudolf Adamkovič
  -- strict thread matches above, loose matches on Subject: below --
2024-07-06  5:47 Pedro Andres Aranda Gutierrez
2024-07-06  9:04 ` Orm Finnendahl

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).