From: Nicolas Goaziou <n.goaziou@gmail.com>
To: Org Mode List <emacs-orgmode@gnu.org>
Subject: [ANN] Org Export in contrib
Date: Fri, 25 Nov 2011 18:32:16 +0100 [thread overview]
Message-ID: <87ipm8w1jz.fsf@gmail.com> (raw)
Hello,
I've pushed org-export.el to contrib. It's a general export engine,
built on top of org-elements aiming at simplifying life of both
developers and maintainers (and, therefore, of end-users).
It predates org-exp.el and org-special-blocks.el. Though, keep it mind
that, as advanced as it is, it isn't yet a drop-in replacement for
them. It still lacks an interface (à la `org-export'), back-ends, and
tons of testing and improvements. That being said, it's usable anyway
and one can already write back-ends for it. I'll show a silly example
later in this mail.
Now, let's have a peek into the guts of that beast.
Besides the parser, the generic exporter is made of three distinct
parts:
- The communication channel consists in a property list, which is
created and updated during the process. Its use is to offer every
piece of information, would it be export options or contextual data,
all in a single place. The exhaustive list of properties is given in
"The Communication Channel" section of the file.
- The transcoder walks the parse tree, ignores or treat as plain text
elements and objects according to export options, and eventually calls
back-end specific functions to do the real transcoding, concatenating
their return value along the way.
- The filter system is activated at the very beginning and the very end
of the export process, and each time an element or an object has been
converted. It is the entry point to fine-tune standard output from
back-end transcoders.
The core function is `org-export-as'. It returns the transcoded buffer
as a string.
In order to derive an exporter out of this generic implementation, one
can define a transcode function for each element or object. Such
function should return a string for the corresponding element, without
any trailing space, or nil. It must accept three arguments:
1. the element or object itself,
2. its contents, or nil when it isn't recursive,
3. the property list used as a communication channel.
If no such function is found, that element or object type will simply be
ignored, along with any separating blank line. The same will happen if
the function returns the nil value. If that function returns the empty
string, the type will be ignored, but the blank lines will be kept.
Contents, when not nil, are stripped from any global indentation
(although the relative one is preserved). They also always end with
a single newline character.
These functions must follow a strict naming convention:
`org-BACKEND-TYPE' where, obviously, BACKEND is the name of the export
back-end and TYPE the type of the element or object handled.
Moreover, two additional functions can be defined. On the one hand,
`org-BACKEND-template' returns the final transcoded string, and can be
used to add a preamble and a postamble to document's body. It must
accept two arguments: the transcoded string and the property list
containing export options. On the other hand, `org-BACKEND-plain-text',
when defined, is to be called on every text not recognized as an element
or an object. It must accept two arguments: the text string and the
information channel.
Any back-end can define its own variables. Among them, those
customizables should belong to the `org-export-BACKEND' group. Also,
a special variable, `org-BACKEND-option-alist', allows to define buffer
keywords and "#+options:" items specific to that back-end. See
`org-export-option-alist' for supported defaults and syntax.
Tools for common tasks across back-ends are implemented in the last
part of the file.
* For Maintainers
To word it differently, this exporter doesn't rely on any
text-property during the process. Thus, it makes
`org-if-unprotected-at' and alike obsolete in the whole code base. Org
core doesn't have to bother anymore about its exporter weaknesses.
Also, buffer's pre-processing is reduced to its strict minimum: Babel
code expansion. No footnote normalization, list markers to add and
remove...
Being only a beefed-up parse tree reader, any element or object added
to Elements is available through the exporter with no further
modification. Back-end just have to create the appropriate new
transcoders, unless that element or object should be ignored anyway.
* For Developers
All data needed is available in two places: the properties associated
to the element being transcoded, through the use of
`org-element-get-property', and the communication channel, with the
help of `plist-get'. Period.
Also, the exporter takes care about all the filtering required by
options, and enforces the same number of blank lines in the Org buffer
and in the source code (though this can be overcome with the use of
filters). It's easier this way to concentrate on the shape of the
output.
Tools for common tasks (like building table of contents or listings,
or numbering headlines) are provided in the library.
* For Users
Hooks are gone. Sorry. Most of them happened during a pre-process part
that doesn't exist anymore.
Though, there are three way to configure the output, in increasing
power:
1. Variables (customizable or not) are still there, provided by either
the exporter itself or its back-ends.
2. Filter sets are provided to fine-tune output of any
back-end. A filter set is a list of functions, applied in a FIFO
order, whose signature is the resulting string of the previous
function (or the back-end output for the first one) and the
back-end as a symbol. The return value of the last function
replaces back-end's output. If one of the return values is nil, the
element or object on which the filters are applied is ignored in
the final output.
Also, three special filter sets apply on the parse tree, on plain
text, and on the final output.
For example, the LaTeX back-end has the bad habit to "boldify"
deadline, scheduled and closed strings close to time-stamps in the
buffer. I'd rather have them emphasized. Obviously, I don't want to
annoy other back-ends with this. The following will do the trick.
#+begin_src emacs-lisp
(add-to-list 'org-export-filter-time-stamp-functions
(lambda (ts backend)
(if (not (eq backend 'latex))
ts
(replace-regexp-in-string "textbf" "emph" ts))))
#+end_src
3. Whole parts of any back-end can be redefined (or advised). For
example, if I don't like the way the LaTeX back-end transcodes
verbatim text, I can always create an `org-latex-verbatim' function
of my own.
* A (silly) Back-End: `summary'
I want a back-end, `summary', which only exports headlines of the
current buffer, in a markdown format. I would like to have the
opportunity to add a few lines of text before the first headline. It
should also delimit beginning and end of output by ASCII scissors. Oh,
and scissors string should be customizable!
As I only want headlines, I only need to implement an
`org-summary-headline' function. Though, text before the first
headline in my buffer will be ignored (it isn't an headline).
So this back-end will have to define its own buffer keyword:
"#+PREAMBLE:". I need to be able to encounter this keyword more than
once in the buffer as my preamble will probably span on more than one
line. The following snippet will do this, and provide the text as the
value of the `:preamble' property in the communication channel. It
also creates a customizable `org-summary-scissors' variable, which is
rightfully added to the `org-export-summary' group.
#+begin_src emacs-lisp
(defcustom org-summary-scissors "--%<--------------------------------\n"
"String used as a delimiter for summary output.
It should end with a newline character."
:group 'org-export-summary
:type 'string)
(defvar org-summary-option-alist)
(add-to-list 'org-summary-option-alist
'(:preamble "PREAMBLE" nil nil newline))
#+end_src
Now onto the headline transcoder. A quick look at the
`org-element-headline-parser' function tell me that `:raw-value'
property should be enough, as I need no fancy transformation. I might
want to also use `:level' to get the number of "equal" signs before
the text, but a longer look at the list of properties offered in the
communication channel tells me that `org-export-get-relative-level'
may be more adequate. So be it.
#+begin_src emacs-lisp
(defun org-summary-headline (headline contents info)
"Return HEADLINE in a Markdown syntax.
CONTENTS is the contents of the headline. INFO is the property
list used as a communication channel."
(let ((title (org-element-get-property :raw-value headline))
(pre-blank (org-element-get-property :pre-blank headline))
(level (org-export-get-relative-level headline info))
;; Depth of 6 is a hard limit in HTML (and therefore Markdown)
;; syntax.
(limit (min (plist-get info :headline-levels) 6)))
(when (<= level limit)
(concat (make-string level ?=) " " title
(make-string (1+ pre-blank) ?\n)
contents))))
#+end_src
This should be sufficient to take care of document's body. Now, I only
need to add the scissors, the preamble, and the title in the final
output. This all happens with the help of the `org-summary-template'
function.
I remember that "#+TITLE:" belongs to `org-element-parsed-keywords',
which means that its value isn't a string but a secondary string. As
I don't want to transcode it (my back-end only knows about headline),
I'll get the raw value back with `org-element-interpret-secondary'
function (If I had wanted to transcode it, I would have used
`org-export-secondary-string' instead).
#+begin_src emacs-lisp
(defun org-summary-template (contents info)
"Return complete document transcoded with summary back-end.
CONTENTS is the body of the document. INFO is the plist used as
a communication channel."
(let ((title (org-element-interpret-secondary (plist-get info :title)))
(preamble (plist-get info :preamble)))
(concat org-summary-scissors
(upcase title) "\n\n"
preamble "\n\n"
contents
org-summary-scissors)))
#+end_src
Now, I can test all of this by with M-: (org-export-as 'summary) on
a test buffer[1]. So far, so good. But I know better and define an
interactive function for that important action. While I'm at it, it
will display my summary in a buffer named "*Summary*".
#+begin_src emacs-lisp
(defun org-export-as-summary ()
"Create the summary of the current Org buffer.
Summary is displayed in a buffer called \"*Summary*\"."
(interactive)
(when (eq major-mode 'org-mode)
(switch-to-buffer (org-export-to-buffer 'summary "*Summary*"))))
#+end_src
That's all, folks.
I'll try to package its first back-end, org-latex.el, into experimental/
before monday.
Feedback, as always, is welcome.
Some text that will probably be ignored.
* Head 1
some text
** Head 1.1
some text too
*** Head 1.1.1
Some text again
** Head 1.2
some text
* Head 2 :noexport:
some text
--8<---------------cut here---------------end--------------->8---
Regards,
[1] For example, this one:
--8<---------------cut here---------------start------------->8---
#+Title: My Summary Test
#+Preamble: I hope
#+Preamble: that it is working
#+Options: H:2
--
Nicolas Goaziou
next reply other threads:[~2011-11-25 17:33 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-25 17:32 Nicolas Goaziou [this message]
2011-11-25 18:57 ` [ANN] Org Export in contrib Nicolas Goaziou
2011-11-27 11:21 ` Carsten Dominik
2011-11-27 19:54 ` Nicolas Goaziou
2011-11-28 11:40 ` Carsten Dominik
2011-11-28 19:38 ` Nicolas Goaziou
2011-11-27 11:06 ` Carsten Dominik
2011-11-29 6:15 ` Robert Klein
2011-11-29 6:34 ` Robert Klein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ipm8w1jz.fsf@gmail.com \
--to=n.goaziou@gmail.com \
--cc=emacs-orgmode@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.