From: Ihor Radchenko <yantar92@gmail.com>
To: "Przemysław Kamiński" <pk@intrepidus.pl>, emacs-orgmode@gnu.org
Subject: Re: official orgmode parser
Date: Wed, 16 Sep 2020 20:27:36 +0800 [thread overview]
Message-ID: <87bli5nbyf.fsf@localhost> (raw)
In-Reply-To: <fb792b49-7387-7a43-640c-5e76b91b50b1@intrepidus.pl>
FYI: You may find https://github.com/ndwarshuis/org-ml helpful.
Przemysław Kamiński <pk@intrepidus.pl> writes:
> On 9/15/20 2:37 PM, tomas@tuxteam.de wrote:
>> On Tue, Sep 15, 2020 at 01:15:56PM +0200, Przemysław Kamiński wrote:
>>
>> [...]
>>
>>> There's the org-json (or ox-json) package but for some reason I
>>> wasn't able to run it successfully. I guess export to S-exps would
>>> be best here. But yes I'll check that out.
>>
>> If that's your route, perhaps the "Org element API" [1] might be
>> helpful. Especially `org-element-parse-buffer' gives you a Lisp
>> data structure which is supposed to be a parse of your Org buffer.
>>
>> From there to S-expression can be trivial (e.g. `print' or `pp'),
>> depending on what you want to do.
>>
>> Walking the structure should be nice in Lisp, too.
>>
>> The topic of (non-Emacs) parsing of Org comes up regularly, and
>> there is a good (but AFAIK not-quite-complete) Org syntax spec
>> in Worg [2], but there are a couple of difficulties to be mastered
>> before such a thing can become really enjoyable and useful.
>>
>> The loose specification of Org's format (arguably its second
>> or third strongest asset, the first two being its incredible
>> community and Emacs itself) is something which makes this
>> problem "interesting". People have invented lots of usages
>> which might be broken should Org change to a strict formal
>> spec. You don't want to break those people.
>>
>> But yes, perhaps some day someone nails it. Perhaps it's you :)
>>
>> Cheers
>>
>> [1] https://orgmode.org/worg/dev/org-element-api.html
>> [2] https://orgmode.org/worg/dev/org-syntax.html
>>
>> - t
>>
>
> So I looked at (pp (org-element-parse-buffer)) however it does print out
> recursive stuff which other schemes have trouble parsing.
>
> My code looks more or less like this:
>
> (defun org-parse (f)
> (with-temp-buffer
> (find-file f)
> (let* ((parsed (org-element-parse-buffer))
> (all (append org-element-all-elements org-element-all-objects))
> (mapped (org-element-map parsed all
> (lambda (item)
> (strip-parent item)))))
> (pp mapped))))
>
>
> strip-parent is basically (plist-put props :parent nil) for elements
> properties. However it turns out there are more recursive objects, like
>
> :title
> #("Headline 1" 0 10
> (:parent
> (headline #2
> (section
>
> So I'm wondering do I have to do it by hand for all cases or is there
> some way to output only a simple AST without those nested objects?
>
> Best,
> Przemek
next prev parent reply other threads:[~2020-09-16 12:32 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-15 7:58 official orgmode parser Przemysław Kamiński
2020-09-15 8:44 ` Gerry Agbobada
2020-09-16 16:36 ` Matt Huszagh
2020-09-23 8:09 ` Bastien
2020-09-15 9:03 ` Tim Cross
2020-09-15 9:17 ` Przemysław Kamiński
2020-09-15 9:55 ` Russell Adams
2020-09-15 11:15 ` Przemysław Kamiński
2020-09-15 12:37 ` tomas
2020-09-15 18:09 ` Diego Zamboni
2020-09-16 12:09 ` Przemysław Kamiński
2020-09-16 12:20 ` tomas
2020-09-16 12:27 ` Ihor Radchenko [this message]
2020-09-16 0:16 ` Tim Cross
2020-09-16 7:24 ` Marcin Borkowski
2020-09-16 7:56 ` Ihor Radchenko
2020-09-16 11:36 ` Przemysław Kamiński
2020-09-16 12:02 ` Ihor Radchenko
2020-09-16 12:15 ` Przemysław Kamiński
2020-09-17 1:18 ` Ihor Radchenko
2020-09-17 15:24 ` Przemysław Kamiński
2020-09-23 8:09 ` Bastien
2020-09-23 17:46 ` Przemysław Kamiński
2020-09-23 19:50 ` rey-coyrehourcq
2020-11-11 8:58 ` Bastien
2020-10-24 21:12 ` Daniele Nicolodi
2020-10-24 21:35 ` Tom Gillespie
2020-11-11 9:13 ` Bastien
2020-11-12 17:14 ` Tom Gillespie
2020-11-11 9:15 ` Bastien
2020-11-11 13:05 ` Daniele Nicolodi
2020-11-28 19:19 ` Gerry Agbobada
2020-10-26 11:23 ` Ken Mankoff
2020-10-26 14:21 ` Nicolas Goaziou
2020-10-26 16:17 ` Ken Mankoff
2020-10-26 16:24 ` Nicolas Goaziou
2020-10-26 16:47 ` Ken Mankoff
2020-10-26 17:59 ` Tom Gillespie
2020-10-26 20:26 ` Ken Mankoff
2020-10-26 21:00 ` Tom Gillespie
2020-10-26 21:37 ` Ken Mankoff
2020-10-26 22:19 ` Tom Gillespie
2020-10-27 5:42 ` Przemysław Kamiński
2020-11-11 8:59 ` Bastien
2020-11-11 9:00 ` Bastien
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.orgmode.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87bli5nbyf.fsf@localhost \
--to=yantar92@gmail.com \
--cc=emacs-orgmode@gnu.org \
--cc=pk@intrepidus.pl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).