Hi,
I have a question.
At FOSDEM, someone asked me if there was a formal description of the structure of Org files, in some language
that would be the input for a parser (or parser generator?) so that Org file could be easily parsed.
Unfortunately I did not catch the name of the format description language that could be
used for something, not did I catch the name of the person who talked to me.
Can anyone help out here? Let me know what language to use, and maybe help work on such a formal description? I
think it would be useful to have....
Something like yacc (bison, antlr etc) are all 'executable BNF' languages. When they work they can make the code an order of magnitude smaller and development/programming correspondingly easier.
That said I see a couple of hitches.
1. Grammatical handling of languages is based on the assumption of a clear and well defined set of tokens/lexemes. I expect this would be harder in org than the typical programming languages for which yacc etc are used. For example in most 'normal' languages there are comments and strings. These involve some non trivial handling which is entirely hidden from the grammar by being pushed into the lexer.
2. Parsing a program is done for the full program as a rule (IDEs are the exception to the rule). Sensible parsing of program fragments, where the fragmenting could be quite arbitrary, is a bit of a research problem
3. As I see it, the main declarative tool (somewhat akin to grammars) that org uses is regular expressions. IOW org is written with re-s strung together with programming logic ie vanilla elisp. An alternative that stays within the regular framework (not using the heavy guns of context free parsing) may be ragel: http://www.complang.org/ragel/
Rusi