* Org-syntax: Intra-word markup
@ 2021-12-02 10:50 Denis Maier
2021-12-02 11:18 ` Ihor Radchenko
` (2 more replies)
0 siblings, 3 replies; 72+ messages in thread
From: Denis Maier @ 2021-12-02 10:50 UTC (permalink / raw)
To: Org Mode List
Hi everyone,
while we're at discussing org syntax anyway, I thought it's time to
bring up another syntax question:
Currently, org syntax doesn't officially seem to support intra-word
emphasis. Am I missing something?
If the assessment is correct: Is there a reason for this? And, shouldn't
that be officially added?
Best,
Denis
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 10:50 Org-syntax: Intra-word markup Denis Maier
@ 2021-12-02 11:18 ` Ihor Radchenko
2021-12-02 11:30 ` Juan Manuel Macías
2021-12-02 11:58 ` Timothy
2022-01-28 12:12 ` [PATCH] Intra-word markup: \relax Max Nikulin
2 siblings, 1 reply; 72+ messages in thread
From: Ihor Radchenko @ 2021-12-02 11:18 UTC (permalink / raw)
To: Denis Maier; +Cc: Org Mode List
Denis Maier <denismaier@mailbox.org> writes:
> Currently, org syntax doesn't officially seem to support intra-word
> emphasis. Am I missing something?
intra-*word* works just fine for me.
Best,
Ihor
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 11:18 ` Ihor Radchenko
@ 2021-12-02 11:30 ` Juan Manuel Macías
2021-12-02 11:36 ` Denis Maier
` (2 more replies)
0 siblings, 3 replies; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-02 11:30 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: orgmode, Denis Maier
Hi Denis and Ihor,
Ihor Radchenko writes:
> Denis Maier <denismaier@mailbox.org> writes:
>
>> Currently, org syntax doesn't officially seem to support intra-word
>> emphasis. Am I missing something?
>
> intra-*word* works just fine for me.
>
> Best,
> Ihor
I think what Denis is referring to is a construction of the type
*intra*word, which, if I'm not mistaken, is not supported and can only
be achieved by inserting a zero width space.
Best regards,
Juan Manuel
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 11:30 ` Juan Manuel Macías
@ 2021-12-02 11:36 ` Denis Maier
2021-12-02 12:01 ` Ihor Radchenko
2021-12-02 11:42 ` Marco Wahl
2021-12-02 12:00 ` Ihor Radchenko
2 siblings, 1 reply; 72+ messages in thread
From: Denis Maier @ 2021-12-02 11:36 UTC (permalink / raw)
To: Juan Manuel Macías, Ihor Radchenko; +Cc: orgmode
Yes, Juan Manuel. That's it.
See for reference:
https://stackoverflow.com/questions/1218238/how-to-make-part-of-a-word-bold-in-org-mode
Best,
Denis
Am 02.12.2021 um 12:30 schrieb Juan Manuel Macías:
> Hi Denis and Ihor,
>
> Ihor Radchenko writes:
>
>> Denis Maier <denismaier@mailbox.org> writes:
>>
>>> Currently, org syntax doesn't officially seem to support intra-word
>>> emphasis. Am I missing something?
>> intra-*word* works just fine for me.
>>
>> Best,
>> Ihor
> I think what Denis is referring to is a construction of the type
> *intra*word, which, if I'm not mistaken, is not supported and can only
> be achieved by inserting a zero width space.
>
> Best regards,
>
> Juan Manuel
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 11:30 ` Juan Manuel Macías
2021-12-02 11:36 ` Denis Maier
@ 2021-12-02 11:42 ` Marco Wahl
2021-12-02 11:50 ` Denis Maier
2021-12-02 12:02 ` Ihor Radchenko
2021-12-02 12:00 ` Ihor Radchenko
2 siblings, 2 replies; 72+ messages in thread
From: Marco Wahl @ 2021-12-02 11:42 UTC (permalink / raw)
To: Juan Manuel Macías; +Cc: orgmode, Ihor Radchenko, Denis Maier
Hi!
>>> Currently, org syntax doesn't officially seem to support intra-word
>>> emphasis. Am I missing something?
>>
>> intra-*word* works just fine for me.
>>
>> Best,
>> Ihor
>
> I think what Denis is referring to is a construction of the type
> *intra*word, which, if I'm not mistaken, is not supported and can only
> be achieved by inserting a zero width space.
Is there a recommended way to insert a zero with space?
BTW occasionally I use
(defun mw-insert-zero-width-whitespace ()
"Insert a space with zero width."
(interactive)
(insert ?\x200B))
Thanks and ciao,
--
Marco
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 11:42 ` Marco Wahl
@ 2021-12-02 11:50 ` Denis Maier
2021-12-02 12:10 ` Ihor Radchenko
2021-12-02 12:02 ` Ihor Radchenko
1 sibling, 1 reply; 72+ messages in thread
From: Denis Maier @ 2021-12-02 11:50 UTC (permalink / raw)
To: Marco Wahl, Juan Manuel Macías; +Cc: orgmode, Ihor Radchenko
Am 02.12.2021 um 12:42 schrieb Marco Wahl:
> Hi!
>
>>>> Currently, org syntax doesn't officially seem to support intra-word
>>>> emphasis. Am I missing something?
>>>
>>> intra-*word* works just fine for me.
>>>
>>> Best,
>>> Ihor
>>
>> I think what Denis is referring to is a construction of the type
>> *intra*word, which, if I'm not mistaken, is not supported and can only
>> be achieved by inserting a zero width space.
>
> Is there a recommended way to insert a zero with space?
>
> BTW occasionally I use
>
> (defun mw-insert-zero-width-whitespace ()
> "Insert a space with zero width."
> (interactive)
> (insert ?\x200B))
>
>
> Thanks and ciao,
Just a furter remark: while zero-width-spaces can be used as a
workaround, they may create problems in some export formats. E.g., they
will mess up hyphenation in latex. I think if read somewhere that those
can be removed with hooks or filters, but I think that shouldn't be
necessary.
Denis
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 10:50 Org-syntax: Intra-word markup Denis Maier
2021-12-02 11:18 ` Ihor Radchenko
@ 2021-12-02 11:58 ` Timothy
2021-12-02 12:26 ` Denis Maier
2022-01-28 12:12 ` [PATCH] Intra-word markup: \relax Max Nikulin
2 siblings, 1 reply; 72+ messages in thread
From: Timothy @ 2021-12-02 11:58 UTC (permalink / raw)
To: Denis Maier; +Cc: emacs-orgmode
[-- Attachment #1: Type: text/plain, Size: 568 bytes --]
Hi Denis,
> Currently, org syntax doesn’t officially seem to support intra-word emphasis. Am
> I missing something?
I’d describe it as supported via-zero width spaces.
You may be interested in <https://blog.tecosaur.com/tmio/2021-05-31-async.html#easy-zero-width>.
> If the assessment is correct: Is there a reason for this? And, shouldn’t that
> be officially added?
Do you happen to have any ideas on how this could be achieved? I’d rather not
resort to having to do things like `\ast{}' and `\tilde{}' too much.
All the best,
Timothy
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 11:30 ` Juan Manuel Macías
2021-12-02 11:36 ` Denis Maier
2021-12-02 11:42 ` Marco Wahl
@ 2021-12-02 12:00 ` Ihor Radchenko
[not found] ` <87r1avtdjy.fsf@ucl.ac.uk>
2021-12-02 12:28 ` Denis Maier
2 siblings, 2 replies; 72+ messages in thread
From: Ihor Radchenko @ 2021-12-02 12:00 UTC (permalink / raw)
To: Juan Manuel Macías; +Cc: orgmode, Denis Maier
Juan Manuel Macías <maciaschain@posteo.net> writes:
>> intra-*word* works just fine for me.
>>
> I think what Denis is referring to is a construction of the type
> *intra*word, which, if I'm not mistaken, is not supported and can only
> be achieved by inserting a zero width space.
I see. We had a discussion about emphasis issues in
https://orgmode.org/list/8735nnq73n.fsf@localhost
The conclusion from there is that supporting such scenarios will
introduce various edge cases. We would need to make the emaphsis parser
more and more complex inevitably introducing errors.
An alternative may be some kind of "forced" emphasis syntax where Org
does not have to guess about the emphasis using non-transparent rules.
But it's what zero width space is for and it is what we recommend in the
Org manual.
Best,
Ihor
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 11:36 ` Denis Maier
@ 2021-12-02 12:01 ` Ihor Radchenko
0 siblings, 0 replies; 72+ messages in thread
From: Ihor Radchenko @ 2021-12-02 12:01 UTC (permalink / raw)
To: Denis Maier; +Cc: Juan Manuel Macías, orgmode
Denis Maier <denismaier@mailbox.org> writes:
> Yes, Juan Manuel. That's it.
>
> See for reference:
> https://stackoverflow.com/questions/1218238/how-to-make-part-of-a-word-bold-in-org-mode
Please, do not use that stackoverflow answer. It is not officially
supported, breaks exporting, and will not work anymore in future Org
versions.
Best,
Ihor
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 11:42 ` Marco Wahl
2021-12-02 11:50 ` Denis Maier
@ 2021-12-02 12:02 ` Ihor Radchenko
1 sibling, 0 replies; 72+ messages in thread
From: Ihor Radchenko @ 2021-12-02 12:02 UTC (permalink / raw)
To: Marco Wahl; +Cc: Juan Manuel Macías, orgmode, Denis Maier
Marco Wahl <marcowahlsoft@gmail.com> writes:
> Is there a recommended way to insert a zero with space?
C-x 8 <RET>
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 11:50 ` Denis Maier
@ 2021-12-02 12:10 ` Ihor Radchenko
2021-12-02 12:40 ` Denis Maier
2021-12-02 12:48 ` Max Nikulin
0 siblings, 2 replies; 72+ messages in thread
From: Ihor Radchenko @ 2021-12-02 12:10 UTC (permalink / raw)
To: Denis Maier; +Cc: Juan Manuel Macías, Marco Wahl, orgmode
Denis Maier <denismaier@mailbox.org> writes:
>
> Just a furter remark: while zero-width-spaces can be used as a
> workaround, they may create problems in some export formats. E.g., they
> will mess up hyphenation in latex. I think if read somewhere that those
> can be removed with hooks or filters, but I think that shouldn't be
> necessary.
Can you create an example of such scenario and post it as a bug?
Probably, we just need to strip all zero-width spaces at the basic ox.el
level.
Best,
Ihor
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 11:58 ` Timothy
@ 2021-12-02 12:26 ` Denis Maier
2021-12-02 13:07 ` Ihor Radchenko
0 siblings, 1 reply; 72+ messages in thread
From: Denis Maier @ 2021-12-02 12:26 UTC (permalink / raw)
To: Timothy; +Cc: emacs-orgmode
Hi Timothy,
Am 02.12.2021 um 12:58 schrieb Timothy:
> Hi Denis,
>
>> Currently, org syntax doesn’t officially seem to support intra-word emphasis. Am
>> I missing something?
> I’d describe it as supported via-zero width spaces.
>
> You may be interested in <https://blog.tecosaur.com/tmio/2021-05-31-async.html#easy-zero-width>.
Thank's that's helpful.
>
>> If the assessment is correct: Is there a reason for this? And, shouldn’t that
>> be officially added?
> Do you happen to have any ideas on how this could be achieved? I’d rather not
> resort to having to do things like `\ast{}' and `\tilde{}' too much.
Well, not really. I just don't understand why /intra/word shouldn't mean
\emph{intra}word. Pandoc's markdown supports *intra*word, asciidoc
supports it via unconstrained formatting pairs:
https://docs.asciidoctor.org/asciidoc/latest/text/#unconstrained; so
__intra__word.
And, as org syntax is said to be the superior markup language, I thought
that must be possible ;-)
I understand zero width spaces are the official workaround, but I don't
really like having invisible characters in my documents. Automatically
removing all of them on export might also introduce problems. Perhaps
some have been added on purpose, and not just to help org?
As for suggestions: If just using /intra/word creates ambiguities, what
about the asciidoc solution? So //intra//word?
In fact, I'd even use raw latex for this things. It's true, they are
rare enough. So I wouldn't mind an occassional `\emph{}`.
Best,
Denis
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
[not found] ` <87r1avtdjy.fsf@ucl.ac.uk>
@ 2021-12-02 12:27 ` Denis Maier
2021-12-02 13:06 ` Eric S Fraga
0 siblings, 1 reply; 72+ messages in thread
From: Denis Maier @ 2021-12-02 12:27 UTC (permalink / raw)
To: Org Mode List
Am 02.12.2021 um 13:08 schrieb Eric S Fraga:
> My solution, in these case, is to fall back to LaTeX using @@latex:...@@
> (and equivalent for HTML, if desired). Not pretty but I need this so
> seldom that I am happy with the org emphasis support generally.
>
Hi Eric,
Am 02.12.2021 um 13:08 schrieb Eric S Fraga:
> My solution, in these case, is to fall back to LaTeX using @@latex:...@@
> (and equivalent for HTML, if desired). Not pretty but I need this so
> seldom that I am happy with the org emphasis support generally.
>
This works if your target is just latex, but not if you have multiple
targets, right?
Denis
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 12:00 ` Ihor Radchenko
[not found] ` <87r1avtdjy.fsf@ucl.ac.uk>
@ 2021-12-02 12:28 ` Denis Maier
2021-12-02 12:55 ` Ihor Radchenko
1 sibling, 1 reply; 72+ messages in thread
From: Denis Maier @ 2021-12-02 12:28 UTC (permalink / raw)
To: Ihor Radchenko, Juan Manuel Macías; +Cc: orgmode
Am 02.12.2021 um 13:00 schrieb Ihor Radchenko:
> Juan Manuel Macías <maciaschain@posteo.net> writes:
>
>>> intra-*word* works just fine for me.
>>>
>> I think what Denis is referring to is a construction of the type
>> *intra*word, which, if I'm not mistaken, is not supported and can only
>> be achieved by inserting a zero width space.
> I see. We had a discussion about emphasis issues in
> https://orgmode.org/list/8735nnq73n.fsf@localhost
>
> The conclusion from there is that supporting such scenarios will
> introduce various edge cases. We would need to make the emaphsis parser
> more and more complex inevitably introducing errors.
Thanks, I'll try to read that thread in due time.
>
> An alternative may be some kind of "forced" emphasis syntax where Org
> does not have to guess about the emphasis using non-transparent rules.
> But it's what zero width space is for and it is what we recommend in the
> Org manual.
As for the forced syntax. What do you think about the asciidoc solution?
Denis
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 12:10 ` Ihor Radchenko
@ 2021-12-02 12:40 ` Denis Maier
2021-12-02 12:54 ` Ihor Radchenko
2021-12-02 12:48 ` Max Nikulin
1 sibling, 1 reply; 72+ messages in thread
From: Denis Maier @ 2021-12-02 12:40 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: Marco Wahl, Juan Manuel Macías, orgmode
[-- Attachment #1: Type: text/plain, Size: 767 bytes --]
Am 02.12.2021 um 13:10 schrieb Ihor Radchenko:
> Denis Maier<denismaier@mailbox.org> writes:
>
>> Just a furter remark: while zero-width-spaces can be used as a
>> workaround, they may create problems in some export formats. E.g., they
>> will mess up hyphenation in latex. I think if read somewhere that those
>> can be removed with hooks or filters, but I think that shouldn't be
>> necessary.
> Can you create an example of such scenario and post it as a bug?
> Probably, we just need to strip all zero-width spaces at the basic ox.el
> level.
To be clear: That's not an org bug. It's just that latex won't be able
such a word. If | is a zero width space, the word "hyphen|ation" is not
the same as "hyphenation".
1. hyphenation
2. hyphen|ation
Best,
Denis
[-- Attachment #2.1: Type: text/html, Size: 1431 bytes --]
[-- Attachment #2.2: b7OGd2OT4Kkun0eA.png --]
[-- Type: image/png, Size: 4888 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 12:10 ` Ihor Radchenko
2021-12-02 12:40 ` Denis Maier
@ 2021-12-02 12:48 ` Max Nikulin
1 sibling, 0 replies; 72+ messages in thread
From: Max Nikulin @ 2021-12-02 12:48 UTC (permalink / raw)
To: emacs-orgmode
On 02/12/2021 19:10, Ihor Radchenko wrote:
> Denis Maier writes:
>
>> Just a furter remark: while zero-width-spaces can be used as a
>> workaround, they may create problems in some export formats. E.g., they
>> will mess up hyphenation in latex. I think if read somewhere that those
>> can be removed with hooks or filters, but I think that shouldn't be
>> necessary.
>
> Probably, we just need to strip all zero-width spaces at the basic ox.el
> level.
I think, legitimate cases when zero-width spaces should be preserved in
a document may exist, so unconditionally stripping them is not a perfect
solution.
I am afraid, regexps detecting start and end of emphasis are similar to
a short blanket. They will always fail for some cases, especially since
verbatim, URLs and similar contexts (that significantly differ from
prose in respect to punctuation) do not have higher priority for parser.
Extensive test set is required for tuning of heuristics. Failures should
be reported in a such way that allows to estimate overall quality before
and after change. Ideally, format of file with such tests should allow
to use the *same* input data for other tools like ruby-org.
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 12:40 ` Denis Maier
@ 2021-12-02 12:54 ` Ihor Radchenko
2021-12-02 13:14 ` Juan Manuel Macías
0 siblings, 1 reply; 72+ messages in thread
From: Ihor Radchenko @ 2021-12-02 12:54 UTC (permalink / raw)
To: Denis Maier; +Cc: Juan Manuel Macías, Marco Wahl, orgmode
Denis Maier <denismaier@mailbox.org> writes:
>> Can you create an example of such scenario and post it as a bug?
>> Probably, we just need to strip all zero-width spaces at the basic ox.el
>> level.
> To be clear: That's not an org bug. It's just that latex won't be able
> such a word. If | is a zero width space, the word "hyphen|ation" is not
> the same as "hyphenation".
> 1. hyphenation
> 2. hyphen|ation
You are right for your example, but if we force the user to put
*hyphen*|ation to create bold emphasis, it should not be any different
compared to @@latex:\textbf{hyphen}ation@@. Meanwhile the *hyphen*|ation
gets exported as \textbf{hyphen}|ation keeping the zero width space.
Best,
Ihor
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 12:28 ` Denis Maier
@ 2021-12-02 12:55 ` Ihor Radchenko
0 siblings, 0 replies; 72+ messages in thread
From: Ihor Radchenko @ 2021-12-02 12:55 UTC (permalink / raw)
To: Denis Maier; +Cc: Juan Manuel Macías, orgmode
Denis Maier <denismaier@mailbox.org> writes:
>> An alternative may be some kind of "forced" emphasis syntax where Org
>> does not have to guess about the emphasis using non-transparent rules.
>> But it's what zero width space is for and it is what we recommend in the
>> Org manual.
> As for the forced syntax. What do you think about the asciidoc solution?
Can you elaborate?
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 12:27 ` Denis Maier
@ 2021-12-02 13:06 ` Eric S Fraga
0 siblings, 0 replies; 72+ messages in thread
From: Eric S Fraga @ 2021-12-02 13:06 UTC (permalink / raw)
To: Denis Maier; +Cc: Org Mode List
On Thursday, 2 Dec 2021 at 13:27, Denis Maier wrote:
> This works if your target is just latex, but not if you have multiple
> targets, right?
Multiple targets are possible:
@@latex:\textbf{@@@@html:<strong>@@intra@@latex:}@@@@html:</strong>@@word.
Just very ugly! 🤣
Of course, if you do this more than once, a macro can help...
--
: Eric S Fraga, with org release_9.5.1-231-g6766c4 in Emacs 29.0.50
: Latest paper written in org: https://arxiv.org/abs/2106.05096
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 12:26 ` Denis Maier
@ 2021-12-02 13:07 ` Ihor Radchenko
2021-12-02 15:51 ` Max Nikulin
2021-12-02 19:03 ` Nicolas Goaziou
0 siblings, 2 replies; 72+ messages in thread
From: Ihor Radchenko @ 2021-12-02 13:07 UTC (permalink / raw)
To: Denis Maier; +Cc: emacs-orgmode, Nicolas Goaziou, Timothy
Denis Maier <denismaier@mailbox.org> writes:
> As for suggestions: If just using /intra/word creates ambiguities, what
> about the asciidoc solution? So //intra//word?
I do like this idea.
Though I would also like to hear Nicolas' opinion.
Best,
Ihor
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 12:54 ` Ihor Radchenko
@ 2021-12-02 13:14 ` Juan Manuel Macías
2021-12-02 13:28 ` Denis Maier
0 siblings, 1 reply; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-02 13:14 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: orgmode, denismaier
Ihor Radchenko writes:
> Denis Maier <denismaier@mailbox.org> writes:
>
>>> Can you create an example of such scenario and post it as a bug?
>>> Probably, we just need to strip all zero-width spaces at the basic ox.el
>>> level.
>> To be clear: That's not an org bug. It's just that latex won't be able
>> such a word. If | is a zero width space, the word "hyphen|ation" is not
>> the same as "hyphenation".
>> 1. hyphenation
>> 2. hyphen|ation
>
> You are right for your example, but if we force the user to put
> *hyphen*|ation to create bold emphasis, it should not be any different
> compared to @@latex:\textbf{hyphen}ation@@. Meanwhile the *hyphen*|ation
> gets exported as \textbf{hyphen}|ation keeping the zero width space.
--
I would say that they are very random cases, and therefore difficult to
reproduce. In the 'hyphenation' example, if we load the package
showhypehns, you see that:
/hyphen/ation (with zero width sp)
and
\emph{hyphen}ation
they are cut in the same way. But differently from
hyphenation (without emphasis)
(compiled with LuaTeX).
Anyway, I have come across some curious cases. For example, a long time
ago I had defined a macro for text in other languages:
#+MACRO: lg (eval (if (org-export-derived-backend-p org-export-current-backend 'latex) (concat "@@latex:\\foreignlanguage{@@" $1 "@@latex:}{@@" "\u200B" $2 "\u200B" "@@latex:}@@") $2))
I needed to add before and after a zero width space, but doing so, the
shape of the text was altered. That can be reproduced with this example:
#+LaTeX_Header: \usepackage{showhyphens}
#+LaTeX_Header:\usepackage{lipsum,multicol}
#+LaTeX_Header:\usepackage[spanish]{babel}
#+LaTeX_Header: \def\example{\lipsum[1]}
#+LaTeX_Header: \def\zwsp{\char"200B{}}
#+OPTIONS: toc:nil
@@latex:\begin{multicols}{2}@@
@@latex:\foreignlanguage{italian}{\zwsp\example\zwsp}@@
@@latex:\foreignlanguage{italian}{\example}@@
@@latex:\end{multicols}@@
Best regards,
Juan Manuel
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 13:14 ` Juan Manuel Macías
@ 2021-12-02 13:28 ` Denis Maier
0 siblings, 0 replies; 72+ messages in thread
From: Denis Maier @ 2021-12-02 13:28 UTC (permalink / raw)
To: Juan Manuel Macías, Ihor Radchenko; +Cc: orgmode
[-- Attachment #1: Type: text/plain, Size: 2468 bytes --]
Am 02.12.2021 um 14:14 schrieb Juan Manuel Macías:
> Ihor Radchenko writes:
>
>> Denis Maier<denismaier@mailbox.org> writes:
>>
>>>> Can you create an example of such scenario and post it as a bug?
>>>> Probably, we just need to strip all zero-width spaces at the basic ox.el
>>>> level.
>>> To be clear: That's not an org bug. It's just that latex won't be able
>>> such a word. If | is a zero width space, the word "hyphen|ation" is not
>>> the same as "hyphenation".
>>> 1. hyphenation
>>> 2. hyphen|ation
>> You are right for your example, but if we force the user to put
>> *hyphen*|ation to create bold emphasis, it should not be any different
>> compared to @@latex:\textbf{hyphen}ation@@. Meanwhile the*hyphen*|ation
>> gets exported as \textbf{hyphen}|ation keeping the zero width space.
> -- I would say that they are very random cases, and therefore
> difficult to reproduce. In the 'hyphenation' example, if we load the
> package showhypehns, you see that: /hyphen/ation (with zero width sp)
> and \emph{hyphen}ation they are cut in the same way. But differently
> from hyphenation (without emphasis) (compiled with LuaTeX). Anyway, I
> have come across some curious cases. For example, a long time ago I
> had defined a macro for text in other languages: #+MACRO: lg (eval (if
> (org-export-derived-backend-p org-export-current-backend 'latex)
> (concat "@@latex:\\foreignlanguage{@@" $1 "@@latex:}{@@" "\u200B" $2
> "\u200B" "@@latex:}@@") $2)) I needed to add before and after a zero
> width space, but doing so, the shape of the text was altered. That can
> be reproduced with this example: #+LaTeX_Header:
> \usepackage{showhyphens} #+LaTeX_Header:\usepackage{lipsum,multicol}
> #+LaTeX_Header:\usepackage[spanish]{babel} #+LaTeX_Header:
> \def\example{\lipsum[1]} #+LaTeX_Header: \def\zwsp{\char"200B{}}
> #+OPTIONS: toc:nil @@latex:\begin{multicols}{2}@@
> @@latex:\foreignlanguage{italian}{\zwsp\example\zwsp}@@
> @@latex:\foreignlanguage{italian}{\example}@@
> @@latex:\end{multicols}@@ Best regards, Juan Manuel
Thanks Juan Manuel. I should have tried that first. Hyphenation is the
same for both /hyphen/ation (with zero width sp) and
\emph{hyphen}ation. (Maybe I can nudge Hans Hagen to add some low level
trickery in context that removes the groups before doing the
hyphenation... but that's a different story.) Anyway, as Juan Manuel
shows there can be cases where zero width spaces cause problems.
Denis
[-- Attachment #2: Type: text/html, Size: 3801 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
@ 2021-12-02 13:36 autofrettage
2021-12-02 15:24 ` Robert Pluim
0 siblings, 1 reply; 72+ messages in thread
From: autofrettage @ 2021-12-02 13:36 UTC (permalink / raw)
To: emacs-orgmode@gnu.org
Someone brought up edge and corner cases, so I simply have to mention the German gender stars ("Gendersternchen").
In an effort to make German gender neutral, some individuals use '*' in the midst of some words, e.g. rower.
Ordinary German:
male rower = Ruderer
female rower = Ruderin
Gender neutral German with gender star:
any kind of rower = Ruder*in
Yours
Rasmus
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 13:36 Org-syntax: Intra-word markup autofrettage
@ 2021-12-02 15:24 ` Robert Pluim
2021-12-02 17:11 ` autofrettage
0 siblings, 1 reply; 72+ messages in thread
From: Robert Pluim @ 2021-12-02 15:24 UTC (permalink / raw)
To: autofrettage; +Cc: emacs-orgmode@gnu.org
>>>>> On Thu, 02 Dec 2021 13:36:48 +0000, autofrettage <autofrettage@protonmail.ch> said:
autofrettage> Someone brought up edge and corner cases, so I simply have to mention the German gender stars ("Gendersternchen").
autofrettage> In an effort to make German gender neutral, some individuals use '*' in the midst of some words, e.g. rower.
autofrettage> Ordinary German:
autofrettage> male rower = Ruderer
autofrettage> female rower = Ruderin
autofrettage> Gender neutral German with gender star:
autofrettage> any kind of rower = Ruder*in
But with the 'female' suffix? Thatʼs almost as bad as 'écriture
inclusive'. Surely 'Ruder**'? 😇
Robert
--
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 13:07 ` Ihor Radchenko
@ 2021-12-02 15:51 ` Max Nikulin
2021-12-02 18:11 ` Tom Gillespie
2021-12-02 19:03 ` Nicolas Goaziou
1 sibling, 1 reply; 72+ messages in thread
From: Max Nikulin @ 2021-12-02 15:51 UTC (permalink / raw)
To: emacs-orgmode
On 02/12/2021 20:07, Ihor Radchenko wrote:
>
>> As for suggestions: If just using /intra/word creates ambiguities, what
>> about the asciidoc solution? So //intra//word?
>
> I do like this idea.
- Some //text <https://orgmode.org/> surprise//
- ++another ~i++~ problem++
First wins...
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 15:24 ` Robert Pluim
@ 2021-12-02 17:11 ` autofrettage
0 siblings, 0 replies; 72+ messages in thread
From: autofrettage @ 2021-12-02 17:11 UTC (permalink / raw)
To: Robert Pluim; +Cc: emacs-orgmode@gnu.org
Sent with ProtonMail Secure Email.
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, December 2nd, 2021 at 4:24 PM, Robert Pluim <rpluim@gmail.com> wrote:
>> autofrettage> any kind of rower = Ruder*in
>
> But with the 'female' suffix? Thatʼs almost as bad as 'écriture
> inclusive'. Surely 'Ruder**'? 😇
The German wikipedia page* about gender neutral language is well
over 30 k words long, and there are almost 250 bibliographic
references. It lists a number of alternatives, such as (based
on Lehrer and Lehrerin, the German words for teacher):
+ Lehrx
+ Lehry
+ Lehrerin
+ Lehrer/-in
+ Lehrer/in
+ LehrerIn
+ Lehrer(in)
+ Lehrer:in
+ Lehrer*in
+ Lehrer_in
+ Lehrer_In
+ Lehrer•in
+ Lehrkraft
+ Lehrperson
+ Lehrende
+ ...
So, by all means, join the party. They will consider all aspects
of your suggestion, and being dead serious about it.
Yours
Rasmus
* https://de.wikipedia.org/wiki/Geschlechtergerechte_Sprache
p.s. There are even browser plug-ins, removing all of this
political correctness, making texts _much_ easier to read.
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 15:51 ` Max Nikulin
@ 2021-12-02 18:11 ` Tom Gillespie
2021-12-02 19:09 ` Juan Manuel Macías
` (3 more replies)
0 siblings, 4 replies; 72+ messages in thread
From: Tom Gillespie @ 2021-12-02 18:11 UTC (permalink / raw)
To: emacs-orgmode
I don't mean to be a wet blanket, but the edge cases for
the current markup syntax are already hard enough to
implement correctly, to the point where different parts of
Org mode are inconsistent. Intra-word markup isn't viable
because there simply isn't any sane way to parse something
like *hello world*/hrm/oh no*. The other issue is that this will
degrade parsing performance because almost every
character could precede the start of a markup section.
I recommend anyone suggesting solutions try to implement
something that can parse the markup unambiguously with
lots of nasty test cases. You will likely find that it is impossible
to consistently tokenize markup, and that you have to hand
write a whole bunch of heuristics, making Org syntax even
harder to implement correctly.
Any solution that suggests extending how =/*~+_ can be
used gets a hard no from me. I could see teaching other
exporters how to interpret \emph{hello}world, but trying for
to have any sane behavior for something like
why *hello*world oh no a wild askterisk*
is not worth it.
Best,
Tom
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 13:07 ` Ihor Radchenko
2021-12-02 15:51 ` Max Nikulin
@ 2021-12-02 19:03 ` Nicolas Goaziou
2021-12-02 19:34 ` Juan Manuel Macías
2021-12-03 14:24 ` Max Nikulin
1 sibling, 2 replies; 72+ messages in thread
From: Nicolas Goaziou @ 2021-12-02 19:03 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: Timothy, emacs-orgmode, Denis Maier
Hello,
Ihor Radchenko <yantar92@gmail.com> writes:
> Denis Maier <denismaier@mailbox.org> writes:
>
>> As for suggestions: If just using /intra/word creates ambiguities, what
>> about the asciidoc solution? So //intra//word?
>
> I do like this idea.
>
> Though I would also like to hear Nicolas' opinion.
I sympathize to the idea of intra-word emphasis, but the syntax above is
going to cause some ambiguous situations.
I do think the marker + zero-width space is one way to go. We could, as
an improvement, consider zero-width spaces around emphasis markers to be
part of the markup, and replace them along during export.
Another solution is to introduce a less-subtle, but less prone to
ambiguity, syntax, e.g.,
/{bold}/markup or /|bold|/markup
where /{ }/ or /| |/ become "extended" markers.
I find zero-with spaces solution much more elegant. It also doesn't
change current syntax, which is a big advantage.
Regards,
--
Nicolas Goaziou
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 18:11 ` Tom Gillespie
@ 2021-12-02 19:09 ` Juan Manuel Macías
2021-12-04 13:07 ` Org-syntax: emphasis and not English punctuation Max Nikulin
2021-12-02 20:47 ` Org-syntax: Intra-word markup Denis Maier
` (2 subsequent siblings)
3 siblings, 1 reply; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-02 19:09 UTC (permalink / raw)
To: Tom Gillespie; +Cc: orgmode
Tom Gillespie writes:
> I don't mean to be a wet blanket, but the edge cases for
> the current markup syntax are already hard enough to
> implement correctly, to the point where different parts of
> Org mode are inconsistent. Intra-word markup isn't viable
> because there simply isn't any sane way to parse something
> like *hello world*/hrm/oh no*. The other issue is that this will
> degrade parsing performance because almost every
> character could precede the start of a markup section.
>
> I recommend anyone suggesting solutions try to implement
> something that can parse the markup unambiguously with
> lots of nasty test cases. You will likely find that it is impossible
> to consistently tokenize markup, and that you have to hand
> write a whole bunch of heuristics, making Org syntax even
> harder to implement correctly.
>
> Any solution that suggests extending how =/*~+_ can be
> used gets a hard no from me. I could see teaching other
> exporters how to interpret \emph{hello}world, but trying for
> to have any sane behavior for something like
> why *hello*world oh no a wild askterisk*
> is not worth it.
I believe, that emphasis marks are a part of Org that can be very
shocking to new users. I mean, there is a series of behaviors that seem
obvious and trivial in the emphasized text, but that in Org are not
possible out of the box, unless you configure
`org-emphasis-regexp-components'. Three quick examples. This in Org is
not possible out of the box:
#+begin_example
[/emphasis/]
¡/emphasis/!
¿/Emphasis/?
#+end_example
Nor is it possible ---out of the box--- to extend emphasis beyond a
certain number of lines. New users who come from other forms of markup
maybe expect the obvious to be something like:
some-text begin-emphasis whatever-is-in-between end-emphasis more-text
Over time one ends up seeing these things more as a feature than as a
bug :-) But those little inconsistencies make the Org syntax a bit ugly,
IMHO. I can't think of how to improve that, though.
Best regards,
Juan Manuel
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 19:03 ` Nicolas Goaziou
@ 2021-12-02 19:34 ` Juan Manuel Macías
2021-12-02 23:05 ` Nicolas Goaziou
2021-12-03 14:24 ` Max Nikulin
1 sibling, 1 reply; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-02 19:34 UTC (permalink / raw)
To: Nicolas Goaziou; +Cc: Denis Maier, emacs-orgmode, Ihor Radchenko, Timothy
Hi Nicolas and all,
Nicolas Goaziou writes:
> I find zero-with spaces solution much more elegant. It also doesn't
> change current syntax, which is a big advantage.
I agree that zero width spaces work fine as a solution, but I think they
should not be understood as part of the syntax but as a punctual
(temporal?) remedy to certain scenarios. As mentioned before, in LaTeX
zero width spaces can produce unexpected effects and modify the final
form of the text (at least in luatex). I also don't know if it would be
useful to remove all zero width spaces in the export process, because in
some cases the user may want to keep them, as I think Maxim commented in
a previous message.
As for the solution of using complementary marks ("//...//", etc.), I
think it would undermine consistency, as those marks would only be to
fix exceptions.
It's a tricky subject...
Best regards,
Juan Manuel
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 18:11 ` Tom Gillespie
2021-12-02 19:09 ` Juan Manuel Macías
@ 2021-12-02 20:47 ` Denis Maier
2021-12-02 22:44 ` Samuel Wales
2021-12-03 14:53 ` Max Nikulin
2021-12-03 23:51 ` Tim Cross
3 siblings, 1 reply; 72+ messages in thread
From: Denis Maier @ 2021-12-02 20:47 UTC (permalink / raw)
To: Tom Gillespie, emacs-orgmode
Am 02.12.2021 um 19:11 schrieb Tom Gillespie:
> I don't mean to be a wet blanket, but the edge cases for
> the current markup syntax are already hard enough to
> implement correctly, to the point where different parts of
> Org mode are inconsistent. Intra-word markup isn't viable
> because there simply isn't any sane way to parse something
> like *hello world*/hrm/oh no*. The other issue is that this will
> degrade parsing performance because almost every
> character could precede the start of a markup section.
>
> I recommend anyone suggesting solutions try to implement
> something that can parse the markup unambiguously with
> lots of nasty test cases. You will likely find that it is impossible
> to consistently tokenize markup, and that you have to hand
> write a whole bunch of heuristics, making Org syntax even
> harder to implement correctly.
>
> Any solution that suggests extending how =/*~+_ can be
> used gets a hard no from me. I could see teaching other
> exporters how to interpret \emph{hello}world, but trying for
> to have any sane behavior for something like
> why *hello*world oh no a wild askterisk*
> is not worth it.
As I've said before, I could well live with \emph{what}ever or something
similar.
Denis
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 20:47 ` Org-syntax: Intra-word markup Denis Maier
@ 2021-12-02 22:44 ` Samuel Wales
0 siblings, 0 replies; 72+ messages in thread
From: Samuel Wales @ 2021-12-02 22:44 UTC (permalink / raw)
To: Denis Maier; +Cc: Tom Gillespie, emacs-orgmode
a silly question. don't we already use something kinda similar to
\emph{what}ever for all backends? could we do so?
On 12/2/21, Denis Maier <denismaier@mailbox.org> wrote:
> Am 02.12.2021 um 19:11 schrieb Tom Gillespie:
>> I don't mean to be a wet blanket, but the edge cases for
>> the current markup syntax are already hard enough to
>> implement correctly, to the point where different parts of
>> Org mode are inconsistent. Intra-word markup isn't viable
>> because there simply isn't any sane way to parse something
>> like *hello world*/hrm/oh no*. The other issue is that this will
>> degrade parsing performance because almost every
>> character could precede the start of a markup section.
>>
>> I recommend anyone suggesting solutions try to implement
>> something that can parse the markup unambiguously with
>> lots of nasty test cases. You will likely find that it is impossible
>> to consistently tokenize markup, and that you have to hand
>> write a whole bunch of heuristics, making Org syntax even
>> harder to implement correctly.
>>
>> Any solution that suggests extending how =/*~+_ can be
>> used gets a hard no from me. I could see teaching other
>> exporters how to interpret \emph{hello}world, but trying for
>> to have any sane behavior for something like
>> why *hello*world oh no a wild askterisk*
>> is not worth it.
>
> As I've said before, I could well live with \emph{what}ever or something
> similar.
>
> Denis
>
>
--
The Kafka Pandemic
Please learn what misopathy is.
https://thekafkapandemic.blogspot.com/2013/10/why-some-diseases-are-wronged.html
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 19:34 ` Juan Manuel Macías
@ 2021-12-02 23:05 ` Nicolas Goaziou
2021-12-02 23:24 ` Juan Manuel Macías
0 siblings, 1 reply; 72+ messages in thread
From: Nicolas Goaziou @ 2021-12-02 23:05 UTC (permalink / raw)
To: Juan Manuel Macías
Cc: Timothy, emacs-orgmode, Ihor Radchenko, Denis Maier
Hello,
Juan Manuel Macías <maciaschain@posteo.net> writes:
> I agree that zero width spaces work fine as a solution, but I think they
> should not be understood as part of the syntax but as a punctual
> (temporal?) remedy to certain scenarios. As mentioned before, in LaTeX
> zero width spaces can produce unexpected effects and modify the final
> form of the text (at least in luatex). I also don't know if it would be
> useful to remove all zero width spaces in the export process, because in
> some cases the user may want to keep them, as I think Maxim commented in
> a previous message.
We may be misunderstanding each other.
I'm suggesting to remove zero-width spaces contiguous to emphasis
markers only. Therefore LaTeX process would npot see them. Other zero
width spaces, e.g., inserted by user, are kept. AFAICT, the two last
points you mention are not relevant with my proposal.
Besides, they already part of the syntax, in some way. So that ship has
sailed long ago.
Regards,
--
Nicolas Goaziou
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 23:05 ` Nicolas Goaziou
@ 2021-12-02 23:24 ` Juan Manuel Macías
0 siblings, 0 replies; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-02 23:24 UTC (permalink / raw)
To: Nicolas Goaziou; +Cc: orgmode
Nicolas Goaziou writes:
> I'm suggesting to remove zero-width spaces contiguous to emphasis
> markers only. Therefore LaTeX process would npot see them. Other zero
> width spaces, e.g., inserted by user, are kept. AFAICT, the two last
> points you mention are not relevant with my proposal.
>
> Besides, they already part of the syntax, in some way. So that ship has
> sailed long ago.
I understand that it is too late to change certain things, but that is
not an impediment for me to continue to think that using the character
U+200B as a part (at least /de facto/) of the syntax is still shocking
and weird.
On the other hand, what was expected in Org would have been to have the
emphasis marks and at the same time have a universal escape character
for those emphasis marks. In the same way as I can write in markdown:
*foo* AND \*foo\*. In Org we have the emphasis marks but not the escape
character. That was probably the cause of many issues that are being
discussed here. But that means also entering the realm of assumptions.
Still, I wanted to leave an opinion on this question in particular.
Best regards,
Juan Manuel
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 19:03 ` Nicolas Goaziou
2021-12-02 19:34 ` Juan Manuel Macías
@ 2021-12-03 14:24 ` Max Nikulin
2021-12-03 15:01 ` Juan Manuel Macías
2021-12-04 15:57 ` Denis Maier
1 sibling, 2 replies; 72+ messages in thread
From: Max Nikulin @ 2021-12-03 14:24 UTC (permalink / raw)
To: emacs-orgmode
On 03/12/2021 02:03, Nicolas Goaziou wrote:
>> Denis Maier writes:
>>
>>> As for suggestions: If just using /intra/word creates ambiguities, what
>>> about the asciidoc solution? So //intra//word?
>
> I sympathize to the idea of intra-word emphasis, but the syntax above is
> going to cause some ambiguous situations.
I suppose, some more general solution is required.
> I do think the marker + zero-width space is one way to go. We could, as
> an improvement, consider zero-width spaces around emphasis markers to be
> part of the markup, and replace them along during export.
Zero-space characters adjacent to emphasis markers is a better idea than
replacing any zero space. However I agree with Juan Manuel that white
space characters, especially completely invisible (I am not Eli who sees
such special characters by moving cursor through them) should not be
overloaded. From my point of view, it is acceptable to use zero width
spaces as a workaround but they should not become official part of Org
syntax.
> Another solution is to introduce a less-subtle, but less prone to
> ambiguity, syntax, e.g.,
>
> /{bold}/markup or /|bold|/markup
>
> where /{ }/ or /| |/ become "extended" markers.
More explicit markup leaves less room for ambiguities, and I like the
idea due to this reason. On the other hand it diverges from principle of
lightweight markup. The almost only special character in TeX is "\",
HTML has three ones "&<>" with simple escape rules. Org uses many
special characters to avoid verbosity and requires some tricks to escape
them. Markers like "\{" make Org more verbose but do not make it more
strict, a lot of things still rely on heuristics.
I have an idea what can be done when some special markup is required
that is not fit into current syntax. Unfortunately some new constructs
should be introduced anyway: inline objects and multiline elements that
represent simplified result of parsed Org structures:
((italic "intra") "word")
wrapped with some markup. It should satisfy any special needs (and even
should allow to create invalid impossible constructs). Maybe idea of
combination of lightweight markup and low-level blocks better suits for
some other project with more expressive internal representation. In Org
it may become the most hated feature.
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 18:11 ` Tom Gillespie
2021-12-02 19:09 ` Juan Manuel Macías
2021-12-02 20:47 ` Org-syntax: Intra-word markup Denis Maier
@ 2021-12-03 14:53 ` Max Nikulin
2021-12-03 23:51 ` Tim Cross
3 siblings, 0 replies; 72+ messages in thread
From: Max Nikulin @ 2021-12-03 14:53 UTC (permalink / raw)
To: emacs-orgmode
On 03/12/2021 01:11, Tom Gillespie wrote:
>
> I recommend anyone suggesting solutions try to implement
> something that can parse the markup unambiguously with
> lots of nasty test cases. You will likely find that it is impossible
> to consistently tokenize markup, and that you have to hand
> write a whole bunch of heuristics, making Org syntax even
> harder to implement correctly.
Tom, I see and share you point, however sometimes more specific and
convincing arguments are necessary.
Why unconstrained markup ("//") does not cause problems in asciidoc?
Maybe it does but they are not immediately obvious. I don know since I
have never used asciidoc. Maybe parser behaves in a different way than
org-element. Maybe plain text links are not allowed at all. Almost any
URL contains such pair of markers: https://orgmode.org/, so it should be
addressed somehow.
Examples of corner cases that are used for tests should be more visible
to users otherwise it is hard to use such samples in discussions. They
should be annotated (arbitrary examples from recent discussions):
- input: [[https://first/-/url/][pre]] text [[https://second-url/?][post]]
parsed: (
(link :target "https://first/-/url/" :description "pre")
" text "
(link :target "https://second-url/?" :description "post"))
comment: "Regexp-based syntax highlighting falsely finds italic text
because URLs have slashes similar start and end of italics"
- input: A _b =c_ d= e_ f
parsed: (
"A "
(underline "b =c")
" d= e_ f")
comment: "Users of markdown may falsely expect that c_ is protected
by verbatim markers and underlined text is ended at e_"
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-03 14:24 ` Max Nikulin
@ 2021-12-03 15:01 ` Juan Manuel Macías
2021-12-04 15:57 ` Denis Maier
1 sibling, 0 replies; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-03 15:01 UTC (permalink / raw)
To: Max Nikulin; +Cc: orgmode
Hi Maxim,
Max Nikulin writes:
> More explicit markup leaves less room for ambiguities, and I like the
> idea due to this reason. On the other hand it diverges from principle
> of lightweight markup. The almost only special character in TeX is
> "\", HTML has three ones "&<>" with simple escape rules. Org uses many
> special characters to avoid verbosity and requires some tricks to
> escape them. Markers like "\{" make Org more verbose but do not make
> it more strict, a lot of things still rely on heuristics.
Excellent explanation. Thanks for the clarification.
> I have an idea what can be done when some special markup is required
> that is not fit into current syntax. Unfortunately some new constructs
> should be introduced anyway: inline objects and multiline elements
> that represent simplified result of parsed Org structures:
>
> ((italic "intra") "word")
>
> wrapped with some markup. It should satisfy any special needs (and
> even should allow to create invalid impossible constructs). Maybe idea
> of combination of lightweight markup and low-level blocks better suits
> for some other project with more expressive internal representation.
> In Org it may become the most hated feature.
I really would like a solution in this direction. In LaTeX there is a
command called \protect (which has nothing to do with this topic and is
used for other things, but I like the 'protection' concept); we could
perhaps think of a type of mark to protect the 'usual' marks when syntax
consistency is compromised in some way by the context. Maybe something
like enclosing the normal marks between two double single quotes ''...''
---or a single set of single quotes before the leading marker--- as I
proposed in another thread:
#+begin_example
''*protected emphasis*''
#+end_example
Best regards,
Juan Manuel
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-02 18:11 ` Tom Gillespie
` (2 preceding siblings ...)
2021-12-03 14:53 ` Max Nikulin
@ 2021-12-03 23:51 ` Tim Cross
2021-12-04 15:01 ` Max Nikulin
2021-12-05 23:37 ` Russell Adams
3 siblings, 2 replies; 72+ messages in thread
From: Tim Cross @ 2021-12-03 23:51 UTC (permalink / raw)
To: emacs-orgmode
Tom Gillespie <tgbugs@gmail.com> writes:
> I don't mean to be a wet blanket, but the edge cases for
> the current markup syntax are already hard enough to
> implement correctly, to the point where different parts of
> Org mode are inconsistent. Intra-word markup isn't viable
> because there simply isn't any sane way to parse something
> like *hello world*/hrm/oh no*. The other issue is that this will
> degrade parsing performance because almost every
> character could precede the start of a markup section.
>
> I recommend anyone suggesting solutions try to implement
> something that can parse the markup unambiguously with
> lots of nasty test cases. You will likely find that it is impossible
> to consistently tokenize markup, and that you have to hand
> write a whole bunch of heuristics, making Org syntax even
> harder to implement correctly.
>
> Any solution that suggests extending how =/*~+_ can be
> used gets a hard no from me. I could see teaching other
> exporters how to interpret \emph{hello}world, but trying for
> to have any sane behavior for something like
> why *hello*world oh no a wild askterisk*
> is not worth it.
>
+infinity!
Please, please can we stop trying to satisfy every edge case or extend
the markup to satisfy every possible scenario.
Org's big strength is in its simplicity. This comes at a price -
limitations in what can be done. If those limitations are unacceptable,
then use a richer markup format like Latex, XML, HTML etc.
The point about back end exporter support is very relevant. The 'richer'
the markup, the harder it is to get a consistent mapping for back end
exporters. things quickly become more complex and difficult to maintain.
In 18 years, I've seen requests for inner word markup less than 4 times.
this is not a feature we should even be considering adding to the markup
syntax.
Org provides a light weight markup, not a fully flexible rich markup
designed to meet any need. It makes the easy stuff simple.
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: emphasis and not English punctuation
2021-12-02 19:09 ` Juan Manuel Macías
@ 2021-12-04 13:07 ` Max Nikulin
2021-12-04 16:42 ` Juan Manuel Macías
0 siblings, 1 reply; 72+ messages in thread
From: Max Nikulin @ 2021-12-04 13:07 UTC (permalink / raw)
To: emacs-orgmode
On 03/12/2021 02:09, Juan Manuel Macías wrote:
>
> I believe, that emphasis marks are a part of Org that can be very
> shocking to new users. I mean, there is a series of behaviors that seem
> obvious and trivial in the emphasized text, but that in Org are not
> possible out of the box, unless you configure
> `org-emphasis-regexp-components'. Three quick examples. This in Org is
> not possible out of the box:
>
> #+begin_example
> [/emphasis/]
> ¡/emphasis/!
> ¿/Emphasis/?
> #+end_example
Maybe this issue should be considered independently of itra-word emphasis.
Second and third examples looks like they should be supported. Ihor
mentioned treating punctuation in a more general way. It requires rich
test set to estimate changes in heuristics. I suspect some problems
since start and end patterns are not symmetric and I have not found a
way to specify in regexp only punctuation marks that normally appears in
front of words. Square brackets likely should be excluded somehow as
well since they are part of Org syntax. I am unsure if it is possible to
use just regexp without additional checks of candidates.
Ihor Radchenko. [PATCH] Re: c47b535bb origin/main org-element: Remove
dependency on ‘org-emphasis-regexp-components’
Sun, 21 Nov 2021 17:28:57 +0800.
https://list.orgmode.org/87v90lzwkm.fsf@localhost
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-03 23:51 ` Tim Cross
@ 2021-12-04 15:01 ` Max Nikulin
2021-12-05 23:34 ` Russell Adams
2021-12-05 23:37 ` Russell Adams
1 sibling, 1 reply; 72+ messages in thread
From: Max Nikulin @ 2021-12-04 15:01 UTC (permalink / raw)
To: emacs-orgmode
On 04/12/2021 06:51, Tim Cross wrote:
>
> Please, please can we stop trying to satisfy every edge case or extend
> the markup to satisfy every possible scenario.
>
> Org's big strength is in its simplicity. This comes at a price -
> limitations in what can be done. If those limitations are unacceptable,
> then use a richer markup format like Latex, XML, HTML etc.
It is ridiculous to throw away a nice tool and start to struggle with
another bunch of problems when a small missed feature is really required.
> The point about back end exporter support is very relevant.
Notice that this particular feature does not require extending of
underlying intermediate representation. There may be some subtle points
but generally export backends are ready to intra-word markup.
> In 18 years, I've seen requests for inner word markup less than 4 times.
> this is not a feature we should even be considering adding to the markup
> syntax.
>
> Org provides a light weight markup, not a fully flexible rich markup
> designed to meet any need. It makes the easy stuff simple.
Different users wish to have different minor features. It would be great
to have a way to include a fragment with more verbose markup that allows
to express special needs unsupported by lightweight markup. I am
discussing a more general solution, not syntax extension namely for
intra-word markup.
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-03 14:24 ` Max Nikulin
2021-12-03 15:01 ` Juan Manuel Macías
@ 2021-12-04 15:57 ` Denis Maier
2021-12-04 17:53 ` Tom Gillespie
1 sibling, 1 reply; 72+ messages in thread
From: Denis Maier @ 2021-12-04 15:57 UTC (permalink / raw)
To: Max Nikulin, emacs-orgmode
Am 03.12.2021 um 15:24 schrieb Max Nikulin:
> On 03/12/2021 02:03, Nicolas Goaziou wrote:
>>> Denis Maier writes:
>>>
>>>> As for suggestions: If just using /intra/word creates ambiguities, what
>>>> about the asciidoc solution? So //intra//word?
>>
>> I sympathize to the idea of intra-word emphasis, but the syntax above is
>> going to cause some ambiguous situations.
>
> I suppose, some more general solution is required.
>
>> I do think the marker + zero-width space is one way to go. We could, as
>> an improvement, consider zero-width spaces around emphasis markers to be
>> part of the markup, and replace them along during export.
>
> Zero-space characters adjacent to emphasis markers is a better idea than
> replacing any zero space. However I agree with Juan Manuel that white
> space characters, especially completely invisible (I am not Eli who sees
> such special characters by moving cursor through them) should not be
> overloaded. From my point of view, it is acceptable to use zero width
> spaces as a workaround but they should not become official part of Org
> syntax.
>
>> Another solution is to introduce a less-subtle, but less prone to
>> ambiguity, syntax, e.g.,
>>
>> /{bold}/markup or /|bold|/markup
>>
>> where /{ }/ or /| |/ become "extended" markers.
>
> More explicit markup leaves less room for ambiguities, and I like the
> idea due to this reason. On the other hand it diverges from principle of
> lightweight markup. The almost only special character in TeX is "\",
> HTML has three ones "&<>" with simple escape rules. Org uses many
> special characters to avoid verbosity and requires some tricks to escape
> them. Markers like "\{" make Org more verbose but do not make it more
> strict, a lot of things still rely on heuristics.
>
> I have an idea what can be done when some special markup is required
> that is not fit into current syntax. Unfortunately some new constructs
> should be introduced anyway: inline objects and multiline elements that
> represent simplified result of parsed Org structures:
>
> ((italic "intra") "word")
>
> wrapped with some markup. It should satisfy any special needs (and even
> should allow to create invalid impossible constructs). Maybe idea of
> combination of lightweight markup and low-level blocks better suits for
> some other project with more expressive internal representation. In Org
> it may become the most hated feature.
I have to admit I like this idea. That brings a lot of flexibility to
accomodate even the most obscure needs, yet it makes the discussion
about escape characters or new symbols much less pressing. After all,
most markup languages face the same problem, i.e., special characters
are limited, and beyond the usual /*_ the meaning of characters becomes
much less obvious.
This idea reminds me a bit of Scribble/Racket where every document is
just inverted code, which makes it possible to insert arbitrary Racket
code in your prose...
Denis
>
>
>
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: emphasis and not English punctuation
2021-12-04 13:07 ` Org-syntax: emphasis and not English punctuation Max Nikulin
@ 2021-12-04 16:42 ` Juan Manuel Macías
0 siblings, 0 replies; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-04 16:42 UTC (permalink / raw)
To: Max Nikulin; +Cc: orgmode
Max Nikulin writes:
> Maybe this issue should be considered independently of itra-word emphasis.
Yes I agree. Apologies for mixing up this topic in the discussion about
intra-word emphasis...
> Second and third examples looks like they should be supported. Ihor
> mentioned treating punctuation in a more general way. It requires rich
> test set to estimate changes in heuristics. I suspect some problems
> since start and end patterns are not symmetric and I have not found a
> way to specify in regexp only punctuation marks that normally appears
> in front of words. Square brackets likely should be excluded somehow
> as well since they are part of Org syntax. I am unsure if it is
> possible to use just regexp without additional checks of candidates.
Ihor's idea seems interesting to me, although I understand the possible
problems you mention. By the way, I'm afraid of initial inverted
punctuation (¡¿) are only used in Castilian Spanish and other languages
of Spain, such as Galician or Asturian, due to the Castilian influence
(we go backwards from the rest of the world ;-):
https://en.wikipedia.org/wiki/Inverted_question_and_exclamation_marks
> Ihor Radchenko. [PATCH] Re: c47b535bb origin/main org-element: Remove
> dependency on ‘org-emphasis-regexp-components’
> Sun, 21 Nov 2021 17:28:57 +0800.
> https://list.orgmode.org/87v90lzwkm.fsf@localhost
I see. I believe it's a sensible decision to get rid of the dependency
on org-emphasis-regexp-components. I understand that now everything
related to the structure of emphases is the competence of org-element?
Best regards,
Juan Manuel
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-04 15:57 ` Denis Maier
@ 2021-12-04 17:53 ` Tom Gillespie
2021-12-04 18:37 ` John Kitchin
` (2 more replies)
0 siblings, 3 replies; 72+ messages in thread
From: Tom Gillespie @ 2021-12-04 17:53 UTC (permalink / raw)
To: emacs-orgmode
Cc: Juan Manuel Macías, Max Nikulin, Tim Cross, Denis Maier
Hi all,
After a bunch of rambling (see below if interested), I think I have
a solution that should work for everyone. The key realization is that
what we really want is the ability to have a "parse me separately"
type of syntax. This meets the intra-word syntax needs and might
meet some other needs as well.
The solution is to make @@org:...@@ "parse me separately"
block! It nearly works that way already too! To minimize typing
we could have @@:...@@ the empty type default to org.
This seems like a winner to me. The syntax for it already exists
and won't conflict. It requires relatively minimal additional typing
the implication is clear, and there are other places where such
behavior could be useful.
This syntax seems like a winner to me
@@org:/hello/@@world
@@:/hello/@@world
You can also do things like
#+begin_src org
I want a number in this number@@org:src_elisp{(+ 1 2)}@@word!
#+end_src
Which would render to
#+begin_src org
I want a number in this number3word!
#+end_src
Thoughts?
Best!
Tom
--------------- rambling below -------------
> This idea reminds me a bit of Scribble/Racket where every document is
> just inverted code, which makes it possible to insert arbitrary Racket
> code in your prose...
I will say, despite some of my comments elsewhere, that I think
exploring certain features of Scribble syntax for use in Org mode
would simplify certain parts of the syntax immensely.
For example
various inline blocks are an absolute pain to parse because they
allow nested delimiters /if they are matched/. The implementation
of the /if they are matched/ clause is currently a nasty hack which
generates a regular expression that can only actually handle nesting
to depth 3. Actually implementing the recursive grammar add a lot
of complexity to the syntax and is hard to get right.
It would be vastly simpler to use Scribble's |<{hello }} world}>|
style syntax and always terminate at the first matching delimiter.
I'm sure that this would break some Org files, but it would make
dealing with latex fragments and inline source blocks and inline
footnotes SO much simpler. Matching an arbitrary number of
angle brackets does add some complexity, but it is tiny compared
to the complexity of enforcing matched parens and their failure cases
especially because many of the places where nesting is required
probably only see use of the nesting feature in a tiny fraction of
all cases.
One other reason why this is attractive is that all the instances
where nested delimiters can appear on a line are preceded by
some non-whitespace character. This means that using the
pipe syntax does not conflict with table syntax!
Now the question comes. If we could implement this for
delimiters, could we also implement something similar
for markup? The issue with the proposed markup outside
delimiter inside approach is that it will change existing
behavior for files that want the delimiters to be included
in the markup, i.e. /{oops}/ becoming /oops/ is bad. A
second issue is that putting the delimiter inside the markup
cannot work for verbatim and code ={oops}= is ={oops}= no
matter what. Therefore the solution is not uniform across all
types of markup. We need another solution that works for
all types of markup.
What if we put the "start arbitrary markup" char outside
the markup? Say something like |/ital/|icks? Or what if
we went whole hog and used |{/ital/}|ics and made the
|{...}| syntax trigger a generalized feature where the
contents of the |{...}| block are parsed by themselves
and can abutt any other text? This would be generally
useful in a variety of situations beyond just intra-word
markup.
What are the issues with this approach? The first issue
is that there is a conflict with table syntax if we were to
use the pipe character because markup can appear at
the start of a line. The second issue is that it might be
confusing for users if |{}| also worked like {} when in the
context of latex elements or inline src blocks, or maybe
that is ok because |{}| never renders as text. Hrm. Ok.
Second issue resolved, but what to do about the first?
If we want generalized "parse this by itself" syntax so
that we can write hello|{/world/}|ok, then we need a
solution that can appear at the start of a line. So we
can't use pipe because that is always a table line even
if a zero width space is put before it ;). What other
options do we have? How about #+|{/hello/}|world for
the start of a line? As long as there is no trailing colon
it isn't a keyword, so it could work ... except that if
someone reflows the text and it is no longer a the
start of a line then the syntax breaks. That is to say
using #+| at the start of a line is not uniform, so we
can't take that approach.
What other chars to we have at our disposal? Hrm.
How about @@? Could we use that? What happens
if we use @@org:/hello/@@world? Or maybe if we
want to minimize the number of chars we could do
@@:/hello/@@world and have the empty prefix in
@@ blocks mean org?
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-04 17:53 ` Tom Gillespie
@ 2021-12-04 18:37 ` John Kitchin
2021-12-04 21:16 ` Juan Manuel Macías
2021-12-06 10:57 ` Raw Org AST snippets for "impossible" markup Max Nikulin
2021-12-04 19:04 ` Org-syntax: Intra-word markup Timothy
2021-12-06 11:01 ` Denis Maier
2 siblings, 2 replies; 72+ messages in thread
From: John Kitchin @ 2021-12-04 18:37 UTC (permalink / raw)
To: Tom Gillespie
Cc: Juan Manuel Macías, Max Nikulin, Tim Cross, emacs-orgmode,
Denis Maier
[-- Attachment #1: Type: text/plain, Size: 7120 bytes --]
Along these lines (and combining the s-exp suggestion from Max) , you can
achieve something like this with links.
This is lightly tested, and I am not thrilled with the eval for exporting,
but I couldn't get a macro to work on the export function to avoid it, and
this is just a proof of concept idea. This might only be suitable for
individual solutions, since you have to define this markup yourself.
#+BEGIN_SRC emacs-lisp :results silent
(defun italic (s)
(pcase backend ;; lexical
('latex (format "{\\textit{%s}}" s))
('html (format "<i>%s</i>" s))
(_ s)))
(defun @@-export (path desc backend)
(eval `(concat ,@(read path))))
(org-link-set-parameters
"@@"
:export #'@@-export)
#+END_SRC
In org, it would look like Here is a [[@@:((italic "part") "ial")]] markup.
And in exports this is what this implementation does.
#+BEGIN_SRC emacs-lisp
(org-export-string-as "Here is a [[@@:((italic \"part\") \"ial\")]]
markup." 'latex t)
#+END_SRC
#+RESULTS:
: Here is a {\textit{part}}ial markup.
#+BEGIN_SRC emacs-lisp
(org-export-string-as "Here is a [[@@:((italic \"part\") \"ial\")]]
markup." 'html t)
#+END_SRC
#+RESULTS:
: <p>
: Here is a <i>part</i>ial markup.</p>
#+BEGIN_SRC emacs-lisp
(org-export-string-as "Here is a [[@@:((italic \"part\") \"ial\")]]
markup." 'ascii t)
#+END_SRC
#+RESULTS:
: Here is a partial markup.
Of course, you are free to do what you want with the path, including parse
it yourself to generate the output, and since it is a link, you could do
all kinds of things to make it look the way you want with faces, overlays,
etc.
John
-----------------------------------
Professor John Kitchin (he/him/his)
Doherty Hall A207F
Department of Chemical Engineering
Carnegie Mellon University
Pittsburgh, PA 15213
412-268-7803
@johnkitchin
http://kitchingroup.cheme.cmu.edu
On Sat, Dec 4, 2021 at 12:54 PM Tom Gillespie <tgbugs@gmail.com> wrote:
> Hi all,
> After a bunch of rambling (see below if interested), I think I have
> a solution that should work for everyone. The key realization is that
> what we really want is the ability to have a "parse me separately"
> type of syntax. This meets the intra-word syntax needs and might
> meet some other needs as well.
>
> The solution is to make @@org:...@@ "parse me separately"
> block! It nearly works that way already too! To minimize typing
> we could have @@:...@@ the empty type default to org.
>
> This seems like a winner to me. The syntax for it already exists
> and won't conflict. It requires relatively minimal additional typing
> the implication is clear, and there are other places where such
> behavior could be useful.
>
> This syntax seems like a winner to me
> @@org:/hello/@@world
> @@:/hello/@@world
>
> You can also do things like
> #+begin_src org
> I want a number in this number@@org:src_elisp{(+ 1 2)}@@word!
> #+end_src
>
> Which would render to
> #+begin_src org
> I want a number in this number3word!
> #+end_src
>
> Thoughts?
>
> Best!
> Tom
>
> --------------- rambling below -------------
>
>
> > This idea reminds me a bit of Scribble/Racket where every document is
> > just inverted code, which makes it possible to insert arbitrary Racket
> > code in your prose...
>
> I will say, despite some of my comments elsewhere, that I think
> exploring certain features of Scribble syntax for use in Org mode
> would simplify certain parts of the syntax immensely.
>
> For example
> various inline blocks are an absolute pain to parse because they
> allow nested delimiters /if they are matched/. The implementation
> of the /if they are matched/ clause is currently a nasty hack which
> generates a regular expression that can only actually handle nesting
> to depth 3. Actually implementing the recursive grammar add a lot
> of complexity to the syntax and is hard to get right.
>
> It would be vastly simpler to use Scribble's |<{hello }} world}>|
> style syntax and always terminate at the first matching delimiter.
> I'm sure that this would break some Org files, but it would make
> dealing with latex fragments and inline source blocks and inline
> footnotes SO much simpler. Matching an arbitrary number of
> angle brackets does add some complexity, but it is tiny compared
> to the complexity of enforcing matched parens and their failure cases
> especially because many of the places where nesting is required
> probably only see use of the nesting feature in a tiny fraction of
> all cases.
>
> One other reason why this is attractive is that all the instances
> where nested delimiters can appear on a line are preceded by
> some non-whitespace character. This means that using the
> pipe syntax does not conflict with table syntax!
>
> Now the question comes. If we could implement this for
> delimiters, could we also implement something similar
> for markup? The issue with the proposed markup outside
> delimiter inside approach is that it will change existing
> behavior for files that want the delimiters to be included
> in the markup, i.e. /{oops}/ becoming /oops/ is bad. A
> second issue is that putting the delimiter inside the markup
> cannot work for verbatim and code ={oops}= is ={oops}= no
> matter what. Therefore the solution is not uniform across all
> types of markup. We need another solution that works for
> all types of markup.
>
> What if we put the "start arbitrary markup" char outside
> the markup? Say something like |/ital/|icks? Or what if
> we went whole hog and used |{/ital/}|ics and made the
> |{...}| syntax trigger a generalized feature where the
> contents of the |{...}| block are parsed by themselves
> and can abutt any other text? This would be generally
> useful in a variety of situations beyond just intra-word
> markup.
>
> What are the issues with this approach? The first issue
> is that there is a conflict with table syntax if we were to
> use the pipe character because markup can appear at
> the start of a line. The second issue is that it might be
> confusing for users if |{}| also worked like {} when in the
> context of latex elements or inline src blocks, or maybe
> that is ok because |{}| never renders as text. Hrm. Ok.
> Second issue resolved, but what to do about the first?
>
> If we want generalized "parse this by itself" syntax so
> that we can write hello|{/world/}|ok, then we need a
> solution that can appear at the start of a line. So we
> can't use pipe because that is always a table line even
> if a zero width space is put before it ;). What other
> options do we have? How about #+|{/hello/}|world for
> the start of a line? As long as there is no trailing colon
> it isn't a keyword, so it could work ... except that if
> someone reflows the text and it is no longer a the
> start of a line then the syntax breaks. That is to say
> using #+| at the start of a line is not uniform, so we
> can't take that approach.
>
> What other chars to we have at our disposal? Hrm.
> How about @@? Could we use that? What happens
> if we use @@org:/hello/@@world? Or maybe if we
> want to minimize the number of chars we could do
> @@:/hello/@@world and have the empty prefix in
> @@ blocks mean org?
>
>
[-- Attachment #2: Type: text/html, Size: 8579 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-04 17:53 ` Tom Gillespie
2021-12-04 18:37 ` John Kitchin
@ 2021-12-04 19:04 ` Timothy
2021-12-04 21:48 ` Tom Gillespie
2021-12-06 11:01 ` Denis Maier
2 siblings, 1 reply; 72+ messages in thread
From: Timothy @ 2021-12-04 19:04 UTC (permalink / raw)
To: Tom Gillespie
Cc: Juan Manuel Macías, Max Nikulin, Tim Cross, emacs-orgmode,
Denis Maier
[-- Attachment #1: Type: text/plain, Size: 872 bytes --]
Hi Tom,
> After a bunch of rambling (see below if interested), I think I have
> a solution that should work for everyone. The key realization is that
> what we really want is the ability to have a “parse me separately”
> type of syntax. This meets the intra-word syntax needs and might
> meet some other needs as well.
>
> The solution is to make “parse me separately”
> block! It nearly works that way already too! To minimize typing
> we could have @@:…@@ the empty type default to org.
>
> Thoughts?
This isn’t quite as succinct as the ascii-doc inspired suggestions, but it’s
barely an extension on the current syntax — I like it!
Since org is a valid export backend though, perhaps this behaviour should be
reserved for @@:…@@, i.e. no export backend, which I think semantically fits
fairly nicely.
All the best,
Timothy
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-04 18:37 ` John Kitchin
@ 2021-12-04 21:16 ` Juan Manuel Macías
2021-12-06 10:57 ` Raw Org AST snippets for "impossible" markup Max Nikulin
1 sibling, 0 replies; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-04 21:16 UTC (permalink / raw)
To: John Kitchin; +Cc: orgmode
Hi John,
John Kitchin writes:
> Along these lines (and combining the s-exp suggestion from Max) , you
> can achieve something like this with links.
I like this idea of merging the Maxim's proposal with the power of links.
In any case, this and other workarounds provided here make it clear that
in Org we do not lack of good and useful resources. I usually use macros
(taking advantage of the fact that macros expand soon). For example
(only in this case with the LaTeX backend):
#+MACRO: emph (eval (when (org-export-derived-backend-p org-export-current-backend 'latex) (concat "@@latex:\\emph{@@" $1 "@@latex:}@@")))
Defined the macro this way, it allows me also to introduce nested
emphases by both ways:
#+begin_src example
{{{emph(lorem *ipsum* /dolor/ {{{emph(sit)}}} amet)}}}
#+end_src
==> \emph{lorem \textbf{ipsum} \emph{dolor} \emph{sit} amet}
Best regards,
Juan Manuel
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-04 19:04 ` Org-syntax: Intra-word markup Timothy
@ 2021-12-04 21:48 ` Tom Gillespie
2021-12-06 10:59 ` Max Nikulin
2022-01-28 14:52 ` Max Nikulin
0 siblings, 2 replies; 72+ messages in thread
From: Tom Gillespie @ 2021-12-04 21:48 UTC (permalink / raw)
To: Timothy; +Cc: emacs-orgmode
> Since org is a valid export backend though, perhaps this behaviour should be
> reserved for @@:…@@, i.e. no export backend, which I think semantically fits
> fairly nicely.
This ends up being even more convenient than I initially realized.
The current spec for export snippets is ambiguous when it says
"NAME can contain any alpha-numeric character and hyphens"
but the implementation behavior requires that "any" means "at
least one" and is implemented using the + regex operator.
What this means is that @@:...@@ syntax is not actually used
in Org at all at the moment and renders as plain text. I agree that
we need to avoid @@org:..@@ because it has legitimate uses.
Making a back-end of empty string valid for parse separately
syntax thus makes @@ syntax more regular overall, and allows
@@:...@@ to be processed separately because it currently
never enters the export snippet processing.
This is important because export snippets do not seem to be easily
accessible to earlier phases of the org-export machinery, i.e. there
isn't a nice centralized place to preprocess @@org:...@@ even
if we wanted to. On the other hand @@:...@@ isn't processed
at all. I could be missing something in the org export code though.
It will take a bit of work to get this behavior implemented I think,
but it doesn't seem to have any conflicts. Some users may have
set the empty backend to expand manually via
org-export-snippet-translation-alist, but as long as we give
org-export-snippet-translation-alist priority and warn people
that setting "" manually will disable the new functionality
then there shouldn't be any disruption. The behavior also sort
of matches what we would want the empty string to be in this
case, which is "all backends" and of course the only markup
that makes sense for "all backends" is org itself!
Best,
Tom
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-04 15:01 ` Max Nikulin
@ 2021-12-05 23:34 ` Russell Adams
0 siblings, 0 replies; 72+ messages in thread
From: Russell Adams @ 2021-12-05 23:34 UTC (permalink / raw)
To: emacs-orgmode
On Sat, Dec 04, 2021 at 10:01:15PM +0700, Max Nikulin wrote:
> On 04/12/2021 06:51, Tim Cross wrote:
> >
> > Please, please can we stop trying to satisfy every edge case or extend
> > the markup to satisfy every possible scenario.
>
> It is ridiculous to throw away a nice tool and start to struggle with
> another bunch of problems when a small missed feature is really required.
I think this is a problem of expectations. I don't export Org to
export perfect documents in every language. I expect Org to make a
simple subset of features available consistently.
With HTML or Latex you can create those words, and you can insert that
code into your Org document. Why does the Org syntax need to be
further extended to support this?
Part of the reason Org is a nice tool is that it is simple, and we
should be cautious trying to make it any more complex.
------------------------------------------------------------------
Russell Adams RLAdams@AdamsInfoServ.com
PGP Key ID: 0x1160DCB3 http://www.adamsinfoserv.com/
Fingerprint: 1723 D8CA 4280 1EC9 557F 66E8 1154 E018 1160 DCB3
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-03 23:51 ` Tim Cross
2021-12-04 15:01 ` Max Nikulin
@ 2021-12-05 23:37 ` Russell Adams
2021-12-06 1:39 ` Samuel Wales
1 sibling, 1 reply; 72+ messages in thread
From: Russell Adams @ 2021-12-05 23:37 UTC (permalink / raw)
To: emacs-orgmode
On Sat, Dec 04, 2021 at 10:51:47AM +1100, Tim Cross wrote:
>
> Tom Gillespie <tgbugs@gmail.com> writes:
>
> > I don't mean to be a wet blanket...
I'd like to be a wet blanket.
> +infinity!
>
> Please, please can we stop trying to satisfy every edge case or extend
> the markup to satisfy every possible scenario.
+infinity^2
I've often thought Org needs to hit the brakes and stop adding
features, or cut out features that have a high support/maintenance
cost. We need to respect our maintainers' time.
------------------------------------------------------------------
Russell Adams RLAdams@AdamsInfoServ.com
PGP Key ID: 0x1160DCB3 http://www.adamsinfoserv.com/
Fingerprint: 1723 D8CA 4280 1EC9 557F 66E8 1154 E018 1160 DCB3
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-05 23:37 ` Russell Adams
@ 2021-12-06 1:39 ` Samuel Wales
0 siblings, 0 replies; 72+ messages in thread
From: Samuel Wales @ 2021-12-06 1:39 UTC (permalink / raw)
To: emacs-orgmode
i think i can't add much useful to these threads, i agree with the
simplicity, but, a nuance, want for org to have had a bit more
consistency growing up. e.g. quoting/escaping, demarcation, and
applicability of features in different contexts.
sort of a "mentally factored user interface" where the user's
expectation is pretty straightforwardly met. e.g. works here so
should also work there. or, there is only one rule for doing this.
that kind of thing. orthogonality also. few exceptions.
it is understandable in context that inconsistencies exist, and that
might apply to various maintenance-over-heavy things users want.
if we are to remove features as suggested below, then i suggest, where
possible, consistency be a desideratum for final result.
On 12/5/21, Russell Adams <RLAdams@adamsinfoserv.com> wrote:
> On Sat, Dec 04, 2021 at 10:51:47AM +1100, Tim Cross wrote:
>>
>> Tom Gillespie <tgbugs@gmail.com> writes:
>>
>> > I don't mean to be a wet blanket...
>
> I'd like to be a wet blanket.
>
>> +infinity!
>>
>> Please, please can we stop trying to satisfy every edge case or extend
>> the markup to satisfy every possible scenario.
>
> +infinity^2
>
> I've often thought Org needs to hit the brakes and stop adding
> features, or cut out features that have a high support/maintenance
> cost. We need to respect our maintainers' time.
>
> ------------------------------------------------------------------
> Russell Adams RLAdams@AdamsInfoServ.com
>
> PGP Key ID: 0x1160DCB3 http://www.adamsinfoserv.com/
>
> Fingerprint: 1723 D8CA 4280 1EC9 557F 66E8 1154 E018 1160 DCB3
>
>
--
The Kafka Pandemic
A blog about science, health, human rights, and misopathy:
https://thekafkapandemic.blogspot.com
^ permalink raw reply [flat|nested] 72+ messages in thread
* Raw Org AST snippets for "impossible" markup
2021-12-04 18:37 ` John Kitchin
2021-12-04 21:16 ` Juan Manuel Macías
@ 2021-12-06 10:57 ` Max Nikulin
2021-12-06 15:45 ` Juan Manuel Macías
1 sibling, 1 reply; 72+ messages in thread
From: Max Nikulin @ 2021-12-06 10:57 UTC (permalink / raw)
To: emacs-orgmode
On 05/12/2021 01:37, John Kitchin wrote:
> Along these lines (and combining the s-exp suggestion from Max) , you
> can achieve something like this with links.
>
> #+BEGIN_SRC emacs-lisp :results silent
> (defun italic (s)
> (pcase backend ;; lexical
> ('latex (format "{\\textit{%s}}" s))
> ('html (format "<i>%s</i>" s))
> (_ s)))
>
> (defun @@-export (path desc backend)
> (eval `(concat ,@(read path))))
>
> (org-link-set-parameters
> "@@"
> :export #'@@-export)
> #+END_SRC
John, thank you for the reminding me of Juan Manuel's idea that
everything missed in Org may be polyfilled (ab)using links.
It is enough for proof of concept, special markers may be introduced
later. After some time spent exercising in monkey-typing,
I have got some code that illustrates my idea.
So the goal is to mitigate demand to extend current syntax.
While simple cases should be easy,
special cases should not be impossible.
- Raw AST snippets should be processed without ~eval~ to give
other tools such as =pandoc= a chance to support the feature.
If you desperately need ~eval~ then you can use source blocks.
- The idea is to use existing backends by passing structures
similar to ones generated by ~org-element~ parser.
- I would prefer to avoid "@@" for link prefix since such sequences
are already a part of Org syntax. In the following example
export snippet is preliminary terminated by such link:
#+begin_src elisp :results pp
(org-element-parse-secondary-string
"@@latex:[[@@:(italics \"i\")]]@@"
(org-element-restriction 'paragraph))
#+end_src
#+RESULTS:
: ((export-snippet
: (:back-end "latex" :value "[[" :begin 1 :end 13 :post-blank 0
:parent #0))
: #(":(italics \"i\")]]@@" 0 18
: (:parent #0)))
Let's take some link prefix that makes it clear that the proposal
is a draft and a sane variant will be chosen later when agreement
concerning details of such feature is achieved. Till that moment
it is named "orgia".
#+begin_src elisp :results silent
(defun orgia-export (path desc backend)
(if (not (eq ?\( (aref path 0)))
path
(let ((tree (read path))
(info (org-export-get-environment backend nil nil)))
(org-no-properties
(org-export-data-with-backend tree backend info)))))
(org-link-set-parameters
"orgia"
:export #'orgia-export)
#+end_src
Either [[orgia:("inter" (bold () "word"))]]
or <orgia:((italic () "inter") "word")>
links may be used. Certainly plain text may be outside:
#+begin_src elisp
(org-export-string-as "A <orgia:(italic () \"inter\")>word" 'html t)
#+end_src
#+RESULTS:
: <p>
: A <i>inter</i>word</p>
- Error handling is required.
- Elements (blocks) should be considered as an error
in object (inline) context.
- Passed tree should be preprocessed to glue strings split to
avoid interpreting them as terminating outer construct or link itself
(=]]= =][= should be ="]" "]"= ="]" "["= inside bracket links).
It is especially important for property values.
- For convenience =parse= element may be added to parse a string
accordingly to Org markup.
- There should be a similar element (block-level markup structure).
- Symbols and structures used by ~org-element~ becomes a part of
public API, but they are already are since they are used
by export backends.
- ~org-cite~ is likely will be a problem.
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-04 21:48 ` Tom Gillespie
@ 2021-12-06 10:59 ` Max Nikulin
2022-01-28 14:52 ` Max Nikulin
1 sibling, 0 replies; 72+ messages in thread
From: Max Nikulin @ 2021-12-06 10:59 UTC (permalink / raw)
To: emacs-orgmode
On 05/12/2021 04:48, Tom Gillespie wrote:
>> Since org is a valid export backend though, perhaps this behaviour should be
>> reserved for @@:…@@, i.e. no export backend, which I think semantically fits
>> fairly nicely.
>
> This ends up being even more convenient than I initially realized.
It is a bright idea. The only drawback I see is that it is impossible to
put new "@@:@@" fragment inside export snippet "@@latex:some
@@:special@@thing@@ or vice versa.
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-04 17:53 ` Tom Gillespie
2021-12-04 18:37 ` John Kitchin
2021-12-04 19:04 ` Org-syntax: Intra-word markup Timothy
@ 2021-12-06 11:01 ` Denis Maier
2 siblings, 0 replies; 72+ messages in thread
From: Denis Maier @ 2021-12-06 11:01 UTC (permalink / raw)
To: Tom Gillespie, emacs-orgmode
Cc: Juan Manuel Macías, Max Nikulin, Tim Cross
Hi Tom
Am 04.12.2021 um 18:53 schrieb Tom Gillespie:
> Hi all,
> After a bunch of rambling (see below if interested), I think I have
> a solution that should work for everyone. The key realization is that
> what we really want is the ability to have a "parse me separately"
> type of syntax. This meets the intra-word syntax needs and might
> meet some other needs as well.
>
> The solution is to make @@org:...@@ "parse me separately"
> block! It nearly works that way already too! To minimize typing
> we could have @@:...@@ the empty type default to org.
>
> This seems like a winner to me. The syntax for it already exists
> and won't conflict. It requires relatively minimal additional typing
> the implication is clear, and there are other places where such
> behavior could be useful.
>
> This syntax seems like a winner to me
> @@org:/hello/@@world
> @@:/hello/@@world
>
> You can also do things like
> #+begin_src org
> I want a number in this number@@org:src_elisp{(+ 1 2)}@@word!
> #+end_src
>
> Which would render to
> #+begin_src org
> I want a number in this number3word!
> #+end_src
>
> Thoughts?
>
> Best!
> Tom
>
Thanks for the suggestion. I think that sounds like a good idea. Of
course not as terse as the asciidoc inspired suggestion, but entirely
appropriate for a case like this one! I also like that there might be
other cases where case might be handy.
Best,
Denis
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Raw Org AST snippets for "impossible" markup
2021-12-06 10:57 ` Raw Org AST snippets for "impossible" markup Max Nikulin
@ 2021-12-06 15:45 ` Juan Manuel Macías
2021-12-06 16:56 ` Juan Manuel Macías
2021-12-08 13:09 ` Max Nikulin
0 siblings, 2 replies; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-06 15:45 UTC (permalink / raw)
To: Max Nikulin; +Cc: orgmode
Max Nikulin writes:
> John, thank you for the reminding me of Juan Manuel's idea that
> everything missed in Org may be polyfilled (ab)using links.
> It is enough for proof of concept, special markers may be introduced
> later. After some time spent exercising in monkey-typing,
> I have got some code that illustrates my idea.
>
> So the goal is to mitigate demand to extend current syntax.
> While simple cases should be easy,
> special cases should not be impossible.
>
> - Raw AST snippets should be processed without ~eval~ to give
> other tools such as =pandoc= a chance to support the feature.
> If you desperately need ~eval~ then you can use source blocks.
> - The idea is to use existing backends by passing structures
> similar to ones generated by ~org-element~ parser.
> - I would prefer to avoid "@@" for link prefix since such sequences
> are already a part of Org syntax. In the following example
> export snippet is preliminary terminated by such link:
>
> #+begin_src elisp :results pp
> (org-element-parse-secondary-string
> "@@latex:[[@@:(italics \"i\")]]@@"
> (org-element-restriction 'paragraph))
> #+end_src
>
>
> #+RESULTS:
> : ((export-snippet
> : (:back-end "latex" :value "[[" :begin 1 :end 13 :post-blank 0
> :parent #0))
> : #(":(italics \"i\")]]@@" 0 18
> : (:parent #0)))
>
> Let's take some link prefix that makes it clear that the proposal
> is a draft and a sane variant will be chosen later when agreement
> concerning details of such feature is achieved. Till that moment
> it is named "orgia".
>
> #+begin_src elisp :results silent
> (defun orgia-export (path desc backend)
> (if (not (eq ?\( (aref path 0)))
> path
> (let ((tree (read path))
> (info (org-export-get-environment backend nil nil)))
> (org-no-properties
> (org-export-data-with-backend tree backend info)))))
>
> (org-link-set-parameters
> "orgia"
> :export #'orgia-export)
> #+end_src
>
>
> Either [[orgia:("inter" (bold () "word"))]]
> or <orgia:((italic () "inter") "word")>
> links may be used. Certainly plain text may be outside:
>
> #+begin_src elisp
> (org-export-string-as "A <orgia:(italic () \"inter\")>word" 'html t)
> #+end_src
>
> #+RESULTS:
> : <p>
> : A <i>inter</i>word</p>
>
> - Error handling is required.
> - Elements (blocks) should be considered as an error
> in object (inline) context.
> - Passed tree should be preprocessed to glue strings split to
> avoid interpreting them as terminating outer construct or link itself
> (=]]= =][= should be ="]" "]"= ="]" "["= inside bracket links).
> It is especially important for property values.
> - For convenience =parse= element may be added to parse a string
> accordingly to Org markup.
> - There should be a similar element (block-level markup structure).
> - Symbols and structures used by ~org-element~ becomes a part of
> public API, but they are already are since they are used
> by export backends.
> - ~org-cite~ is likely will be a problem.
Hi Maxim,
I understand that with this method the emphases could be nested, which
it seems also very productive. I like it.
I would suggest, however, not to use the term 'italics', since is a
'typographic' term, but a term that is agnostic of format and
typography, something like as 'emphasis' or 'emph'. For example, in a
format agnostic environment like Org, which is concerned only with
structure, an emphasis is always an emphasis. But in a typographic
environment that emphasis may or may not be be in italics. That is why
in LaTeX you can write constructions like:
#+begin_src latex
\emph{The Making Off of \emph{Star Wars}}
#+end_src
In this context 'Star Wars' would appear in upright font. Naturally,
these things are only possible in LaTeX, but it's nice to keep in Org a
typographic agnosticism.
Anyway, I find all this very interesting as proof of concept, although
in my workflow I prefer to use macros for these types of scenarios (yes,
a rare case where I don't use links! :-D):
#+begin_src emacs-lisp
(defun my-macro-emph (arg)
(cond ((org-export-derived-backend-p org-export-current-backend 'latex)
(concat "@@latex:\\emph{@@" arg "@@latex:}@@"))
((org-export-derived-backend-p org-export-current-backend 'html)
(concat "@@html:<em>@@" arg "@@html:</em>@@"))
((org-export-derived-backend-p org-export-current-backend 'odt)
(concat "@@odt:<text:span text:style-name=\"Emphasis\">@@" arg "@@odt:</text:span>@@"))))
(setq org-export-global-macros
'(("emph" . "(eval (my-macro-emph $1))")))
#+end_src
{{{emph(The Making Off of {{{emph(Star Wars)}}})}}}
Best regards,
Juan Manuel
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Raw Org AST snippets for "impossible" markup
2021-12-06 15:45 ` Juan Manuel Macías
@ 2021-12-06 16:56 ` Juan Manuel Macías
2021-12-08 13:09 ` Max Nikulin
1 sibling, 0 replies; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-06 16:56 UTC (permalink / raw)
To: Max Nikulin; +Cc: orgmode
Juan Manuel Macías writes:
> I would suggest, however, not to use the term 'italics [...blah blah...]'
Sorry for the noise! I think I messed myself up...
Naturally, 'italic' (or 'bold') is required: (italic () \"inter\")
Best regards,
Juan Manuel
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Raw Org AST snippets for "impossible" markup
2021-12-06 15:45 ` Juan Manuel Macías
2021-12-06 16:56 ` Juan Manuel Macías
@ 2021-12-08 13:09 ` Max Nikulin
2021-12-08 23:19 ` Juan Manuel Macías
1 sibling, 1 reply; 72+ messages in thread
From: Max Nikulin @ 2021-12-08 13:09 UTC (permalink / raw)
To: emacs-orgmode
On 06/12/2021 22:45, Juan Manuel Macías wrote:
>
> I understand that with this method the emphases could be nested, which
> it seems also very productive. I like it.
>
> I would suggest, however, not to use the term 'italics', since is a
> 'typographic' term, but a term that is agnostic of format and
> typography, something like as 'emphasis' or 'emph'. For example, in a
> format agnostic environment like Org, which is concerned only with
> structure, an emphasis is always an emphasis. But in a typographic
> environment that emphasis may or may not be be in italics. That is why
> in LaTeX you can write constructions like:
As you have guessed, It is not my choice, it is interface of ox.el and
org-element.el.
However if you strongly want to use proper terminology in markup, you
may try to trade it for +your soul+ compatibility and portability
issues. The following almost works:
#+begin_src elisp :results silent
(defun orgia-link (link-data desc info)
(let* ((backend-struct (plist-get info :back-end))
(backend-name (org-export-backend-name backend-struct)))
(or
(org-export-custom-protocol-maybe link-data desc backend-name info)
(let* ((parent (org-export-backend-parent backend-struct))
(transcoders-alist (org-export-get-all-transcoders parent))
(link-transcoder (alist-get 'link transcoders-alist)))
(if link-transcoder
(funcall link-transcoder link-data desc info)
desc)))))
(defun evilatex-emph (_emph content info)
;; I have no idea yet why newline is appended.
(format "\\textit{%s}%%" content))
(org-export-define-derived-backend 'evilatex 'latex
:translate-alist '((emph . evilatex-emph)
(link . orgia-link)))
#+end_src
#+begin_src elisp
(let ((org-export-with-broken-links 'mark))
(org-export-string-as
"An [[orgia:(italic () \"ex\")]]ample of <orgia:(emph ()
\"inter\")>word and [[http://te.st][link]] [[unknown:prefix][desc]]!"
'evilatex t))
#+end_src
#+RESULTS:
: An \emph{ex}ample of \textit{inter}%
: word and \href{http://te.st}{link} [BROKEN LINK: unknown:prefix]!
Actually, I believe that something like orgia-link code should be added
by `org-exprot-define-derived-backend' if "link" is missed in
translate-alist. I suspect that `org-export-get-all-transcoders' may be
avoided.
> (setq org-export-global-macros
> '(("emph" . "(eval (my-macro-emph $1))")))
Sorry, I have not prepared better variant to solve comma in macro
problem yet.
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Raw Org AST snippets for "impossible" markup
2021-12-08 13:09 ` Max Nikulin
@ 2021-12-08 23:19 ` Juan Manuel Macías
2021-12-08 23:35 ` John Kitchin
0 siblings, 1 reply; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-08 23:19 UTC (permalink / raw)
To: Max Nikulin; +Cc: orgmode
Max Nikulin writes:
> As you have guessed, It is not my choice, it is interface of ox.el and
> org-element.el.
Indeed. Sorry for my haste: it's the consequences of not read the code
carefully :-)
Of course, your orgia-link-procedure could be extended to more org elements.
I can't think of what kind of scenario that might fit in, but as a proof
of concept I find it really stimulating. E.g:
#+begin_src elisp
(org-export-string-as "<orgia:(verse-block () \"Lorem\\nipsum\\ndolor\")>" 'html t)
#+end_src
#+RESULTS:
: <p>
: <p class="verse">
: Lorem<br />
: ipsum<br />
: dolor</p>
: </p>
#+begin_src elisp
(org-export-string-as "<orgia:(quote-block (:attr_latex
(\":environment foreigndisplayquote :options {greek}\"))
\"Δαρείου καὶ Παρυσάτιδος γίγνονται παῖδες δύο, πρεσβύτερος μὲν
Ἀρταξέρξης, νεώτερος δὲ Κῦρος·\")>" 'latex t)
#+end_src
#+RESULTS:
: \begin{foreigndisplayquote}{greek}
: Δαρείου καὶ Παρυσάτιδος γίγνονται παῖδες δύο, πρεσβύτερος μὲνἈρταξέρξης, νεώτερος δὲ Κῦρος·
: \end{foreigndisplayquote}
> However if you strongly want to use proper terminology in markup, you
> may try to trade it for +your soul+ compatibility and portability
> issues. The following almost works:
Interesting, thank you.
Yes, it is strange the new line added in `evilatex-emph' ... I have no
idea why that happens.
Best regards,
Juan Manuel
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Raw Org AST snippets for "impossible" markup
2021-12-08 23:19 ` Juan Manuel Macías
@ 2021-12-08 23:35 ` John Kitchin
2021-12-09 7:01 ` Juan Manuel Macías
0 siblings, 1 reply; 72+ messages in thread
From: John Kitchin @ 2021-12-08 23:35 UTC (permalink / raw)
To: Juan Manuel Macías; +Cc: Max Nikulin, orgmode
[-- Attachment #1: Type: text/plain, Size: 2220 bytes --]
Have you seen
https://github.com/tj64/org-dp? It seems to do a lot with creating and
manipulating org elements. It might either be handy or lead to some
inspiration.
On Wed, Dec 8, 2021 at 6:20 PM Juan Manuel Macías <maciaschain@posteo.net>
wrote:
> Max Nikulin writes:
>
> > As you have guessed, It is not my choice, it is interface of ox.el and
> > org-element.el.
>
> Indeed. Sorry for my haste: it's the consequences of not read the code
> carefully :-)
>
> Of course, your orgia-link-procedure could be extended to more org
> elements.
> I can't think of what kind of scenario that might fit in, but as a proof
> of concept I find it really stimulating. E.g:
>
> #+begin_src elisp
> (org-export-string-as "<orgia:(verse-block ()
> \"Lorem\\nipsum\\ndolor\")>" 'html t)
> #+end_src
>
> #+RESULTS:
> : <p>
> : <p class="verse">
> : Lorem<br />
> : ipsum<br />
> : dolor</p>
> : </p>
>
> #+begin_src elisp
> (org-export-string-as "<orgia:(quote-block (:attr_latex
> (\":environment foreigndisplayquote :options {greek}\"))
> \"Δαρείου καὶ Παρυσάτιδος γίγνονται παῖδες δύο, πρεσβύτερος μὲν
> Ἀρταξέρξης, νεώτερος δὲ Κῦρος·\")>" 'latex t)
> #+end_src
>
> #+RESULTS:
> : \begin{foreigndisplayquote}{greek}
> : Δαρείου καὶ Παρυσάτιδος γίγνονται παῖδες δύο, πρεσβύτερος μὲνἈρταξέρξης,
> νεώτερος δὲ Κῦρος·
> : \end{foreigndisplayquote}
>
>
> > However if you strongly want to use proper terminology in markup, you
> > may try to trade it for +your soul+ compatibility and portability
> > issues. The following almost works:
>
> Interesting, thank you.
>
> Yes, it is strange the new line added in `evilatex-emph' ... I have no
> idea why that happens.
>
> Best regards,
>
> Juan Manuel
>
--
John
-----------------------------------
Professor John Kitchin (he/him/his)
Doherty Hall A207F
Department of Chemical Engineering
Carnegie Mellon University
Pittsburgh, PA 15213
412-268-7803
@johnkitchin
http://kitchingroup.cheme.cmu.edu
[-- Attachment #2: Type: text/html, Size: 3121 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Raw Org AST snippets for "impossible" markup
2021-12-08 23:35 ` John Kitchin
@ 2021-12-09 7:01 ` Juan Manuel Macías
2021-12-09 14:56 ` Max Nikulin
0 siblings, 1 reply; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-09 7:01 UTC (permalink / raw)
To: John Kitchin; +Cc: Maxim Nikulin, orgmode
John Kitchin writes:
> Have you seen
> https://github.com/tj64/org-dp? It seems to do a lot with creating and
> manipulating org elements. It might either be handy or lead to some
> inspiration.
Interesting package. Thanks for sharing.
It gave me an idea, also borrowing part of Maxim's code, but evaluating
in this case the path. To continue playing with links... The goal is
to obtain a link with this structure `[[quote-lang:lang][quote]]':
#+BEGIN_SRC emacs-lisp :results silent
(org-link-set-parameters
"quote-lang"
:display 'full
:export (lambda (path desc bck)
(let* ((bck org-export-current-backend)
(attr (list (format
":environment foreigndisplayquote :options {%s}"
path)))
(info (org-export-get-environment
bck nil nil)))
(org-no-properties
(org-export-data-with-backend
`(quote-block (:attr_latex ,attr)
,desc)
bck info)))))
#+END_SRC
#+begin_src emacs-lisp
(setq backends '(latex html odt))
(setq results nil)
(mapc (lambda (backend)
(add-to-list 'results
(org-export-string-as
"[[quote-lang:spanish][Publicamos nuestro libros
para librarnos de ellos, para no pasar el resto de nuestras vidas
corrigiendo borradores.]]" backend t) t))
backends)
(mapconcat 'identity results "\n")
#+end_src
#+RESULTS:
#+begin_example
\begin{foreigndisplayquote}{spanish}
Publicamos nuestro libros
para librarnos de ellos, para no pasar el resto de nuestras vidas
corrigiendo borradores.
\end{foreigndisplayquote}
<p>
<blockquote>
Publicamos nuestro libros
para librarnos de ellos, para no pasar el resto de nuestras vidas
corrigiendo borradores.
</blockquote>
</p>
<text:p text:style-name="Text_20_body">Publicamos nuestro libros
para librarnos de ellos, para no pasar el resto de nuestras vidas
corrigiendo borradores.</text:p>
#+end_example
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Raw Org AST snippets for "impossible" markup
2021-12-09 7:01 ` Juan Manuel Macías
@ 2021-12-09 14:56 ` Max Nikulin
2021-12-09 16:11 ` Juan Manuel Macías
0 siblings, 1 reply; 72+ messages in thread
From: Max Nikulin @ 2021-12-09 14:56 UTC (permalink / raw)
To: emacs-orgmode
On 09/12/2021 14:01, Juan Manuel Macías wrote:
> John Kitchin writes:
>
>> Have you seen
>> https://github.com/tj64/org-dp? It seems to do a lot with creating and
>> manipulating org elements. It might either be handy or lead to some
>> inspiration.
>
> Interesting package. Thanks for sharing.
Either I missed something or its purpose is completely different. It
maps Org markup to Org markup. I am experimenting with fragments that
should allow to get something that is really tricky or even impossible
with established syntax, so it has to run immediately before exporters.
> It gave me an idea, also borrowing part of Maxim's code, but evaluating
> in this case the path. To continue playing with links... The goal is
> to obtain a link with this structure `[[quote-lang:lang][quote]]':
>
> #+BEGIN_SRC emacs-lisp :results silent
> (org-link-set-parameters
> "quote-lang"
> :display 'full
> :export (lambda (path desc bck)
> (let* ((bck org-export-current-backend)
> (attr (list (format
> ":environment foreigndisplayquote :options {%s}"
> path)))
> (info (org-export-get-environment
> bck nil nil)))
> (org-no-properties
> (org-export-data-with-backend
> `(quote-block (:attr_latex ,attr)
> ,desc)
> bck info)))))
> #+END_SRC
Looking into your code I have realized that it should be implemented
using filter, not through :export property of links. Maybe without
working proof of concept with link exporters, this session of
monkey-typing would not be successful.
#+begin_src elisp :results silent
(defun orgia-element-replace (current new destructive?)
(if (eq current new)
current
(let* ((lst? (and (listp new) (not (symbolp (car new)))))
(new-lst (if lst?
(if destructive? (nconc new) (reverse new))
(list new))))
(dolist (element new-lst)
(org-element-insert-before element current)))
(org-element-extract-element current)
new))
(defun orgia--transform-link (data)
(if (not (string-equal "orgia" (org-element-property :type data)))
data
(let* ((path (org-element-property :path data)))
(if (not (eq ?\( (aref path 0)))
(or path (org-element-contents data))
(read path)))))
(defun orgia-parse-tree-filter (data _backend info)
(org-element-map data 'link
(lambda (data)
(orgia-element-replace data (orgia--transform-link data) t))
info nil nil t)
data)
#+end_src
#+begin_src elisp :results silent
(add-to-list 'org-export-filter-parse-tree-functions
#'orgia-parse-tree-filter)
(org-link-set-parameters "orgia")
#+end_src
#+begin_src elisp
(org-export-string-as "An <orgia:(\"in\" (italic () \"ter\"))>word"
'html t)
#+end_src
#+RESULTS:
: <p>
: An in<i>ter</i>word</p>
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Raw Org AST snippets for "impossible" markup
2021-12-09 14:56 ` Max Nikulin
@ 2021-12-09 16:11 ` Juan Manuel Macías
2021-12-09 22:27 ` Juan Manuel Macías
0 siblings, 1 reply; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-09 16:11 UTC (permalink / raw)
To: Max Nikulin; +Cc: orgmode
Max Nikulin writes:
> Looking into your code I have realized that it should be implemented
> using filter, not through :export property of links. Maybe without
> working proof of concept with link exporters, this session of
> monkey-typing would not be successful.
Jumping into the "real world", how about these two examples of nested emphasis?
#+begin_src org :results latex :results replace
[[orgia:(italic () "The English versions of the " (italic () "Iliad") " and the " (italic () "Odyssey"))]]
#+end_src
#+RESULTS:
#+begin_export latex
\emph{The English versions of the \emph{Iliad} and the \emph{Odyssey}}
#+end_export
This one more complex:
#+begin_src org :results latex :results replace
[[orgia:(italic () "The English versions of the " (bold () (italic () "Iliad")) " and the " (bold () (italic () "Odyssey")))]]
#+end_src
#+RESULTS:
#+begin_export latex
\emph{The English versions of the \textbf{\emph{Iliad}} and the \textbf{\emph{Odyssey}}}
#+end_export
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Raw Org AST snippets for "impossible" markup
2021-12-09 16:11 ` Juan Manuel Macías
@ 2021-12-09 22:27 ` Juan Manuel Macías
2022-01-03 14:34 ` Max Nikulin
0 siblings, 1 reply; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-09 22:27 UTC (permalink / raw)
To: Maxim Nikulin; +Cc: orgmode
Juan Manuel Macías writes:
> Jumping into the "real world", how about these two examples of nested emphasis?
By the way, what do you think about allowing the use of some kind of
aliases, so that the aspect is less verbose? Maybe something like "(i::"
instead of "(italic () ..."? I came up with this hasty sketch over your
latest code, *just* to see how it looks (I don't know if I prefer it to
stay verbose):
#+begin_src emacs-lisp :results silent
(setq orgia-alias-alist '(("i" "italic")
("b" "bold")
("u" "underline")
("s" "strike-through")))
(defun orgia-replace (before after)
(interactive)
(save-excursion
(goto-char (point-min))
(while (re-search-forward before nil t)
(replace-match after t nil))))
(defun orgia--transform-path (path)
(with-temp-buffer
(insert path)
(mapc (lambda (el)
(orgia-replace (concat "(" (car el) "::") (concat "(" (cadr el) " () ")))
orgia-alias-alist)
(buffer-string)))
(defun orgia--transform-link (data)
(if (not (string-equal "orgia" (org-element-property :type data)))
data
(let* ((path (org-element-property :path data)))
(if (not (eq ?\( (aref path 0)))
(or path (org-element-contents data))
(read (orgia--transform-path path)))))) ;; <====
;;;;;;;;;;;;;;;;;;
#+end_src
#+begin_src elisp
(org-export-string-as "An <orgia:(\"in\" (s:: \"ter\"))>word"
'odt t)
#+end_src
#+RESULTS:
:
: <text:p text:style-name="Text_20_body">An in<text:span text:style-name="Strikethrough">ter</text:span>word</text:p>
#+begin_src org :results latex :results replace
[[orgia:(i:: "The English versions of the " (b:: (i:: "Iliad")) " and the " (b:: (i::
"Odyssey")))]]
#+end_src
#+RESULTS:
#+begin_export latex
\emph{The English versions of the \textbf{\emph{Iliad}} and the \textbf{\emph{Odyssey}}}
#+end_export
------------------------------------------------------
Juan Manuel Macías
https://juanmanuelmacias.com/
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Raw Org AST snippets for "impossible" markup
2021-12-09 22:27 ` Juan Manuel Macías
@ 2022-01-03 14:34 ` Max Nikulin
0 siblings, 0 replies; 72+ messages in thread
From: Max Nikulin @ 2022-01-03 14:34 UTC (permalink / raw)
To: emacs-orgmode
[-- Attachment #1: Type: text/plain, Size: 1899 bytes --]
On 10/12/2021 05:27, Juan Manuel Macías wrote:
> Juan Manuel Macías writes:
>
>> Jumping into the "real world", how about these two examples of nested emphasis?
>
> By the way, what do you think about allowing the use of some kind of
> aliases, so that the aspect is less verbose?
I have no particular opinion concerning aliases, but certainly they
should not work through string search and replace when parsed tree is
available.
> (defun orgia--transform-path (path)
> (with-temp-buffer
> (insert path)
> (mapc (lambda (el)
> (orgia-replace (concat "(" (car el) "::") (concat "(" (cadr el) " () ")))
By the way, is there any problem with `replace-regexp-in-string'?
See the attached file for definitions of some helper functions. Final setup:
#+begin_src elisp :results silent
(setq orgia-demo-alias-alist
'((b . bold)
(i . italic)
(s . strike-through)
(_ . underline)))
(defun orgia-demo-alias-post-filter (node &optional _children)
(when (listp node)
(let ((sym (and (symbolp (car node))
(assq (car node) orgia-demo-alias-alist))))
(when sym
(setcar node (cdr sym)))))
node)
(defun orgia-demo-alias (tree)
(orgia-transform-tree-deep tree nil #'orgia-demo-alias-post-filter))
#+end_src
#+begin_src elisp :results silent
(require 'ox)
(add-to-list 'org-export-filter-parse-tree-functions
#'orgia-parse-tree-filter)
(org-link-set-parameters "orgia")
(require 'ob-org)
(add-to-list 'orgia-transform-functions #'orgia-demo-alias)
#+end_src
And a bit modified your test sample:
#+begin_src org :results latex :results replace
[[orgia:(i nil "The English versions of the " (b nil (i () "Iliad"))
" and the " (b () (i ()
"Odyssey")))]]
#+end_src
#+RESULTS:
#+begin_export latex
\emph{The English versions of the \textbf{\emph{Iliad}} and the
\textbf{\emph{Odyssey}}}
#+end_export
[-- Attachment #2: orgia-draft.el --]
[-- Type: text/x-emacs-lisp, Size: 2080 bytes --]
(defvar orgia-transform-functions nil)
(defun orgia-default-pre-filter (node)
"Returns (node . children)"
(if (listp node)
(cons node node)
(cons node nil)))
(defun orgia-transform-tree-deep (tree &optional pre-filter post-filter)
"Deep-first walk."
;; Queue items: ((node-cell . children) . next-list)
(let* ((pre-filter (or pre-filter #'orgia-default-pre-filter))
(top (list tree))
(queue (list (cons (cons top top) top))))
(while queue
(let* ((item (pop queue))
(next-list (cdr item)))
(if (not next-list)
;; post; skip POST-FILTER for the list wrapping TREE
(when (and queue post-filter)
(let* ((node-cell-children (car item))
(children (cdr node-cell-children)))
(setcar (car node-cell-children)
(funcall post-filter
(caar node-cell-children)
children))))
;; pre
(setcdr item (cdr next-list))
(push item queue)
(let* ((node-children
(funcall pre-filter (car next-list)))
(node (car node-children))
(children (cdr node-children)))
(setcar next-list node)
(push (cons (cons next-list children) children) queue)))))
(car top)))
(defun orgia-element-replace (current new destructive?)
(if (eq current new)
current
(let* ((lst? (and (listp new) (not (symbolp (car new)))))
(new-lst (if lst?
(if destructive? (nconc new) (reverse new))
(list new))))
(dolist (element new-lst)
(org-element-insert-before element current)))
(org-element-extract-element current)
new))
(defun orgia--transform-link (data)
(if (not (string-equal "orgia" (org-element-property :type data)))
data
(let* ((path (org-element-property :path data)))
(if (not (eq ?\( (aref path 0)))
(or path (org-element-contents data))
(let ((tree (read path)))
(dolist (f orgia-transform-functions tree)
(setq tree (funcall f tree))))))))
(defun orgia-parse-tree-filter (data _backend info)
(org-element-map data 'link
(lambda (data)
(orgia-element-replace data (orgia--transform-link data) t))
info nil nil t)
data)
^ permalink raw reply [flat|nested] 72+ messages in thread
* [PATCH] Intra-word markup: \relax
2021-12-02 10:50 Org-syntax: Intra-word markup Denis Maier
2021-12-02 11:18 ` Ihor Radchenko
2021-12-02 11:58 ` Timothy
@ 2022-01-28 12:12 ` Max Nikulin
2022-01-28 13:13 ` Juan Manuel Macías
2 siblings, 1 reply; 72+ messages in thread
From: Max Nikulin @ 2022-01-28 12:12 UTC (permalink / raw)
To: emacs-orgmode
[-- Attachment #1: Type: text/plain, Size: 1214 bytes --]
On 02/12/2021 17:50, Denis Maier wrote:
>
> Currently, org syntax doesn't officially seem to support intra-word
> emphasis. Am I missing something?
> If the assessment is correct: Is there a reason for this? And, shouldn't
> that be officially added?
I have an idea how to implement *intra*/word/ markup with minimal change
of Org syntax. At first I had a hope that it is enough to introduce
\relax entity that expands to empty string, but it does not work for
second part of words: *intra*\relax{}/word/ is exported to
<b>intra</b>/word/.
So it is necessary to support consuming spaces after such entity similar
to TeX commands:
*intra*\relax /word/
In Org "a\_ b" already behaves in the same way.
I do not like zero-width spaces since they are invisible, so they are
not really "text" markup. Moreover, it is better to filter them out
during export.
Another failed idea was to use export snippet or a macro for such purpose:
#+macro sep $1
*intra*{{{sep()}}}/word/, *intra*@@html:@@/word/
Important point that suggested solution works for all export backends. I
do not consider explicit export snippets as a workaround since it
requires code for all backends in org files.
[-- Attachment #2: 0001-Intra-word-markup-relax.patch --]
[-- Type: text/x-patch, Size: 2278 bytes --]
From 95a0dcb1370577409388e137dae98ec4c1af5bbd Mon Sep 17 00:00:00 2001
From: Max Nikulin <manikulin@gmail.com>
Date: Fri, 28 Jan 2022 18:55:54 +0700
Subject: [PATCH] Intra-word markup: \relax
lisp/org-element.el (org-element-entity-parser): Parse \relax entity
with following spaces.
lisp/org-entities.el (org-entities): Add "\relax " entities with various
number of spaces expanding to nothing.
Allow "*intra*\relax /word/" markup change withing continuous word. It
is not enough to just add "relax" entity since while it allows
"*intra*\relax{}word", characters after "{}" are not considered as
emphasis markers "intra\relax{}/word/". The name is similar to the TeX
command. Consuming spaces following a command is usual behavior of TeX
commands as well.
---
lisp/org-element.el | 2 +-
lisp/org-entities.el | 7 ++++++-
2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/lisp/org-element.el b/lisp/org-element.el
index b82475a14..83001fd74 100644
--- a/lisp/org-element.el
+++ b/lisp/org-element.el
@@ -3159,7 +3159,7 @@ a plist with `:begin', `:end', `:latex', `:latex-math-p',
Assume point is at the beginning of the entity."
(catch 'no-object
- (when (looking-at "\\\\\\(?:\\(?1:_ +\\)\\|\\(?1:there4\\|sup[123]\\|frac[13][24]\\|[a-zA-Z]+\\)\\(?2:$\\|{}\\|[^[:alpha:]]\\)\\)")
+ (when (looking-at "\\\\\\(?:\\(?1:\\(?:_\\|relax\\) +\\)\\|\\(?1:there4\\|sup[123]\\|frac[13][24]\\|[a-zA-Z]+\\)\\(?2:$\\|{}\\|[^[:alpha:]]\\)\\)")
(save-excursion
(let* ((value (or (org-entity-get (match-string 1))
(throw 'no-object nil)))
diff --git a/lisp/org-entities.el b/lisp/org-entities.el
index 2bd4f2fe3..f6177c471 100644
--- a/lisp/org-entities.el
+++ b/lisp/org-entities.el
@@ -526,7 +526,12 @@ packages to be loaded, add these packages to `org-latex-packages-alist'."
spaces
spaces
(make-string n ?\x2002))
- space-entities)))))
+ space-entities))))
+ ;; Add "\relax " space-eating entity family for "intra\relax *word*" markup.
+ (mapcar (lambda (n)
+ (list (concat "relax" (make-string n ? )) "" nil "" "" "" ""))
+ (number-sequence 0 20)))
+
"Default entities used in Org mode to produce special characters.
For details see `org-entities-user'.")
--
2.25.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* Re: [PATCH] Intra-word markup: \relax
2022-01-28 12:12 ` [PATCH] Intra-word markup: \relax Max Nikulin
@ 2022-01-28 13:13 ` Juan Manuel Macías
2022-02-02 15:42 ` Max Nikulin
0 siblings, 1 reply; 72+ messages in thread
From: Juan Manuel Macías @ 2022-01-28 13:13 UTC (permalink / raw)
To: Max Nikulin; +Cc: orgmode
Max Nikulin writes:
> I have an idea how to implement *intra*/word/ markup with minimal
> change of Org syntax. At first I had a hope that it is enough to
> introduce \relax entity that expands to empty string, but it does not
> work for second part of words: *intra*\relax{}/word/ is exported to
> <b>intra</b>/word/.
> So it is necessary to support consuming spaces after such entity
> similar to TeX commands:
> *intra*\relax /word/
> In Org "a\_ b" already behaves in the same way.
>
> I do not like zero-width spaces since they are invisible, so they are
> not really "text" markup. Moreover, it is better to filter them out
> during export.
>
> Another failed idea was to use export snippet or a macro for such purpose:
> #+macro sep $1
> *intra*{{{sep()}}}/word/, *intra*@@html:@@/word/
>
> Important point that suggested solution works for all export backends.
> I do not consider explicit export snippets as a workaround since it
> requires code for all backends in org files.
Maxim, I find the idea of \relax entity interesting. The only (minor)
drawback I find (in normal use, I mean) is the verbosity it adds.
In my case, I have already given up on the problem of marks inside words
:-(. My personal opinion: I think that, unless a completely
'revolutionary' solution emerges, it is better to leave the matter as it
is, and consider this a feature of Org rather than a bug. I suspect that
a single solution could not satisfy all tastes or all possible
scenarios, so maybe it would be nice to put a list of solutions
(including this one and also the zero space thing, and others that have
arisen or may arise) somewhere (perhaps in the manual?). What doesn't
quite convince me (and I agree with you on that) is recommending zero
width space as a sort of 'official' escape character. For the reasons
you have expressed, which I think are very fair.
Best regards,
Juan Manuel
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2021-12-04 21:48 ` Tom Gillespie
2021-12-06 10:59 ` Max Nikulin
@ 2022-01-28 14:52 ` Max Nikulin
2022-01-29 3:13 ` Ihor Radchenko
1 sibling, 1 reply; 72+ messages in thread
From: Max Nikulin @ 2022-01-28 14:52 UTC (permalink / raw)
To: Tom Gillespie; +Cc: emacs-orgmode
On 05/12/2021 04:48, Tom Gillespie wrote:
>> Since org is a valid export backend though, perhaps this behaviour should be
>> reserved for @@:…@@, i.e. no export backend, which I think semantically fits
>> fairly nicely.
>
> ...
>
> What this means is that @@:...@@ syntax is not actually used
> in Org at all at the moment and renders as plain text. I agree that
> we need to avoid @@org:..@@ because it has legitimate uses.
> Making a back-end of empty string valid for parse separately
> syntax thus makes @@ syntax more regular overall, and allows
> @@:...@@ to be processed separately because it currently
> never enters the export snippet processing.
It seems that @@:...@@ should behave significantly different from
regular export snippet since org markup should be parsed inside.
It could be used for one more purpose. I miss "fallback" option for
export snippets. E.g. if explicit raw markup is specified for HTML and
LaTeX, it would be nice to have something for other backends such as
ascii or odt. In the series of adjacent export snippets @@:...@@ may be
taken when backends in earlier snippets are not matched:
@@html:HTML 1@@@@latex:LaTeX 1@@@@:ascii and odt 1@@@@html: HTML
2@@@@:LaTeX, ascii, and odt 2@@.
At first I complained that it would be impossible to put export snippets
in "parse separately" construct with @@:...@@ syntax. Likely it is not
necessary. It is a bit verbose, but "parse separately" may be split:
@@:part 1@@@@html:html-only@@@@:@@@@:part 2@@
Empty @@:@@ is added to avoid considering @@:part 2@@ as a fallback for
"html-only".
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2022-01-28 14:52 ` Max Nikulin
@ 2022-01-29 3:13 ` Ihor Radchenko
2022-01-29 13:05 ` Juan Manuel Macías
0 siblings, 1 reply; 72+ messages in thread
From: Ihor Radchenko @ 2022-01-29 3:13 UTC (permalink / raw)
To: Max Nikulin; +Cc: Tom Gillespie, emacs-orgmode
Max Nikulin <manikulin@gmail.com> writes:
> It could be used for one more purpose. I miss "fallback" option for
> export snippets. E.g. if explicit raw markup is specified for HTML and
> LaTeX, it would be nice to have something for other backends such as
> ascii or odt. In the series of adjacent export snippets @@:...@@ may be
> taken when backends in earlier snippets are not matched:
This reminds me about our #+begin_export export blocks and #+begin_*
special blocks. We can think of @@backend:...@@ snippets as inline
equivalent of export blocks. Special blocks do not have inline
equivalent (except maybe links abused for export by some people).
Keeping in mind the above analogy, note that export blocks do not have
fallbacks, while special blocks do (for example, see
https://github.com/alhassy/org-special-block-extras/).
Maybe we should introduce an equivalent of special blocks, but for
inline use? Or should we modify _both_ inline export snippets and export
blocks to allow fallback mechanism?
Best,
Ihor
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2022-01-29 3:13 ` Ihor Radchenko
@ 2022-01-29 13:05 ` Juan Manuel Macías
2022-02-02 15:28 ` Max Nikulin
0 siblings, 1 reply; 72+ messages in thread
From: Juan Manuel Macías @ 2022-01-29 13:05 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: orgmode
Ihor Radchenko writes:
> Maybe we should introduce an equivalent of special blocks, but for
> inline use? Or should we modify _both_ inline export snippets and export
> blocks to allow fallback mechanism?
I find the idea of inline special blocks very interesting, but I think
there are a couple of drawbacks: since special blocks support ATTR_X,
how would that be implemented in the inline version? The most obvious
thing I can think of is to mimic inline code blocks:
my_special_block[attributes list]{content}
But it would produce a result many times too verbose. Another risk that
this would entail, IMHO, is that of the "LaTeXification" of Org...
In any case, for things like that, aren't links and macros enough? I'm
one of those who 'abuse' links for many export scenarios (I even have
written this package:
https://gitlab.com/maciaschain/org-critical-edition), and I think links
have enormous potential and versatility. John Kitchin's blog has really
helped me open my mind and explore that very productive Org component.
Macros are also a very powerful tool, except for the comma issue, which
I think is still an unfinished business and a solution should be found
one day. Still, the possibility of a special inline block is very
interesting to me.
Best regards,
Juan Manuel
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2022-01-29 13:05 ` Juan Manuel Macías
@ 2022-02-02 15:28 ` Max Nikulin
2022-02-02 20:01 ` Juan Manuel Macías
0 siblings, 1 reply; 72+ messages in thread
From: Max Nikulin @ 2022-02-02 15:28 UTC (permalink / raw)
To: emacs-orgmode
> Ihor Radchenko writes:
>> Keeping in mind the above analogy, note that export blocks do not have
>> fallbacks, while special blocks do (for example, see
>> https://github.com/alhassy/org-special-block-extras/)
Ihor, I am sorry, but I missed your point. That project provides some
set of defined link+block pairs and some macros to define new
links/pairs. I do not see relation to export snippets or blocks that are
used when their content is not intended to be reusable.
>> Maybe we should introduce an equivalent of special blocks, but for
>> inline use? Or should we modify _both_ inline export snippets and export
>> blocks to allow fallback mechanism?
I suppose, it should be consistent to consider adjacent export blocks as
alternatives and to allow "fallback" or "default" block. Again, similar
to @@:...@@ snippets, block content should be parsed as Org markup.
On 29/01/2022 20:05, Juan Manuel Macías wrote:
> I find the idea of inline special blocks very interesting, but I think
> there are a couple of drawbacks: since special blocks support ATTR_X,
> how would that be implemented in the inline version? The most obvious
> thing I can think of is to mimic inline code blocks:
>
> my_special_block[attributes list]{content}
ATTR_X attributes are supported for links as well, see
info "(org) Links in HTML export"
https://orgmode.org/manual/Links-in-HTML-export.html
However it is rather verbose, may have problems with LaTeX, and I am
unsure if they can be accessed from export link handlers
Actually I do not like src_something[...]{...} syntax since there is no
clear mark (such as "\") at the beginning that it is a special construct.
> In any case, for things like that, aren't links and macros enough?
Ad hoc code for particular backends (and discussed fallback for other
backends) is a bit different thing. It may be used in macros, but macros
can not replace it. Moreover @@:...@@ construct proposed by Tom would
allow e.g.
[[https://orgmode.org][@@:*inter*@@@@:/word/@@]]
to be half-word bold and half-word italics without invisible zero width
spaces and filters to remove them.
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH] Intra-word markup: \relax
2022-01-28 13:13 ` Juan Manuel Macías
@ 2022-02-02 15:42 ` Max Nikulin
0 siblings, 0 replies; 72+ messages in thread
From: Max Nikulin @ 2022-02-02 15:42 UTC (permalink / raw)
To: emacs-orgmode
On 28/01/2022 20:13, Juan Manuel Macías wrote:
> Max Nikulin writes:
>
>> I have an idea how to implement *intra*/word/ markup with minimal
>> change of Org syntax. At first I had a hope that it is enough to
>> introduce \relax entity that expands to empty string, but it does not
>> work for second part of words: *intra*\relax{}/word/ is exported to
>> <b>intra</b>/word/.
>> So it is necessary to support consuming spaces after such entity
>> similar to TeX commands:
>> *intra*\relax /word/
>> In Org "a\_ b" already behaves in the same way.
>
> Maxim, I find the idea of \relax entity interesting. The only (minor)
> drawback I find (in normal use, I mean) is the verbosity it adds.
"Relax" is just a name known to TeX users. Certainly another shorter
word may be used instead. I am just lazy enough to look through HTML
named entities and LaTeX command to avoid conflicts and thus behavior
unexpected to some users.
> In my case, I have already given up on the problem of marks inside words
> :-(. My personal opinion: I think that, unless a completely
> 'revolutionary' solution emerges, it is better to leave the matter as it
> is, and consider this a feature of Org rather than a bug. I suspect that
> a single solution could not satisfy all tastes or all possible
> scenarios, so maybe it would be nice to put a list of solutions
> (including this one and also the zero space thing, and others that have
> arisen or may arise) somewhere (perhaps in the manual?).
A day before I posted my current summary why export snippets and macros
do not help with intra-word markup (before I expected that they can),
only custom links is a workaround (with some limitations, as usual):
[RFC] Creole-style / Support for **emphasis**__within__**a word**
Tue, 25 Jan 2022 23:27:50 +0700.
https://list.orgmode.org/ssp8e7$ah2$1@ciao.gmane.io/
But at that moment I forgot about entities, Another topic served as a
reminder, and I spent some time experimenting with them.
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2022-02-02 15:28 ` Max Nikulin
@ 2022-02-02 20:01 ` Juan Manuel Macías
2022-02-03 12:10 ` Max Nikulin
0 siblings, 1 reply; 72+ messages in thread
From: Juan Manuel Macías @ 2022-02-02 20:01 UTC (permalink / raw)
To: Max Nikulin; +Cc: orgmode
Max Nikulin writes:
> ATTR_X attributes are supported for links as well, see
> info "(org) Links in HTML export"
> https://orgmode.org/manual/Links-in-HTML-export.html
> However it is rather verbose, may have problems with LaTeX, and I am
> unsure if they can be accessed from export link handlers
Yes, I know. I use a lot in my blogs constructions of this type:
#+ATTR_HTML: :target _blank
some link...
But, as far as I know, its use is line-oriented. I mean, you can't use
multiple ATTR_X constructs inside a paragraph and for different links
inside the paragraph.
As for links and their multiple possible or future uses (I say *uses*
and never *abuses*: it's a tool, it's there to be used, and it works
great), of course I see them more as a resource ---and quite powerful
and versatile, by the way. --- that a matter of syntax. But the thing is
that for me Org is, in addition to a syntax, above all a set of
coherently assembled resources to prepare my documents and take my
notes, organize my work and a lot of other things.
Best regards,
Juan Manuel
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Org-syntax: Intra-word markup
2022-02-02 20:01 ` Juan Manuel Macías
@ 2022-02-03 12:10 ` Max Nikulin
0 siblings, 0 replies; 72+ messages in thread
From: Max Nikulin @ 2022-02-03 12:10 UTC (permalink / raw)
To: emacs-orgmode
On 03/02/2022 03:01, Juan Manuel Macías wrote:
> Max Nikulin writes:
>
>> ATTR_X attributes are supported for links as well, see
>> info "(org) Links in HTML export"
>> https://orgmode.org/manual/Links-in-HTML-export.html
>> However it is rather verbose, may have problems with LaTeX, and I am
>> unsure if they can be accessed from export link handlers
>
> Yes, I know. I use a lot in my blogs constructions of this type:
>
> #+ATTR_HTML: :target _blank
> some link...
I just have realized that example in the manual does not work. I will
start a new thread. Attributes are assigned to paragraph, not to the link:
#+ATTR_HTML: :title The Org mode homepage :style color:red;
[[https://orgmode.org]]
<p title="The Org mode homepage" style="color:red;">
<a href="https://orgmode.org" title="The Org mode homepage"
style="color:red;">https://orgmode.org</a>
</p>
> But, as far as I know, its use is line-oriented. I mean, you can't use
> multiple ATTR_X constructs inside a paragraph and for different links
> inside the paragraph.
Thank you, I confused issues related to export when keywords and export
blocks are used. For some reason I believed that affiliated keywords
have a dedicated section in https://orgmode.org/worg/dev/org-syntax.html
because they can be applied to inline objects, but you are right, they
set property for next block-level element.
Attributes from several lines are combined however.
The following snippets illustrates bugs in LaTeX exporter that I
remember from an earlier discussion:
---- >8 ----
This is a single paragraph in LaTeX export, but 3 HTML paragraphs.
First link (with =rel= attribute) is to
#+attr_html: :rel nofollow :title Org Mode web site
[[https://orgmode.org/][Org Mode]].
Another one is to
#+attr_html: :rel noopener
#+attr_html: :title GNU web site
[[https://www.gnu.org/][GNU]]. Both links have =title= HTML attributes.
This is single paragraph in HTML
@@odt:@@
but 2 paragraphs in LaTeX.
---- 8< ----
This is a single paragraph in \LaTeX{} export, but 3 HTML paragraphs.
First link (with \texttt{rel} attribute) is to
\href{https://orgmode.org/}{Org Mode}.
Another one is to
\href{https://www.gnu.org/}{GNU}. Both links have \texttt{title} HTML
attributes.
This is single paragraph in HTML
but 2 paragraphs in \LaTeX{}.
---- >8 ----
<p>
This is a single paragraph in LaTeX export, but 3 HTML paragraphs.
First link (with <code>rel</code> attribute) is to
</p>
<p rel="nofollow" title="Org Mode web site">
<a href="https://orgmode.org/" rel="nofollow" title="Org Mode web
site">Org Mode</a>.
Another one is to
</p>
<p title="GNU web site" rel="noopener">
<a href="https://www.gnu.org/" title="GNU web site"
rel="noopener">GNU</a>. Both links have <code>title</code> HTML attributes.
</p>
<p>
This is single paragraph in HTML
but 2 paragraphs in LaTeX.</p>
^ permalink raw reply [flat|nested] 72+ messages in thread
end of thread, other threads:[~2022-02-03 12:13 UTC | newest]
Thread overview: 72+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-12-02 10:50 Org-syntax: Intra-word markup Denis Maier
2021-12-02 11:18 ` Ihor Radchenko
2021-12-02 11:30 ` Juan Manuel Macías
2021-12-02 11:36 ` Denis Maier
2021-12-02 12:01 ` Ihor Radchenko
2021-12-02 11:42 ` Marco Wahl
2021-12-02 11:50 ` Denis Maier
2021-12-02 12:10 ` Ihor Radchenko
2021-12-02 12:40 ` Denis Maier
2021-12-02 12:54 ` Ihor Radchenko
2021-12-02 13:14 ` Juan Manuel Macías
2021-12-02 13:28 ` Denis Maier
2021-12-02 12:48 ` Max Nikulin
2021-12-02 12:02 ` Ihor Radchenko
2021-12-02 12:00 ` Ihor Radchenko
[not found] ` <87r1avtdjy.fsf@ucl.ac.uk>
2021-12-02 12:27 ` Denis Maier
2021-12-02 13:06 ` Eric S Fraga
2021-12-02 12:28 ` Denis Maier
2021-12-02 12:55 ` Ihor Radchenko
2021-12-02 11:58 ` Timothy
2021-12-02 12:26 ` Denis Maier
2021-12-02 13:07 ` Ihor Radchenko
2021-12-02 15:51 ` Max Nikulin
2021-12-02 18:11 ` Tom Gillespie
2021-12-02 19:09 ` Juan Manuel Macías
2021-12-04 13:07 ` Org-syntax: emphasis and not English punctuation Max Nikulin
2021-12-04 16:42 ` Juan Manuel Macías
2021-12-02 20:47 ` Org-syntax: Intra-word markup Denis Maier
2021-12-02 22:44 ` Samuel Wales
2021-12-03 14:53 ` Max Nikulin
2021-12-03 23:51 ` Tim Cross
2021-12-04 15:01 ` Max Nikulin
2021-12-05 23:34 ` Russell Adams
2021-12-05 23:37 ` Russell Adams
2021-12-06 1:39 ` Samuel Wales
2021-12-02 19:03 ` Nicolas Goaziou
2021-12-02 19:34 ` Juan Manuel Macías
2021-12-02 23:05 ` Nicolas Goaziou
2021-12-02 23:24 ` Juan Manuel Macías
2021-12-03 14:24 ` Max Nikulin
2021-12-03 15:01 ` Juan Manuel Macías
2021-12-04 15:57 ` Denis Maier
2021-12-04 17:53 ` Tom Gillespie
2021-12-04 18:37 ` John Kitchin
2021-12-04 21:16 ` Juan Manuel Macías
2021-12-06 10:57 ` Raw Org AST snippets for "impossible" markup Max Nikulin
2021-12-06 15:45 ` Juan Manuel Macías
2021-12-06 16:56 ` Juan Manuel Macías
2021-12-08 13:09 ` Max Nikulin
2021-12-08 23:19 ` Juan Manuel Macías
2021-12-08 23:35 ` John Kitchin
2021-12-09 7:01 ` Juan Manuel Macías
2021-12-09 14:56 ` Max Nikulin
2021-12-09 16:11 ` Juan Manuel Macías
2021-12-09 22:27 ` Juan Manuel Macías
2022-01-03 14:34 ` Max Nikulin
2021-12-04 19:04 ` Org-syntax: Intra-word markup Timothy
2021-12-04 21:48 ` Tom Gillespie
2021-12-06 10:59 ` Max Nikulin
2022-01-28 14:52 ` Max Nikulin
2022-01-29 3:13 ` Ihor Radchenko
2022-01-29 13:05 ` Juan Manuel Macías
2022-02-02 15:28 ` Max Nikulin
2022-02-02 20:01 ` Juan Manuel Macías
2022-02-03 12:10 ` Max Nikulin
2021-12-06 11:01 ` Denis Maier
2022-01-28 12:12 ` [PATCH] Intra-word markup: \relax Max Nikulin
2022-01-28 13:13 ` Juan Manuel Macías
2022-02-02 15:42 ` Max Nikulin
-- strict thread matches above, loose matches on Subject: below --
2021-12-02 13:36 Org-syntax: Intra-word markup autofrettage
2021-12-02 15:24 ` Robert Pluim
2021-12-02 17:11 ` autofrettage
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.