From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id wEO1KO61q2HfPQEAgWs5BA (envelope-from ) for ; Sat, 04 Dec 2021 19:39:42 +0100 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id iCFzJO61q2GYegAA1q6Kng (envelope-from ) for ; Sat, 04 Dec 2021 18:39:42 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id CC6C925467 for ; Sat, 4 Dec 2021 19:39:40 +0100 (CET) Received: from localhost ([::1]:58696 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mtZwh-0007al-7E for larch@yhetil.org; Sat, 04 Dec 2021 13:39:39 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36416) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mtZv4-0007ad-1Y for emacs-orgmode@gnu.org; Sat, 04 Dec 2021 13:37:58 -0500 Received: from [2a00:1450:4864:20::42d] (port=42778 helo=mail-wr1-x42d.google.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mtZv0-00018F-Vz for emacs-orgmode@gnu.org; Sat, 04 Dec 2021 13:37:57 -0500 Received: by mail-wr1-x42d.google.com with SMTP id c4so13020887wrd.9 for ; Sat, 04 Dec 2021 10:37:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=andrew.cmu.edu; s=google-2021; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=xWtgOV/M1nSG1O5kJORI5i8xjq/ksoZ5cRvF6OybiOs=; b=Ba9dS6235/YZVbZf0kh/l19l0nvYr5AiA6gQU3dGKMrc5DG7+ON449oObDumrJs/xm 1mvdtDin1WjKegwaqL8dQY/wLHM4aRj94fmSj9FHpKmMTCt1FTYSdw2b4K0cWffwTW43 howdVfIoeE9PNWBe3UfUjjVCDH6DmQriN0IOCtDmSorOhOEo/knCSgDZplhPWZysLb30 P01N0il18C5U5Okh0NBXCGT9NQzkAs9aEQ8VABB3SZCDUYohT1w2INTJ/2yyNT3CStjP FK143OtmDfUpoLMF4FYJuF6ihHxxi9yhp3Av5SvKvPBFzpJsTZbfZ0i1xslWIKuUJ5VU p5Og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=xWtgOV/M1nSG1O5kJORI5i8xjq/ksoZ5cRvF6OybiOs=; b=WjpfH31sh5cVvyQsq7qf/UVYS/IG7tnUQEt4J1dFoxQMOBPL5ph30vN01rgabq6zSz 9XrcUZ+JGOVQ0FaZBKmZ08WZOckXeWxAiiwrLy1MWucWAFM+m/X38UEwi6eKosCRYhKa f5VaCO6J7DGAgMxfW0K+fyzsUtHJNT/4YZ1+gZ0xbV3Uno6cOmTsNtecqvYM5S7N8hXt FhmtGZy7Ik5EZrx/8ftWscG+BC6STz8Ex8z/e0N8vh14PYjUSrhuyBId8sv8JoW/XdXM jDpxWwkIZmkckzL8ufdu03MaibzSYUQbvZ4KudL9xEo+ovL7MRqlYW3hpNuaB5q9XBKB rPEg== X-Gm-Message-State: AOAM531fsSA/P4+TtbiZG6SNoX75n3bxRYlOCWyOW9cEuAH6fhFZ79bi gDSGUawCmiQ9NljIUXXSCA4rou19v+uRmKvUJnM= X-Google-Smtp-Source: ABdhPJzjmrV0QGNRRwfRUpradCGxCGRylhCH80Fjyw0YM4GDaXs76oVsvGubPsFIToYknHFCJNtnJ6OFJ4WtJb+NcMc= X-Received: by 2002:a5d:5385:: with SMTP id d5mr29526329wrv.132.1638643072514; Sat, 04 Dec 2021 10:37:52 -0800 (PST) MIME-Version: 1.0 References: <4897bc60-b74f-ccfd-e13e-9b89a1194fdf@mailbox.org> <87fsrbp673.fsf@gmail.com> <1ef0e093-c165-2a5f-954d-6a33b64c8ee9@mailbox.org> <87r1avgnpi.fsf@localhost> <878rx2bzhw.fsf@nicolasgoaziou.fr> <9525e029-a590-3f48-df64-ffb9176075d9@mailbox.org> In-Reply-To: From: John Kitchin Date: Sat, 4 Dec 2021 13:37:41 -0500 Message-ID: Subject: Re: Org-syntax: Intra-word markup To: Tom Gillespie Content-Type: multipart/alternative; boundary="000000000000403ec105d2565256" X-Host-Lookup-Failed: Reverse DNS lookup failed for 2a00:1450:4864:20::42d (failed) Received-SPF: pass client-ip=2a00:1450:4864:20::42d; envelope-from=johnrkitchin@gmail.com; helo=mail-wr1-x42d.google.com X-Spam_score_int: -6 X-Spam_score: -0.7 X-Spam_bar: / X-Spam_report: (-0.7 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.248, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.248, HTML_MESSAGE=0.001, PDS_HP_HELO_NORDNS=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?UTF-8?Q?Juan_Manuel_Mac=C3=ADas?= , Max Nikulin , Tim Cross , emacs-orgmode , Denis Maier Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1638643182; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=xWtgOV/M1nSG1O5kJORI5i8xjq/ksoZ5cRvF6OybiOs=; b=goEBqFfAmcB6jHYbdtJrTYVIhW2g4LqO5DaBLy4gWzaOw9ihB0CE+vQIE6xM8rTSG3BQMX IXfuXnuO2HnW48cyeY10enbaBdzoVL7c5J+RPtroCHxZ1zApmn40v+LOuGf7ZD4oMAfE8x wvNA20AkDpTEYvVeDlOZezITpIMHCide78rsMw/fYrTzFnkHV6OpAJbcPk7bh7t4iC3iGK YwK32r+xXMsXa4fYSgPv7xWr0DwQBmCVmFbwqQ9NUTM5GiIhl9BFrUQoI6pTolJR/8UUpA nvQ8P8fz9LKcOFPUzTuq6bfBoqjOaEe9L6EHXYD3zUFIreM/MHk9bdMy+jgKXQ== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1638643182; a=rsa-sha256; cv=none; b=AIVlLrxUksYOvYp+kf2wfWRBo53QjEqh00M+f99vX/PQq7EbT59Bb6v3I3VkzPy24o3+uk 4M/j/NCAXMgGtGym7Rn4R0Vuc73AgEmMSYfQ10iYJ8CapzMpl0fL6sFiLdOR2urSlgQ3Kt NR0LZKGwx0wqgxo+c30fAqF04XtQYbwih0lDe1FGlDh/fDapceVOw1DYbe/wIW1PnxlUcP uL5CwDFGBdhJTFh7Lde9a6LeUFJBujHpjg5fPLnMIFQzCSPqFzqDzF9P/MtNx0cqSBrm/5 msVgcKYhHjFbtznjJHfq8z6WRgkr7ztkoFztM3E9GkWh1v9wPs9hOfV4cOVTzA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=andrew.cmu.edu header.s=google-2021 header.b=Ba9dS623; dmarc=fail reason="SPF not aligned (relaxed)" header.from=andrew.cmu.edu (policy=none); spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -1.83 Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=andrew.cmu.edu header.s=google-2021 header.b=Ba9dS623; dmarc=fail reason="SPF not aligned (relaxed)" header.from=andrew.cmu.edu (policy=none); spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: CC6C925467 X-Spam-Score: -1.83 X-Migadu-Scanner: scn0.migadu.com X-TUID: GZ6Ws+Pxkp0G --000000000000403ec105d2565256 Content-Type: text/plain; charset="UTF-8" Along these lines (and combining the s-exp suggestion from Max) , you can achieve something like this with links. This is lightly tested, and I am not thrilled with the eval for exporting, but I couldn't get a macro to work on the export function to avoid it, and this is just a proof of concept idea. This might only be suitable for individual solutions, since you have to define this markup yourself. #+BEGIN_SRC emacs-lisp :results silent (defun italic (s) (pcase backend ;; lexical ('latex (format "{\\textit{%s}}" s)) ('html (format "%s" s)) (_ s))) (defun @@-export (path desc backend) (eval `(concat ,@(read path)))) (org-link-set-parameters "@@" :export #'@@-export) #+END_SRC In org, it would look like Here is a [[@@:((italic "part") "ial")]] markup. And in exports this is what this implementation does. #+BEGIN_SRC emacs-lisp (org-export-string-as "Here is a [[@@:((italic \"part\") \"ial\")]] markup." 'latex t) #+END_SRC #+RESULTS: : Here is a {\textit{part}}ial markup. #+BEGIN_SRC emacs-lisp (org-export-string-as "Here is a [[@@:((italic \"part\") \"ial\")]] markup." 'html t) #+END_SRC #+RESULTS: :

: Here is a partial markup.

#+BEGIN_SRC emacs-lisp (org-export-string-as "Here is a [[@@:((italic \"part\") \"ial\")]] markup." 'ascii t) #+END_SRC #+RESULTS: : Here is a partial markup. Of course, you are free to do what you want with the path, including parse it yourself to generate the output, and since it is a link, you could do all kinds of things to make it look the way you want with faces, overlays, etc. John ----------------------------------- Professor John Kitchin (he/him/his) Doherty Hall A207F Department of Chemical Engineering Carnegie Mellon University Pittsburgh, PA 15213 412-268-7803 @johnkitchin http://kitchingroup.cheme.cmu.edu On Sat, Dec 4, 2021 at 12:54 PM Tom Gillespie wrote: > Hi all, > After a bunch of rambling (see below if interested), I think I have > a solution that should work for everyone. The key realization is that > what we really want is the ability to have a "parse me separately" > type of syntax. This meets the intra-word syntax needs and might > meet some other needs as well. > > The solution is to make @@org:...@@ "parse me separately" > block! It nearly works that way already too! To minimize typing > we could have @@:...@@ the empty type default to org. > > This seems like a winner to me. The syntax for it already exists > and won't conflict. It requires relatively minimal additional typing > the implication is clear, and there are other places where such > behavior could be useful. > > This syntax seems like a winner to me > @@org:/hello/@@world > @@:/hello/@@world > > You can also do things like > #+begin_src org > I want a number in this number@@org:src_elisp{(+ 1 2)}@@word! > #+end_src > > Which would render to > #+begin_src org > I want a number in this number3word! > #+end_src > > Thoughts? > > Best! > Tom > > --------------- rambling below ------------- > > > > This idea reminds me a bit of Scribble/Racket where every document is > > just inverted code, which makes it possible to insert arbitrary Racket > > code in your prose... > > I will say, despite some of my comments elsewhere, that I think > exploring certain features of Scribble syntax for use in Org mode > would simplify certain parts of the syntax immensely. > > For example > various inline blocks are an absolute pain to parse because they > allow nested delimiters /if they are matched/. The implementation > of the /if they are matched/ clause is currently a nasty hack which > generates a regular expression that can only actually handle nesting > to depth 3. Actually implementing the recursive grammar add a lot > of complexity to the syntax and is hard to get right. > > It would be vastly simpler to use Scribble's |<{hello }} world}>| > style syntax and always terminate at the first matching delimiter. > I'm sure that this would break some Org files, but it would make > dealing with latex fragments and inline source blocks and inline > footnotes SO much simpler. Matching an arbitrary number of > angle brackets does add some complexity, but it is tiny compared > to the complexity of enforcing matched parens and their failure cases > especially because many of the places where nesting is required > probably only see use of the nesting feature in a tiny fraction of > all cases. > > One other reason why this is attractive is that all the instances > where nested delimiters can appear on a line are preceded by > some non-whitespace character. This means that using the > pipe syntax does not conflict with table syntax! > > Now the question comes. If we could implement this for > delimiters, could we also implement something similar > for markup? The issue with the proposed markup outside > delimiter inside approach is that it will change existing > behavior for files that want the delimiters to be included > in the markup, i.e. /{oops}/ becoming /oops/ is bad. A > second issue is that putting the delimiter inside the markup > cannot work for verbatim and code ={oops}= is ={oops}= no > matter what. Therefore the solution is not uniform across all > types of markup. We need another solution that works for > all types of markup. > > What if we put the "start arbitrary markup" char outside > the markup? Say something like |/ital/|icks? Or what if > we went whole hog and used |{/ital/}|ics and made the > |{...}| syntax trigger a generalized feature where the > contents of the |{...}| block are parsed by themselves > and can abutt any other text? This would be generally > useful in a variety of situations beyond just intra-word > markup. > > What are the issues with this approach? The first issue > is that there is a conflict with table syntax if we were to > use the pipe character because markup can appear at > the start of a line. The second issue is that it might be > confusing for users if |{}| also worked like {} when in the > context of latex elements or inline src blocks, or maybe > that is ok because |{}| never renders as text. Hrm. Ok. > Second issue resolved, but what to do about the first? > > If we want generalized "parse this by itself" syntax so > that we can write hello|{/world/}|ok, then we need a > solution that can appear at the start of a line. So we > can't use pipe because that is always a table line even > if a zero width space is put before it ;). What other > options do we have? How about #+|{/hello/}|world for > the start of a line? As long as there is no trailing colon > it isn't a keyword, so it could work ... except that if > someone reflows the text and it is no longer a the > start of a line then the syntax breaks. That is to say > using #+| at the start of a line is not uniform, so we > can't take that approach. > > What other chars to we have at our disposal? Hrm. > How about @@? Could we use that? What happens > if we use @@org:/hello/@@world? Or maybe if we > want to minimize the number of chars we could do > @@:/hello/@@world and have the empty prefix in > @@ blocks mean org? > > --000000000000403ec105d2565256 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Along these lines (and combining the s-exp suggestion from= Max) , you can achieve something like this with links.=C2=A0

This is lightly tested, and I am not thrilled with the eval for expor= ting, but I couldn't get a macro to work on the export function to avoi= d it,=C2=A0and this is just a proof of concept idea. This might only be sui= table for individual solutions, since you have to define this markup yourse= lf.

#+BEGIN_SRC emacs-lisp :results silent
(defun ita= lic (s)
=C2=A0 (pcase backend ;; lexical
=C2=A0 =C2=A0 ('latex (f= ormat "{\\textit{%s}}" s))
=C2=A0 =C2=A0 ('html (format &q= uot;<i>%s</i>" s))
=C2=A0 =C2=A0 (_ s)))

(defun = @@-export (path desc backend)
=C2=A0 (eval `(concat ,@(read path))))
=
(org-link-set-parameters
=C2=A0"@@"
=C2=A0:export #'= ;@@-export)
#+END_SRC

In org, it would look like=C2=A0Here is a [= [@@:((italic "part") "ial")]] markup. And in exports th= is is what this implementation=C2=A0does.

#+BEGIN_SRC emacs-lisp
= (org-export-string-as "Here is a [[@@:((italic \"part\") \&q= uot;ial\")]] markup." 'latex t)
#+END_SRC

#+RESULTS= :
: Here is a {\textit{part}}ial markup.


#+BEGIN_SRC emacs-li= sp
(org-export-string-as "Here is a [[@@:((italic \"part\"= ;) \"ial\")]] markup." 'html t)
#+END_SRC

#+RE= SULTS:
: <p>
: Here is a <i>part</i>ial markup.<= /p>

#+BEGIN_SRC emacs-lisp
(org-export-string-as "Here is= a [[@@:((italic \"part\") \"ial\")]] markup." = 9;ascii t)
#+END_SRC

#+RESULTS:
: Here is a partial markup.
=

Of course, you are free to do what you want= with the path, including parse it yourself to generate the output, and sin= ce it is a link, you could do all kinds of things to make it look the way y= ou want with faces, overlays, etc.


John

------------= -----------------------
Professor John Kitchin (he/him/his)
Doherty H= all A207F
Department of Chemical Engineering
Carnegie Mellon Universi= ty
Pittsburgh, PA 15213
412-268-7803

<= /div>

On Sat, Dec 4, 2021 at 12:54 PM Tom Gillespie <tgbugs@gmail.com> wrote:
=
Hi all,
=C2=A0 =C2=A0 After a bunch of rambling (see below if interested), I think = I have
a solution that should work for everyone. The key realization is that
what we really want is the ability to have a "parse me separately"= ;
type of syntax. This meets the intra-word syntax needs and might
meet some other needs as well.

The solution is to make @@org:...@@ "parse me separately"
block! It nearly works that way already too! To minimize typing
we could have @@:...@@ the empty type default to org.

This seems like a winner to me. The syntax for it already exists
and won't conflict. It requires relatively minimal additional typing the implication is clear, and there are other places where such
behavior could be useful.

This syntax seems like a winner to me
@@org:/hello/@@world
@@:/hello/@@world

You can also do things like
#+begin_src org
I want a number in this number@@org:src_elisp{(+ 1 2)}@@word!
#+end_src

Which would render to
#+begin_src org
I want a number in this number3word!
#+end_src

Thoughts?

Best!
Tom

--------------- rambling below -------------


> This idea reminds me a bit of Scribble/Racket where every document is<= br> > just inverted code, which makes it possible to insert arbitrary Racket=
> code in your prose...

I will say, despite some of my comments elsewhere, that I think
exploring certain features of Scribble syntax for use in Org mode
would simplify certain parts of the syntax immensely.

For example
various inline blocks are an absolute pain to parse because they
allow nested delimiters /if they are matched/. The implementation
of the /if they are matched/ clause is currently a nasty hack which
generates a regular expression that can only actually handle nesting
to depth 3. Actually implementing the recursive grammar add a lot
of complexity to the syntax and is hard to get right.

It would be vastly simpler to use Scribble's |<{hello }} world}>|=
style syntax and always terminate at the first matching delimiter.
I'm sure that this would break some Org files, but it would make
dealing with latex fragments and inline source blocks and inline
footnotes SO much simpler. Matching an arbitrary number of
angle brackets does add some complexity, but it is tiny compared
to the complexity of enforcing matched parens and their failure cases
especially because many of the places where nesting is required
probably only see use of the nesting feature in a tiny fraction of
all cases.

One other reason why this is attractive is that all the instances
where nested delimiters can appear on a line are preceded by
some non-whitespace character. This means that using the
pipe syntax does not conflict with table syntax!

Now the question comes. If we could implement this for
delimiters, could we also implement something similar
for markup? The issue with the proposed markup outside
delimiter inside approach is that it will change existing
behavior for files that want the delimiters to be included
in the markup, i.e. /{oops}/ becoming /oops/ is bad. A
second issue is that putting the delimiter inside the markup
cannot work for verbatim and code =3D{oops}=3D is =3D{oops}=3D no
matter what. Therefore the solution is not uniform across all
types of markup. We need another solution that works for
all types of markup.

What if we put the "start arbitrary markup" char outside
the markup? Say something like |/ital/|icks? Or what if
we went whole hog and used |{/ital/}|ics and made the
|{...}| syntax trigger a generalized feature where the
contents of the |{...}| block are parsed by themselves
and can abutt any other text? This would be generally
useful in a variety of situations beyond just intra-word
markup.

What are the issues with this approach? The first issue
is that there is a conflict with table syntax if we were to
use the pipe character because markup can appear at
the start of a line. The second issue is that it might be
confusing for users if |{}| also worked like {} when in the
context of latex elements or inline src blocks, or maybe
that is ok because |{}| never renders as text. Hrm. Ok.
Second issue resolved, but what to do about the first?

If we want generalized "parse this by itself" syntax so
that we can write hello|{/world/}|ok, then we need a
solution that can appear at the start of a line. So we
can't use pipe because that is always a table line even
if a zero width space is put before it ;). What other
options do we have? How about #+|{/hello/}|world for
the start of a line? As long as there is no trailing colon
it isn't a keyword, so it could work ... except that if
someone reflows the text and it is no longer a the
start of a line then the syntax breaks. That is to say
using #+| at the start of a line is not uniform, so we
can't take that approach.

What other chars to we have at our disposal? Hrm.
How about @@? Could we use that? What happens
if we use @@org:/hello/@@world? Or maybe if we
want to minimize the number of chars we could do
@@:/hello/@@world and have the empty prefix in
@@ blocks mean org?

--000000000000403ec105d2565256--