From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <emacs-orgmode-bounces+larch=yhetil.org@gnu.org>
Received: from mp11.migadu.com ([2001:41d0:2:4a6f::])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	by ms5.migadu.com with LMTPS
	id aAUZAX+M4mI+SAEAbAwnHQ
	(envelope-from <emacs-orgmode-bounces+larch=yhetil.org@gnu.org>)
	for <larch@yhetil.org>; Thu, 28 Jul 2022 15:17:51 +0200
Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	by mp11.migadu.com with LMTPS
	id AEAZAX+M4mKRsgAA9RJhRA
	(envelope-from <emacs-orgmode-bounces+larch=yhetil.org@gnu.org>)
	for <larch@yhetil.org>; Thu, 28 Jul 2022 15:17:51 +0200
Received: from lists.gnu.org (lists.gnu.org [209.51.188.17])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by aspmx1.migadu.com (Postfix) with ESMTPS id 41E2F1739F
	for <larch@yhetil.org>; Thu, 28 Jul 2022 15:17:50 +0200 (CEST)
Received: from localhost ([::1]:32918 helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <emacs-orgmode-bounces+larch=yhetil.org@gnu.org>)
	id 1oH3Od-0002pi-IP
	for larch@yhetil.org; Thu, 28 Jul 2022 09:17:49 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:58226)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <yantar92@gmail.com>)
 id 1oH3NR-0002nb-2C
 for emacs-orgmode@gnu.org; Thu, 28 Jul 2022 09:16:33 -0400
Received: from mail-pl1-x635.google.com ([2607:f8b0:4864:20::635]:38702)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.90_1) (envelope-from <yantar92@gmail.com>)
 id 1oH3NP-0003FK-8p
 for emacs-orgmode@gnu.org; Thu, 28 Jul 2022 09:16:32 -0400
Received: by mail-pl1-x635.google.com with SMTP id o3so1782138ple.5
 for <emacs-orgmode@gnu.org>; Thu, 28 Jul 2022 06:16:30 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
 h=from:to:cc:subject:in-reply-to:references:date:message-id
 :mime-version; bh=AJrDkxQaJiELdcrRL2MTBPXX7fh/i1iMBo9bjLgY6qA=;
 b=i9m6FJRU9BeIHsN6/DM9jX/J30bQG9q6CkuTO8sanPPgTAQX/JwmzwOi6Wq7XLdK6J
 Z/qyTB7nOFka+GnPrVUOCwIuxRYsZ6orSGQAXU5IZeaAPKUYfy5iF5EFfvktGhljiCNa
 CmiAM7opwXmARcXT/N4QdN3eWPL3vqEMUzx8ZhBfHT8/7celqc9ttfrJEs9OoBJVBuo1
 /adIFYuQX2ouY3cnAi+qvGfzYLSjzL5/6CgSX4FgKPxrZ0xBHPYyF+1xVGDXpq5GyNcj
 y8Sd1EbIj2gZDsZhJed2gP7FbX4ox0EMDS4sx/IMkDClge7hXMyaeyb3kGZcoWM8bFBG
 Sk3g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date
 :message-id:mime-version;
 bh=AJrDkxQaJiELdcrRL2MTBPXX7fh/i1iMBo9bjLgY6qA=;
 b=pG4pjV9A0T28xYAq3x9Wn4DKyadIvOMQZPaoSk6XQTbRzQ3O5y4d9yGbfmAPalYM+C
 PInYIjP4/Az801sdeYx/9K3yfNTZGa3lbRs3Ppi9u3ptXs2wKpVZXA7qwgcuzCxFoZhl
 /W/lnzPn6rR55Wct9rcs8dg/c32T7rZJL2XJ9wgsY78bXbkERAdsbl8vc0t+i4t8bYDM
 C+kom0wvW9lgrf3ldC4c4yz4Qpt3+wEaDhRit6q+SXcD6ycW2gDhwNGKy7AMgU6urTt5
 MvleqrVqW9CDnda9n0fU+ZM4+JbUlP13kVE6TY07I8YdYHvg6QsGGD4bj80rmJVWphyG
 2mTA==
X-Gm-Message-State: AJIora80F/oNE3c4jzWQinlcU8VxJabXF09I1PYlOX0EQ4/xDcSQ1O+N
 X3B+/LN0wvj8AsYRy6KSgys=
X-Google-Smtp-Source: AGRyM1tXWVxaiiHndu3p7AzAcTQ3/M8YmopgP0g5BoH6qp1PLsi1/7n/l9WkcXVvFrJXf46BVcvGkg==
X-Received: by 2002:a17:90b:1894:b0:1f2:76d7:24ce with SMTP id
 mn20-20020a17090b189400b001f276d724cemr10219696pjb.62.1659014189949; 
 Thu, 28 Jul 2022 06:16:29 -0700 (PDT)
Received: from localhost ([2409:8a70:217:4f80:8ec6:81ff:fe70:339d])
 by smtp.gmail.com with ESMTPSA id
 m22-20020a17090a7f9600b001f327021900sm368901pjl.1.2022.07.28.06.16.27
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Thu, 28 Jul 2022 06:16:28 -0700 (PDT)
From: Ihor Radchenko <yantar92@gmail.com>
To: K K <k_foreign@outlook.com>
Cc: Max Nikulin <manikulin@gmail.com>,  "emacs-orgmode@gnu.org"
 <emacs-orgmode@gnu.org>
Subject: [PATCH] Add new entity \-- serving as markup separator/escape symbol
In-Reply-To: <87v8rkav2x.fsf@localhost>
References: <BY5PR10MB4289167298649297E045360996959@BY5PR10MB4289.namprd10.prod.outlook.com>
 <87r128d5pp.fsf@localhost> <tbnj6u$11sv$1@ciao.gmane.io>
 <80f0990042a564556cc6b047a94f7e9dddf5a280.camel@outlook.com>
 <87v8rkav2x.fsf@localhost>
Date: Thu, 28 Jul 2022 21:17:32 +0800
Message-ID: <87mtct9y1f.fsf@localhost>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=-=-="
Received-SPF: pass client-ip=2607:f8b0:4864:20::635;
 envelope-from=yantar92@gmail.com; helo=mail-pl1-x635.google.com
X-Spam_score_int: -18
X-Spam_score: -1.9
X-Spam_bar: -
X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001,
 RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001,
 T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: emacs-orgmode@gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "General discussions about Org-mode." <emacs-orgmode.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-orgmode>,
 <mailto:emacs-orgmode-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/emacs-orgmode>
List-Post: <mailto:emacs-orgmode@gnu.org>
List-Help: <mailto:emacs-orgmode-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-orgmode>,
 <mailto:emacs-orgmode-request@gnu.org?subject=subscribe>
Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org
Sender: "Emacs-orgmode" <emacs-orgmode-bounces+larch=yhetil.org@gnu.org>
X-Migadu-Flow: FLOW_IN
X-Migadu-To: larch@yhetil.org
X-Migadu-Country: US
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org;
	s=key1; t=1659014270;
	h=from:from:sender:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:in-reply-to:in-reply-to:
	 references:references:list-id:list-help:list-unsubscribe:
	 list-subscribe:list-post:dkim-signature;
	bh=AJrDkxQaJiELdcrRL2MTBPXX7fh/i1iMBo9bjLgY6qA=;
	b=cnASTf2V7xYZCc3daTS5onJp0Slpkz1zRCvp0HnGGSuxvzmfx8nqdvrG2fXSUxHZMTi3Cd
	FDJ7y44KSuYwUbnBaxf9XfobMLEnjRNxv/guvF64+KuYRMRsd0Hh3xgi/lphjHzGHJp+WB
	xq4hKu665iJclnzLddtHuaBejtjgJdwKtjEcPVtYwZ60kbEt6thlgjNtKuM7yDB1pZvGjJ
	rzoKRFUFhngrURXJ3Pqnq0s/NFJfOl/9gaXZgF7hBnPn87EuS8FYF3rYRrxsIB6ofB9u7W
	k6tr7+2EU5DcsU52hxUndtbdjNh2EXNGNJOt8Lf+8dJK5joOHgBGxjNk+8+eeg==
ARC-Seal: i=1; s=key1; d=yhetil.org; t=1659014270; a=rsa-sha256; cv=none;
	b=Ph+IRstQIYUIwxNEHdBGV+rXncPUFXSme/YfpSSIt1D27F0BAX1q254RgRDFFasysycp6C
	KJSzIEj/aVt8iyC/1wHo9xDrPojAB8g24IdQ2aD0N5G5hTq5YKBIptnXsi3N43iE2TKY1a
	4JWOyGsyESkKbSGc5Faq5Z4pWPaxtlnOeFqnJrko35Pz3qrQWUhKt+aRNjBPN6Xed8R9D9
	X5uVId+1/2Dbwq9ngLJEoqAi510l924kQ/DFNHVSV9C5wuAgINapRZJ+bDALFVxSM0EegD
	CRV6zmv1+neaGiNeWBL6mBiclPViVOq6kj+DusIHztTCK8NrQfAA2s+n9LoLYg==
ARC-Authentication-Results: i=1;
	aspmx1.migadu.com;
	dkim=pass header.d=gmail.com header.s=20210112 header.b=i9m6FJRU;
	dmarc=pass (policy=none) header.from=gmail.com;
	spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org"
X-Migadu-Spam-Score: -3.43
Authentication-Results: aspmx1.migadu.com;
	dkim=pass header.d=gmail.com header.s=20210112 header.b=i9m6FJRU;
	dmarc=pass (policy=none) header.from=gmail.com;
	spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org"
X-Migadu-Queue-Id: 41E2F1739F
X-Spam-Score: -3.43
X-Migadu-Scanner: scn1.migadu.com
X-TUID: DLrTJ/cOrqSu

--=-=-=
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

Ihor Radchenko <yantar92@gmail.com> writes:

> I am attaching a tentative patch that will make Org export remove
> zero-width spaces when those spaces actually separate the object
> boundaries.
>
> Any objections?

Given the raised objections, zero-width space does not appear to be a
useful escape symbol because it has its valid uses as a standalone space
symbol.

The raised objections can be solved using some kind of intricate
heuristics, but I do not feel like it is a good direction to go. The
code will be too complex and fragile.

Therefore, I am proposing a different approach for shielding
fontification: introducing a special entity.

The new entity is \--, which is a valid boundary between emphasis
markup. It will be removed during export (replaced by "").

"\--" specifically is somewhat arbitrary choice. The actual requirements
for the entity name are: (1) No clash with LaTeX (which is why simpler
\- would not cut it); (2) Being a valid markup boundary: entity must end
with (any space ?- ?\( ?' ?\" ?\{).

I am attaching a tentative patch introducing the new entity. Note that
some minor tweaks to the parser were needed. I do not see it as a big
deal - the current entity regexp has much more cumbersome exceptions.

Also, the patch will not work correctly on org =E2=86=92 org export, simila=
r to
pointed in one of the replies to the previous abandoned approach. I do
not want to address it here because a much more appropriate solution for
this issue is changing org-element-interpret-data.

Consider (org-element-interpret-data '("asd" (bold () "bold") "bsd"))
This will return "asd*bold*bsd", which is not correct even though the
given Org datum is not wrong by itself - such things can easily appear
when user filters are applied to parse tree during org=E2=86=92org export.

Otherwise, the patch should be good enough to play around and kick-start
the discussion.

WDYT?

Best,
Ihor


--=-=-=
Content-Type: text/x-patch; charset=utf-8
Content-Disposition: inline;
 filename=0001-Add-new-entity-serving-as-markup-separator-escape-sy.patch
Content-Transfer-Encoding: quoted-printable

>From 521a4b06578cf37f22e9f33d2f45b967419ad3a3 Mon Sep 17 00:00:00 2001
Message-Id: <521a4b06578cf37f22e9f33d2f45b967419ad3a3.1659013441.git.yantar=
92@gmail.com>
From: Ihor Radchenko <yantar92@gmail.com>
Date: Thu, 28 Jul 2022 21:02:26 +0800
Subject: [PATCH] Add new entity \-- serving as markup separator/escape symb=
ol

* lisp/org-entities.el (org-entities): Add \-- entity.  This entity is
exported as an empty string and simply serves as markup separator if
the user needs any.
* lisp/org.el (org-fontify-entities):
* lisp/org-element.el (org-element-entity-parser):
(org-element--set-regexps): Update entity regexp to match "-".
---
 lisp/org-element.el  | 4 ++--
 lisp/org-entities.el | 4 ++++
 lisp/org.el          | 2 +-
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/lisp/org-element.el b/lisp/org-element.el
index 9e9b7c5ec..6405b4db8 100644
--- a/lisp/org-element.el
+++ b/lisp/org-element.el
@@ -258,7 +258,7 @@ (defun org-element--set-regexps ()
 		      "\\$"
 		      ;; Objects starting with "\": line break,
 		      ;; entity, latex fragment.
-		      "\\\\\\(?:[a-zA-Z[(]\\|\\\\[ \t]*$\\|_ +\\)"
+		      "\\\\\\(?:[-a-zA-Z[(]\\|\\\\[ \t]*$\\|_ +\\)"
 		      ;; Objects starting with raw text: inline Babel
 		      ;; source block, inline Babel call.
 		      "\\(?:call\\|src\\)_"))
@@ -3158,7 +3158,7 @@ (defun org-element-entity-parser ()
=20
 Assume point is at the beginning of the entity."
   (catch 'no-object
-    (when (looking-at "\\\\\\(?:\\(?1:_ +\\)\\|\\(?1:there4\\|sup[123]\\|f=
rac[13][24]\\|[a-zA-Z]+\\)\\(?2:$\\|{}\\|[^[:alpha:]]\\)\\)")
+    (when (looking-at "\\\\\\(?:\\(?1:_ +\\)\\|\\(?1:there4\\|sup[123]\\|f=
rac[13][24]\\|[a-zA-Z-]+\\)\\(?2:$\\|{}\\|[^[:alpha:]]\\)\\)")
       (save-excursion
 	(let* ((value (or (org-entity-get (match-string 1))
 			  (throw 'no-object nil)))
diff --git a/lisp/org-entities.el b/lisp/org-entities.el
index d35e3fa8a..9d79d23fc 100644
--- a/lisp/org-entities.el
+++ b/lisp/org-entities.el
@@ -264,6 +264,10 @@ (defconst org-entities
      ("rsaquo" "\\guilsinglright{}" nil "&rsaquo;" ">" ">" "=E2=80=BA")
=20
      "* Other"
+=20=20=20=20=20
+     "** Escaping Org markup"
+     ("--" "" nil "" "" "" "")
+=20=20=20=20=20
      "** Misc. (often used)"
      ("circ" "\\^{}" nil "&circ;" "^" "^" "=E2=88=98")
      ("vert" "\\vert{}" t "&vert;" "|" "|" "|")
diff --git a/lisp/org.el b/lisp/org.el
index 937892ef3..29ccff83b 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -5828,7 +5828,7 @@ (defun org-fontify-entities (limit)
 	;; i.e., "\_ ", could be fontified anyway, and it would be
 	;; confusing when adding a second white space character.
 	(while (re-search-forward
-		"\\\\\\(there4\\|sup[123]\\|frac[13][24]\\|[a-zA-Z]+\\)\\($\\|{}\\|[^[:a=
lpha:]\n]\\)"
+		"\\\\\\(there4\\|sup[123]\\|frac[13][24]\\|[a-zA-Z-]+\\)\\($\\|{}\\|[^[:=
alpha:]\n]\\)"
 		limit t)
 	  (when (and (not (org-at-comment-p))
 		     (setq ee (org-entity-get (match-string 1)))
--=20
2.35.1


--=-=-=--