emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Ihor Radchenko <yantar92@gmail.com>
To: K K <k_foreign@outlook.com>
Cc: Max Nikulin <manikulin@gmail.com>,
	 "emacs-orgmode@gnu.org" <emacs-orgmode@gnu.org>
Subject: [PATCH] Add new entity \-- serving as markup separator/escape symbol
Date: Thu, 28 Jul 2022 21:17:32 +0800	[thread overview]
Message-ID: <87mtct9y1f.fsf@localhost> (raw)
In-Reply-To: <87v8rkav2x.fsf@localhost>

[-- Attachment #1: Type: text/plain, Size: 1948 bytes --]

Ihor Radchenko <yantar92@gmail.com> writes:

> I am attaching a tentative patch that will make Org export remove
> zero-width spaces when those spaces actually separate the object
> boundaries.
>
> Any objections?

Given the raised objections, zero-width space does not appear to be a
useful escape symbol because it has its valid uses as a standalone space
symbol.

The raised objections can be solved using some kind of intricate
heuristics, but I do not feel like it is a good direction to go. The
code will be too complex and fragile.

Therefore, I am proposing a different approach for shielding
fontification: introducing a special entity.

The new entity is \--, which is a valid boundary between emphasis
markup. It will be removed during export (replaced by "").

"\--" specifically is somewhat arbitrary choice. The actual requirements
for the entity name are: (1) No clash with LaTeX (which is why simpler
\- would not cut it); (2) Being a valid markup boundary: entity must end
with (any space ?- ?\( ?' ?\" ?\{).

I am attaching a tentative patch introducing the new entity. Note that
some minor tweaks to the parser were needed. I do not see it as a big
deal - the current entity regexp has much more cumbersome exceptions.

Also, the patch will not work correctly on org → org export, similar to
pointed in one of the replies to the previous abandoned approach. I do
not want to address it here because a much more appropriate solution for
this issue is changing org-element-interpret-data.

Consider (org-element-interpret-data '("asd" (bold () "bold") "bsd"))
This will return "asd*bold*bsd", which is not correct even though the
given Org datum is not wrong by itself - such things can easily appear
when user filters are applied to parse tree during org→org export.

Otherwise, the patch should be good enough to play around and kick-start
the discussion.

WDYT?

Best,
Ihor


[-- Attachment #2: 0001-Add-new-entity-serving-as-markup-separator-escape-sy.patch --]
[-- Type: text/x-patch, Size: 2994 bytes --]

From 521a4b06578cf37f22e9f33d2f45b967419ad3a3 Mon Sep 17 00:00:00 2001
Message-Id: <521a4b06578cf37f22e9f33d2f45b967419ad3a3.1659013441.git.yantar92@gmail.com>
From: Ihor Radchenko <yantar92@gmail.com>
Date: Thu, 28 Jul 2022 21:02:26 +0800
Subject: [PATCH] Add new entity \-- serving as markup separator/escape symbol

* lisp/org-entities.el (org-entities): Add \-- entity.  This entity is
exported as an empty string and simply serves as markup separator if
the user needs any.
* lisp/org.el (org-fontify-entities):
* lisp/org-element.el (org-element-entity-parser):
(org-element--set-regexps): Update entity regexp to match "-".
---
 lisp/org-element.el  | 4 ++--
 lisp/org-entities.el | 4 ++++
 lisp/org.el          | 2 +-
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/lisp/org-element.el b/lisp/org-element.el
index 9e9b7c5ec..6405b4db8 100644
--- a/lisp/org-element.el
+++ b/lisp/org-element.el
@@ -258,7 +258,7 @@ (defun org-element--set-regexps ()
 		      "\\$"
 		      ;; Objects starting with "\": line break,
 		      ;; entity, latex fragment.
-		      "\\\\\\(?:[a-zA-Z[(]\\|\\\\[ \t]*$\\|_ +\\)"
+		      "\\\\\\(?:[-a-zA-Z[(]\\|\\\\[ \t]*$\\|_ +\\)"
 		      ;; Objects starting with raw text: inline Babel
 		      ;; source block, inline Babel call.
 		      "\\(?:call\\|src\\)_"))
@@ -3158,7 +3158,7 @@ (defun org-element-entity-parser ()
 
 Assume point is at the beginning of the entity."
   (catch 'no-object
-    (when (looking-at "\\\\\\(?:\\(?1:_ +\\)\\|\\(?1:there4\\|sup[123]\\|frac[13][24]\\|[a-zA-Z]+\\)\\(?2:$\\|{}\\|[^[:alpha:]]\\)\\)")
+    (when (looking-at "\\\\\\(?:\\(?1:_ +\\)\\|\\(?1:there4\\|sup[123]\\|frac[13][24]\\|[a-zA-Z-]+\\)\\(?2:$\\|{}\\|[^[:alpha:]]\\)\\)")
       (save-excursion
 	(let* ((value (or (org-entity-get (match-string 1))
 			  (throw 'no-object nil)))
diff --git a/lisp/org-entities.el b/lisp/org-entities.el
index d35e3fa8a..9d79d23fc 100644
--- a/lisp/org-entities.el
+++ b/lisp/org-entities.el
@@ -264,6 +264,10 @@ (defconst org-entities
      ("rsaquo" "\\guilsinglright{}" nil "&rsaquo;" ">" ">" "›")
 
      "* Other"
+     
+     "** Escaping Org markup"
+     ("--" "" nil "" "" "" "")
+     
      "** Misc. (often used)"
      ("circ" "\\^{}" nil "&circ;" "^" "^" "∘")
      ("vert" "\\vert{}" t "&vert;" "|" "|" "|")
diff --git a/lisp/org.el b/lisp/org.el
index 937892ef3..29ccff83b 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -5828,7 +5828,7 @@ (defun org-fontify-entities (limit)
 	;; i.e., "\_ ", could be fontified anyway, and it would be
 	;; confusing when adding a second white space character.
 	(while (re-search-forward
-		"\\\\\\(there4\\|sup[123]\\|frac[13][24]\\|[a-zA-Z]+\\)\\($\\|{}\\|[^[:alpha:]\n]\\)"
+		"\\\\\\(there4\\|sup[123]\\|frac[13][24]\\|[a-zA-Z-]+\\)\\($\\|{}\\|[^[:alpha:]\n]\\)"
 		limit t)
 	  (when (and (not (org-at-comment-p))
 		     (setq ee (org-entity-get (match-string 1)))
-- 
2.35.1


  parent reply	other threads:[~2022-07-28 13:17 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-19  5:32 How to force markup without spaces cinsky
2012-11-19  7:11 ` Vladimir Lomov
2012-11-19 10:06   ` Seong-Kook Shin
2012-11-19 14:40     ` Suvayu Ali
2012-12-13 21:26       ` Bastien
2022-07-25 17:50         ` K
2022-07-25 18:27         ` K
2022-07-25 19:02           ` K
2022-07-26  1:26             ` Ihor Radchenko
2022-07-26  2:23               ` Max Nikulin
2022-07-26  4:26                 ` K K
2022-07-26  6:30                   ` Max Nikulin
2022-07-26 12:59                   ` [PATCH] org-export: Remove zero-width space escapes during export Ihor Radchenko
2022-07-26 14:25                     ` Timothy
2022-07-26 15:27                       ` András Simonyi
2022-07-26 16:38                     ` Max Nikulin
2022-07-27  3:30                     ` Max Nikulin
2022-07-28 13:17                     ` Ihor Radchenko [this message]
2022-07-28 15:34                       ` [PATCH] Add new entity \-- serving as markup separator/escape symbol Max Nikulin
2022-07-29  1:43                         ` Ihor Radchenko
2022-07-29  2:50                           ` Max Nikulin
2022-07-29  9:06                             ` [PATCH v2] " Ihor Radchenko
2022-07-30  0:22                               ` Samuel Wales
2022-07-30  4:12                                 ` Samuel Wales
2022-07-30  6:49                                 ` Ihor Radchenko
2022-07-30 15:44                                   ` Max Nikulin
2022-07-28 22:20                       ` [PATCH] " Tim Cross
2022-07-29  0:32                       ` Juan Manuel Macías
2022-07-29  5:49                       ` tomas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mtct9y1f.fsf@localhost \
    --to=yantar92@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=k_foreign@outlook.com \
    --cc=manikulin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).