From: Teemu Likonen <tlikonen@iki.fi>
To: notmuch@notmuchmail.org
Cc: tomi.ollila@iki.fi
Subject: [PATCH 1/2] Emacs: Add a new function for balancing bidi control chars
Date: Sat, 15 Aug 2020 12:30:35 +0300 [thread overview]
Message-ID: <20200815093036.5930-2-tlikonen@iki.fi> (raw)
In-Reply-To: <20200815093036.5930-1-tlikonen@iki.fi>
The following Unicode's bidirectional control chars are modal so that
they push a new bidirectional rendering mode to a stack:
U+202A LEFT-TO-RIGHT EMBEDDING
U+202B RIGHT-TO-LEFT EMBEDDING
U+202D LEFT-TO-RIGHT OVERRIDE
U+202E RIGHT-TO-LEFT OVERRIDE
Every mode must be terminated with with character U+202C POP
DIRECTIONAL FORMATTING which pops the mode from the stack. The stack
is per paragraph. A new text paragraph resets the rendering mode
changed by these control characters.
This change adds a new function "notmuch-balance-bidi-ctrl-chars"
which reads its STRING argument and ensures that all push
characters (U+202A, U+202B, U+202D, U+202E) have a pop character
pair (U+202C). The function may add more U+202C characters at the end
of the returned string, or it may remove some U+202C characters. The
returned string is safe in the sense that it won't change the
surrounding bidirectional rendering mode. This function should be used
when sanitizing arbitrary input.
---
emacs/notmuch-lib.el | 54 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 54 insertions(+)
diff --git a/emacs/notmuch-lib.el b/emacs/notmuch-lib.el
index 118faf1e..e6252c6c 100644
--- a/emacs/notmuch-lib.el
+++ b/emacs/notmuch-lib.el
@@ -469,6 +469,60 @@ be displayed."
"[No Subject]"
subject)))
+
+(defun notmuch-balance-bidi-ctrl-chars (string)
+ "Balance bidirectional control chars in STRING.
+
+The following Unicode's bidirectional control chars are modal so
+that they push a new bidirectional rendering mode to a stack:
+U+202A LEFT-TO-RIGHT EMBEDDING, U+202B RIGHT-TO-LEFT EMBEDDING,
+U+202D LEFT-TO-RIGHT OVERRIDE and U+202E RIGHT-TO-LEFT OVERRIDE.
+Every mode must be terminated with with character U+202C POP
+DIRECTIONAL FORMATTING which pops the mode from the stack. The
+stack is per paragraph. A new text paragraph resets the rendering
+mode changed by these control characters.
+
+This function reads the STRING argument and ensures that all push
+characters (U+202A, U+202B, U+202D, U+202E) have a pop character
+pair (U+202C). The function may add more U+202C characters at the
+end of the returned string, or it may remove some U+202C
+characters. The returned string is safe in the sense that it
+won't change the surrounding bidirectional rendering mode. This
+function should be used when sanitizing arbitrary input."
+
+ (let ((new-string nil)
+ (stack-count 0))
+
+ (cl-flet ((push-char-p (c)
+ ;; U+202A LEFT-TO-RIGHT EMBEDDING
+ ;; U+202B RIGHT-TO-LEFT EMBEDDING
+ ;; U+202D LEFT-TO-RIGHT OVERRIDE
+ ;; U+202E RIGHT-TO-LEFT OVERRIDE
+ (cl-find c '(?\u202a ?\u202b ?\u202d ?\u202e)))
+ (pop-char-p (c)
+ ;; U+202C POP DIRECTIONAL FORMATTING
+ (eql c ?\u202c)))
+
+ (cl-loop for char across string
+ do (cond ((push-char-p char)
+ (cl-incf stack-count)
+ (push char new-string))
+ ((and (pop-char-p char)
+ (cl-plusp stack-count))
+ (cl-decf stack-count)
+ (push char new-string))
+ ((and (pop-char-p char)
+ (not (cl-plusp stack-count)))
+ ;; The stack is empty. Ignore this pop character.
+ )
+ (t (push char new-string)))))
+
+ ;; Add possible missing pop characters.
+ (cl-loop repeat stack-count
+ do (push ?\x202c new-string))
+
+ (seq-into (nreverse new-string) 'string)))
+
(defun notmuch-sanitize (str)
"Sanitize control character in STR.
--
2.20.1
next prev parent reply other threads:[~2020-08-15 9:31 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-15 9:30 [PATCH 0/2] Balance bidi control chars Teemu Likonen
2020-08-15 9:30 ` Teemu Likonen [this message]
2020-08-16 16:28 ` [PATCH 1/2] Emacs: Add a new function for balancing " Tomi Ollila
2020-08-16 17:41 ` Teemu Likonen
2020-08-15 9:30 ` [PATCH 2/2] Emacs: Call notmuch-balance-bidi-ctrl-chars in notmuch-sanitize Teemu Likonen
2020-08-15 9:44 ` [PATCH 0/2] Balance bidi control chars Teemu Likonen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://notmuchmail.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200815093036.5930-2-tlikonen@iki.fi \
--to=tlikonen@iki.fi \
--cc=notmuch@notmuchmail.org \
--cc=tomi.ollila@iki.fi \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).