From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: handa Newsgroups: gmane.emacs.bugs Subject: bug#28339: 25.2; Emacs shows ZWNJ character (Zero Width non-Joiner) as Space Date: Fri, 06 Oct 2017 19:05:41 +0900 Message-ID: <87infszj4a.fsf@gnu.org> References: NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: blaine.gmane.org 1507285459 22732 195.159.176.226 (6 Oct 2017 10:24:19 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 6 Oct 2017 10:24:19 +0000 (UTC) Cc: b.riefenstahl@turtle-trading.net, 28339@debbugs.gnu.org To: Nima Aryan Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Fri Oct 06 12:24:13 2017 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e0Pnd-0004mP-Kr for geb-bug-gnu-emacs@m.gmane.org; Fri, 06 Oct 2017 12:24:09 +0200 Original-Received: from localhost ([::1]:43978 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e0Pnk-0001P0-PD for geb-bug-gnu-emacs@m.gmane.org; Fri, 06 Oct 2017 06:24:16 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:60840) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e0PXA-00033C-50 for bug-gnu-emacs@gnu.org; Fri, 06 Oct 2017 06:07:14 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1e0PX3-0005VA-W1 for bug-gnu-emacs@gnu.org; Fri, 06 Oct 2017 06:07:08 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:43495) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1e0PX3-0005Up-SR for bug-gnu-emacs@gnu.org; Fri, 06 Oct 2017 06:07:01 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1e0PX3-0001fG-JN for bug-gnu-emacs@gnu.org; Fri, 06 Oct 2017 06:07:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: handa Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 06 Oct 2017 10:07:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 28339 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 28339-submit@debbugs.gnu.org id=B28339.15072843726323 (code B ref 28339); Fri, 06 Oct 2017 10:07:01 +0000 Original-Received: (at 28339) by debbugs.gnu.org; 6 Oct 2017 10:06:12 +0000 Original-Received: from localhost ([127.0.0.1]:52176 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e0PWG-0001dv-4p for submit@debbugs.gnu.org; Fri, 06 Oct 2017 06:06:12 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:41274) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e0PWF-0001dk-5F for 28339@debbugs.gnu.org; Fri, 06 Oct 2017 06:06:11 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1e0PW4-0004Yo-Rb for 28339@debbugs.gnu.org; Fri, 06 Oct 2017 06:06:05 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:43047) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e0PVr-0004H3-6k; Fri, 06 Oct 2017 06:05:47 -0400 Original-Received: from fl1-125-197-70-243.iba.mesh.ad.jp ([125.197.70.243]:33428 helo=shatin) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1e0PVq-0007yf-K4; Fri, 06 Oct 2017 06:05:46 -0400 Original-Received: from handa by shatin with local (Exim 4.86_2) (envelope-from ) id 1e0PVl-0000nH-HF; Fri, 06 Oct 2017 19:05:41 +0900 In-Reply-To: (message from Nima Aryan on Tue, 19 Sep 2017 13:53:31 +0000) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:138011 Archived-At: In article , Nima Aryan writes: > As a user I prefer absorb mode by default but some times thin-space (and > not simple space) might be a good option to consider. Attached patch introduces a customizable variable arabic-shaper-ZWNJ-handling. Shall I install it? --- K. Handa handa@gnu.org ------------------------------------------------------------ diff --git a/lisp/composite.el b/lisp/composite.el index ab39e08..72b0ffc 100644 --- a/lisp/composite.el +++ b/lisp/composite.el @@ -442,8 +442,10 @@ lglyph-set-width (defsubst lglyph-set-adjustment (glyph &optional xoff yoff wadjust) (aset glyph 9 (vector (or xoff 0) (or yoff 0) (or wadjust 0)))) =20 +;; Return the shallow Copy of GLYPH. (defsubst lglyph-copy (glyph) (copy-sequence glyph)) =20 +;; Insert GLYPH at the index IDX of GSTRING. (defun lgstring-insert-glyph (gstring idx glyph) (let ((nglyphs (lgstring-glyph-len gstring)) (i idx)) @@ -459,6 +461,18 @@ lgstring-insert-glyph (lgstring-set-glyph gstring i glyph) gstring)) =20 +;; Remove glyph at IDX from GSTRING. +(defun lgstring-remove-glyph (gstring idx) + (setq gstring (copy-sequence gstring)) + (lgstring-set-id gstring nil) + (let ((len (length gstring))) + (setq idx (+ idx 3)) + (while (< idx len) + (aset gstring (1- idx) (aref gstring idx)) + (setq idx (1+ idx))) + (aset gstring (1- len) nil)) + gstring) + (defun compose-glyph-string (gstring from to) (let ((glyph (lgstring-glyph gstring from)) from-pos to-pos) diff --git a/lisp/language/misc-lang.el b/lisp/language/misc-lang.el index 2843c7c..4e10227 100644 --- a/lisp/language/misc-lang.el +++ b/lisp/language/misc-lang.el @@ -75,12 +75,72 @@ 'cp1256 (sample-text . "Persian =D9=81=D8=A7=D8=B1=D8=B3=DB=8C") (documentation . "Bidirectional editing is supported."))) =20 +(defcustom arabic-shaper-ZWNJ-handling nil + "How to handle ZWMJ in Arabic text renderling. +This variable controls the way to handle a glyph for ZWNJ +returned by the underling shaping engine. + +The default value is nil, which means that the ZWNJ glyph is +displayed as is. + +If the value is `absorb', ZWNJ is absorbed into the previous +grapheme cluster, and not displayed. + +If the value is `as-space', the glyph is displayed by a +thin (i.e. 1-dot width) space. + +Customizing the value takes effect when you start Emacs next time." + :group 'mule + :version "27.1" + :type '(choice + (const :tag "default" nil) + (const :tag "as space" as-space) + (const :tag "absorb" absorb))) + +(defvar arabic-shape-log nil) + +(defun arabic-shape-gstring (gstring) + (setq gstring (font-shape-gstring gstring)) + (push arabic-shaper-ZWNJ-handling arabic-shape-log) + (condition-case err + (when arabic-shaper-ZWNJ-handling + (let ((font (lgstring-font gstring)) + (i 1) + (len (lgstring-glyph-len gstring)) + (modified nil)) + (while (< i len) + (let ((glyph (lgstring-glyph gstring i))) + (when (eq (lglyph-char glyph) #x200c) + (cond + ((eq arabic-shaper-ZWNJ-handling 'as-space) + (if (> (- (lglyph-rbearing glyph) (lglyph-lbearing glyph= )) 0) + (let ((space-glyph (aref (font-get-glyphs font 0 1 "= ") 0))) + (when space-glyph + (lglyph-set-code glyph (aref space-glyph 3)) + (lglyph-set-width glyph (aref space-glyph 4))))) + (lglyph-set-adjustment glyph 0 0 1) + (setq modified t)) + ((eq arabic-shaper-ZWNJ-handling 'absorb) + (let ((prev (lgstring-glyph gstring (1- i)))) + (lglyph-set-from-to prev (lglyph-from prev) (lglyph-to= glyph)) + (push (cons "remove" (lgstring-glyph gstring i)) + arabic-shape-log) + (setq gstring (lgstring-remove-glyph gstring i)) + (setq len (1- len))) + (setq modified t))))) + (setq i (1+ i))) + (if modified + (lgstring-set-id gstring nil)))) + (error (push err arabic-shape-log))) + gstring) + (set-char-table-range composition-function-table '(#x600 . #x74F) - (list (vector "[\u0600-\u074F\u200C\u200D]+" 0 'font-shape-gstring) - (vector "[\u200C\u200D][\u0600-\u074F\u200C\u200D]+" - 1 'font-shape-gstring))) + (list (vector "[\u0600-\u074F\u200C\u200D]+" 0 + 'arabic-shape-gstring) + (vector "[\u200C\u200D][\u0600-\u074F\u200C\u200D]+" 1 + 'arabic-shape-gstring))) =20 (provide 'misc-lang) =20