From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Bidirectional text and URLs Date: Sat, 29 Nov 2014 20:18:29 +0200 Message-ID: <83zjb9an0q.fsf@gnu.org> References: <87a93cngwv.fsf@uwakimon.sk.tsukuba.ac.jp> <837fyfml31.fsf@gnu.org> <874mtio7wh.fsf@uwakimon.sk.tsukuba.ac.jp> <83r3wml8kq.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE X-Trace: ger.gmane.org 1417285135 12286 80.91.229.3 (29 Nov 2014 18:18:55 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 29 Nov 2014 18:18:55 +0000 (UTC) Cc: emacs-devel@gnu.org To: Lars Magne Ingebrigtsen Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Nov 29 19:18:49 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1XumbY-0002G9-6G for ged-emacs-devel@m.gmane.org; Sat, 29 Nov 2014 19:18:48 +0100 Original-Received: from localhost ([::1]:48423 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XumbX-0002Rp-MC for ged-emacs-devel@m.gmane.org; Sat, 29 Nov 2014 13:18:47 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:58367) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XumbO-0002Re-3d for emacs-devel@gnu.org; Sat, 29 Nov 2014 13:18:44 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XumbG-000656-Uv for emacs-devel@gnu.org; Sat, 29 Nov 2014 13:18:38 -0500 Original-Received: from mtaout20.012.net.il ([80.179.55.166]:45852) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XumbG-00064t-MD for emacs-devel@gnu.org; Sat, 29 Nov 2014 13:18:30 -0500 Original-Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0NFT00I00BZJHP00@a-mtaout20.012.net.il> for emacs-devel@gnu.org; Sat, 29 Nov 2014 20:18:29 +0200 (IST) Original-Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NFT00IBKC6SCJ40@a-mtaout20.012.net.il>; Sat, 29 Nov 2014 20:18:29 +0200 (IST) In-reply-to: X-012-Sender: halo1@inter.net.il X-detected-operating-system: by eggs.gnu.org: Solaris 10 X-Received-From: 80.179.55.166 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:178469 Archived-At: > From: Lars Magne Ingebrigtsen > Date: Sat, 29 Nov 2014 18:49:21 +0100 >=20 > It seems pretty clear that stuff like >=20 > =E2=80=AEhttp://myspace.com/#/segami/moc.koobecaf//:sptth >=20 > where you have a buffer with only left-to-right text, but then you = have > a single right-to-left indicator, is suspicious. The "single right-to-left indicator" is a fallacy: the correct use of these formatting controls calls for a u+202E RIGHT-TO-LEFT OVERRIDE (RLO) character before the text and a u+202C POP DIRECTIONAL FORMATTing (PDF) character after the text. Your example only works because the UBA mandates that all embeddings end at the end of a physical line, so omitting a PDF here doesn't affect the display, since the URL stands out on its own line. So you could actually see a URL enclosed in the RLO..PDF pair as well= , and we need to handle that in the same manner. > And since Latin characters are strongly left-to-right, you don't ge= t > confusing URLs in the middle of right-to-left text: As Stephen pointed out earlier, the same effect can be achieved with RTL text by using the LRO..PDF embedding (LRO is u+202D). > So... would a possible solution here be as simple as removing all > right-to-left indicators in mail and web modes if those right-to-le= ft > indicators apply to URLs? I think instead of removing them it is better to display them prominently, e.g., by changing their entry in the glyphless-char-display char-table. The advantage is that you don't accidentally harm the display where these controls are used legitimately, and OTOH make their presence acutely evident. > But currently Emacs doesn't really have a mechanism for querying th= e > directionality of a buffer region, I think? What do you mean by "directionality of a buffer region"? At least under some definitions of that, I can think of a very easy implementation.