From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Bidirectional text and URLs Date: Fri, 28 Nov 2014 16:54:58 +0200 Message-ID: <837fyfml31.fsf@gnu.org> References: <87a93cngwv.fsf@uwakimon.sk.tsukuba.ac.jp> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1417186520 23843 80.91.229.3 (28 Nov 2014 14:55:20 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 28 Nov 2014 14:55:20 +0000 (UTC) Cc: larsi@gnus.org, emacs-devel@gnu.org To: "Stephen J. Turnbull" Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Nov 28 15:55:13 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1XuMwy-0006Xw-JX for ged-emacs-devel@m.gmane.org; Fri, 28 Nov 2014 15:55:12 +0100 Original-Received: from localhost ([::1]:44779 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XuMwy-0006c6-A4 for ged-emacs-devel@m.gmane.org; Fri, 28 Nov 2014 09:55:12 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:45810) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XuMwr-0006YW-0d for emacs-devel@gnu.org; Fri, 28 Nov 2014 09:55:10 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XuMwl-0002rM-Iu for emacs-devel@gnu.org; Fri, 28 Nov 2014 09:55:04 -0500 Original-Received: from mtaout25.012.net.il ([80.179.55.181]:59101) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XuMwl-0002r5-AG for emacs-devel@gnu.org; Fri, 28 Nov 2014 09:54:59 -0500 Original-Received: from conversion-daemon.mtaout25.012.net.il by mtaout25.012.net.il (HyperSendmail v2007.08) id <0NFR00A007H4WC00@mtaout25.012.net.il> for emacs-devel@gnu.org; Fri, 28 Nov 2014 16:50:37 +0200 (IST) Original-Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout25.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NFR009MO7WDOY20@mtaout25.012.net.il>; Fri, 28 Nov 2014 16:50:37 +0200 (IST) In-reply-to: <87a93cngwv.fsf@uwakimon.sk.tsukuba.ac.jp> X-012-Sender: halo1@inter.net.il X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 80.179.55.181 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:178420 Archived-At: > From: "Stephen J. Turnbull" > Date: Fri, 28 Nov 2014 12:27:28 +0900 > Cc: emacs-devel@gnu.org > > Lars Magne Ingebrigtsen writes: > > > Using right-to-left markers to do phishing and obscure URLs has gotten > > some attention on the webs today. For instance, can you easily tell > > where the link below takes you if you click on it in Gnus and > > (presumably) rmail? > > Eli's the expert Not really, not in this particular field. > but I would say that given that the UAX#9 bidi algorithm does what's > wanted 99.44% of the time, it makes sense to mark text reordered by > RTL markers with a warning face That might be considered an annoyance by users of bidi scripts. There's any number of perfectly valid URLs that use the same formatting control characters. What you suggest might be TRT when left-to-right text is enclosed within directional override controls (which is what Lars did in his example). These controls assign right-to-left directionality to all the enclosed characters, which is indeed highly suspicious in URLs. In addition to using a special face, another possibility is to present the directional overrides in these cases in percent-hex notation, which will disable their effect on the enclosed text. Of course, this should be only done when the enclosed text is entirely made of LTR characters and neutrals. Like I said: we should first decide what we want to do in these cases, and then look around for machinery to implement that. > You do need a way to turn it off, or to make it reasonably smart, in > the case of ASCII which is often mixed with other charsets. Not sure what you mean here. Care to elaborate? "Turn off" how? And how do you do that without unduly punishing perfectly valid URLs that need these controls to avoid visual "jumbles"?