* Bidirectional text and URLs @ 2014-11-28 2:51 Lars Magne Ingebrigtsen 2014-11-28 3:27 ` Stephen J. Turnbull ` (3 more replies) 0 siblings, 4 replies; 133+ messages in thread From: Lars Magne Ingebrigtsen @ 2014-11-28 2:51 UTC (permalink / raw) To: emacs-devel Using right-to-left markers to do phishing and obscure URLs has gotten some attention on the webs today. For instance, can you easily tell where the link below takes you if you click on it in Gnus and (presumably) rmail? Works on URLs too. http://myspace.com/#/segami/moc.koobecaf//:sptth Unless I messed something up while cut'n'pasting that, you should see the problem. Now, should we do something about that? And if so -- what? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 133+ messages in thread
* Bidirectional text and URLs 2014-11-28 2:51 Bidirectional text and URLs Lars Magne Ingebrigtsen @ 2014-11-28 3:27 ` Stephen J. Turnbull 2014-11-28 14:54 ` Eli Zaretskii 2014-11-28 11:19 ` Ted Zlatanov ` (2 subsequent siblings) 3 siblings, 1 reply; 133+ messages in thread From: Stephen J. Turnbull @ 2014-11-28 3:27 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel Lars Magne Ingebrigtsen writes: > Using right-to-left markers to do phishing and obscure URLs has gotten > some attention on the webs today. For instance, can you easily tell > where the link below takes you if you click on it in Gnus and > (presumably) rmail? Eli's the expert, but I would say that given that the UAX#9 bidi algorithm does what's wanted 99.44% of the time, it makes sense to mark text reordered by RTL markers with a warning face, and to the extent that your UI recognizes URLs, you could even query the user: This link appears to have been obfuscated by using unusual characters or presentation techniques. This link points to http://myspace.com/#/... Is that your intended destination? if you recognize that the URL was obfuscated (not limited to RTL, but also out-of-block confusable characters such as a Cyrillic A in an otherwise ASCII URL and HTML A elements where the displayed text appears to be a URL that doesn't match the href, etc). Personally I'll probably just add RTL characters to my .procmailrc, and never see them in the first place. :-) Sorry about not noticing your post, larsi! ;^) > Works on URLs too. > > http://myspace.com/#/segami/moc.koobecaf//:sptth > > Unless I messed something up while cut'n'pasting that, you should see > the problem. Interestingly, it worked temporarily in Terminal.app but then stopped, I'm not sure why. A wormy Apple, I guess! ;-) > Now, should we do something about that? And if so -- what? I think that the query and the statistical analysis of confusables is likely to be a fair amount of work, if you want to avoid confusing the user more than the obfuscation does. A different face should be easy enough in cases where you have RTL markers or mixed charset blocks. You do need a way to turn it off, or to make it reasonably smart, in the case of ASCII which is often mixed with other charsets. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-28 3:27 ` Stephen J. Turnbull @ 2014-11-28 14:54 ` Eli Zaretskii 2014-11-29 6:09 ` Stephen J. Turnbull 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-11-28 14:54 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: larsi, emacs-devel > From: "Stephen J. Turnbull" <stephen@xemacs.org> > Date: Fri, 28 Nov 2014 12:27:28 +0900 > Cc: emacs-devel@gnu.org > > Lars Magne Ingebrigtsen writes: > > > Using right-to-left markers to do phishing and obscure URLs has gotten > > some attention on the webs today. For instance, can you easily tell > > where the link below takes you if you click on it in Gnus and > > (presumably) rmail? > > Eli's the expert Not really, not in this particular field. > but I would say that given that the UAX#9 bidi algorithm does what's > wanted 99.44% of the time, it makes sense to mark text reordered by > RTL markers with a warning face That might be considered an annoyance by users of bidi scripts. There's any number of perfectly valid URLs that use the same formatting control characters. What you suggest might be TRT when left-to-right text is enclosed within directional override controls (which is what Lars did in his example). These controls assign right-to-left directionality to all the enclosed characters, which is indeed highly suspicious in URLs. In addition to using a special face, another possibility is to present the directional overrides in these cases in percent-hex notation, which will disable their effect on the enclosed text. Of course, this should be only done when the enclosed text is entirely made of LTR characters and neutrals. Like I said: we should first decide what we want to do in these cases, and then look around for machinery to implement that. > You do need a way to turn it off, or to make it reasonably smart, in > the case of ASCII which is often mixed with other charsets. Not sure what you mean here. Care to elaborate? "Turn off" how? And how do you do that without unduly punishing perfectly valid URLs that need these controls to avoid visual "jumbles"? ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-28 14:54 ` Eli Zaretskii @ 2014-11-29 6:09 ` Stephen J. Turnbull 2014-11-29 8:22 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Stephen J. Turnbull @ 2014-11-29 6:09 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel Eli Zaretskii writes: > Not really, not in this particular field. > > > but I would say that given that the UAX#9 bidi algorithm does what's > > wanted 99.44% of the time, it makes sense to mark text reordered by > > RTL markers with a warning face > > That might be considered an annoyance by users of bidi scripts. > There's any number of perfectly valid URLs that use the same > formatting control characters. Why? Because many displays don't implement UAX#9? Or is it because UAX#9 defines segments in a way that would reorder the components of a domain name or path? That is, the logical URL http://www.example.com/ABC/DEF/ is expected by a bidi reader to appear as http://www.example.com/CBA/FED/ but UAX#9 would display it as http://www.example.com/FED/CBA/ (the natural direction of lowercase characters is LTR, the natural direction of uppercase characters is RTL)? (Or perhaps the reverse misdisplay.) Whatever the reason, I'd have to say that's too bad for users of bidi languages, because that means *any* bidi URLs is ambiguous, and therefore subject to being deliberately obfuscated by reflection and/or jumbling, regardless of the presence of directional controls. > What you suggest might be TRT when left-to-right text is enclosed > within directional override controls (which is what Lars did in his > example). These controls assign right-to-left directionality to all > the enclosed characters, which is indeed highly suspicious in URLs. This isn't hard to detect. But there is also the case where you have a word which is a different word when reflected. I assume that this is the case in bidi languages as well, and of course any jumble is possible as a domain or path component which is an abbreviation. And any useful jumble can probably be registered as a domain, and certainly incorporated in a path. > In addition to using a special face, another possibility is to present > the directional overrides in these cases in percent-hex notation, > which will disable their effect on the enclosed text. Of course, this > should be only done when the enclosed text is entirely made of LTR > characters and neutrals. Well, no. I assume that bidi readers are as vulnerable to phishing and other frauds as non-bidi readers (hard as that may be to believe for you bidi readers). That is not yet clear. > > You do need a way to turn it off, or to make it reasonably smart, in > > the case of ASCII which is often mixed with other charsets. > > Not sure what you mean here. As above, where the domain name is ASCII and the path is RTL. Or the path (or the domain) might be mixed. > "Turn off" how? "We need to decide what we want to do, and then look for a mechanism." > And how do you do that without unduly punishing perfectly valid > URLs that need these controls to avoid visual "jumbles"? I hate to tell you, but the phishers have *already* started punishing those perfectly valid URLs. You have a choice of punishment, that's all: "jumbled display" vs. "defrauded users". Except that as I say above, apparently all bidi URLs must now be considered to offer suspicious display under some circumstances, so maybe you have no choice about the defrauded users. In that case I suppose avoiding jumbles does take precedence. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 6:09 ` Stephen J. Turnbull @ 2014-11-29 8:22 ` Eli Zaretskii 2014-11-29 17:05 ` Richard Stallman ` (2 more replies) 0 siblings, 3 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-11-29 8:22 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: larsi, emacs-devel > From: "Stephen J. Turnbull" <stephen@xemacs.org> > Cc: larsi@gnus.org, > emacs-devel@gnu.org > Date: Sat, 29 Nov 2014 15:09:02 +0900 > > > > but I would say that given that the UAX#9 bidi algorithm does what's > > > wanted 99.44% of the time, it makes sense to mark text reordered by > > > RTL markers with a warning face > > > > That might be considered an annoyance by users of bidi scripts. > > There's any number of perfectly valid URLs that use the same > > formatting control characters. > > Why? Because many displays don't implement UAX#9? Or is it because > UAX#9 defines segments in a way that would reorder the components of a > domain name or path? That is, the logical URL > > http://www.example.com/ABC/DEF/ > > is expected by a bidi reader to appear as > > http://www.example.com/CBA/FED/ > > but UAX#9 would display it as > > http://www.example.com/FED/CBA/ Yes. And there are worse examples (e.g., try an HTML link which includes both a URL and a link text). The problem here is that all those /, :, <, and > characters are neutrals, so they take the direction of surrounding text, i.e. are reversed for display when the surrounding text is RTL. In addition, < and > are mirrored in that case. That can make quite a jumble. (Unicode 6.3 added special handling for "paired-bracket" characters, which makes the situation with < and > somewhat better, but we only support that on master, Emacs 24.4 doesn't.) > Whatever the reason, I'd have to say that's too bad for users of bidi > languages, because that means *any* bidi URLs is ambiguous, and > therefore subject to being deliberately obfuscated by reflection > and/or jumbling, regardless of the presence of directional controls. I agree, but the issue discussed here is different: it's AFAIU about users of LTR scripts that can fall victim to use of directional controls that are by default (almost) invisible on Emacs display. I think we would like to have at least that situation "handled" in some way. My point above was that the way we handle that should not unduly punish users of bidi scripts, i.e. legitimate uses of these controls. > > What you suggest might be TRT when left-to-right text is enclosed > > within directional override controls (which is what Lars did in his > > example). These controls assign right-to-left directionality to all > > the enclosed characters, which is indeed highly suspicious in URLs. > > This isn't hard to detect. But there is also the case where you have > a word which is a different word when reflected. If we have a dictionary, we can detect that, too. If we don't, then detecting only the enclosed-LTR case is better than nothing, I think. Another possibility is to modify the way these control characters are displayed by manipulating their entries in the glyphless-char-display char-table. It should probably be enough to display them as hex-code in a box, to make the user aware of the possible problem. This should be done by applications that display URLs, like eww, Gnus, Rmail, etc.; not globally. > I assume that this is the case in bidi languages as well Yes, but that would require RTL text embedded in a left-to-right overriding embedding, which is easily detectable, like the opposite case that started this thread. > and of course any jumble is possible as a domain or path component > which is an abbreviation. And any useful jumble can probably be > registered as a domain, and certainly incorporated in a path. I doubt that a domain like this could be registered, as using such characters in a domain name is AFAIU against the regulations, see RFC3987. > > In addition to using a special face, another possibility is to present > > the directional overrides in these cases in percent-hex notation, > > which will disable their effect on the enclosed text. Of course, this > > should be only done when the enclosed text is entirely made of LTR > > characters and neutrals. > > Well, no. I assume that bidi readers are as vulnerable to phishing > and other frauds as non-bidi readers (hard as that may be to believe > for you bidi readers). That is not yet clear. The easy cases with RTL text, as mentioned above, should be also easily detectable, and I agree they should get the same treatment. > > > You do need a way to turn it off, or to make it reasonably smart, in > > > the case of ASCII which is often mixed with other charsets. > > > > Not sure what you mean here. > > As above, where the domain name is ASCII and the path is RTL. Or the > path (or the domain) might be mixed. > > > "Turn off" how? > > "We need to decide what we want to do, and then look for a mechanism." OK, let me rephrase: what effect will "turning off" have on display? > > And how do you do that without unduly punishing perfectly valid > > URLs that need these controls to avoid visual "jumbles"? > > I hate to tell you, but the phishers have *already* started punishing > those perfectly valid URLs. You have a choice of punishment, that's > all: "jumbled display" vs. "defrauded users". I very much hope we will find a sane middle ground, possibly subject to user control. I'd hate to see Emacs become another case of the TSA disaster. > Except that as I say above, apparently all bidi URLs must now be > considered to offer suspicious display under some circumstances, so > maybe you have no choice about the defrauded users. In that case I > suppose avoiding jumbles does take precedence. Once we decide which cases we want to avoid or flag, we could be smart there, by comparing the original and reordered strings, perhaps aided by some dictionary lookup. The infrastructure is either already there or easy to add. It's "just" a matter of deciding what to do and when. Someone(TM) should present a list of well-thought requirements, and we can take it from there. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 8:22 ` Eli Zaretskii @ 2014-11-29 17:05 ` Richard Stallman 2014-11-29 17:13 ` Lars Magne Ingebrigtsen 2014-11-29 17:14 ` Ted Zlatanov 2014-11-30 13:42 ` Stephen J. Turnbull 2 siblings, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-11-29 17:05 UTC (permalink / raw) To: Eli Zaretskii; +Cc: stephen, larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > Whatever the reason, I'd have to say that's too bad for users of bidi > > languages, because that means *any* bidi URLs is ambiguous, and > > therefore subject to being deliberately obfuscated by reflection > > and/or jumbling, regardless of the presence of directional controls. > I agree, but the issue discussed here is different: it's AFAIU about > users of LTR scripts that can fall victim to use of directional > controls that are by default (almost) invisible on Emacs display. We need to address both issues --- with two different solutions, if necessary. I have a feeling that the problem that LTR URLs get reordered strangely must have presented itself in other software, such as browsers. What do they do about it? If the host NAME isn't confused, perhaps it is not really dangerous. So perhaps it is enough to make sure to avoid confusion about the host name. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 17:05 ` Richard Stallman @ 2014-11-29 17:13 ` Lars Magne Ingebrigtsen 2014-11-29 17:49 ` Lars Magne Ingebrigtsen 0 siblings, 1 reply; 133+ messages in thread From: Lars Magne Ingebrigtsen @ 2014-11-29 17:13 UTC (permalink / raw) To: Richard Stallman; +Cc: Eli Zaretskii, stephen, emacs-devel Richard Stallman <rms@gnu.org> writes: > I have a feeling that the problem that LTR URLs get reordered > strangely must have presented itself in other software, such as > browsers. What do they do about it? Most browsers do nothing about it -- Firefox, for instance, will just display the reordered URL, and clicking it will take you to unexpected places. While this problem has existed for years, it seems like it's only been getting attention lately, and perhaps the other browser maintainers are also scratching their heads about what the right approach to take here is... -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 17:13 ` Lars Magne Ingebrigtsen @ 2014-11-29 17:49 ` Lars Magne Ingebrigtsen 2014-11-29 17:54 ` Lars Magne Ingebrigtsen ` (2 more replies) 0 siblings, 3 replies; 133+ messages in thread From: Lars Magne Ingebrigtsen @ 2014-11-29 17:49 UTC (permalink / raw) To: emacs-devel Phishing using this method is a problem mainly on the web and in mail, so I wonder whether the solution we're looking for would be applied to main and web modes instead of having a more general mechanism. It seems pretty clear that stuff like http://myspace.com/#/segami/moc.koobecaf//:sptth where you have a buffer with only left-to-right text, but then you have a single right-to-left indicator, is suspicious. And since Latin characters are strongly left-to-right, you don't get confusing URLs in the middle of right-to-left text: הממשלה בכך שהוא http://myspace.com/#/segami/moc.koobecaf//:sptth "משתף פעולה עם (I hope that's nothing rude, I just cut'n'pasted text at random from a Hebrew web page.) So... would a possible solution here be as simple as removing all right-to-left indicators in mail and web modes if those right-to-left indicators apply to URLs? That is, after the modes mark the regions it thinks are URLs, then they would check if there are any RTL characters that apply to the regions that it thinks are URLs? But currently Emacs doesn't really have a mechanism for querying the directionality of a buffer region, I think? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 17:49 ` Lars Magne Ingebrigtsen @ 2014-11-29 17:54 ` Lars Magne Ingebrigtsen 2014-11-29 18:24 ` Eli Zaretskii 2014-11-29 18:18 ` Eli Zaretskii 2014-11-30 9:38 ` Richard Stallman 2 siblings, 1 reply; 133+ messages in thread From: Lars Magne Ingebrigtsen @ 2014-11-29 17:54 UTC (permalink / raw) To: emacs-devel Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > So... would a possible solution here be as simple as removing all > right-to-left indicators in mail and web modes if those right-to-left > indicators apply to URLs? Or even simpler: The URL-finding functions would explicitly place left-to-right markers over the bits of the URL that have left-to-right characters if there are any RTL markers in the buffer. This would make all the bits that say "http://example.com" etc be left-to-right, and if there are bits in the URL later that contains, say, Hebrew, those would still be displayed correctly. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 17:54 ` Lars Magne Ingebrigtsen @ 2014-11-29 18:24 ` Eli Zaretskii 2014-11-29 18:29 ` Lars Magne Ingebrigtsen 2014-11-30 9:38 ` Richard Stallman 0 siblings, 2 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-11-29 18:24 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Date: Sat, 29 Nov 2014 18:54:43 +0100 > > Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > > > So... would a possible solution here be as simple as removing all > > right-to-left indicators in mail and web modes if those right-to-left > > indicators apply to URLs? > > Or even simpler: The URL-finding functions would explicitly place > left-to-right markers over the bits of the URL that have left-to-right > characters if there are any RTL markers in the buffer. > > This would make all the bits that say "http://example.com" etc be > left-to-right, and if there are bits in the URL later that contains, > say, Hebrew, those would still be displayed correctly. Please don't: you will never be able to do that correctly without re-implementing bidi.c in Lisp. The UBA rules are much more complex than what you seem to envision; in particular, a character can be neither RTL nor LTR (so called "weak" and "neutral" characters, like the slash and the period). In any case, I think what you suggest is too drastic. We don't need to change the display of these URLs from their intended one, we just need to make the user aware of the possible phishing. E.g., with your suggestion, a Web page that explain how the URL you posted at the beginning could be dangerous won't be able to make its point clearly visible ;-) ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 18:24 ` Eli Zaretskii @ 2014-11-29 18:29 ` Lars Magne Ingebrigtsen 2014-11-30 9:38 ` Richard Stallman 1 sibling, 0 replies; 133+ messages in thread From: Lars Magne Ingebrigtsen @ 2014-11-29 18:29 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > In any case, I think what you suggest is too drastic. We don't need > to change the display of these URLs from their intended one, we just > need to make the user aware of the possible phishing. E.g., with your > suggestion, a Web page that explain how the URL you posted at the > beginning could be dangerous won't be able to make its point clearly > visible ;-) Yeah, that's true. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 18:24 ` Eli Zaretskii 2014-11-29 18:29 ` Lars Magne Ingebrigtsen @ 2014-11-30 9:38 ` Richard Stallman 2014-11-30 15:21 ` Eli Zaretskii 1 sibling, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-11-30 9:38 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Please don't: you will never be able to do that correctly without > re-implementing bidi.c in Lisp. Rather than re-implementing bidi.c in Lisp, I suggest we provide primitives to make all the relevant inquiries from Lisp code through the same code in bidi.c. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 9:38 ` Richard Stallman @ 2014-11-30 15:21 ` Eli Zaretskii 0 siblings, 0 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-11-30 15:21 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Sun, 30 Nov 2014 04:38:22 -0500 > From: Richard Stallman <rms@gnu.org> > CC: larsi@gnus.org, emacs-devel@gnu.org > > Rather than re-implementing bidi.c in Lisp, I suggest we provide > primitives to make all the relevant inquiries from Lisp code > through the same code in bidi.c. I agree, and we already have that for every inquiry of this kind that surfaced until now. One example is current-bidi-paragraph-direction. The issue here is what exactly is the inquiry we are talking about this time. I don't yet see what exactly is required. Maybe someone else does. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 17:49 ` Lars Magne Ingebrigtsen 2014-11-29 17:54 ` Lars Magne Ingebrigtsen @ 2014-11-29 18:18 ` Eli Zaretskii 2014-11-29 18:33 ` Lars Magne Ingebrigtsen 2014-11-30 16:26 ` Lars Magne Ingebrigtsen 2014-11-30 9:38 ` Richard Stallman 2 siblings, 2 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-11-29 18:18 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Date: Sat, 29 Nov 2014 18:49:21 +0100 > > It seems pretty clear that stuff like > > http://myspace.com/#/segami/moc.koobecaf//:sptth > > where you have a buffer with only left-to-right text, but then you have > a single right-to-left indicator, is suspicious. The "single right-to-left indicator" is a fallacy: the correct use of these formatting controls calls for a u+202E RIGHT-TO-LEFT OVERRIDE (RLO) character before the text and a u+202C POP DIRECTIONAL FORMATTing (PDF) character after the text. Your example only works because the UBA mandates that all embeddings end at the end of a physical line, so omitting a PDF here doesn't affect the display, since the URL stands out on its own line. So you could actually see a URL enclosed in the RLO..PDF pair as well, and we need to handle that in the same manner. > And since Latin characters are strongly left-to-right, you don't get > confusing URLs in the middle of right-to-left text: As Stephen pointed out earlier, the same effect can be achieved with RTL text by using the LRO..PDF embedding (LRO is u+202D). > So... would a possible solution here be as simple as removing all > right-to-left indicators in mail and web modes if those right-to-left > indicators apply to URLs? I think instead of removing them it is better to display them prominently, e.g., by changing their entry in the glyphless-char-display char-table. The advantage is that you don't accidentally harm the display where these controls are used legitimately, and OTOH make their presence acutely evident. > But currently Emacs doesn't really have a mechanism for querying the > directionality of a buffer region, I think? What do you mean by "directionality of a buffer region"? At least under some definitions of that, I can think of a very easy implementation. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 18:18 ` Eli Zaretskii @ 2014-11-29 18:33 ` Lars Magne Ingebrigtsen 2014-11-29 18:47 ` Eli Zaretskii 2014-11-30 16:26 ` Lars Magne Ingebrigtsen 1 sibling, 1 reply; 133+ messages in thread From: Lars Magne Ingebrigtsen @ 2014-11-29 18:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > I think instead of removing them it is better to display them > prominently, e.g., by changing their entry in the > glyphless-char-display char-table. The advantage is that you don't > accidentally harm the display where these controls are used > legitimately, and OTOH make their presence acutely evident. Yeah, isn't that a bit too intrusive if done generally? If we display these markers very visibly, then buffers where they are legitimately used would be kinda ugly. And I don't think users would necessarily know that the URL is displayed the wrong way around just because there's an ugly control character displayed before or after the URL... >> But currently Emacs doesn't really have a mechanism for querying the >> directionality of a buffer region, I think? > > What do you mean by "directionality of a buffer region"? At least > under some definitions of that, I can think of a very easy > implementation. When hitting RET on an URL, the function that handles that could ask Emacs "is the http://domain.com bit displayed RTL or LTR"? If it's RTL, then that function could "are you sure?" the user. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 18:33 ` Lars Magne Ingebrigtsen @ 2014-11-29 18:47 ` Eli Zaretskii 2014-11-29 19:12 ` Andreas Schwab 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-11-29 18:47 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Cc: emacs-devel@gnu.org > Date: Sat, 29 Nov 2014 19:33:47 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > > I think instead of removing them it is better to display them > > prominently, e.g., by changing their entry in the > > glyphless-char-display char-table. The advantage is that you don't > > accidentally harm the display where these controls are used > > legitimately, and OTOH make their presence acutely evident. > > Yeah, isn't that a bit too intrusive if done generally? I didn't suggest to do that generally, just in Web pages. These format controls are discouraged in Web pages anyway; the use of HTML bidi markup dir="rtl" etc. is advised instead. > If we display these markers very visibly, then buffers where they > are legitimately used would be kinda ugly. I don't know why it would be "ugly". The text will still be displayed correctly, so it will be as legible as with our current bidi display. > And I don't think users would necessarily know that the URL is > displayed the wrong way around just because there's an ugly control > character displayed before or after the URL... I think the existence of a strange unprintable character in or around a URL should attract attention, which is all we need to accomplish. > >> But currently Emacs doesn't really have a mechanism for querying the > >> directionality of a buffer region, I think? > > > > What do you mean by "directionality of a buffer region"? At least > > under some definitions of that, I can think of a very easy > > implementation. > > When hitting RET on an URL, the function that handles that could ask > Emacs "is the http://domain.com bit displayed RTL or LTR"? If it's RTL, > then that function could "are you sure?" the user. You just replaced one not well-defined term with another. So now my question becomes what do you mean by "displayed RTL or LTR"? And mind you: the "domain" part can legitimately consist of RTL characters, if my reading of the respective RFCs is correct. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 18:47 ` Eli Zaretskii @ 2014-11-29 19:12 ` Andreas Schwab 2014-11-29 19:31 ` Lars Magne Ingebrigtsen 2014-11-29 20:13 ` Eli Zaretskii 0 siblings, 2 replies; 133+ messages in thread From: Andreas Schwab @ 2014-11-29 19:12 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Lars Magne Ingebrigtsen, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> From: Lars Magne Ingebrigtsen <larsi@gnus.org> >> Cc: emacs-devel@gnu.org >> Date: Sat, 29 Nov 2014 19:33:47 +0100 >> >> Eli Zaretskii <eliz@gnu.org> writes: >> >> > I think instead of removing them it is better to display them >> > prominently, e.g., by changing their entry in the >> > glyphless-char-display char-table. The advantage is that you don't >> > accidentally harm the display where these controls are used >> > legitimately, and OTOH make their presence acutely evident. >> >> Yeah, isn't that a bit too intrusive if done generally? > > I didn't suggest to do that generally, just in Web pages. These > format controls are discouraged in Web pages anyway; the use of HTML > bidi markup dir="rtl" etc. is advised instead. But the problem at hand is not relevant to Web pages. The URL in an anchor is always a separate entity. Only non-HTML text where URLs are made active by heuristics are the case to worry about. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 19:12 ` Andreas Schwab @ 2014-11-29 19:31 ` Lars Magne Ingebrigtsen 2014-11-29 19:39 ` Andreas Schwab 2014-11-29 20:13 ` Eli Zaretskii 1 sibling, 1 reply; 133+ messages in thread From: Lars Magne Ingebrigtsen @ 2014-11-29 19:31 UTC (permalink / raw) To: Andreas Schwab; +Cc: Eli Zaretskii, emacs-devel Andreas Schwab <schwab@linux-m68k.org> writes: > But the problem at hand is not relevant to Web pages. The URL in an > anchor is always a separate entity. Only non-HTML text where URLs are > made active by heuristics are the case to worry about. It's sort of relevant to web pages, too: `M-x eww RET http://permalink.gmane.org/gmane.emacs.devel/178392 RET' Of course, the <a> text could just contain "http://facebook.com" without any RTL, like <a href="http://myspace.com">http://facebook.com</a> shr should warn when the <a> text is also an URL and when it's different from the href. But in this case, the <a> text and the href are identical, so that check wouldn't do anything helpful, and the user would end up on the dreaded Myspace... -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 19:31 ` Lars Magne Ingebrigtsen @ 2014-11-29 19:39 ` Andreas Schwab 0 siblings, 0 replies; 133+ messages in thread From: Andreas Schwab @ 2014-11-29 19:39 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: Eli Zaretskii, emacs-devel Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > shr should warn when the <a> text is also an URL and when it's > different from the href. But in this case, the <a> text and the href > are identical, so that check wouldn't do anything helpful, and the user > would end up on the dreaded Myspace... You can always show the URL unambigously before following it. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 19:12 ` Andreas Schwab 2014-11-29 19:31 ` Lars Magne Ingebrigtsen @ 2014-11-29 20:13 ` Eli Zaretskii 1 sibling, 0 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-11-29 20:13 UTC (permalink / raw) To: Andreas Schwab; +Cc: larsi, emacs-devel > From: Andreas Schwab <schwab@linux-m68k.org> > Cc: Lars Magne Ingebrigtsen <larsi@gnus.org>, emacs-devel@gnu.org > Date: Sat, 29 Nov 2014 20:12:24 +0100 > > >> Yeah, isn't that a bit too intrusive if done generally? > > > > I didn't suggest to do that generally, just in Web pages. These > > format controls are discouraged in Web pages anyway; the use of HTML > > bidi markup dir="rtl" etc. is advised instead. > > But the problem at hand is not relevant to Web pages. The URL in an > anchor is always a separate entity. Only non-HTML text where URLs are > made active by heuristics are the case to worry about. Then I guess it's even easier. Of course, we still have ffap and the likes, which do their thing even in general-purpose text. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 18:18 ` Eli Zaretskii 2014-11-29 18:33 ` Lars Magne Ingebrigtsen @ 2014-11-30 16:26 ` Lars Magne Ingebrigtsen 2014-11-30 17:29 ` Yuri Khan 2014-11-30 17:53 ` Eli Zaretskii 1 sibling, 2 replies; 133+ messages in thread From: Lars Magne Ingebrigtsen @ 2014-11-30 16:26 UTC (permalink / raw) To: emacs-devel Just a point of clarification: When people embed URLs in paragraphs with mainly right-to-left script (like Hebrew), do they expect to see http://myspace.com or ?http://myspace.com (If I did that correctly, the latter URL should have an RLO character preceding it so that it reads right to left.) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 16:26 ` Lars Magne Ingebrigtsen @ 2014-11-30 17:29 ` Yuri Khan 2014-11-30 17:57 ` Lars Magne Ingebrigtsen 2014-11-30 17:53 ` Eli Zaretskii 1 sibling, 1 reply; 133+ messages in thread From: Yuri Khan @ 2014-11-30 17:29 UTC (permalink / raw) To: Emacs developers On Sun, Nov 30, 2014 at 10:26 PM, Lars Magne Ingebrigtsen <larsi@gnus.org> wrote: > Just a point of clarification: When people embed URLs in paragraphs with > mainly right-to-left script (like Hebrew), do they expect to see > http://myspace.com or ?http://myspace.com As a person who has never spoken or written an RTL language but who understands the logic behind RTL, I think in an RTL context I might expect a rendering which is visually identical to that of com.myspace//:http or maybe com.myspace\\:http. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 17:29 ` Yuri Khan @ 2014-11-30 17:57 ` Lars Magne Ingebrigtsen 2014-11-30 18:18 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Lars Magne Ingebrigtsen @ 2014-11-30 17:57 UTC (permalink / raw) To: Yuri Khan; +Cc: Emacs developers Yuri Khan <yuri.v.khan@gmail.com> writes: > As a person who has never spoken or written an RTL language but who > understands the logic behind RTL, I think in an RTL context I might > expect a rendering which is visually identical to that of > com.myspace//:http or maybe com.myspace\\:http. Well, I had a look at a Hebrew mailing list, and I found paragraphs like המילון שאמור לרכז את המונחים ולתקנן את המינוח העברי במיזמי הקוד הפתוח הינו מילון כרמ"ל (כרמל איננה רשימת מילים לתרגום), ניתן למצוא את המילון בכתובת: http://carmel.whatsup.org.il and בניתי חבילה לקבצי התרגום לעברית של אופן אופיס לארצ'. הבעייה שאני לא יודע מה הרשיון שלה. http://aur.archlinux.org/packages.php?do_Details=1&ID=9791 where all the URLs are displayed left-to-right. I don't know whether this is a representative sample, though. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 17:57 ` Lars Magne Ingebrigtsen @ 2014-11-30 18:18 ` Eli Zaretskii 0 siblings, 0 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-11-30 18:18 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel, yuri.v.khan > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Date: Sun, 30 Nov 2014 18:57:33 +0100 > Cc: Emacs developers <emacs-devel@gnu.org> > > Well, I had a look at a Hebrew mailing list, and I found paragraphs like > המילון שאמור לרכז את המונחים ולתקנן את המינוח העברי במיזמי הקוד הפתוח > הינו מילון כרמ"ל (כרמל איננה רשימת מילים לתרגום), ניתן למצוא את המילון > בכתובת: http://carmel.whatsup.org.il > and > בניתי חבילה לקבצי התרגום לעברית של אופן אופיס לארצ'. הבעייה שאני לא > יודע מה הרשיון שלה. > http://aur.archlinux.org/packages.php?do_Details=1&ID=9791 > where all the URLs are displayed left-to-right. I don't know whether > this is a representative sample, though. They are. Since all the characters in the URL are either strong LTR or weak/neutral characters, the entire URL is displayed left to right, no matter whether the paragraph's base direction is LTR or RTL. But if the part after the "?" will include RTL characters, that part will be rendered right to left. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 16:26 ` Lars Magne Ingebrigtsen 2014-11-30 17:29 ` Yuri Khan @ 2014-11-30 17:53 ` Eli Zaretskii 2014-11-30 18:13 ` Lars Magne Ingebrigtsen 1 sibling, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-11-30 17:53 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Date: Sun, 30 Nov 2014 17:26:33 +0100 > > Just a point of clarification: When people embed URLs in paragraphs with > mainly right-to-left script (like Hebrew) Let's clear up terminology first, OK? There's no distinction in bidi display and bidi scripts between "paragraphs with mainly right-to-left scripts" and "paragraphs with mainly left-to-right scripts". Instead, there's "the base direction of a paragraph", which can be either left-to-right (LTR) or right-to-left (RTL). The former is displayed with the first character (in the _visual_ order!) at the left edge of the window, while the latter at the right edge. It is true that the LTR paragraphs make most sense when most of the paragraph text is made of LTR characters, and the RTL paragraphs in the opposite case. But nothing prevents me from having a paragraph whose base direction is LTR which is nevertheless full of RTL characters. It is entirely legitimate and sometimes even necessary. Emacs determines the base direction of a paragraph by searching for the first strong directional character in the paragraph (this is a simplification, the actual rules described in the UBA are more complex). Buffer-local variable bidi-paragraph-direction overrides this dynamic calculation and forces a specific base direction on all paragraphs of the buffer. With this out of our way, I will assume that you were asking about URLs that are part of paragraphs whose base direction is RTL. Now let's go back to your question: > do they expect to see http://myspace.com or ?http://myspace.com The answer to your question is "it depends". Here are 3 examples, to see them as I intended, make sure you are viewing them in a buffer whose bidi-paragraph-direction is set to nil: abc http://אבג.דהוזחט.קום אבג http://foo.bar.com אבג http://אבג.דהוזחט.קום The leading 3 letters (1 would be enough) cause Emacs to decide that the paragraph has LTR base direction in the 1st example and RTL base direction in the last 2 examples. Now move the cursor with C-f from the beginning of each of these three lines (you can get to the beginning of a line with C-a or Home, as usual), and I hope you will see what's going on: cursor movement with C-f follows the "reading order", i.e. the order in which a human is supposed to read these URLs. To summarize: Latin characters are displayed left to right, even in RTL paragraphs, while right-to-left characters are always displayed right to left. Neutral characters (slash, period) take the direction of the surrounding text. > (If I did that correctly, the latter URL should have an RLO character > preceding it so that it reads right to left.) As you see above, there's no need to use any directional overrides to see what users expect: Emacs does that automatically, by following the Unicode Bidirectional Algorithm (UBA). You just need to arrange for the paragraph to have a RTL base direction, which is very easy, as shown above. RLO and LRO (and the other directional control characters) are needed when you need to override the normal reordering for some reason, typically because you want punctuation characters to take a different directionality from its default. This is rarely needed when rendering URLs. HTH May I ask why you came up with the question? ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 17:53 ` Eli Zaretskii @ 2014-11-30 18:13 ` Lars Magne Ingebrigtsen 2014-11-30 19:06 ` Lars Magne Ingebrigtsen ` (2 more replies) 0 siblings, 3 replies; 133+ messages in thread From: Lars Magne Ingebrigtsen @ 2014-11-30 18:13 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > Let's clear up terminology first, OK? Thanks for the explanation. > To summarize: Latin characters are displayed left to right, even in > RTL paragraphs, while right-to-left characters are always displayed > right to left. Neutral characters (slash, period) take the direction > of the surrounding text. Right. > HTH It does, yes. > May I ask why you came up with the question? Because I was wondering whether my suggestion from yesterday (that we insert LRO/PDF characters into URLs if there is an LRO present in the buffer when recognising URLs) is at all feasible, and from your explanation, it seems like it would be. And it would not require reimplementing bidi.c in Lisp. I agreed with your objection that if we used such a scheme, then the discussion we're doing here would look pretty incomprehensible. However, thinking about it a bit more, this is really favouring meta-discussion over usage, and I think we should be leery of doing that. Here's my proposal again, fleshed out with examples, for the algorithms that recognise (and make buttons out of) URLs and the like in email (etc.) buffers: 1) If there are no right-to-left overrides in the buffer, then do nothing special. This will cover 99.996% of all buffers. 2) If there is an LRO in the buffer, then, after recognising an URL, it is further treated. * If it contains no strongly right-to-left characters, we just wrap it in an LRO/PDF pair. URLs like "http://myspace.com" will then be guaranteed to be displayed reading left-to-right. * If the URL is like http://אבג.דהוזחט.קום, we would segment the URL into strongly-left-to-right-with-weak-chars and strongly-right-to-left-with-weak-chars segments. We wrap each left-to-right-with-weak-chars in LRO/PDF pairs. For that URL, this would be LRO http:// PDF אבג.דהוזחט.קום Emacs already exposes the weak/strong/LTR/RTL status of each character, so function to do this LRO/PDF insertion is trivial. It's like a seven-line Elisp function or something. From what you say, sounds like it would make the display of these URLs acceptable for bidi readers, too -- this would be the normal display of these URLs, anyway. The only thing we're protecting the users from is shenaningans. And discussions like this, of course, since all the URLs would display "correctly". :-) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 18:13 ` Lars Magne Ingebrigtsen @ 2014-11-30 19:06 ` Lars Magne Ingebrigtsen 2014-11-30 19:10 ` Lars Magne Ingebrigtsen 2014-11-30 19:19 ` Lars Magne Ingebrigtsen 2014-11-30 21:05 ` Eli Zaretskii 2 siblings, 1 reply; 133+ messages in thread From: Lars Magne Ingebrigtsen @ 2014-11-30 19:06 UTC (permalink / raw) To: emacs-devel Ok, it was a bit longer than 7 lines, and I'm not quite sure what to do about further embedded LRM, RLM, ALM, LRE, RLE, LRO, RLO, PDF, LRI, RLI, FSI, PDI characters (perhaps just drop them?), but here's a kinda weak proof of concept. Eval the following, and you sort of get what you'd expect. (concat (string ?\x202e) "---" (ensure-left-to-right-string "http://אבג.דהוזחט.קום/yes/indeed.קום///")) ?\x202e is the right-to-left override. Compare with the output you get if you don't hack up the URL and sprinkle LROs: (concat (string ?\x202e) "---" "http://אבג.דהוזחט.קום/yes/indeed.קום///") (defun ensure-left-to-right-string (string) (let ((prev (get-char-code-property (aref string 0) 'bidi-class)) (start 0) (pos 0) (bits nil)) (while (< pos (length string)) (setq current (get-char-code-property (aref string pos) 'bidi-class)) (when (or (and (eq prev 'L) (memq current '(R AL))) (and (memq prev '(R AL)) (eq current 'L))) (push (substring string start pos) bits) (when (memq current '(L R AL)) (setq prev current)) (setq start pos)) (cl-incf pos)) (push (substring string start pos) bits) (mapconcat (lambda (bit) (if (cl-notany (lambda (char) (memq (get-char-code-property char 'bidi-class) '(R AL))) bit) ;; Wrap the string in LRO and PDF. (concat (string ?\x202d) bit (string ?\x202C)) ;; And RLO and PDF for the right-to-left bits. (concat (string ?\x202e) bit (string ?\x202C)))) (nreverse bits) ""))) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 19:06 ` Lars Magne Ingebrigtsen @ 2014-11-30 19:10 ` Lars Magne Ingebrigtsen 2014-11-30 20:41 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Lars Magne Ingebrigtsen @ 2014-11-30 19:10 UTC (permalink / raw) To: emacs-devel Bug fix! Leading neutralish characters would defeat it. (defun ensure-left-to-right-string (string) (let ((prev (get-char-code-property (aref string 0) 'bidi-class)) (start 0) (pos 0) (bits nil)) (while (< pos (length string)) (setq current (get-char-code-property (aref string pos) 'bidi-class)) (when (or (and (eq prev 'L) (memq current '(R AL))) (and (memq prev '(R AL)) (eq current 'L))) (push (substring string start pos) bits) (setq start pos)) (when (memq current '(L R AL)) (setq prev current)) (cl-incf pos)) (push (substring string start pos) bits) (mapconcat (lambda (bit) (if (cl-notany (lambda (char) (memq (get-char-code-property char 'bidi-class) '(R AL))) bit) ;; Wrap the string in LRO and PDF. (concat (string ?\x202d) bit (string ?\x202C)) ;; And RLO and PDF for the right-to-left bits. (concat (string ?\x202e) bit (string ?\x202C)))) (nreverse bits) ""))) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 19:10 ` Lars Magne Ingebrigtsen @ 2014-11-30 20:41 ` Eli Zaretskii 0 siblings, 0 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-11-30 20:41 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Date: Sun, 30 Nov 2014 20:10:29 +0100 > > Bug fix! Leading neutralish characters would defeat it. You are well on your way to re-implement bidi.c. Good luck. That's not how this problem should be handled. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 18:13 ` Lars Magne Ingebrigtsen 2014-11-30 19:06 ` Lars Magne Ingebrigtsen @ 2014-11-30 19:19 ` Lars Magne Ingebrigtsen 2014-11-30 21:05 ` Eli Zaretskii 2 siblings, 0 replies; 133+ messages in thread From: Lars Magne Ingebrigtsen @ 2014-11-30 19:19 UTC (permalink / raw) To: emacs-devel Lars Magne Ingebrigtsen <larsi@gnus.org> writes: > 1) If there are no right-to-left overrides in the buffer, then do > nothing special. This will cover 99.996% of all buffers. And with that I mean all the right-to-left indicators characters, I think. RLE, RLM, ALM, etc. Probably. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 18:13 ` Lars Magne Ingebrigtsen 2014-11-30 19:06 ` Lars Magne Ingebrigtsen 2014-11-30 19:19 ` Lars Magne Ingebrigtsen @ 2014-11-30 21:05 ` Eli Zaretskii 2014-11-30 21:36 ` Lars Magne Ingebrigtsen ` (2 more replies) 2 siblings, 3 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-11-30 21:05 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Date: Sun, 30 Nov 2014 19:13:54 +0100 > Cc: emacs-devel@gnu.org > > Because I was wondering whether my suggestion from yesterday (that we > insert LRO/PDF characters into URLs if there is an LRO present in the > buffer when recognising URLs) is at all feasible, and from your > explanation, it seems like it would be. IMO, you are jumping to solutions too early, without a good understanding of the real problem. I also guess that you meant RLO, not LRO. The latter makes the embedded text render like strict left-to-right characters, so it doesn't need any special handling and cannot do any harm in URLs that use left-to-right characters (which is 99.99% of URLs). Can we please take a step back and try to identify the real problem here? What exactly are we trying to detect and handle? Is it true that we are trying to detect URLs whose characters got their "normal" bidirectional properties overridden by some directional control characters? If so, I can write a primitive that will take a region of buffer text and examine it to detect this. If it is something else, please tell what that is, and chances are you can have it without having to go through a crash course in UBA. In any way, it is IMO wrong to look for specific controls that you just happened to learn yesterday. They are not what you need to look for, they are just one sign of what you are looking for. The UBA is too complex an algorithm, and it keeps evolving, so chances are there will be more ways to do these tricks. You need to define what is it that you are looking for, not search for this or that sign. Next, given that you have detected the spoofed URL, what do you want to do with it? Do you want to highlight it, do you want to de-spoof (i.e. undo the spoofing) in some way, but still leave some indication of the fact that it was spoofed, or maybe you want to remove any trace of the spoofing as if it never happened (and leave the user oblivious to the fact it did)? Given the answers to those questions, there's any number of possible solutions that do NOT require inserting more directional controls. Some of the possible solutions were already mentioned in this thread. Here's another: cover the offending RLO with a display property showing whatever you want -- a warning sign, a smiley, a string made of a SPC character, anything. You can try it with your example: you will see the spoofing gone immediately. Why is this worse than inserting directional controls whose effect on the surrounding text can be far reaching? > 2) If there is an LRO in the buffer, then, after recognising an URL, it > is further treated. > > * If it contains no strongly right-to-left characters, we just wrap it > in an LRO/PDF pair. URLs like "http://myspace.com" will then be > guaranteed to be displayed reading left-to-right. > > * If the URL is like http://אבג.דהוזחט.קום, we would segment the URL > into strongly-left-to-right-with-weak-chars and > strongly-right-to-left-with-weak-chars segments. We wrap each > left-to-right-with-weak-chars in LRO/PDF pairs. This will change how these URLs are displayed, in a way that users will not like, and personally it sounds to me like another kind of phishing. > Emacs already exposes the weak/strong/LTR/RTL status of each character, > so function to do this LRO/PDF insertion is trivial. It's like a > seven-line Elisp function or something. It's easy to insert them, yes. But the effect is not what you or our users necessarily want. More importantly, there are better ways to deal with that, provided that we DEFINE WHAT PROBLEMS DO WE WANT TO SOLVE, AND HOW. > >From what you say, sounds like it would make the display of these URLs > acceptable for bidi readers, too -- this would be the normal display of > these URLs, anyway. No, it isn't. You cannot get the correct display by overriding the bidi properties with LRO or its ilk. You can see the differences by moving point with C-f. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 21:05 ` Eli Zaretskii @ 2014-11-30 21:36 ` Lars Magne Ingebrigtsen 2014-12-01 3:45 ` Eli Zaretskii 2014-12-01 19:15 ` Richard Stallman 2014-12-01 19:15 ` Richard Stallman 2 siblings, 1 reply; 133+ messages in thread From: Lars Magne Ingebrigtsen @ 2014-11-30 21:36 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > Can we please take a step back and try to identify the real problem > here? What exactly are we trying to detect and handle? Is it true > that we are trying to detect URLs whose characters got their "normal" > bidirectional properties overridden by some directional control > characters? If so, I can write a primitive that will take a region of > buffer text and examine it to detect this. Oh, great. My impression was that such functionality was off the table. > Next, given that you have detected the spoofed URL, what do you want > to do with it? Do you want to highlight it, do you want to de-spoof > (i.e. undo the spoofing) in some way, but still leave some indication > of the fact that it was spoofed, or maybe you want to remove any trace > of the spoofing as if it never happened (and leave the user oblivious > to the fact it did)? Yes, I want to unspoof the URL. Adding some markings to notify that this has been done would also be nice, perhaps by adding a 'warning face to the text or the like. > Given the answers to those questions, there's any number of possible > solutions that do NOT require inserting more directional controls. > Some of the possible solutions were already mentioned in this thread. > Here's another: cover the offending RLO with a display property > showing whatever you want -- a warning sign, a smiley, a string made > of a SPC character, anything. You can try it with your example: you > will see the spoofing gone immediately. Why is this worse than > inserting directional controls whose effect on the surrounding text > can be far reaching? RLOs are used legitimately, and I think they display you've selected for them now (a thin blank line) is good. So I don't want to uglify mail mode buffers just to handle this quite obscure URL UI problem. I mean, why shouldn't people be able to do this if they want to in a smooth way? (Ok, bad example, but these overrides are used legitimately in the bidi community, if I understand my extensive research correctly.) And displaying http://myspace.com/#/segami/moc.koobecaf//:sptth with a couple of visible control characters doesn't really solve the problem, because most people will still assume that that's a link to Facebook, not to Myspace. Most people are not even aware that this bidi stuff exists. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 21:36 ` Lars Magne Ingebrigtsen @ 2014-12-01 3:45 ` Eli Zaretskii 2014-12-01 16:19 ` Lars Magne Ingebrigtsen 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-01 3:45 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Cc: emacs-devel@gnu.org > Date: Sun, 30 Nov 2014 22:36:41 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Can we please take a step back and try to identify the real problem > > here? What exactly are we trying to detect and handle? Is it true > > that we are trying to detect URLs whose characters got their "normal" > > bidirectional properties overridden by some directional control > > characters? If so, I can write a primitive that will take a region of > > buffer text and examine it to detect this. > > Oh, great. My impression was that such functionality was off the table. Why would it be off the table? Anyway, if you want this, please show the API of the function -- what it should return and how. > > Next, given that you have detected the spoofed URL, what do you want > > to do with it? Do you want to highlight it, do you want to de-spoof > > (i.e. undo the spoofing) in some way, but still leave some indication > > of the fact that it was spoofed, or maybe you want to remove any trace > > of the spoofing as if it never happened (and leave the user oblivious > > to the fact it did)? > > Yes, I want to unspoof the URL. Adding some markings to notify that > this has been done would also be nice, perhaps by adding a 'warning face > to the text or the like. Then putting a display property on the offending RLO might be the best solution. > > Given the answers to those questions, there's any number of possible > > solutions that do NOT require inserting more directional controls. > > Some of the possible solutions were already mentioned in this thread. > > Here's another: cover the offending RLO with a display property > > showing whatever you want -- a warning sign, a smiley, a string made > > of a SPC character, anything. You can try it with your example: you > > will see the spoofing gone immediately. Why is this worse than > > inserting directional controls whose effect on the surrounding text > > can be far reaching? > > RLOs are used legitimately, and I think they display you've selected for > them now (a thin blank line) is good. Yes, but adding RLOs or LROs just to undo some evil effect is something I think we should avoid, because its effect is non-local and can frequently be surprising and unintended. It is better to use other means we have. > So I don't want to uglify mail mode buffers just to handle this > quite obscure URL UI problem. Where do you see uglification in my suggestions? > (Ok, bad example, but these overrides are used legitimately in the bidi > community, if I understand my extensive research correctly.) They are meant for very specific situations, and this one isn't one of them. > And displaying http://myspace.com/#/segami/moc.koobecaf//:sptth with a > couple of visible control characters doesn't really solve the problem, > because most people will still assume that that's a link to Facebook, > not to Myspace. Most people are not even aware that this bidi stuff > exists. Under my suggestion to cover the overrides with a display property, the URL will not be reversed on display. Did you try that? ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 3:45 ` Eli Zaretskii @ 2014-12-01 16:19 ` Lars Magne Ingebrigtsen 2014-12-01 17:39 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Lars Magne Ingebrigtsen @ 2014-12-01 16:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > Anyway, if you want this, please show the API of the function -- what > it should return and how. Actually, I'm not sure. :-) Would it make any sense to have a function like `(displayed-directionality POSITION)' that returns either `right-to-left' or `left-to-right? If so, the URL-finding function would query about the start of the URL (which would normally be the HTTP part), and if that's `right-to-left', Here There Be Shenanigans. >> Yes, I want to unspoof the URL. Adding some markings to notify that >> this has been done would also be nice, perhaps by adding a 'warning face >> to the text or the like. > > Then putting a display property on the offending RLO might be the best > solution. On the RLO character itself or the URL affected by the RLO? I'd rather limit the impact of whatever we do to the URL itself, since the presentation of the URL is the user interface question here. > Yes, but adding RLOs or LROs just to undo some evil effect is > something I think we should avoid, because its effect is non-local and > can frequently be surprising and unintended. It is better to use > other means we have. Sure, if a different method is available that allows us to display these URLs in a non-spoofed way, I'm all for that. >> And displaying http://myspace.com/#/segami/moc.koobecaf//:sptth with a >> couple of visible control characters doesn't really solve the problem, >> because most people will still assume that that's a link to Facebook, >> not to Myspace. Most people are not even aware that this bidi stuff >> exists. > > Under my suggestion to cover the overrides with a display property, > the URL will not be reversed on display. Did you try that? Oh, they won't? I thought you meant adding a display property to the RLO in addition to having it do what it normally does. So is your suggestion here to disable all RLO (etc.) characters in mail buffers? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 16:19 ` Lars Magne Ingebrigtsen @ 2014-12-01 17:39 ` Eli Zaretskii 2014-12-01 17:49 ` Lars Magne Ingebrigtsen 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-01 17:39 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Cc: emacs-devel@gnu.org > Date: Mon, 01 Dec 2014 17:19:30 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Anyway, if you want this, please show the API of the function -- what > > it should return and how. > > Actually, I'm not sure. :-) Would it make any sense to have a function > like `(displayed-directionality POSITION)' that returns either > `right-to-left' or `left-to-right? If so, the URL-finding function > would query about the start of the URL (which would normally be the HTTP > part), and if that's `right-to-left', Here There Be Shenanigans. How is this different from the previous suggestion? > >> Yes, I want to unspoof the URL. Adding some markings to notify that > >> this has been done would also be nice, perhaps by adding a 'warning face > >> to the text or the like. > > > > Then putting a display property on the offending RLO might be the best > > solution. > > On the RLO character itself or the URL affected by the RLO? On the RLO. The URL will be left intact, and will show correctly after you put the display property. > >> And displaying http://myspace.com/#/segami/moc.koobecaf//:sptth with a > >> couple of visible control characters doesn't really solve the problem, > >> because most people will still assume that that's a link to Facebook, > >> not to Myspace. Most people are not even aware that this bidi stuff > >> exists. > > > > Under my suggestion to cover the overrides with a display property, > > the URL will not be reversed on display. Did you try that? > > Oh, they won't? I thought you meant adding a display property to the > RLO in addition to having it do what it normally does. Any character covered by a display property effectively loses its bidi properties, as described by this paragraph in the ELisp manual: Text covered by `display' text properties, by overlays with `display' properties whose value is a string, and by any other properties that replace buffer text, is treated as a single unit when it is reordered for display. That is, the entire chunk of text covered by these properties is reordered together. Moreover, the bidirectional properties of the characters in such a chunk of text are ignored, and Emacs reorders them as if they were replaced with a single character `U+FFFC', known as the "Object Replacement Character". This means that placing a display property over a portion of text may change the way that the surrounding text is reordered for display. To prevent this unexpected effect, always place such properties on text whose directionality is identical with text that surrounds it. > So is your suggestion here to disable all RLO (etc.) characters in mail > buffers? No, only RLOs that affect URLs. Specifically, I suggest to look for RLO before a URL on the same physical line, and PDF or hard newline after it, and if found, cover it by a display property whose value is e.g. a string " ". Since just the fact that you find an RLO before doesn't yet mean that it's a malicious RLO (other bidirectional controls which you don't want to know about can countermand the RLO before it affects the URL display), I suggest to augment that by checking that the URL's host and domain parts consist of LTR characters whose directionality was overridden. The latter part is to be done by calling a new primitive mentioned above. Given all this evidence, I think it's pretty much certain that we found our offending RLO. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 17:39 ` Eli Zaretskii @ 2014-12-01 17:49 ` Lars Magne Ingebrigtsen 2014-12-01 18:22 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Lars Magne Ingebrigtsen @ 2014-12-01 17:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> > Anyway, if you want this, please show the API of the function -- what >> > it should return and how. >> >> Actually, I'm not sure. :-) Would it make any sense to have a function >> like `(displayed-directionality POSITION)' that returns either >> `right-to-left' or `left-to-right? If so, the URL-finding function >> would query about the start of the URL (which would normally be the HTTP >> part), and if that's `right-to-left', Here There Be Shenanigans. > > How is this different from the previous suggestion? I'm not sure what you are referring to. >> So is your suggestion here to disable all RLO (etc.) characters in mail >> buffers? > > No, only RLOs that affect URLs. > > Specifically, I suggest to look for RLO before a URL on the same > physical line, and PDF or hard newline after it, and if found, cover > it by a display property whose value is e.g. a string " ". Since just > the fact that you find an RLO before doesn't yet mean that it's a > malicious RLO (other bidirectional controls which you don't want to > know about can countermand the RLO before it affects the URL display), > I suggest to augment that by checking that the URL's host and domain > parts consist of LTR characters whose directionality was overridden. > The latter part is to be done by calling a new primitive mentioned > above. > > Given all this evidence, I think it's pretty much certain that we > found our offending RLO. If you think that that's sufficient (that we only need to look for preceding RLOs on the same line), then this sounds like a good solution to me. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 17:49 ` Lars Magne Ingebrigtsen @ 2014-12-01 18:22 ` Eli Zaretskii 2014-12-01 18:28 ` Lars Magne Ingebrigtsen 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-01 18:22 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Cc: emacs-devel@gnu.org > Date: Mon, 01 Dec 2014 18:49:58 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> > Anyway, if you want this, please show the API of the function -- what > >> > it should return and how. > >> > >> Actually, I'm not sure. :-) Would it make any sense to have a function > >> like `(displayed-directionality POSITION)' that returns either > >> `right-to-left' or `left-to-right? If so, the URL-finding function > >> would query about the start of the URL (which would normally be the HTTP > >> part), and if that's `right-to-left', Here There Be Shenanigans. > > > > How is this different from the previous suggestion? > > I'm not sure what you are referring to. I'm saying that asking about "characters between FROM and TO that were supposed to be LTR, but was forced to display as RTL", and asking essentially the same question about a character at POS, is actually asking the same question. IOW, the same API will be able to satisfy both needs. (defun bidi-find-overridden-directionality (from to) "Return position between FROM and TO where directionality was overridden. This function returns the first character position in the specified region where there is a character whose `bidi-class' property is `L', but which was forced to display as `R' by a directional override, and likewise with characters whose `bidi-class' is `R' or `AL' that were forced to display as `L'. Strong directional characters `L', `R', and `AL' can have their intrinsic directionality overridden by directional override control characters RLO \(u+202e) and LRO \(u+202d)." OK? If you want, the function can return a cons cell (POS . DIR), where POS is the position and DIR is the intrinsic directionality of the overridden character. Or even (POS . DIR-ORIG DIR-OVERRIDDEN). > > No, only RLOs that affect URLs. > > > > Specifically, I suggest to look for RLO before a URL on the same > > physical line, and PDF or hard newline after it, and if found, cover > > it by a display property whose value is e.g. a string " ". Since just > > the fact that you find an RLO before doesn't yet mean that it's a > > malicious RLO (other bidirectional controls which you don't want to > > know about can countermand the RLO before it affects the URL display), > > I suggest to augment that by checking that the URL's host and domain > > parts consist of LTR characters whose directionality was overridden. > > The latter part is to be done by calling a new primitive mentioned > > above. > > > > Given all this evidence, I think it's pretty much certain that we > > found our offending RLO. > > If you think that that's sufficient (that we only need to look for > preceding RLOs on the same line), then this sounds like a good solution > to me. We need to look for an RLO on the same line when a LTR character was forced to display as RTL, and for LRO in the opposite case. This will detect the case you've demonstrated at the beginning of this thread. I don't know about other similar cases, so if you don't know either, I suggest to treat this problem, and take it from there. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 18:22 ` Eli Zaretskii @ 2014-12-01 18:28 ` Lars Magne Ingebrigtsen 2014-12-02 14:17 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Lars Magne Ingebrigtsen @ 2014-12-01 18:28 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > (defun bidi-find-overridden-directionality (from to) > "Return position between FROM and TO where directionality was overridden. > > This function returns the first character position in the specified > region where there is a character whose `bidi-class' property is `L', > but which was forced to display as `R' by a directional override, > and likewise with characters whose `bidi-class' is `R' or `AL' > that were forced to display as `L'. > > Strong directional characters `L', `R', and `AL' can have their > intrinsic directionality overridden by directional override > control characters RLO \(u+202e) and LRO \(u+202d)." > > OK? Yes, that sounds perfect. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 18:28 ` Lars Magne Ingebrigtsen @ 2014-12-02 14:17 ` Eli Zaretskii 2014-12-02 16:31 ` Lars Magne Ingebrigtsen 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-02 14:17 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Cc: emacs-devel@gnu.org > Date: Mon, 01 Dec 2014 19:28:31 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > > (defun bidi-find-overridden-directionality (from to) > > "Return position between FROM and TO where directionality was overridden. > > > > This function returns the first character position in the specified > > region where there is a character whose `bidi-class' property is `L', > > but which was forced to display as `R' by a directional override, > > and likewise with characters whose `bidi-class' is `R' or `AL' > > that were forced to display as `L'. > > > > Strong directional characters `L', `R', and `AL' can have their > > intrinsic directionality overridden by directional override > > control characters RLO \(u+202e) and LRO \(u+202d)." > > > > OK? > > Yes, that sounds perfect. It is now implemented on master. (Please read the doc string, as I did slightly more than I promised, hope you will find those additions useful.) ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-02 14:17 ` Eli Zaretskii @ 2014-12-02 16:31 ` Lars Magne Ingebrigtsen 0 siblings, 0 replies; 133+ messages in thread From: Lars Magne Ingebrigtsen @ 2014-12-02 16:31 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > It is now implemented on master. (Please read the doc string, as I > did slightly more than I promised, hope you will find those additions > useful.) Great! It looks like I won't get a chance to do much, if any, work on the URL-recognising code until the weekend, so if somebody else wants to handle that bit -- please do. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 21:05 ` Eli Zaretskii 2014-11-30 21:36 ` Lars Magne Ingebrigtsen @ 2014-12-01 19:15 ` Richard Stallman 2014-12-01 19:15 ` Richard Stallman 2 siblings, 0 replies; 133+ messages in thread From: Richard Stallman @ 2014-12-01 19:15 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Next, given that you have detected the spoofed URL, what do you want > to do with it? Do you want to highlight it, do you want to de-spoof > (i.e. undo the spoofing) in some way, but still leave some indication > of the fact that it was spoofed, or maybe you want to remove any trace > of the spoofing as if it never happened (and leave the user oblivious > to the fact it did)? I think that all commands to fetch a URL should ask for confirmation about a URL whose display may have been confusing due to bidi. The message should appear in a window, so it doesn't have to be terse. It should present everything that is interesting, including the URL as it appears in the actual context, and the URL as would appear in a normal LTR context, and the real URL that will be fetched. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 21:05 ` Eli Zaretskii 2014-11-30 21:36 ` Lars Magne Ingebrigtsen 2014-12-01 19:15 ` Richard Stallman @ 2014-12-01 19:15 ` Richard Stallman 2014-12-01 19:34 ` Eli Zaretskii 2 siblings, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-12-01 19:15 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] To be able copy some text into another buffer and have that text display there just as it displayed in the original buffer is important for user messages about what's going on with bidi. I think it requires two facilities. (Please correct me if I'm wrong.) 1. A way for a Lisp program to get, for a specified region, a short description of the outside bidi context that affects bidi treatment of that region. The result should be a small amount of data, computed solely from the text outside the specified region. The result should encapsulate everything about the text outside the specified region that can possibly affect the bidi treatment of whatever text might be inside the region. Thus, any change in the text outside the specified region, which gives the same encapsulated data, will not affect bidi treatment of text inside the region. Ideally, this data should have a transparent documented format. It could be called 'bidi-context'. If this can't be done in a way that is independent of the text inside the specified region, as a fallback it could be done in a way that works only for the current text inside that region. 2. Given such encapsulated context data, a straightforward way to create an equivalent bidi context in the current buffer. I expect it would work by inserting some magic bidi characters. (Can all such contexts be replicated by inserting some magic bidi characters?) It could be called 'replicate-bidi-context'. Are these feasible to implement? -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 19:15 ` Richard Stallman @ 2014-12-01 19:34 ` Eli Zaretskii 2014-12-01 20:21 ` Eli Zaretskii 2014-12-02 14:44 ` Richard Stallman 0 siblings, 2 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-12-01 19:34 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Mon, 01 Dec 2014 14:15:41 -0500 > From: Richard Stallman <rms@gnu.org> > CC: larsi@gnus.org, emacs-devel@gnu.org > > 1. A way for a Lisp program to get, for a specified region, a > short description of the outside bidi context that affects bidi > treatment of that region. > > The result should be a small amount of data, computed solely from the > text outside the specified region. The result should encapsulate > everything about the text outside the specified region that can > possibly affect the bidi treatment of whatever text might be inside > the region. > > Thus, any change in the text outside the specified region, which gives > the same encapsulated data, will not affect bidi treatment of text > inside the region. > > Ideally, this data should have a transparent documented format. > > It could be called 'bidi-context'. > > If this can't be done in a way that is independent of the text inside > the specified region, as a fallback it could be done in a way that > works only for the current text inside that region. > > 2. Given such encapsulated context data, a straightforward way to > create an equivalent bidi context in the current buffer. I expect it > would work by inserting some magic bidi characters. (Can all such > contexts be replicated by inserting some magic bidi characters?) > > It could be called 'replicate-bidi-context'. > > Are these feasible to implement? The first one sounds pretty complicated. I need to think about its feasibility. It could require analysis of a very large chunk of buffer text, at least in theory. What's more, the UBA specifies how to reorder text given the contents, but not how to do the reverse. Anyway, what's more important: you can have 2 without 1. The trick is to capture the visual order of the text you want to copy (can be done by looking at the current glyph matrix), and then create a string whose logical order is identical to the captured visual order, and embed that string in LRO..PDF, which will ensure the visual order will not change on display. The disadvantage of this is that you recreate the order, but not the reordering, so e.g. cursor motion will be different -- you won't see the jumps as in the URL phishing example. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 19:34 ` Eli Zaretskii @ 2014-12-01 20:21 ` Eli Zaretskii 2014-12-01 20:30 ` David Kastrup 2014-12-02 14:45 ` Richard Stallman 2014-12-02 14:44 ` Richard Stallman 1 sibling, 2 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-12-01 20:21 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Mon, 01 Dec 2014 21:34:46 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: larsi@gnus.org, emacs-devel@gnu.org > > The first one sounds pretty complicated. I need to think about its > feasibility. A simple (as in "KISS") strategy that should always work is to copy the entire physical line around the region. The disadvantage is, of course, that it could be very long in some rare cases. Optimizing that would probably require replacing runs of certain types of characters with a single representative character of the same type, and keeping all the directional controls. We could also replace strong directional characters L/R/AL with the corresponding mark (LRM/RLM/ALM), which are displayed as (thin) spaces, and so will be almost invisible, keeping an illusion of copying just the region of text and nothing else. Is this good enough? ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 20:21 ` Eli Zaretskii @ 2014-12-01 20:30 ` David Kastrup 2014-12-01 20:45 ` Eli Zaretskii 2014-12-02 14:45 ` Richard Stallman 1 sibling, 1 reply; 133+ messages in thread From: David Kastrup @ 2014-12-01 20:30 UTC (permalink / raw) To: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> Date: Mon, 01 Dec 2014 21:34:46 +0200 >> From: Eli Zaretskii <eliz@gnu.org> >> Cc: larsi@gnus.org, emacs-devel@gnu.org >> >> The first one sounds pretty complicated. I need to think about its >> feasibility. > > A simple (as in "KISS") strategy that should always work is to copy > the entire physical line around the region. The disadvantage is, of > course, that it could be very long in some rare cases. Optimizing > that would probably require replacing runs of certain types of > characters with a single representative character of the same type, > and keeping all the directional controls. > > We could also replace strong directional characters L/R/AL with the > corresponding mark (LRM/RLM/ALM), which are displayed as (thin) > spaces, and so will be almost invisible, keeping an illusion of > copying just the region of text and nothing else. > > Is this good enough? Wouldn't it just be enough to turn off bidi-display-reordering in the minibuffer when inputting/displaying the URL? -- David Kastrup ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 20:30 ` David Kastrup @ 2014-12-01 20:45 ` Eli Zaretskii 2014-12-02 14:45 ` Richard Stallman 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-01 20:45 UTC (permalink / raw) To: David Kastrup; +Cc: emacs-devel > From: David Kastrup <dak@gnu.org> > Date: Mon, 01 Dec 2014 21:30:03 +0100 > > Wouldn't it just be enough to turn off bidi-display-reordering in the > minibuffer when inputting/displaying the URL? That's not what Richard wanted, AFAIU. He wanted a way of citing a chunk of text in a mail message, in a way that ensures the cited text will have the same visual order as the original. And it is not only about URLs. Or maybe I'm confused. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 20:45 ` Eli Zaretskii @ 2014-12-02 14:45 ` Richard Stallman 0 siblings, 0 replies; 133+ messages in thread From: Richard Stallman @ 2014-12-02 14:45 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dak, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > That's not what Richard wanted, AFAIU. He wanted a way of citing a > chunk of text in a mail message, Actually, the message I am thinking about are not email. I am thinking about messages to display (in an Emacs temp buffer) to give information to the user, or query the user. We may want to include a pertinent part of the buffer into the message, and we should make sure it gets bidi-formatted the same way in the message that it does in its original context. The text to copy from the buffer might be a URL, or anything. This facility would be general, but we might want to use it as part of handling strange URLs. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 20:21 ` Eli Zaretskii 2014-12-01 20:30 ` David Kastrup @ 2014-12-02 14:45 ` Richard Stallman 2014-12-02 15:03 ` Eli Zaretskii 1 sibling, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-12-02 14:45 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > A simple (as in "KISS") strategy that should always work is to copy > the entire physical line around the region. 1. Is that physical line sufficient to determine the bidi context for the region? I don't know. If you say it is, I believe you. 2. It would be unclear to include the whole line in the message if the message is about just part of it (such as, a URL). So what I am looking for is a way to simplify the rest of that line into something that would create an equivalent bidi context for the region to be copied. > Optimizing > that would probably require replacing runs of certain types of > characters with a single representative character of the same type, > and keeping all the directional controls. > We could also replace strong directional characters L/R/AL with the > corresponding mark (LRM/RLM/ALM), which are displayed as (thin) > spaces, and so will be almost invisible, keeping an illusion of > copying just the region of text and nothing else. This sounds like the sort of thing I proposed. Another possible interface would be 'buffer-substring-preserve-bidi-context'. It would copy a specified part of the buffer, but prefix and suffix it with whatever is necessary to cause that part to display the same, bidi-wise, as it did in its original buffer. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-02 14:45 ` Richard Stallman @ 2014-12-02 15:03 ` Eli Zaretskii 2014-12-03 8:39 ` Richard Stallman 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-02 15:03 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Tue, 02 Dec 2014 09:45:08 -0500 > From: Richard Stallman <rms@gnu.org> > CC: larsi@gnus.org, emacs-devel@gnu.org > > > A simple (as in "KISS") strategy that should always work is to copy > > the entire physical line around the region. > > 1. Is that physical line sufficient to determine the bidi context for > the region? I don't know. If you say it is, I believe you. I say yes. For this purpose, line == paragraph. > > Optimizing > > that would probably require replacing runs of certain types of > > characters with a single representative character of the same type, > > and keeping all the directional controls. > > > We could also replace strong directional characters L/R/AL with the > > corresponding mark (LRM/RLM/ALM), which are displayed as (thin) > > spaces, and so will be almost invisible, keeping an illusion of > > copying just the region of text and nothing else. > > This sounds like the sort of thing I proposed. OK, I will work on it. > Another possible interface would be > 'buffer-substring-preserve-bidi-context'. > It would copy a specified part of the buffer, but prefix and suffix it > with whatever is necessary to cause that part to display the same, > bidi-wise, as it did in its original buffer. How is this different (you say "another possible interface")? ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-02 15:03 ` Eli Zaretskii @ 2014-12-03 8:39 ` Richard Stallman 2014-12-03 17:39 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-12-03 8:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > Another possible interface would be > > 'buffer-substring-preserve-bidi-context'. > > It would copy a specified part of the buffer, but prefix and suffix it > > with whatever is necessary to cause that part to display the same, > > bidi-wise, as it did in its original buffer. > How is this different (you say "another possible interface")? First I proposed an interface that would return a representation of the bidi context that affects a certain region. This representation would NOT include the text of that region. It would only represent the context _around_ that region, not the contents of that region. Along with that I proposed a function to convert that representation of context into magic bidi characters that will reproduce that context. The second proposed interface would copy the text of a region, while adding to it something to reproduce the bidi effect of its context. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-03 8:39 ` Richard Stallman @ 2014-12-03 17:39 ` Eli Zaretskii 2014-12-04 9:41 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-03 17:39 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Wed, 03 Dec 2014 03:39:03 -0500 > From: Richard Stallman <rms@gnu.org> > CC: larsi@gnus.org, emacs-devel@gnu.org > > The second proposed interface would copy the text of a region, while > adding to it something to reproduce the bidi effect of its context. That was how I understood the first suggestion, so that's what I'm working on. It is easier to do that than invent a representation. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-03 17:39 ` Eli Zaretskii @ 2014-12-04 9:41 ` Eli Zaretskii 2014-12-05 11:16 ` Richard Stallman 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-04 9:41 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Wed, 03 Dec 2014 19:39:42 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: larsi@gnus.org, emacs-devel@gnu.org > > > Date: Wed, 03 Dec 2014 03:39:03 -0500 > > From: Richard Stallman <rms@gnu.org> > > CC: larsi@gnus.org, emacs-devel@gnu.org > > > > The second proposed interface would copy the text of a region, while > > adding to it something to reproduce the bidi effect of its context. > > That was how I understood the first suggestion, so that's what I'm > working on. It is easier to do that than invent a representation. I have now implemented on master: (defun buffer-substring-with-bidi-context (start end &optional no-properties) "Return portion of current buffer between START and END with bidi context. This function works similar to `buffer-substring', but it prepends and appends to the text bidi directional control characters necessary to preserve the visual appearance of the text if it is inserted at another place. This is useful when the buffer substring includes bidirectional text and control characters that cause non-trivial reordering on display. If copied verbatim, such text can have a very different visual appearance, and can also change the visual appearance of the surrounding text at the destination of the copy. Optional argument NO-PROPERTIES, if non-nil, means copy the text without the text properties." Based on the fuss this generated, I now expect to see Lisp programs using this to start popping like mushrooms after the rain ;-) ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-04 9:41 ` Eli Zaretskii @ 2014-12-05 11:16 ` Richard Stallman 2014-12-05 11:28 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-12-05 11:16 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] Thanks. This construct will be useful for warning users about strange bidi in URLs. Do we need any new features to make it possible to show how the strange bidi text would really be interpreted? I think the feature you proposed, which would examine how text is actually displayed and represent that with text that is straightfoward, may be useful too. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-05 11:16 ` Richard Stallman @ 2014-12-05 11:28 ` Eli Zaretskii 2014-12-05 22:43 ` Richard Stallman 2014-12-05 22:43 ` Richard Stallman 0 siblings, 2 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-12-05 11:28 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Fri, 05 Dec 2014 06:16:22 -0500 > From: Richard Stallman <rms@gnu.org> > CC: larsi@gnus.org, emacs-devel@gnu.org > > Do we need any new features to make it possible to show > how the strange bidi text would really be interpreted? Not sure I understand what you mean here, but if I do, then this is up to applications, because only they know the meaning of a particular piece of displayed text and its interpretation. > I think the feature you proposed, which would examine how text is > actually displayed and represent that with text that is > straightfoward, may be useful too. Again, not sure what proposition you allude to here. Doesn't buffer-substring-with-bidi-context already do that? If not, what is missing? ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-05 11:28 ` Eli Zaretskii @ 2014-12-05 22:43 ` Richard Stallman 2014-12-05 23:15 ` Eli Zaretskii 2014-12-05 22:43 ` Richard Stallman 1 sibling, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-12-05 22:43 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > Do we need any new features to make it possible to show > > how the strange bidi text would really be interpreted? > Not sure I understand what you mean here, but if I do, then this is up > to applications, because only they know the meaning of a particular > piece of displayed text and its interpretation. In principle they might vary, but in practice I think most of them will use the characters in the order they appear in the buffer. So we need a way to show what a certain piece of text would look like with all bidi effects suppressed. One that would force them to display in strict LTR order. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-05 22:43 ` Richard Stallman @ 2014-12-05 23:15 ` Eli Zaretskii 2014-12-06 12:06 ` Richard Stallman 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-05 23:15 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Fri, 05 Dec 2014 17:43:42 -0500 > From: Richard Stallman <rms@gnu.org> > CC: larsi@gnus.org, emacs-devel@gnu.org > > > > Do we need any new features to make it possible to show > > > how the strange bidi text would really be interpreted? > > > Not sure I understand what you mean here, but if I do, then this is up > > to applications, because only they know the meaning of a particular > > piece of displayed text and its interpretation. > > In principle they might vary, but in practice I think most of them > will use the characters in the order they appear in the buffer. That's true, but that still doesn't say how should each application show that to the user. > So we need a way to show what a certain piece of text would look like > with all bidi effects suppressed. One that would force them to > display in strict LTR order. We were through this: it won't help, unless the logical-order text consists only of LTR characters. And for that, we already have a solution that detects the fraud. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-05 23:15 ` Eli Zaretskii @ 2014-12-06 12:06 ` Richard Stallman 2014-12-06 12:59 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-12-06 12:06 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > In principle they might vary, but in practice I think most of them > > will use the characters in the order they appear in the buffer. > That's true, but that still doesn't say how should each application > show that to the user. I don't entirely understand what sort of variation you have in mind, but I think we should make all such applications handle this as uniformly as possible. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-06 12:06 ` Richard Stallman @ 2014-12-06 12:59 ` Eli Zaretskii 0 siblings, 0 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-12-06 12:59 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Sat, 06 Dec 2014 07:06:50 -0500 > From: Richard Stallman <rms@gnu.org> > CC: larsi@gnus.org, emacs-devel@gnu.org > > > > In principle they might vary, but in practice I think most of them > > > will use the characters in the order they appear in the buffer. > > > That's true, but that still doesn't say how should each application > > show that to the user. > > I don't entirely understand what sort of variation you have in mind, > but I think we should make all such applications handle this > as uniformly as possible. The danger in using such obfuscated strings is different in each application. That's because each application assigns different semantics to the various portions of the string, and does different things with each portion. IOW, the semantics of these strings depends on the application, and thus our solution to warn the user about the dangers is probably going to be different in each case. Until now we had only one use case: the URL. For that use case, we understood the implications, and we now have the infrastructure to detect the obfuscation. We still don't know what will the application using URLs (in this case, eww) want to do to warn the user and ask for their permission. One way is to show the "real" URL to the user, which will automatically solve the obfuscation problem and display the URL in its "normal" form -- without the need to turn off the bidi reordering. Maybe there are other, better ways -- we just need to wait and see. And that's just a single application for which we have a use case we understand quite well. Other use cases are yet to come. When they do, we should analyze them as we did with this one. It could be that eventually we come to the conclusion you are proposing now: that we need a way to display some string in its logical order of characters. If and when we arrive to such a conclusion, there will be sufficient weight to it to justify the change in the code. We are not there yet, and it is not clear to me that we will indeed arrive at that conclusion. We have at least partial evidence that this might not be required: no other application out there does this, AFAIK. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-05 11:28 ` Eli Zaretskii 2014-12-05 22:43 ` Richard Stallman @ 2014-12-05 22:43 ` Richard Stallman 2014-12-05 23:17 ` Eli Zaretskii 1 sibling, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-12-05 22:43 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > I think the feature you proposed, which would examine how text is > > actually displayed and represent that with text that is > > straightfoward, may be useful too. > Again, not sure what proposition you allude to here. Doesn't > buffer-substring-with-bidi-context already do that? If not, what is > missing? A few days ago we had a misunderstanding -- I proposed the feature which you've now implemented, but you proposed a different feature. You proposed that Emacs would examine the text as actually reordered by display, and present that as a string in the display order. That was a different thing from what I had proposed. But I think it is a good idea. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-05 22:43 ` Richard Stallman @ 2014-12-05 23:17 ` Eli Zaretskii 2014-12-06 12:06 ` Richard Stallman 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-05 23:17 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Fri, 05 Dec 2014 17:43:43 -0500 > From: Richard Stallman <rms@gnu.org> > CC: larsi@gnus.org, emacs-devel@gnu.org > > A few days ago we had a misunderstanding -- I proposed the feature > which you've now implemented, but you proposed a different feature. > You proposed that Emacs would examine the text as actually reordered > by display, and present that as a string in the display order. > > That was a different thing from what I had proposed. > But I think it is a good idea. I can do that. But since the feature you suggested is already implemented, what would be the use of the alternative? They both try to achieve the same goal. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-05 23:17 ` Eli Zaretskii @ 2014-12-06 12:06 ` Richard Stallman 0 siblings, 0 replies; 133+ messages in thread From: Richard Stallman @ 2014-12-06 12:06 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > A few days ago we had a misunderstanding -- I proposed the feature > > which you've now implemented, but you proposed a different feature. > > You proposed that Emacs would examine the text as actually reordered > > by display, and present that as a string in the display order. > > > > That was a different thing from what I had proposed. > > But I think it is a good idea. > I can do that. But since the feature you suggested is already > implemented, what would be the use of the alternative? They both try > to achieve the same goal. Maybe you are right, since they would look the same in display. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 19:34 ` Eli Zaretskii 2014-12-01 20:21 ` Eli Zaretskii @ 2014-12-02 14:44 ` Richard Stallman 2014-12-02 15:00 ` Eli Zaretskii 1 sibling, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-12-02 14:44 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > The first one sounds pretty complicated. I need to think about its > feasibility. It could require analysis of a very large chunk of > buffer text, at least in theory. Doesn't each paragraph do bidi separately? If so, at most this requires analyzing one paragraph before and after the region. What's more, the UBA specifies how > to reorder text given the contents, but not how to do the reverse. How does this relate to what I proposed? I don't see it so I suspect a misunderstanding. > Anyway, what's more important: you can have 2 without 1. I don't understand what that would mean. > The trick is > to capture the visual order of the text you want to copy (can be done > by looking at the current glyph matrix), and then create a string > whose logical order is identical to the captured visual order, That seems more complicated and less desirable. For the job I have in mind, it is more elegant to COPY the text in question into the message. But one needs to make sure it will display the same in this new context as in the original context. That's what the proposed feature is for. The facility you propose here might be useful too, for other purposes. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-02 14:44 ` Richard Stallman @ 2014-12-02 15:00 ` Eli Zaretskii 2014-12-03 8:39 ` Richard Stallman 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-02 15:00 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Tue, 02 Dec 2014 09:44:17 -0500 > From: Richard Stallman <rms@gnu.org> > CC: larsi@gnus.org, emacs-devel@gnu.org > > > The first one sounds pretty complicated. I need to think about its > > feasibility. It could require analysis of a very large chunk of > > buffer text, at least in theory. > > Doesn't each paragraph do bidi separately? Yes. > If so, at most this requires analyzing one paragraph before and > after the region. That's correct, but a paragraph can be very long in some specialized cases. E.g., log files written by software frequently have very long paragraphs. > What's more, the UBA specifies how > > to reorder text given the contents, but not how to do the reverse. > > How does this relate to what I proposed? I don't see it so I suspect > a misunderstanding. One way of looking at your request is to think of it as an interface that takes reordered text in the visual order and reconstructs the bidi context that leads to it. The way the UBA is described doesn't lend itself easily to such a reconstruction. > > Anyway, what's more important: you can have 2 without 1. > > I don't understand what that would mean. It means we can display the copied text in the same visual order without analyzing the context that caused that visual order. > The facility you propose here might be useful too, for other purposes. It is already being used (I needed in the Emacs test suite to visually compare the results of reordering with the reference implementation). ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-02 15:00 ` Eli Zaretskii @ 2014-12-03 8:39 ` Richard Stallman 0 siblings, 0 replies; 133+ messages in thread From: Richard Stallman @ 2014-12-03 8:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > One way of looking at your request is to think of it as an interface > that takes reordered text in the visual order and reconstructs the > bidi context that leads to it. However, that's not what I requested. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 17:49 ` Lars Magne Ingebrigtsen 2014-11-29 17:54 ` Lars Magne Ingebrigtsen 2014-11-29 18:18 ` Eli Zaretskii @ 2014-11-30 9:38 ` Richard Stallman 2014-11-30 15:27 ` Eli Zaretskii 2 siblings, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-11-30 9:38 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > It seems pretty clear that stuff like > http://myspace.com/#/segami/moc.koobecaf//:sptth This is the first time I've observe RTL display in Emacs. I don't see any way to detect the magic character that specifies it. (I am using a terminal as usual.) I think we need to provide a way to make them visible. Perhaps it should even be the default. Also, is there a way to disable bidi in the current buffer? If not, I think we need one. > But currently Emacs doesn't really have a mechanism for querying the > directionality of a buffer region, I think? I think we need to add this. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 9:38 ` Richard Stallman @ 2014-11-30 15:27 ` Eli Zaretskii 2014-12-01 10:17 ` Richard Stallman 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-11-30 15:27 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Sun, 30 Nov 2014 04:38:14 -0500 > From: Richard Stallman <rms@gnu.org> > Cc: emacs-devel@gnu.org > > > It seems pretty clear that stuff like > > > http://myspace.com/#/segami/moc.koobecaf//:sptth > > This is the first time I've observe RTL display in Emacs. I don't see > any way to detect the magic character that specifies it. That's because there isn't one, in the citation you provided. The original example was this: http://myspace.com/#/segami/moc.koobecaf//:sptth where there is a u+202e character at the rightmost (visual) edge of the line. If you move point with C-f from the beginning of that line, you should see it jump to the right edge of the line after the leading whitespace, and then continue to "advance backwards", i.e. to the left. You can search for this character by typing C-s C-x 8 RET 202e RET After typing this, you should see the offending character highlighted in some reddish background. > (I am using a terminal as usual.) These characters are by default displayed as spaces on a TTY, and as a very thin (1-pixel) space on GUI frames. > I think we need to provide a way to make them visible. We already have it: the glyphless-char-display char-table. > Perhaps it should even be the default. I don't think so: these controls should normally be all but invisible. The Unicode Standard actually recommends to remove them from display, but when I worked on the bidi display engine, I decided that removing characters by infrastructure is un-Emacsy, so I left them alone. Lisp programs and specialized major modes can make them invisible by using text properties, if they want. Making these controls visible by default will uglify the display for no good reason. These controls are perfectly valid in email messages with RTL text, for example. > Also, is there a way to disable bidi in the current buffer? > If not, I think we need one. There is a way, but it is not meant for Lisp programs, only for debugging the display engine. In any case, I don't think disabling display reordering is the right solution for the problem at hand. It's a cure that is worse than the disease, since Web pages and email messages with RTL text will be displayed incorrectly, and be almost illegible. > > But currently Emacs doesn't really have a mechanism for querying the > > directionality of a buffer region, I think? > > I think we need to add this. We are still discussing what that means, exactly. When we reach conclusions, we can start working on implementing whatever is needed. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 15:27 ` Eli Zaretskii @ 2014-12-01 10:17 ` Richard Stallman 2014-12-01 16:17 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-12-01 10:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] We need to make Emacs safe and clear for users who don't know anything about bidi and don't want to. One idea: change the mode line color when there is any RTL text (in the buffer, or on the screen, whichever is easier). Another idea: make magic bidi characters visible by default. People who edit in RTL languages and get used to bidi could set a user option to make them invisible. > This is the first time I've observe RTL display in Emacs. I don't see > any way to detect the magic character that specifies it. That's because there isn't one, in the citation you provided. Yes there was -- you said so yourself: > where there is a u+202e character The point is that I could not tell what it was, or where it was, or anything about it, from my ordinary Emacs commands -- even though I knew I was observing RTL text display and that some magic bidi character was probably the reason for it. Plenty of users wouldn't even know that much. at the rightmost (visual) edge of > the line. If you move point with C-f from the beginning of that line, > you should see it jump to the right edge of the line after the leading > whitespace, and then continue to "advance backwards", i.e. to the left. Yes, I observed that strange behavior. As I said, it was the first time I saw Emacs's bidi display functionality actually operate. But I could not tell how to detect the presence of that the magic character directly. I could see the bidi effect, but I could not tell what was causing it. > These characters are by default displayed as spaces on a TTY, and as a > very thin (1-pixel) space on GUI frames. > > I think we need to provide a way to make them visible. > We already have it: the glyphless-char-display char-table. We need a convenient _user-level_ feature to make them visible. > I don't think so: these controls should normally be all but invisible. We need to make it easy to see them. Otherwise people can't tell why strangeness is happening on their screens. > > Also, is there a way to disable bidi in the current buffer? > > If not, I think we need one. > There is a way, but it is not meant for Lisp programs, only for > debugging the display engine. It needs to be made convenient for users. Especially for users who never use bidi. You use an RTL language, so you see bidi text often and it doesn't surprise you. When you see it, you know what is going on. You know what in the buffer is likely to cause what visual results. I don't speak any RTL language (and those characters won't display on this tty anyway). So I never see bidi at work, or at least not in a way I would notice. I get mail that might be in Arabic script, but that's just a guess. The messages are spam, so I delete them. Even so, I am more knowledgeable about bidi than most Emacs users. I once read the the Unicode bidi rules, I just don't remember them. I think most Emacs users have even less knowledge of this issue. We need to make Emacs safe and clear for users who don't know anything about bidi and don't want to. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 10:17 ` Richard Stallman @ 2014-12-01 16:17 ` Eli Zaretskii 2014-12-02 14:42 ` Richard Stallman 2014-12-02 14:42 ` Richard Stallman 0 siblings, 2 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-12-01 16:17 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Mon, 01 Dec 2014 05:17:58 -0500 > From: Richard Stallman <rms@gnu.org> > CC: larsi@gnus.org, emacs-devel@gnu.org > > We need to make Emacs safe and clear for users who don't know anything > about bidi and don't want to. I think we are in violent agreement here. The question is how to do that, not whether or not do it. > One idea: change the mode line color when there is any RTL text > (in the buffer, or on the screen, whichever is easier). That's possible, but I think it's too drastic. Just having RTL text doesn't yet constitute any danger or require special vigilance on the part of the user, even if she doesn't want to know anything about bidi, let alone if she does. And of course, the display engine only examines the visible portion of the buffer and sometimes a small region above and below, so it cannot really tell what's in the rest of the buffer. OTOH, we have indications on the mode line, such as "(DOS)", which users in the past said they didn't pay attention to. My conclusion from that is that mode-line indication is only effective when we know users will look at the mode line at the right moment. > Another idea: make magic bidi characters visible by default. People > who edit in RTL languages and get used to bidi could set a user option > to make them invisible. This is both possible and easy, we already have infrastructure for this. Not sure it's enough, though: the reordering effect on URLs, like in the example that started this thread, will still be there, and seeing the actual URL where the link will take the user if clicked upon will still be not easy enough, IMO. > > This is the first time I've observe RTL display in Emacs. I don't see > > any way to detect the magic character that specifies it. > > That's because there isn't one, in the citation you provided. > > Yes there was -- you said so yourself: > > > where there is a u+202e character There was no such character in your mail, only in the one sent by Lars. So I assumed you somehow lost it. My bad. > > > I think we need to provide a way to make them visible. > > > We already have it: the glyphless-char-display char-table. > > We need a convenient _user-level_ feature to make them visible. We have glyphless-char-display-control, which is a defcustom. If that is still too technical, we can have a minor mode to set that for these directional controls, or maybe just for some subset of them (most of them cannot cause such disastrous effects on display). > > I don't think so: these controls should normally be all but invisible. > > We need to make it easy to see them. Otherwise people can't tell why > strangeness is happening on their screens. I think we should prefer making them visible only in the context where they could cause harm. Making them visible everywhere could be an annoyance. > > > Also, is there a way to disable bidi in the current buffer? > > > If not, I think we need one. > > > There is a way, but it is not meant for Lisp programs, only for > > debugging the display engine. > > It needs to be made convenient for users. I don't think this is needed. People who don't read and don't understand about bidi will not find it useful, because they cannot read text affected by the reordering anyway, regardless of its order. This could only help in the rare situations such as the one discussed here. But in those cases, I think we all agree that Emacs should detect them and act on them automatically; passing the buck to the user would be a mistake on our part. But even if I'd agree with you, making a convenient and reliable way of going back to unidirectional display of Emacs 23 and before would require a lot of work, because the current display engine no longer supports unidirectional display without reordering, at least not reliably. The old unidirectional code was left in some of the places, either as a debugging aid or for special corner cases, like unibyte buffers. In other places, the code was simply rewritten to work only through the reordering engine, and the old code no longer exists. For example, display strings and overlay strings are rendered exclusively by the reordering engine. IOW, the unidirectional display code is for all practical purposes gone; what's left is not reliable enough for users to use it. So we simply cannot turn reordering off and get an otherwise the same Emacs. > I don't speak any RTL language (and those characters won't display on > this tty anyway). So I never see bidi at work, or at least not in a > way I would notice. I get mail that might be in Arabic script, but > that's just a guess. The messages are spam, so I delete them. Once again, the only dangerous situation with bidi we are aware of is the one that started this thread: a malicious use of directional overrides that changes the visual appearance of what is otherwise strict left-to-right text. Let's concentrate on solving this rather unique situation. It is IMO wrong to try to generalize these rare cases into a view that bidi reordering is somehow a menace that users need to turn off every now and then; it isn't. > We need to make Emacs safe and clear for users who don't know anything > about bidi and don't want to. Again, I think we are in violent agreement here. The question is how to do that, not whether or not do it. But disabling bidi is not the way. Several useful ideas were raised in this discussion. I suggest that we implement some of them and see if they are enough. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 16:17 ` Eli Zaretskii @ 2014-12-02 14:42 ` Richard Stallman 2014-12-02 14:48 ` Eli Zaretskii 2014-12-02 14:42 ` Richard Stallman 1 sibling, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-12-02 14:42 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > One idea: change the mode line color when there is any RTL text > > (in the buffer, or on the screen, whichever is easier). > That's possible, but I think it's too drastic. Just having RTL text > doesn't yet constitute any danger or require special vigilance on the > part of the user, It requires special vigilance if the user isn't expecting it! I am not saying that RTL per se is dangerous. I'm suggesting we should warn users very visibly about RTL text it if they don't normally use it and are perhaps not expecting it. Changing the color of the mode line was my first idea. Another idea is to display "This buffer contains right-to-left text\n\n" at the start of the buffer. People like you who are accustomed to RTL editing would set a flag to disable those messages. > > Another idea: make magic bidi characters visible by default. People > > who edit in RTL languages and get used to bidi could set a user option > > to make them invisible. > This is both possible and easy, we already have infrastructure for > this. Not sure it's enough, though: I don't think it is enough by itself. We should continue with the other proposed measures too. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-02 14:42 ` Richard Stallman @ 2014-12-02 14:48 ` Eli Zaretskii 2014-12-03 8:38 ` Richard Stallman 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-02 14:48 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Tue, 02 Dec 2014 09:42:38 -0500 > From: Richard Stallman <rms@gnu.org> > CC: larsi@gnus.org, emacs-devel@gnu.org > > > > One idea: change the mode line color when there is any RTL text > > > (in the buffer, or on the screen, whichever is easier). > > > That's possible, but I think it's too drastic. Just having RTL text > > doesn't yet constitute any danger or require special vigilance on the > > part of the user, > > It requires special vigilance if the user isn't expecting it! > > I am not saying that RTL per se is dangerous. I'm suggesting we > should warn users very visibly about RTL text it if they don't > normally use it and are perhaps not expecting it. We don't know if this particular user normally uses RTL. We could introduce an option through which users could tell us that they want such warnings. But in general, things that are not dangerous don't warrant a warning. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-02 14:48 ` Eli Zaretskii @ 2014-12-03 8:38 ` Richard Stallman 2014-12-03 11:56 ` Nicolas Richard 2014-12-03 17:38 ` Eli Zaretskii 0 siblings, 2 replies; 133+ messages in thread From: Richard Stallman @ 2014-12-03 8:38 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > I am not saying that RTL per se is dangerous. I'm suggesting we > > should warn users very visibly about RTL text it if they don't > > normally use it and are perhaps not expecting it. > We don't know if this particular user normally uses RTL. We could > introduce an option through which users could tell us that they want > such warnings. Exactly. If we introduce a variable to set if you use RTL text, we will know who normally uses RTL text. But in general, things that are not dangerous don't > warrant a warning. RTL is dangerous in SOME CASES, and that's enough reason to warn about it. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-03 8:38 ` Richard Stallman @ 2014-12-03 11:56 ` Nicolas Richard 2014-12-03 17:12 ` Richard Stallman 2014-12-03 17:38 ` Eli Zaretskii 1 sibling, 1 reply; 133+ messages in thread From: Nicolas Richard @ 2014-12-03 11:56 UTC (permalink / raw) To: Richard Stallman; +Cc: Eli Zaretskii, larsi, emacs-devel Richard Stallman <rms@gnu.org> writes: > RTL is dangerous in SOME CASES, and that's enough reason to warn > about it. IMO this implies that RTL users are the ones that should adjust to the "normal" LTR world. While it may reflect the current state of the world, I don't think it's the right thing to do. -- Nicolas ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-03 11:56 ` Nicolas Richard @ 2014-12-03 17:12 ` Richard Stallman 0 siblings, 0 replies; 133+ messages in thread From: Richard Stallman @ 2014-12-03 17:12 UTC (permalink / raw) To: Nicolas Richard; +Cc: eliz, larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > IMO this implies that RTL users are the ones that should adjust to the > "normal" LTR world. While it may reflect the current state of the > world, I don't think it's the right thing to do. This is not a symbolic gesture. It's a matter of practicality. There are already warnings and notifications that Emacs gives by default, and that you can turn off by setting a flag. This would be one more kind. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-03 8:38 ` Richard Stallman 2014-12-03 11:56 ` Nicolas Richard @ 2014-12-03 17:38 ` Eli Zaretskii 2014-12-04 14:30 ` Richard Stallman 1 sibling, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-03 17:38 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Wed, 03 Dec 2014 03:38:59 -0500 > From: Richard Stallman <rms@gnu.org> > CC: larsi@gnus.org, emacs-devel@gnu.org > > But in general, things that are not dangerous don't > > warrant a warning. > > RTL is dangerous in SOME CASES, and that's enough reason to warn > about it. My point is that we should try to narrow down the cases where we issue a warning, ideally only to those SOME CASES where they can actually be harmful. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-03 17:38 ` Eli Zaretskii @ 2014-12-04 14:30 ` Richard Stallman 2014-12-04 15:53 ` Stefan Monnier 0 siblings, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-12-04 14:30 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > My point is that we should try to narrow down the cases where we issue > a warning, ideally only to those SOME CASES where they can actually be > harmful. I agree that we should do this. But it is also useful to warn users that a buffer contains RTL text when they don't expect any. If the buffer is all RTL text, the user will see that, and none of it will make sense to him anyway. So no warning is needed. But if the buffer is mostly ordinary LTR text, but has a little RTL text in it, the non-bidi user will probably not notice that and could get fooled. That is the case for which I think a warning is useful. But there is no harm in giving the warning in both of these cases. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-04 14:30 ` Richard Stallman @ 2014-12-04 15:53 ` Stefan Monnier 2014-12-04 17:30 ` Eli Zaretskii 2014-12-04 20:25 ` Paul Eggert 0 siblings, 2 replies; 133+ messages in thread From: Stefan Monnier @ 2014-12-04 15:53 UTC (permalink / raw) To: Richard Stallman; +Cc: Eli Zaretskii, larsi, emacs-devel > But if the buffer is mostly ordinary LTR text, but has a little RTL > text in it, the non-bidi user will probably not notice that and could > get fooled. That is the case for which I think a warning is useful. When I see a bit of hebrew text in a buffer, I wouldn't know if it's displayed L2R or R2L and either way wouldn't make any difference to me, so I'm definitely not "fooled". This happens reasonably often, and I wouldn't want to be "warned" that there's some R2L script in my buffer, since I can see it plainly since the characters are different anyway. The problematic case that started this thread was because strongly L2R characters were displayed in R2L fashion because of their context. And *that* is indeed a problem, because there was no obvious visual clue: the reversed chars were all latin chars. So, if we want to emit a warning, it should not be when "there's some R2L text in an L2R context" but only when L2R characters end up layed out in R2L because of the context. I'm not familiar enough with bidi uses to know for sure whether such "forced wrong-way layout" is something that can occur regularly in normal/legitimate situations, but at least it's something that would fool me every time, so I think a warning would be OK for those cases. Stefan ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-04 15:53 ` Stefan Monnier @ 2014-12-04 17:30 ` Eli Zaretskii 2014-12-04 20:25 ` Paul Eggert 1 sibling, 0 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-12-04 17:30 UTC (permalink / raw) To: Stefan Monnier; +Cc: larsi, rms, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Eli Zaretskii <eliz@gnu.org>, larsi@gnus.org, emacs-devel@gnu.org > Date: Thu, 04 Dec 2014 10:53:29 -0500 > > The problematic case that started this thread was because strongly L2R > characters were displayed in R2L fashion because of their context. > And *that* is indeed a problem, because there was no obvious visual > clue: the reversed chars were all latin chars. We now have a primitive that can be used to detect such regions in a buffer. So we can implement a warning in those cases. > So, if we want to emit a warning, it should not be when "there's some > R2L text in an L2R context" but only when L2R characters end up layed out in > R2L because of the context. And likewise with R2L characters that end up displayed left to right (although the target audience for this would be much smaller). > I'm not familiar enough with bidi uses to know for sure whether such > "forced wrong-way layout" is something that can occur regularly in > normal/legitimate situations There's no reason for it to occur regularly. Its main purpose is to satisfy very specific and rare circumstances, like when you need to show R2L text in logical order (e.g., for didactic reasons), or force punctuation characters to display in a particular visual order. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-04 15:53 ` Stefan Monnier 2014-12-04 17:30 ` Eli Zaretskii @ 2014-12-04 20:25 ` Paul Eggert 1 sibling, 0 replies; 133+ messages in thread From: Paul Eggert @ 2014-12-04 20:25 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel On 12/04/2014 07:53 AM, Stefan Monnier wrote: > So, if we want to emit a warning, it should not be when "there's some > R2L text in an L2R context" but only when L2R characters end up layed out in > R2L because of the context. How about if we reverse the letters as well as issue a warning? That is, instead of merely displaying "ces" for a reversed "sec", we also display the individual characters reversed (so it would display like "ↄɘƨ"). On a graphical display we should be able to do that reasonably well, and it'd be a strong visual cue. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 16:17 ` Eli Zaretskii 2014-12-02 14:42 ` Richard Stallman @ 2014-12-02 14:42 ` Richard Stallman 2014-12-02 14:52 ` Eli Zaretskii 1 sibling, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-12-02 14:42 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > We need a convenient _user-level_ feature to make them visible. > We have glyphless-char-display-control, which is a defcustom. If that > is still too technical, we can have a minor mode to set that for these > directional controls, A minor mode would be convenient enough for non-wizard users. > I think we should prefer making them visible only in the context where > they could cause harm. Making them visible everywhere could be an > annoyance. It would only be an annoyance for users who really use bidi, and they would turn it off so it would not annoy them again. > But even if I'd agree with you, making a convenient and reliable way > of going back to unidirectional display of Emacs 23 and before would > require a lot of work, because the current display engine no longer > supports unidirectional display without reordering, at least not > reliably. It is easy to make an option turn off bidi processing. All it has to do is make all characters seem LTR. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-02 14:42 ` Richard Stallman @ 2014-12-02 14:52 ` Eli Zaretskii 2014-12-02 18:05 ` Eli Zaretskii ` (2 more replies) 0 siblings, 3 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-12-02 14:52 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Tue, 02 Dec 2014 09:42:42 -0500 > From: Richard Stallman <rms@gnu.org> > CC: larsi@gnus.org, emacs-devel@gnu.org > > > > We need a convenient _user-level_ feature to make them visible. > > > We have glyphless-char-display-control, which is a defcustom. If that > > is still too technical, we can have a minor mode to set that for these > > directional controls, > > A minor mode would be convenient enough for non-wizard users. > > > I think we should prefer making them visible only in the context where > > they could cause harm. Making them visible everywhere could be an > > annoyance. > > It would only be an annoyance for users who really use bidi, > and they would turn it off so it would not annoy them again. But even users who do use bidi would like to be warned when these controls are part of potential URL phishing. So there's a contradiction here, at least for those users: they would like a warning when these controls could be harmful, but would like to avoid the warning when they aren't. > > But even if I'd agree with you, making a convenient and reliable way > > of going back to unidirectional display of Emacs 23 and before would > > require a lot of work, because the current display engine no longer > > supports unidirectional display without reordering, at least not > > reliably. > > It is easy to make an option turn off bidi processing. > All it has to do is make all characters seem LTR. That doesn't disable reordering, it just makes the results indistinguishable. Perhaps I don't understand what you want to do with this option. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-02 14:52 ` Eli Zaretskii @ 2014-12-02 18:05 ` Eli Zaretskii 2014-12-03 17:13 ` Richard Stallman 2014-12-03 17:13 ` Richard Stallman 2014-12-03 17:13 ` Richard Stallman 2 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-02 18:05 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Tue, 02 Dec 2014 16:52:15 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: larsi@gnus.org, emacs-devel@gnu.org > > > It is easy to make an option turn off bidi processing. > > All it has to do is make all characters seem LTR. > > That doesn't disable reordering, it just makes the results > indistinguishable. Actually, even this is not true: the directional overrides will still have their effect. So deeper changes are needed to countermand that as well. And I still don't understand the purpose of such a feature. Users who cannot read RTL won't be able to understand the text either way, and don't know what is "the right" display to make any sense out of what will be presented when bidi processing is "turned off". ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-02 18:05 ` Eli Zaretskii @ 2014-12-03 17:13 ` Richard Stallman 2014-12-03 18:14 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-12-03 17:13 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > It is easy to make an option turn off bidi processing. > > > All it has to do is make all characters seem LTR. > > > > That doesn't disable reordering, it just makes the results > > indistinguishable. > Actually, even this is not true: the directional overrides will still > have their effect. So deeper changes are needed to countermand that > as well. It should not be hard for the same flag to tell the code not to recognize those characters. > And I still don't understand the purpose of such a feature. Users who > cannot read RTL won't be able to understand the text either way, and > don't know what is "the right" display to make any sense out of what > will be presented when bidi processing is "turned off". One use of disabling bidi is that you'll see what the strange URL really consists of. And likewise any other texts that involve bidi: you'll see what the real sequence of characters is. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-03 17:13 ` Richard Stallman @ 2014-12-03 18:14 ` Eli Zaretskii 2014-12-05 22:44 ` Richard Stallman 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-03 18:14 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Wed, 03 Dec 2014 12:13:04 -0500 > From: Richard Stallman <rms@gnu.org> > CC: larsi@gnus.org, emacs-devel@gnu.org > > > Actually, even this is not true: the directional overrides will still > > have their effect. So deeper changes are needed to countermand that > > as well. > > It should not be hard for the same flag to tell the code not to > recognize those characters. The point is it's not just a change in some table. The code needs to be changed as well, then tested, debugged, and maintained. Without a good reason, that's just waste of resources. > > And I still don't understand the purpose of such a feature. Users who > > cannot read RTL won't be able to understand the text either way, and > > don't know what is "the right" display to make any sense out of what > > will be presented when bidi processing is "turned off". > > One use of disabling bidi is that you'll see what the strange URL > really consists of. We already have a better solution for that, I just added yesterday the infrastructure that enables such a solution. We can now stop talking about the "reversed URL" case, it's a problem that is all but solved. > And likewise any other texts that involve bidi: you'll see what the > real sequence of characters is. If it's the same case as with reversed URL, i.e. obfuscation by using directional overrides, then the same solution will work there. If it's something else, seeing RTL text in logical order will not help anyone who doesn't already know how to read that text in its reordered for display form. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-03 18:14 ` Eli Zaretskii @ 2014-12-05 22:44 ` Richard Stallman 2014-12-05 23:19 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-12-05 22:44 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > One use of disabling bidi is that you'll see what the strange URL > > really consists of. > We already have a better solution for that, I just added yesterday the > infrastructure that enables such a solution. Could you tell me what that solution is? I'm concerned that we may be miscommunicating again. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-05 22:44 ` Richard Stallman @ 2014-12-05 23:19 ` Eli Zaretskii 2014-12-07 9:20 ` Richard Stallman 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-05 23:19 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Fri, 05 Dec 2014 17:44:37 -0500 > From: Richard Stallman <rms@gnu.org> > CC: larsi@gnus.org, emacs-devel@gnu.org > > > > One use of disabling bidi is that you'll see what the strange URL > > > really consists of. > > > We already have a better solution for that, I just added yesterday the > > infrastructure that enables such a solution. > > Could you tell me what that solution is? I'm concerned that we > may be miscommunicating again. I meant this primitive: (bidi-find-overridden-directionality FROM TO &optional OBJECT) Return position between FROM and TO where directionality was overridden. This function returns the first character position in the specified region of OBJECT where there is a character whose `bidi-class' property is `L', but which was forced to display as `R' by a directional override, and likewise with characters whose `bidi-class' is `R' or `AL' that were forced to display as `L'. If no such character is found, the function returns nil. OBJECT is a Lisp string or buffer to search for overridden directionality, and defaults to the current buffer if nil or omitted. OBJECT can also be a window, in which case the function will search the buffer displayed in that window. Passing the window instead of a buffer is preferable when the buffer is displayed in some window, because this function will then be able to correctly account for window-specific overlays, which can affect the results. Strong directional characters `L', `R', and `AL' can have their intrinsic directionality overridden by directional override control characters RLO (u+202e) and LRO (u+202d). See the function `get-char-code-property' for a way to inquire about the `bidi-class' property of a character. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-05 23:19 ` Eli Zaretskii @ 2014-12-07 9:20 ` Richard Stallman 2014-12-07 15:50 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-12-07 9:20 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > We already have a better solution for that, I just added yesterday the > > > infrastructure that enables such a solution. > > > > Could you tell me what that solution is? I'm concerned that we > > may be miscommunicating again. > I meant this primitive: > (bidi-find-overridden-directionality FROM TO &optional OBJECT) > Return position between FROM and TO where directionality was overridden. This looks like a way to _test_ part of a buffer or string to see if it has any bidi strangeness. Could you confirm? If so, the questionis: once you detect the strangeness, what then? I suppose the next step is either an error message or a query. In either case, I think we should show the user (1) what the text looks like and (2) what's actually in it. With your implementation of context-regeneration, we can show what the text looks like. How can we show what it really is? Perhaps what we want is a suppress-bidi property, or a bidi property that would specify the direction for certain text. These properties would override all bidi attributes of the characters themselves. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-07 9:20 ` Richard Stallman @ 2014-12-07 15:50 ` Eli Zaretskii 2014-12-08 0:26 ` Richard Stallman 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-07 15:50 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Sun, 07 Dec 2014 04:20:31 -0500 > From: Richard Stallman <rms@gnu.org> > Cc: larsi@gnus.org, emacs-devel@gnu.org > > > > > We already have a better solution for that, I just added yesterday the > > > > infrastructure that enables such a solution. > > > > > > Could you tell me what that solution is? I'm concerned that we > > > may be miscommunicating again. > > > I meant this primitive: > > > (bidi-find-overridden-directionality FROM TO &optional OBJECT) > > > Return position between FROM and TO where directionality was overridden. > > This looks like a way to _test_ part of a buffer or string to see if > it has any bidi strangeness. Could you confirm? Yes, that's the purpose of that primitive. > If so, the questionis: once you detect the strangeness, what then? It's up to the application. Lars requested the above infrastructure for eww, so I guess we will need to see what eww does to handle these "reversed" URLs. It's possible that eww will need some further assistance in that matter, in which case it should come up with the requirements, and we (probably I) should implement whatever is needed. > I suppose the next step is either an error message or a query. > In either case, I think we should show the user (1) what the text > looks like and (2) what's actually in it. > > With your implementation of context-regeneration, we can show what > the text looks like. > > How can we show what it really is? That's easy: copy the text without the directional override and display it in some other buffer. The position returned by bidi-find-overridden-directionality is of the 1st character following the override control, so copying the text starting at that position will exclude the override and avoid its effects. The advantage of this method as compared to presenting the text non-reordered (a.k.a. "disable bidi") is that the above method works for RTL text that is similarly obfuscated by the LRO character, whereas disabling bidi reordering will show RTL text in the order that is very hard, sometimes impossible, to read correctly (it has the same effect as showing words in reversed order to a user of a left-to-right script). > Perhaps what we want is a suppress-bidi property, or a bidi property > that would specify the direction for certain text. These properties > would override all bidi attributes of the characters themselves. I think this won't be needed, but if it is, then it certainly can be done. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-07 15:50 ` Eli Zaretskii @ 2014-12-08 0:26 ` Richard Stallman 2014-12-08 15:46 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-12-08 0:26 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > If so, the questionis: once you detect the strangeness, what then? > It's up to the application. Alas, that's ducking the issue. We need to confront this issue. > That's easy: copy the text without the directional override and > display it in some other buffer. The position returned by > bidi-find-overridden-directionality is of the 1st character following > the override control, so copying the text starting at that position > will exclude the override and avoid its effects. That is the first magic bidi char, but there could be more. It would be necessary to remove them all. However, is simply removing them correct? In general, do magic bidi characters get include in the URL that is passed to the browser? I would expect so. If so, a string which does not include them is inaccurate, and the accurate thing to do is to include them and display them (perhaps in hex) while suppressing their bidi effect. Also, don't some RTL characters cause some normally LTR characters to display RTL? That too could cause confusion, right? -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-08 0:26 ` Richard Stallman @ 2014-12-08 15:46 ` Eli Zaretskii 0 siblings, 0 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-12-08 15:46 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Sun, 07 Dec 2014 19:26:33 -0500 > From: Richard Stallman <rms@gnu.org> > CC: larsi@gnus.org, emacs-devel@gnu.org > > > > If so, the questionis: once you detect the strangeness, what then? > > > It's up to the application. > > Alas, that's ducking the issue. We need to confront this issue. We _are_ confronting it. We are methodically analyzing the issue piecemeal, identifying the separate parts of it, and providing solutions to each part as soon as it is well-defined and understood. The problem we are dealing with is a very complex one. It involves multiple disciplines: bidi reordering, URL construction and display, Internet security, cultural differences, human perception of visual cues, etc. Part of the solution should be in the infrastructure and primitives, part on the application and UI level. Moreover, we are in uncharted territory, with no prior art or standards to guide us. Plus, we don't have any single individual on board who'd have a good understanding of all the aspects of the problem. When dealing with such hard issues, it is IME methodologically wrong to charge ahead without a sufficiently clear definition and understanding of each part of the problem and the alternatives for their solutions. We have now identified the first part: how to find the potentially fraudulent URL, and we have a clear understanding of it. We have a solution for that part of the problem that seems to satisfy the requirements of the application programmer who brought up this issue. The next step should be for the application to try using this infrastructure to address the issue on the application and UI levels. It is possible that that such an attempt will result in feedback that will require changes in the infrastructure, or some additional functionality there. Or the application developers will decide that this part of the problem is successfully solved, and will request assistance in solving the next part, which will need to be defined in clear terms. And so on and so forth -- we will break this complex issue into individual parts and solve them one by one on the level each part belongs to. That's not "ducking the issue" in my book. What you seem to expect is that we start coding solutions to problems that are at best very vaguely defined, without any practical experience to back that up, guided only by some intuition. IME, this is a recipe for wrong solutions and for waste of time and energy. I submit that there's no one around here, including myself, whose intuition in this matter I would trust, because intuition is only reliable when it is based on knowledge and experience in the subject matter, and we don't have such individuals at our disposal. So I don't see any reasons to rush into coding under the circumstances. > > That's easy: copy the text without the directional override and > > display it in some other buffer. The position returned by > > bidi-find-overridden-directionality is of the 1st character following > > the override control, so copying the text starting at that position > > will exclude the override and avoid its effects. > > That is the first magic bidi char, but there could be more. Inside the URL? Extremely unlikely, see below. In any case, the presented use case didn't have them. I'd like to see a complete solution for this simple use case, before we move to more complex ones (if they exist). > It would be necessary to remove them all. I don't think it's a problem, not a likely one anyway. But if it is, it should be almost trivial to use that primitive iteratively to reconstruct the string with all the overrides removed. > However, is simply removing them correct? Yes, I think so. > In general, do magic bidi characters get include in the URL that is > passed to the browser? I would expect so. Using the directional control characters as part of the URL is forbidden by the relevant standards. The authorities that approve domain names will reject them if they include such characters. So I think URLs which include them will be non-existent, or at least very rare. The use case which started this thread of discussion had the control characters outside the URL itself, even outside the protocol part of it. > If so, a string which does not include them is inaccurate, and the > accurate thing to do is to include them and display them (perhaps in > hex) while suppressing their bidi effect. Removing them and suppressing their effect give rise to the same visual appearance, since these controls display as very thin spaces, and thus are almost invisible on the screen. That's why this type of fraud came into existence in the first place. As for using hex, that was one alternative I suggested earlier in this thread. It is still on the table, and doesn't require any infrastructure changes to do its job. But people liked this proposal less, so eventually I coded the primitive to find the spoofed characters as a means for supporting other solutions. > Also, don't some RTL characters cause some normally LTR characters to > display RTL? No. LTR characters always display left to right, unless overridden by the RLO control (which simply makes every character act as an RTL character). ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-02 14:52 ` Eli Zaretskii 2014-12-02 18:05 ` Eli Zaretskii @ 2014-12-03 17:13 ` Richard Stallman 2014-12-03 17:13 ` Richard Stallman 2 siblings, 0 replies; 133+ messages in thread From: Richard Stallman @ 2014-12-03 17:13 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > It would only be an annoyance for users who really use bidi, > > and they would turn it off so it would not annoy them again. > But even users who do use bidi would like to be warned when these > controls are part of potential URL phishing. So there's a > contradiction here, at least for those users: they would like a > warning when these controls could be harmful, but would like to avoid > the warning when they aren't. I agree we want other features to deal specifically with these confusing URLs. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-02 14:52 ` Eli Zaretskii 2014-12-02 18:05 ` Eli Zaretskii 2014-12-03 17:13 ` Richard Stallman @ 2014-12-03 17:13 ` Richard Stallman 2 siblings, 0 replies; 133+ messages in thread From: Richard Stallman @ 2014-12-03 17:13 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > It is easy to make an option turn off bidi processing. > > All it has to do is make all characters seem LTR. > That doesn't disable reordering, it just makes the results > indistinguishable. Perhaps I don't understand what you want to do > with this option. I think we are miscommunicating. If every character is considered to imply left-to-right, the ordering will be what it was before we had bidi support. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 8:22 ` Eli Zaretskii 2014-11-29 17:05 ` Richard Stallman @ 2014-11-29 17:14 ` Ted Zlatanov 2014-11-30 13:42 ` Stephen J. Turnbull 2 siblings, 0 replies; 133+ messages in thread From: Ted Zlatanov @ 2014-11-29 17:14 UTC (permalink / raw) To: emacs-devel On Sat, 29 Nov 2014 10:22:45 +0200 Eli Zaretskii <eliz@gnu.org> wrote: EZ> Once we decide which cases we want to avoid or flag, we could be smart EZ> there, by comparing the original and reordered strings, perhaps aided EZ> by some dictionary lookup. The infrastructure is either already there EZ> or easy to add. It's "just" a matter of deciding what to do and when. EZ> Someone(TM) should present a list of well-thought requirements, and we EZ> can take it from there. Well, here are the pieces I think will be useful for SHR and EWW. I don't claim they are well-thought :) Items 1-3 could be used through font-lock and just set some special text properties in the buffer in text modes that request it (so this will be an optional piece that is always available). Then themes and packages can add special highlighting or handling for those properties. 1) bring uni-confusables in the core. In regular expressions, support either a new syntax char class \s~ to mean "confusable" or a new character class [:confusable:] (or some other way to easily search for such characters, especially if they used outside of their native script). Possible text property: 'uni-confusable 2) in regular expressions, support a new character class [:unicodemeta:] for any characters that have meta meaning in Unicode and no printable representation, from bidi markers to composition. I'm not sure if that's already possible. That will allow packages to detect these characters in places where they are not expected, e.g. inside URL buttons. Possible text property: 'uni-meta 3) make it easy in the core to scan the buffer for places where scripts are mixed in a single sentence, string, word, symbol, etc. syntactic unit. markchars.el does that but only inside words. Possible text property: 'uni-mixedscripts 4) modify `browse-url' to intercept suspicious URLs where any of the above happened in the source buffer. I think the calling package will have to help set the context. I don't know if it can be automated... maybe the function could look for those special text properties around point in the buffer where it was invoked? 5) modify SHR/EWW to highlight these text properties and interrupt the user when the text or content of the URL button has them. Does that seem useful? Ted ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 8:22 ` Eli Zaretskii 2014-11-29 17:05 ` Richard Stallman 2014-11-29 17:14 ` Ted Zlatanov @ 2014-11-30 13:42 ` Stephen J. Turnbull 2014-11-30 15:36 ` Eli Zaretskii 2014-12-01 10:18 ` Richard Stallman 2 siblings, 2 replies; 133+ messages in thread From: Stephen J. Turnbull @ 2014-11-30 13:42 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel Eli Zaretskii writes: > I agree, but the issue discussed here is different: I have to disagree. The issue is about *any* technology that can be used to convince the user that one URL is being accessed when in fact another one is. Whether one should try to warn the user is a separate question, which depends on the probabilities of legitimate vs. fraudulent displays, and the cost of annoyance vs the *avoidable* cost to fraud victims. Unfortunately, the HCI evidence suggests that few potential victims listen to warnings (or even understand them), so you're probably right that it's a bad idea to warn if RTL characters are present. > detecting only the enclosed-LTR case is better than nothing, I > think. Agreed. > > and of course any jumble is possible as a domain or path component > > which is an abbreviation. And any useful jumble can probably be > > registered as a domain, and certainly incorporated in a path. > > I doubt that a domain like this could be registered, as using such > characters in a domain name is AFAIU against the regulations, see > RFC3987. If you mean the controls, you're probably right, although RFC3987 has been updated for international domain names. I suppose those controls are not permitted, though. > The easy cases with RTL text, as mentioned above, should be also > easily detectable, and I agree they should get the same treatment. OK, good enough for me. > > "We need to decide what we want to do, and then look for a mechanism." > > OK, let me rephrase: what effect will "turning off" have on > display? Whatever the display would be in the absence of an attempt to detect and warn about instances of possibly fraudulent use of directional controls. > I very much hope we will find a sane middle ground, possibly subject > to user control. I'd hate to see Emacs become another case of the TSA > disaster. The best I've been able to come up with given the unfortunate conflict between UAX#9 and the "normal" display of URLs as I understand it is a one-off warning (or use of something like the novice mechanism so the user can easily "turn it off" as defined above as soon as it becomes annoying -- I expect your judgment to be that it would *always* be annoying, just mentioning the possibility for completeness). > Someone(TM) should present a list of well-thought requirements, and we > can take it from there. Unfortunately, besides LTR in RTL control, and RTL in LTR control, I can't help, not being familiar with the expected display. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 13:42 ` Stephen J. Turnbull @ 2014-11-30 15:36 ` Eli Zaretskii 2014-12-01 10:18 ` Richard Stallman 1 sibling, 0 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-11-30 15:36 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: larsi, emacs-devel > From: "Stephen J. Turnbull" <stephen@xemacs.org> > Cc: larsi@gnus.org, > emacs-devel@gnu.org > Date: Sun, 30 Nov 2014 22:42:18 +0900 > > Eli Zaretskii writes: > > > I agree, but the issue discussed here is different: > > I have to disagree. The issue is about *any* technology that can be > used to convince the user that one URL is being accessed when in fact > another one is. Well, I thought "bidirectional" in the subject does mean just that. > Whether one should try to warn the user is a separate question, which > depends on the probabilities of legitimate vs. fraudulent displays, > and the cost of annoyance vs the *avoidable* cost to fraud victims. I don't think the probability of legitimate vs fraudulent displays is so low that it justifies the annoyance. > > > "We need to decide what we want to do, and then look for a mechanism." > > > > OK, let me rephrase: what effect will "turning off" have on > > display? > > Whatever the display would be in the absence of an attempt to detect > and warn about instances of possibly fraudulent use of directional > controls. Sorry, couldn't parse this. > > Someone(TM) should present a list of well-thought requirements, and we > > can take it from there. > > Unfortunately, besides LTR in RTL control, and RTL in LTR control, I > can't help, not being familiar with the expected display. Maybe we should simply start with that, and take it from there if needed. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 13:42 ` Stephen J. Turnbull 2014-11-30 15:36 ` Eli Zaretskii @ 2014-12-01 10:18 ` Richard Stallman 2014-12-01 16:18 ` Eli Zaretskii 1 sibling, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-12-01 10:18 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: eliz, larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > I agree, but the issue discussed here is different: > I have to disagree. The issue is about *any* technology that can be > used to convince the user that one URL is being accessed when in fact > another one is. In general, yes, but at present we're looking at two specific cases of that. They made need different solutions. 1. There are magic bidi characters inside the URL. 2. The bidi context of the URL could cause the URL to appear strangely even though the URL itself does not contain any magic bidi characters. Mixing up these two cases has caused a lot of confusion in this discussion. Things said about one of them were mistakenly applied to the other, resulting in nonsense. I proposed checking the URL for bidi magic, for case 1, and someone interpreted the suggestion based on case 2 and said it would be ineffective. For case 2 I proposed the user could insert newlines around the URL to see what it really says. Someone replied that this would be ineffective because he interpreted it based on case 1. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 10:18 ` Richard Stallman @ 2014-12-01 16:18 ` Eli Zaretskii 2014-12-01 18:32 ` Stephen J. Turnbull 2014-12-02 14:42 ` Richard Stallman 0 siblings, 2 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-12-01 16:18 UTC (permalink / raw) To: rms; +Cc: stephen, larsi, emacs-devel > Date: Mon, 01 Dec 2014 05:18:07 -0500 > From: Richard Stallman <rms@gnu.org> > CC: eliz@gnu.org, larsi@gnus.org, emacs-devel@gnu.org > > 1. There are magic bidi characters inside the URL. By "magic bidi characters" do you mean printable characters from RTL scripts, or do you mean the directional controls? (RTL characters are also "magic" in some sense, because they might cause reordering of surrounding text, e.g. if it contains numerical characters.) > 2. The bidi context of the URL could cause the URL to appear strangely > even though the URL itself does not contain any magic bidi characters. > > Mixing up these two cases has caused a lot of confusion in this > discussion. Things said about one of them were mistakenly applied to > the other, resulting in nonsense. > > I proposed checking the URL for bidi magic, for case 1, and someone > interpreted the suggestion based on case 2 and said it would be > ineffective. I, for one, don't understand how would such a check help us. As I wrote elsewhere, at least some parts of a legitimate URL can include such characters, and we shouldn't treat those as suspicious. Maybe you are talking only about some parts of the URL, like the host and the domain. > For case 2 I proposed the user could insert newlines around the URL to > see what it really says. Someone replied that this would be > ineffective because he interpreted it based on case 1. I think it's impractical to insert newlines before and after each URL. It will make Web pages and HTML mail all but illegible, because modern Web text includes URLs in the normal flow of text, which will be interrupted by these newlines. We might do that for URLs where we detect an attempt at spoofing/phishing, but once those are detected, there are better methods to undo the effects of phishing. They were suggested earlier in this thread, let me reiterate the alternatives: . modify the way the relevant directional controls are displayed to make them prominently apparent . allow the user to request a temporary display of the URL in its original logical order, before the reordering, or maybe do that automatically in a tooltip . replace the relevant directional controls with percent-hex encoded representation, which will as result disable the reordering . cover the relevant directional controls with a display property (e.g., with a display string " "), which will also disable reordering Let's pick up one of these alternatives and use it, or maybe allow the users choose any one of them. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 16:18 ` Eli Zaretskii @ 2014-12-01 18:32 ` Stephen J. Turnbull 2014-12-01 19:12 ` Eli Zaretskii 2014-12-02 14:42 ` Richard Stallman 1 sibling, 1 reply; 133+ messages in thread From: Stephen J. Turnbull @ 2014-12-01 18:32 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, rms, emacs-devel Eli Zaretskii writes: > . modify the way the relevant directional controls are displayed to > make them prominently apparent -0 I don't think this will help enough, especially for the users who would most benefit from Emacs's automated paranoia (ie, those who read bidi but not RFCs). > . allow the user to request a temporary display of the URL in its > original logical order, before the reordering, or maybe do that > automatically in a tooltip +1 for the tooltip, with url-encoding for format characters, which are non-conforming to RFC 3987 anyway. Note that RFC 3987 specifies that bidirectional IRIs must *always* be displayed with the UBA, and as if in an LRE embedding. I'm not sure how you would enforce it, but I believe this would defang larsi's example (ie, at the start of the URI proper in logical order insert a LRE, and at the end a PDF -- any directional format characters between those points are nonconforming to RFC 3987, section 4.1, last paragraph). > . replace the relevant directional controls with percent-hex encoded > representation, which will as result disable the reordering -1 If they're outside of the IRI, this will just make things ugly. If they're inside the IRI, they're non-conforming and therefore bogus, and would be caught by the tooltip. > . cover the relevant directional controls with a display property > (e.g., with a display string " "), which will also disable > reordering -0 This is just a specific implementation of the first option above, right? ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 18:32 ` Stephen J. Turnbull @ 2014-12-01 19:12 ` Eli Zaretskii 2014-12-01 20:08 ` Stephen J. Turnbull 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-01 19:12 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: larsi, rms, emacs-devel > From: "Stephen J. Turnbull" <stephen@xemacs.org> > Cc: rms@gnu.org, > larsi@gnus.org, > emacs-devel@gnu.org > Date: Tue, 02 Dec 2014 03:32:07 +0900 > > Eli Zaretskii writes: > > > . modify the way the relevant directional controls are displayed to > > make them prominently apparent > > -0 I don't think this will help enough, especially for the users who > would most benefit from Emacs's automated paranoia (ie, those who read > bidi but not RFCs). This alternative makes the least changes on display. > Note that RFC 3987 specifies that bidirectional IRIs must *always* be > displayed with the UBA, and as if in an LRE embedding. I'm not sure > how you would enforce it, but I believe this would defang larsi's > example (ie, at the start of the URI proper in logical order insert a > LRE, and at the end a PDF -- any directional format characters between > those points are nonconforming to RFC 3987, section 4.1, last > paragraph). Using an LRE..PDF embedding is a possibility, but it can be defeated: the UBA mandates that any embeddings above some predefined fixed depth are to be ignored. So a malicious code could insert a large enough number of RLOs such that any LRE would be ignored. That's one of the reasons why I prefer not to poke the text with additional directional controls. > > . replace the relevant directional controls with percent-hex encoded > > representation, which will as result disable the reordering > > -1 If they're outside of the IRI, this will just make things ugly. Ugly, yes. But if these cases are sufficiently rare, that ugliness is useful, I think, as it will attract attention. > If they're inside the IRI, they're non-conforming and therefore bogus, > and would be caught by the tooltip. Yes, but tooltips could be overlooked (or even disabled globally by the user). > > . cover the relevant directional controls with a display property > > (e.g., with a display string " "), which will also disable > > reordering > > -0 This is just a specific implementation of the first option above, right? No, it also disables reordering, whereas the first one doesn't. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 19:12 ` Eli Zaretskii @ 2014-12-01 20:08 ` Stephen J. Turnbull 2014-12-01 20:42 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Stephen J. Turnbull @ 2014-12-01 20:08 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, rms, emacs-devel Eli Zaretskii writes: > > Note that RFC 3987 specifies that bidirectional IRIs must *always* be > > displayed with the UBA, and as if in an LRE embedding. I'm not sure > > how you would enforce it, but I believe this would defang larsi's > > example (ie, at the start of the URI proper in logical order insert a > > LRE, and at the end a PDF -- any directional format characters between > > those points are nonconforming to RFC 3987, section 4.1, last > > paragraph). > > Using an LRE..PDF embedding is a possibility, but it can be defeated: > the UBA mandates that any embeddings above some predefined fixed depth > are to be ignored. So a malicious code could insert a large enough > number of RLOs such that any LRE would be ignored. Note that RFC 3987 is a MUST, and OTOH does not specify an implementation (probably precisely because of the nesting issue). > That's one of the reasons why I prefer not to poke the text with > additional directional controls. You don't need to poke them into the text. You just MUST display IRIs "as if" there were an effective embedding. I'm aware of the GNU mantra "standards are sometimes not a terrible idea -- but only sometimes". But in this case I think conformance is a very good idea. > > If they're inside the IRI, they're non-conforming and therefore bogus, > > and would be caught by the tooltip. > > Yes, but tooltips could be overlooked (or even disabled globally by > the user). I think for the cases we've identified so far (LTR-only text in a RTL context, RTL-only text in an LTR context, and directional controls embedded in an IRI) you probably want to require the user who clicks on them to confirm that they want to follow this misleading link, anyway. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 20:08 ` Stephen J. Turnbull @ 2014-12-01 20:42 ` Eli Zaretskii 0 siblings, 0 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-12-01 20:42 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: larsi, rms, emacs-devel > From: "Stephen J. Turnbull" <stephen@xemacs.org> > Cc: larsi@gnus.org, > rms@gnu.org, > emacs-devel@gnu.org > Date: Tue, 02 Dec 2014 05:08:11 +0900 > > > That's one of the reasons why I prefer not to poke the text with > > additional directional controls. > > You don't need to poke them into the text. You just MUST display IRIs > "as if" there were an effective embedding. We don't (yet) have the machinery to do that, except by inserting an LRE. > I think for the cases we've identified so far (LTR-only text in a RTL > context, RTL-only text in an LTR context, and directional controls > embedded in an IRI) you probably want to require the user who clicks > on them to confirm that they want to follow this misleading link, > anyway. That's something for Lars to worry about, I will just provide the detection infrastructure. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 16:18 ` Eli Zaretskii 2014-12-01 18:32 ` Stephen J. Turnbull @ 2014-12-02 14:42 ` Richard Stallman 2014-12-02 14:54 ` Eli Zaretskii 1 sibling, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-12-02 14:42 UTC (permalink / raw) To: Eli Zaretskii; +Cc: stephen, larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > 1. There are magic bidi characters inside the URL. > By "magic bidi characters" do you mean printable characters from RTL > scripts, or do you mean the directional controls? I think I mean the directional controls, but I can't be sure. I don't know this terminology enough. > I think it's impractical to insert newlines before and after each > URL. We are miscommunicating. What I said is that the USER can insert newlines in order to see what a certain URL looks like, free of influence from its surroundings. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-02 14:42 ` Richard Stallman @ 2014-12-02 14:54 ` Eli Zaretskii 2014-12-03 8:39 ` Richard Stallman 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-02 14:54 UTC (permalink / raw) To: rms; +Cc: stephen, larsi, emacs-devel > Date: Tue, 02 Dec 2014 09:42:54 -0500 > From: Richard Stallman <rms@gnu.org> > CC: stephen@xemacs.org, larsi@gnus.org, emacs-devel@gnu.org > > > I think it's impractical to insert newlines before and after each > > URL. > > We are miscommunicating. What I said is that the USER can insert > newlines in order to see what a certain URL looks like, free of > influence from its surroundings. In that case, it's not a very good idea, IMO. First, some buffers are read-only. Second, when the display is sufficiently jumbled by directional controls, users who are not acquainted with bidi will have trouble figuring out where to insert the newlines. Even I sometimes fail to insert them in the correct position. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-02 14:54 ` Eli Zaretskii @ 2014-12-03 8:39 ` Richard Stallman 0 siblings, 0 replies; 133+ messages in thread From: Richard Stallman @ 2014-12-03 8:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: stephen, larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > We are miscommunicating. What I said is that the USER can insert > > newlines in order to see what a certain URL looks like, free of > > influence from its surroundings. > In that case, it's not a very good idea, IMO. First, some buffers are > read-only. Second, when the display is sufficiently jumbled by > directional controls, users who are not acquainted with bidi will have > trouble figuring out where to insert the newlines. Even I sometimes > fail to insert them in the correct position. This increases the need to do something else about the problem. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-28 2:51 Bidirectional text and URLs Lars Magne Ingebrigtsen 2014-11-28 3:27 ` Stephen J. Turnbull @ 2014-11-28 11:19 ` Ted Zlatanov 2014-11-28 13:58 ` Lars Magne Ingebrigtsen ` (3 more replies) 2014-11-28 14:45 ` Eli Zaretskii 2014-11-28 17:09 ` Richard Stallman 3 siblings, 4 replies; 133+ messages in thread From: Ted Zlatanov @ 2014-11-28 11:19 UTC (permalink / raw) To: emacs-devel On Fri, 28 Nov 2014 03:51:14 +0100 Lars Magne Ingebrigtsen <larsi@gnus.org> wrote: LMI> Using right-to-left markers to do phishing and obscure URLs has gotten LMI> some attention on the webs today. For instance, can you easily tell LMI> where the link below takes you if you click on it in Gnus and LMI> (presumably) rmail? LMI> Works on URLs too. LMI> http://myspace.com/#/segami/moc.koobecaf//:sptth LMI> Unless I messed something up while cut'n'pasting that, you should see LMI> the problem. LMI> Now, should we do something about that? And if so -- what? My uni-confusables package in the GNU ELPA would help detect things like б (CYRILLIC SMALL LETTER BE) confused with the number 6. The relevant line from confusables.txt is: 0431 ; 0036 ; SL # ( б → 6 ) CYRILLIC SMALL LETTER BE → DIGIT SIX # which maps to (1073 "6") in `uni-confusables-char-table-single'. EWW and SHR could opportunistically use that table to highlight such characters. I could also add RTL markers and other useful things to uni-confusables if you think it's the right place, and maybe provide the function for EWW and SHR and others to use when looking for suspicious characters. Or I could keep the package to a single purpose. I'm not sure of the right thing because this feels a little bit like core functionality. Ted ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-28 11:19 ` Ted Zlatanov @ 2014-11-28 13:58 ` Lars Magne Ingebrigtsen 2014-11-28 19:49 ` Ted Zlatanov 2014-11-28 14:24 ` Stefan Monnier ` (2 subsequent siblings) 3 siblings, 1 reply; 133+ messages in thread From: Lars Magne Ingebrigtsen @ 2014-11-28 13:58 UTC (permalink / raw) To: emacs-devel Ted Zlatanov <tzz@lifelogs.com> writes: > My uni-confusables package in the GNU ELPA would help detect things like > б (CYRILLIC SMALL LETTER BE) confused with the number 6. The relevant > line from confusables.txt is: > > 0431 ; 0036 ; SL # ( б → 6 ) CYRILLIC SMALL LETTER BE → DIGIT SIX # > > which maps to (1073 "6") in `uni-confusables-char-table-single'. EWW and > SHR could opportunistically use that table to highlight such characters. Yes, and perhaps use that to do a "are you sure?" if a user tries to visit https://𝐩𝐚𝐲𝐩𝐚𝐥.com or https://paypal.com. But then uni-confusables should perhaps be moved from ELPA to Emacs so that we can use it generally? > I could also add RTL markers and other useful things to uni-confusables > if you think it's the right place, and maybe provide the function for > EWW and SHR and others to use when looking for suspicious characters. Or > I could keep the package to a single purpose. I'm not sure of the right > thing because this feels a little bit like core functionality. Yeah, I think the RTL stuff sounds kinda like a separate issue that's even more fundamental than the confusables, perhaps. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-28 13:58 ` Lars Magne Ingebrigtsen @ 2014-11-28 19:49 ` Ted Zlatanov 2014-11-28 21:02 ` Stefan Monnier 2014-11-28 22:26 ` Eli Zaretskii 0 siblings, 2 replies; 133+ messages in thread From: Ted Zlatanov @ 2014-11-28 19:49 UTC (permalink / raw) To: emacs-devel On Fri, 28 Nov 2014 14:58:27 +0100 Lars Magne Ingebrigtsen <larsi@gnus.org> wrote: LMI> Ted Zlatanov <tzz@lifelogs.com> writes: >> My uni-confusables package in the GNU ELPA would help detect things like >> б (CYRILLIC SMALL LETTER BE) confused with the number 6. The relevant >> line from confusables.txt is: >> >> 0431 ; 0036 ; SL # ( б → 6 ) CYRILLIC SMALL LETTER BE → DIGIT SIX # >> >> which maps to (1073 "6") in `uni-confusables-char-table-single'. EWW and >> SHR could opportunistically use that table to highlight such characters. LMI> Yes, and perhaps use that to do a "are you sure?" if a user tries to LMI> visit https://𝐩𝐚𝐲𝐩𝐚𝐥.com or https://paypal.com. Right. At least in the SHR/EWW context we can control that experience, and also perhaps in places like `browse-url' or `ffap-url-at-point'. LMI> But then uni-confusables should perhaps be moved from ELPA to Emacs so LMI> that we can use it generally? It would probably improve the use experience, yes. Stefan, WDYT? On Fri, 28 Nov 2014 09:24:21 -0500 Stefan Monnier <monnier@IRO.UMontreal.CA> wrote: >> which maps to (1073 "6") in `uni-confusables-char-table-single'. EWW and >> SHR could opportunistically use that table to highlight such characters. SM> I don't think SHR/EWW can really do that for the buffer's main text, SM> since AFAIK it doesn't know whether what it displays is supposed to be SM> a URL or just plain human text (or rather, to do it well it would have SM> to somehow detect a particular mix of characters). For Gnus users, for instance, the buffer would be using SHR so there's some control over the experience and metadata about the content. You're right that in general this is not clear, which is why interactive functions like `browse-url' and others may need to be advised. SM> OTOH it can&should indeed do something (including a bigfat warning for SM> bidi-ordering codes) when displaying something it knows to be a URL. I'm not sure about the bidi markers, Eli can discuss that side. I'll try to get the confusables in there and maybe write general code that bidi markers and others can hook into. On Fri, 28 Nov 2014 16:57:27 +0200 Eli Zaretskii <eliz@gnu.org> wrote: >> I could also add RTL markers and other useful things to uni-confusables >> if you think it's the right place EZ> I don't think it's TRT to highlight these controls regardless of what EZ> characters they affect. See my other message for why. OK. >> I'm not sure of the right thing because this feels a little bit like >> core functionality. EZ> What is "core functionality" here? Things that work in Emacs without customization. EZ> For that matter, what functionality are we talking about? The uni-confusables package from the GNU ELPA and glue code to let SHR and EWW know that a URL includes such characters. EZ> We should first decide what we want to do with these cases, and only EZ> then discuss whether that functionality belongs to the core. I think Lars' suggestion is decent, see above. Ted ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-28 19:49 ` Ted Zlatanov @ 2014-11-28 21:02 ` Stefan Monnier 2014-11-29 0:26 ` Ted Zlatanov 2014-11-28 22:26 ` Eli Zaretskii 1 sibling, 1 reply; 133+ messages in thread From: Stefan Monnier @ 2014-11-28 21:02 UTC (permalink / raw) To: emacs-devel > For Gnus users, for instance, the buffer would be using SHR so there's > some control over the experience and metadata about the content. You're > right that in general this is not clear, which is why interactive > functions like `browse-url' and others may need to be advised. What I meant is that in the SHR case, the text displayed is not the URL but some random piece of text that should be highlighted as a button (although in some cases it is the same text as the URL itself, SHR has no idea whether that's the case or not). We can do something in the Gnus case rendering non-HTML contents, where the URL is highlighted as a button, because at that point we do display something which we know is supposed to be interpreted by the user as a URL. Stefan ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-28 21:02 ` Stefan Monnier @ 2014-11-29 0:26 ` Ted Zlatanov 0 siblings, 0 replies; 133+ messages in thread From: Ted Zlatanov @ 2014-11-29 0:26 UTC (permalink / raw) To: emacs-devel On Fri, 28 Nov 2014 16:02:02 -0500 Stefan Monnier <monnier@IRO.UMontreal.CA> wrote: >> For Gnus users, for instance, the buffer would be using SHR so there's >> some control over the experience and metadata about the content. You're >> right that in general this is not clear, which is why interactive >> functions like `browse-url' and others may need to be advised. SM> What I meant is that in the SHR case, the text displayed is not the URL SM> but some random piece of text that should be highlighted as a button SM> (although in some cases it is the same text as the URL itself, SHR has SM> no idea whether that's the case or not). SM> We can do something in the Gnus case rendering non-HTML contents, where SM> the URL is highlighted as a button, because at that point we do display SM> something which we know is supposed to be interpreted by the user as a URL. I see what you mean. You're right that it should be rendered differently, but there's too many ways the rendering can be modified by the buffer mode, so whatever SHR does will not be enough. Intercepting the `browse-url' action, on the other hand, is definitely going to interrupt the user in order to warn them, no matter how they got that URL. For rendering, I think some help from the core would be nice for modes that want it; see below about "markchars" and `prettify-symbols-mode' etc. On Sat, 29 Nov 2014 00:26:01 +0200 Eli Zaretskii <eliz@gnu.org> wrote: >> From: Ted Zlatanov <tzz@lifelogs.com> >> Date: Fri, 28 Nov 2014 14:49:59 -0500 >> >> I'm not sure about the bidi markers, Eli can discuss that side. I'll >> try to get the confusables in there and maybe write general code that >> bidi markers and others can hook into. EZ> I cannot say I can follow that. Those "bidi markers" are just EZ> characters, so how can they hook into something? Sorry, I meant "code that detects suspicious bidi markers" instead of "bidi markers." We have the "markchars" package in the GNU ELPA, which can currently highlight Unicode confusables and others with a special face (magenta underline by default). For confusables specifically, it just looks for more than one Unicode script within a word, so it's not exactly what Lars asked originally. There was an epic discussion about "markchars" back in 2011: http://comments.gmane.org/gmane.emacs.devel/122200 Anyhow, I was thinking of bringing something like "markchars" into the core and also making the "uni-confusables" package (which is just a conversion of the Unicode confusables.txt) available by default as a char-table. I'm not sure what it will look like, so if anyone can think of precedents, let me know. I think the `prettify-symbols-mode' approach is one possibility, and in fact it was just suggested recently that it should support regexps... any others? EZ> For that matter, what functionality are we talking about? >> >> The uni-confusables package from the GNU ELPA and glue code to let SHR >> and EWW know that a URL includes such characters. EZ> Once again, these characters are not confusables. Their use around EZ> the URL is. So highlighting them wherever we see them is not EZ> necessarily the best way. OK, understood. See above about rendering vs. interrupting UI flow. The latter is what Lars suggested and I agree is more useful. Ted ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-28 19:49 ` Ted Zlatanov 2014-11-28 21:02 ` Stefan Monnier @ 2014-11-28 22:26 ` Eli Zaretskii 1 sibling, 0 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-11-28 22:26 UTC (permalink / raw) To: emacs-devel > From: Ted Zlatanov <tzz@lifelogs.com> > Date: Fri, 28 Nov 2014 14:49:59 -0500 > > I'm not sure about the bidi markers, Eli can discuss that side. I'll > try to get the confusables in there and maybe write general code that > bidi markers and others can hook into. I cannot say I can follow that. Those "bidi markers" are just characters, so how can they hook into something? > EZ> For that matter, what functionality are we talking about? > > The uni-confusables package from the GNU ELPA and glue code to let SHR > and EWW know that a URL includes such characters. Once again, these characters are not confusables. Their use around the URL is. So highlighting them wherever we see them is not necessarily the best way. > EZ> We should first decide what we want to do with these cases, and only > EZ> then discuss whether that functionality belongs to the core. > > I think Lars' suggestion is decent, see above. What question? And what does decency have to do with this? ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-28 11:19 ` Ted Zlatanov 2014-11-28 13:58 ` Lars Magne Ingebrigtsen @ 2014-11-28 14:24 ` Stefan Monnier 2014-11-28 14:57 ` Eli Zaretskii 2014-11-29 6:17 ` Stephen J. Turnbull 3 siblings, 0 replies; 133+ messages in thread From: Stefan Monnier @ 2014-11-28 14:24 UTC (permalink / raw) To: emacs-devel > which maps to (1073 "6") in `uni-confusables-char-table-single'. EWW and > SHR could opportunistically use that table to highlight such characters. I don't think SHR/EWW can really do that for the buffer's main text, since AFAIK it doesn't know whether what it displays is supposed to be a URL or just plain human text (or rather, to do it well it would have to somehow detect a particular mix of characters). OTOH it can&should indeed do something (including a bigfat warning for bidi-ordering codes) when displaying something it knows to be a URL. Stefan ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-28 11:19 ` Ted Zlatanov 2014-11-28 13:58 ` Lars Magne Ingebrigtsen 2014-11-28 14:24 ` Stefan Monnier @ 2014-11-28 14:57 ` Eli Zaretskii 2014-11-29 6:17 ` Stephen J. Turnbull 3 siblings, 0 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-11-28 14:57 UTC (permalink / raw) To: emacs-devel > From: Ted Zlatanov <tzz@lifelogs.com> > Date: Fri, 28 Nov 2014 06:19:31 -0500 > > I could also add RTL markers and other useful things to uni-confusables > if you think it's the right place I don't think it's TRT to highlight these controls regardless of what characters they affect. See my other message for why. > I'm not sure of the right thing because this feels a little bit like > core functionality. What is "core functionality" here? For that matter, what functionality are we talking about? We should first decide what we want to do with these cases, and only then discuss whether that functionality belongs to the core. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-28 11:19 ` Ted Zlatanov ` (2 preceding siblings ...) 2014-11-28 14:57 ` Eli Zaretskii @ 2014-11-29 6:17 ` Stephen J. Turnbull 3 siblings, 0 replies; 133+ messages in thread From: Stephen J. Turnbull @ 2014-11-29 6:17 UTC (permalink / raw) To: emacs-devel Ted Zlatanov writes: > I could also add RTL markers and other useful things to uni-confusables If you do, change the name of the package or at least use a different library name. "Confusable" is a technical term in Unicode, and people familiar with Unicode would not expect directionality related features to be in the uni-confusables library. > suspicious Eureka! How about the "uni-suspicious" package, with uni-confusables and uni-directional libraries? > I'm not sure of the right thing because this feels a little bit > like core functionality. +1 Any text (including web documents and programs) might contain a URL or other "problematic if copied and pasted" phrase. I've also seen many students copy math symbols and the like from different blocks, so "confusables" might be useful in lexing such documents. Steve ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-28 2:51 Bidirectional text and URLs Lars Magne Ingebrigtsen 2014-11-28 3:27 ` Stephen J. Turnbull 2014-11-28 11:19 ` Ted Zlatanov @ 2014-11-28 14:45 ` Eli Zaretskii 2014-11-28 17:09 ` Richard Stallman 3 siblings, 0 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-11-28 14:45 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel > From: Lars Magne Ingebrigtsen <larsi@gnus.org> > Date: Fri, 28 Nov 2014 03:51:14 +0100 > > Using right-to-left markers to do phishing and obscure URLs has gotten > some attention on the webs today. For instance, can you easily tell > where the link below takes you if you click on it in Gnus and > (presumably) rmail? > > Works on URLs too. > > http://myspace.com/#/segami/moc.koobecaf//:sptth > > Unless I messed something up while cut'n'pasting that, you should see > the problem. > > Now, should we do something about that? And if so -- what? It depends on what do we _want_ to do. All I can do at this stage is point to the relevant resources (which unfortunately are not helpful enough IMO when it comes to recommendations for browser-type applications that need to display such URLs without fooling users): http://www.unicode.org/reports/tr36/#Bidirectional_Text_Spoofing http://www.unicode.org/reports/tr39/ http://www.ietf.org/rfc/rfc3987.txt ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-28 2:51 Bidirectional text and URLs Lars Magne Ingebrigtsen ` (2 preceding siblings ...) 2014-11-28 14:45 ` Eli Zaretskii @ 2014-11-28 17:09 ` Richard Stallman 2014-11-28 18:28 ` Eli Zaretskii 2014-11-28 19:28 ` Andreas Schwab 3 siblings, 2 replies; 133+ messages in thread From: Richard Stallman @ 2014-11-28 17:09 UTC (permalink / raw) To: Lars Magne Ingebrigtsen; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] There is no legitimate need for such URLs to "work." Perhaps the Emacs programs that follow a URL should give an error if there is any special RTL flag character in the URL. Or anything else strange or dangerous. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-28 17:09 ` Richard Stallman @ 2014-11-28 18:28 ` Eli Zaretskii 2014-11-29 17:03 ` Richard Stallman 2014-11-28 19:28 ` Andreas Schwab 1 sibling, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-11-28 18:28 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Fri, 28 Nov 2014 12:09:51 -0500 > From: Richard Stallman <rms@gnu.org> > Cc: emacs-devel@gnu.org > > There is no legitimate need for such URLs to "work." Yes, there is. Some bidirectional texts can be hard to read without these control characters. > Perhaps the Emacs programs that follow a URL > should give an error if there is any special RTL flag character > in the URL. Or anything else strange or dangerous. That'd be a mistake, IMO. If we can detect unreasonable or suspicious uses of these control characters (like when strictly left-to-right text is included in a right-to-left override embedding), then we should flag only those. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-28 18:28 ` Eli Zaretskii @ 2014-11-29 17:03 ` Richard Stallman 2014-11-29 17:06 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-11-29 17:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > There is no legitimate need for such URLs to "work." > Yes, there is. Some bidirectional texts can be hard to read without > these control characters. We seem to be talking about different questions. You're talking about "some...text" but the question was specifically URLs. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 17:03 ` Richard Stallman @ 2014-11-29 17:06 ` Eli Zaretskii 2014-11-30 9:37 ` Richard Stallman 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-11-29 17:06 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Sat, 29 Nov 2014 12:03:43 -0500 > From: Richard Stallman <rms@gnu.org> > CC: larsi@gnus.org, emacs-devel@gnu.org > > > > There is no legitimate need for such URLs to "work." > > > Yes, there is. Some bidirectional texts can be hard to read without > > these control characters. > > We seem to be talking about different questions. > You're talking about "some...text" but the question was specifically URLs. URLs are a special case of human-readable text. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 17:06 ` Eli Zaretskii @ 2014-11-30 9:37 ` Richard Stallman 2014-11-30 15:16 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-11-30 9:37 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > We seem to be talking about different questions. > > You're talking about "some...text" but the question was specifically URLs. > URLs are a special case of human-readable text. Yes, but that's not the point. The point is that your special cases The places where bidi characters should work are human-readable text. don't overlap with URLs. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 9:37 ` Richard Stallman @ 2014-11-30 15:16 ` Eli Zaretskii 2014-12-01 10:18 ` Richard Stallman 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-11-30 15:16 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Sun, 30 Nov 2014 04:37:42 -0500 > From: Richard Stallman <rms@gnu.org> > CC: larsi@gnus.org, emacs-devel@gnu.org > > > > We seem to be talking about different questions. > > > You're talking about "some...text" but the question was specifically URLs. > > > URLs are a special case of human-readable text. > > Yes, but that's not the point. The point is that your special cases > > The places where bidi characters should work are human-readable text. > > don't overlap with URLs. My conclusion is the opposite: This issue happens _precisely_ _because_ humans review the URLs presented to them before they decide to follow the link to those URLs. The issue here is that bidirectional display features are being (ab)used to trick humans into thinking they will follow a link to some place, while in fact the link leads to a very different place. This problem would not have existed without humans reading the URLs, and without the discrepancy between what those humans perceive visually and the actual URL as seen by the program which interprets it. A program always reads and processes a URL in the logical order of its characters, i.e. in the strictly increasing order of the character positions in the string, so a program will never see any strangeness here. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 15:16 ` Eli Zaretskii @ 2014-12-01 10:18 ` Richard Stallman 2014-12-01 16:02 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-12-01 10:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > The issue here is that bidirectional display features are being > (ab)used to trick humans into thinking they will follow a link to some > place, while in fact the link leads to a very different place. This > problem would not have existed without humans reading the URLs, and > without the discrepancy between what those humans perceive visually > and the actual URL as seen by the program which interprets it. That is true. These magic characters have the same effect in URLs as everywhere else, because Emacs display does not distinguish. But URLs are not the places where these magic characters are useful and meant to be used. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 10:18 ` Richard Stallman @ 2014-12-01 16:02 ` Eli Zaretskii 0 siblings, 0 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-12-01 16:02 UTC (permalink / raw) To: rms; +Cc: larsi, emacs-devel > Date: Mon, 01 Dec 2014 05:18:01 -0500 > From: Richard Stallman <rms@gnu.org> > CC: larsi@gnus.org, emacs-devel@gnu.org > > > The issue here is that bidirectional display features are being > > (ab)used to trick humans into thinking they will follow a link to some > > place, while in fact the link leads to a very different place. This > > problem would not have existed without humans reading the URLs, and > > without the discrepancy between what those humans perceive visually > > and the actual URL as seen by the program which interprets it. > > That is true. These magic characters have the same effect in URLs > as everywhere else, because Emacs display does not distinguish. > > But URLs are not the places where these magic characters are useful > and meant to be used. Not in the host.domain parts, but URLs can hold more than just that. The query part, the one after the "?", might very well use it. Anyway, if we want to detect the cases that are simple for detection, we can start there; it's probably better than nothing. But we need to have a very specific definition of those cases. Many people in this thread talk in terms of vague concepts, such as "directionality", which sound intuitive, but break down as soon as we need to translate them into requirements for what Emacs should do. Not their fault, of course: the issue is complex and most people don't know the details, or need to. But it does make the discussion more difficult. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-28 17:09 ` Richard Stallman 2014-11-28 18:28 ` Eli Zaretskii @ 2014-11-28 19:28 ` Andreas Schwab 2014-11-29 17:04 ` Richard Stallman 1 sibling, 1 reply; 133+ messages in thread From: Andreas Schwab @ 2014-11-28 19:28 UTC (permalink / raw) To: Richard Stallman; +Cc: Lars Magne Ingebrigtsen, emacs-devel Richard Stallman <rms@gnu.org> writes: > There is no legitimate need for such URLs to "work." > Perhaps the Emacs programs that follow a URL > should give an error if there is any special RTL flag character > in the URL. Or anything else strange or dangerous. The RTL flag character in the example isn't part of the URL, it only precedes it. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-28 19:28 ` Andreas Schwab @ 2014-11-29 17:04 ` Richard Stallman 2014-11-29 17:11 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-11-29 17:04 UTC (permalink / raw) To: Andreas Schwab; +Cc: larsi, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > The RTL flag character in the example isn't part of the URL, it only > precedes it. (I couldn't see it in any case.) This suggests we need to provide a primitive to tell Lisp programs a guaranteed answer for which direction the text at a certain point is displayed in. Also, a primitive to verify that a certain region of text has no bidi strangeness within it. It could return the position of the first bidi strangeness in the region, or nil. On issues like this, better safe than sorry. The user who wants to override the safety measure can easily do that. For instance, inserting line breaks around the URL would make it be considered safe, right? I think the precaution I suggested about bidi flags inside the URL is needed also. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 17:04 ` Richard Stallman @ 2014-11-29 17:11 ` Eli Zaretskii 2014-11-30 9:38 ` Richard Stallman 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-11-29 17:11 UTC (permalink / raw) To: rms; +Cc: larsi, schwab, emacs-devel > Date: Sat, 29 Nov 2014 12:04:17 -0500 > From: Richard Stallman <rms@gnu.org> > Cc: larsi@gnus.org, emacs-devel@gnu.org > > > The RTL flag character in the example isn't part of the URL, it only > > precedes it. > > (I couldn't see it in any case.) It's displayed as a very thin space. > This suggests we need to provide a primitive to tell Lisp programs a > guaranteed answer for which direction the text at a certain point is > displayed in. The directionality of the text is determined by the display engine, and by design is not subject to control by Lisp programs, with 2 notable exceptions (none of which are relevant to the issue at hand): . Lisp programs can disable bidi reordering in a buffer . Lisp programs can define the base paragraph direction > Also, a primitive to verify that a certain region of text has no > bidi strangeness within it. We need to have a good instrumental definition of "bidi strangeness" for that. The simple job of determining whether the region of text includes RTL characters or bidi formatting controls is already possible by using suitable regular expressions, of course. > On issues like this, better safe than sorry. The user who wants to > override the safety measure can easily do that. For instance, > inserting line breaks around the URL would make it be considered safe, > right? No. In fact, it won't change at all the (jumbled) display of the example presented by Lars. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-29 17:11 ` Eli Zaretskii @ 2014-11-30 9:38 ` Richard Stallman 2014-11-30 15:20 ` Eli Zaretskii 0 siblings, 1 reply; 133+ messages in thread From: Richard Stallman @ 2014-11-30 9:38 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, schwab, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > This suggests we need to provide a primitive to tell Lisp programs a > > guaranteed answer for which direction the text at a certain point is > > displayed in. > The directionality of the text is determined by the display engine, > and by design is not subject to control by Lisp programs, I think we are talking about different issues. You're talking about whether Lisp programs control the directionality. I'm talking about providing a way for them to inquire what display will do. > > Also, a primitive to verify that a certain region of text has no > > bidi strangeness within it. > We need to have a good instrumental definition of "bidi strangeness" > for that. I suggest the definition: whatever would cause the displayed order of characters to be perhaps misleading if the text is interpreted as a URL or anything else with programatic significance. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 9:38 ` Richard Stallman @ 2014-11-30 15:20 ` Eli Zaretskii 2014-11-30 23:39 ` chad 2014-12-01 10:18 ` Richard Stallman 0 siblings, 2 replies; 133+ messages in thread From: Eli Zaretskii @ 2014-11-30 15:20 UTC (permalink / raw) To: rms; +Cc: larsi, schwab, emacs-devel > Date: Sun, 30 Nov 2014 04:38:04 -0500 > From: Richard Stallman <rms@gnu.org> > CC: schwab@linux-m68k.org, larsi@gnus.org, emacs-devel@gnu.org > > > > This suggests we need to provide a primitive to tell Lisp programs a > > > guaranteed answer for which direction the text at a certain point is > > > displayed in. > > > The directionality of the text is determined by the display engine, > > and by design is not subject to control by Lisp programs, > > I think we are talking about different issues. You're talking about > whether Lisp programs control the directionality. I'm talking about > providing a way for them to inquire what display will do. I apologize for my misunderstanding. > > > Also, a primitive to verify that a certain region of text has no > > > bidi strangeness within it. > > > We need to have a good instrumental definition of "bidi strangeness" > > for that. > > I suggest the definition: whatever would cause the displayed order of > characters to be perhaps misleading if the text is interpreted as a > URL or anything else with programatic significance. I'm sorry, but this is not instrumental: it doesn't specify what "misleading" means. We need a detailed spec for that. The underlying problem here is that many cases of what readers of RTL scripts will perceive as perfectly valid reordering might appear "misleading" to people who don't read those scripts. We should strive to arrive at a definition that detects unreasonable and suspicious reordering, not just any reordering. One possible definitions for "misleading" were suggested earlier: strict left-to-right text which is reordered for display due to directional control characters. If this is what we want, I can work on providing infrastructure for detecting these cases (and perhaps also similar ones for when similar games are played with URLs that use RTL characters). If that is not what we want, then we need to continue discussing the requirements. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 15:20 ` Eli Zaretskii @ 2014-11-30 23:39 ` chad 2014-12-01 3:49 ` Eli Zaretskii 2014-12-01 10:18 ` Richard Stallman 1 sibling, 1 reply; 133+ messages in thread From: chad @ 2014-11-30 23:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, schwab, Richard Stallman, emacs-devel > On 30 Nov 2014, at 07:20, Eli Zaretskii <eliz@gnu.org> wrote: > > I'm sorry, but this is not instrumental: it doesn't specify what > "misleading" means. We need a detailed spec for that. Given things we're already identifying the URL in text, is it possible/easy to check for a different directionality of any part of a URL text (including the entire url) compared to the text (not whitespace) before and after the URL? In order to make phishing-style surprises work, the mal-ordered text probably wants to have the left-side string "http[s]://" and the right-side string "//:[s]ptth", right? That should be reasonably easy to check, and would be a good heuristic. I suppose there are non-HTTP schemes that might be troublesome also. Some that come to mind are: ftp, file, imap, jabber, nntp, sip, sips, and xmpp. I can't think of a way offhand to abuse mailto: or about:, but I might just be missing it. Hope that helps, ~Chad ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 23:39 ` chad @ 2014-12-01 3:49 ` Eli Zaretskii 2014-12-01 8:01 ` chad 0 siblings, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-01 3:49 UTC (permalink / raw) To: chad; +Cc: larsi, schwab, rms, emacs-devel > From: chad <yandros@gmail.com> > Date: Sun, 30 Nov 2014 15:39:15 -0800 > Cc: Richard Stallman <rms@gnu.org>, > larsi@gnus.org, > schwab@linux-m68k.org, > emacs-devel@gnu.org > > > > On 30 Nov 2014, at 07:20, Eli Zaretskii <eliz@gnu.org> wrote: > > > > I'm sorry, but this is not instrumental: it doesn't specify what > > "misleading" means. We need a detailed spec for that. > > Given things we're already identifying the URL in text, is it > possible/easy to check for a different directionality of any part > of a URL text (including the entire url) compared to the text (not > whitespace) before and after the URL? Yes, but this would only be a sign of trouble if the rest of buffer text is strictly left to right. And even then, there are legitimate URLs that have RTL characters, e.g. in Google queries. So I don't see how this would help. > In order to make phishing-style surprises work, the mal-ordered > text probably wants to have the left-side string "http[s]://" and > the right-side string "//:[s]ptth", right? I don't think we can count on that. Villains might surprise us. This is just one example. But if someone does the research and comes up with such a conclusion, then yes, it makes our job easier. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 3:49 ` Eli Zaretskii @ 2014-12-01 8:01 ` chad 2014-12-01 15:58 ` Eli Zaretskii 2014-12-01 19:17 ` Richard Stallman 0 siblings, 2 replies; 133+ messages in thread From: chad @ 2014-12-01 8:01 UTC (permalink / raw) To: Eli Zaretskii, emacs > On 30 Nov 2014, at 19:49, Eli Zaretskii <eliz@gnu.org> wrote: >> Given things we're already identifying the URL in text, is it >> possible/easy to check for a different directionality of any part >> of a URL text (including the entire url) compared to the text (not >> whitespace) before and after the URL? > > Yes, but this would only be a sign of trouble if the rest of buffer > text is strictly left to right. And even then, there are legitimate > URLs that have RTL characters, e.g. in Google queries. This is a great point. Does this happen often enough that it would be troublesome to add a warning or prompt about it? ~Chad ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 8:01 ` chad @ 2014-12-01 15:58 ` Eli Zaretskii 2014-12-02 14:41 ` Richard Stallman 2014-12-01 19:17 ` Richard Stallman 1 sibling, 1 reply; 133+ messages in thread From: Eli Zaretskii @ 2014-12-01 15:58 UTC (permalink / raw) To: chad; +Cc: emacs-devel > From: chad <yandros@gmail.com> > Date: Mon, 1 Dec 2014 00:01:43 -0800 > > > Yes, but this would only be a sign of trouble if the rest of buffer > > text is strictly left to right. And even then, there are legitimate > > URLs that have RTL characters, e.g. in Google queries. > > This is a great point. Does this happen often enough that it would > be troublesome to add a warning or prompt about it? What happens often enough? Google queries with RTL text? For me, all the time. Here's a random example: https://www.google.co.il/search?q=זאפ+השוואת+מחירים&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a&channel=sb&gfe_rd=cr&ei=g498VJeSJ8Sg8wfD5IDwBQ Note that copy-pasting this from the Firefox's address bar actually pastes this instead: https://www.google.co.il/search?q=%D7%96%D7%90%D7%A4+%D7%94%D7%A9%D7%95%D7%95%D7%90%D7%AA+%D7%9E%D7%97%D7%99%D7%A8%D7%99%D7%9D&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a&channel=sb&gfe_rd=cr&ei=g498VJeSJ8Sg8wfD5IDwBQ which might be something we could consider (I suggested that earlier as one of the possible ways to fight the malicious directional overrides). ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 15:58 ` Eli Zaretskii @ 2014-12-02 14:41 ` Richard Stallman 0 siblings, 0 replies; 133+ messages in thread From: Richard Stallman @ 2014-12-02 14:41 UTC (permalink / raw) To: Eli Zaretskii; +Cc: yandros, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > https://www.google.co.il/search?q= + + &ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a&channel=sb&gfe_rd=cr&ei=g498VJeSJ8Sg8wfD5IDwBQ In this example, one of the arguments is RTL. Maybe we can consider the case where some arguments are RTL to be safe enough, and avoid warning for it. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-12-01 8:01 ` chad 2014-12-01 15:58 ` Eli Zaretskii @ 2014-12-01 19:17 ` Richard Stallman 1 sibling, 0 replies; 133+ messages in thread From: Richard Stallman @ 2014-12-01 19:17 UTC (permalink / raw) To: chad; +Cc: eliz, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > Yes, but this would only be a sign of trouble if the rest of buffer > > text is strictly left to right. And even then, there are legitimate > > URLs that have RTL characters, e.g. in Google queries. > This is a great point. Does this happen often enough that it would > be troublesome to add a warning or prompt about it? Maybe it is enough to ensure that the host name is not confused. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
* Re: Bidirectional text and URLs 2014-11-30 15:20 ` Eli Zaretskii 2014-11-30 23:39 ` chad @ 2014-12-01 10:18 ` Richard Stallman 1 sibling, 0 replies; 133+ messages in thread From: Richard Stallman @ 2014-12-01 10:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, schwab, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > We need to have a good instrumental definition of "bidi strangeness" > > > for that. > > > > I suggest the definition: whatever would cause the displayed order of > > characters to be perhaps misleading if the text is interpreted as a > > URL or anything else with programatic significance. > I'm sorry, but this is not instrumental: it doesn't specify what > "misleading" means. We need a detailed spec for that. Yes, my proposal is a first step that needs to be fleshed out. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call. ^ permalink raw reply [flat|nested] 133+ messages in thread
end of thread, other threads:[~2014-12-08 15:46 UTC | newest] Thread overview: 133+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-11-28 2:51 Bidirectional text and URLs Lars Magne Ingebrigtsen 2014-11-28 3:27 ` Stephen J. Turnbull 2014-11-28 14:54 ` Eli Zaretskii 2014-11-29 6:09 ` Stephen J. Turnbull 2014-11-29 8:22 ` Eli Zaretskii 2014-11-29 17:05 ` Richard Stallman 2014-11-29 17:13 ` Lars Magne Ingebrigtsen 2014-11-29 17:49 ` Lars Magne Ingebrigtsen 2014-11-29 17:54 ` Lars Magne Ingebrigtsen 2014-11-29 18:24 ` Eli Zaretskii 2014-11-29 18:29 ` Lars Magne Ingebrigtsen 2014-11-30 9:38 ` Richard Stallman 2014-11-30 15:21 ` Eli Zaretskii 2014-11-29 18:18 ` Eli Zaretskii 2014-11-29 18:33 ` Lars Magne Ingebrigtsen 2014-11-29 18:47 ` Eli Zaretskii 2014-11-29 19:12 ` Andreas Schwab 2014-11-29 19:31 ` Lars Magne Ingebrigtsen 2014-11-29 19:39 ` Andreas Schwab 2014-11-29 20:13 ` Eli Zaretskii 2014-11-30 16:26 ` Lars Magne Ingebrigtsen 2014-11-30 17:29 ` Yuri Khan 2014-11-30 17:57 ` Lars Magne Ingebrigtsen 2014-11-30 18:18 ` Eli Zaretskii 2014-11-30 17:53 ` Eli Zaretskii 2014-11-30 18:13 ` Lars Magne Ingebrigtsen 2014-11-30 19:06 ` Lars Magne Ingebrigtsen 2014-11-30 19:10 ` Lars Magne Ingebrigtsen 2014-11-30 20:41 ` Eli Zaretskii 2014-11-30 19:19 ` Lars Magne Ingebrigtsen 2014-11-30 21:05 ` Eli Zaretskii 2014-11-30 21:36 ` Lars Magne Ingebrigtsen 2014-12-01 3:45 ` Eli Zaretskii 2014-12-01 16:19 ` Lars Magne Ingebrigtsen 2014-12-01 17:39 ` Eli Zaretskii 2014-12-01 17:49 ` Lars Magne Ingebrigtsen 2014-12-01 18:22 ` Eli Zaretskii 2014-12-01 18:28 ` Lars Magne Ingebrigtsen 2014-12-02 14:17 ` Eli Zaretskii 2014-12-02 16:31 ` Lars Magne Ingebrigtsen 2014-12-01 19:15 ` Richard Stallman 2014-12-01 19:15 ` Richard Stallman 2014-12-01 19:34 ` Eli Zaretskii 2014-12-01 20:21 ` Eli Zaretskii 2014-12-01 20:30 ` David Kastrup 2014-12-01 20:45 ` Eli Zaretskii 2014-12-02 14:45 ` Richard Stallman 2014-12-02 14:45 ` Richard Stallman 2014-12-02 15:03 ` Eli Zaretskii 2014-12-03 8:39 ` Richard Stallman 2014-12-03 17:39 ` Eli Zaretskii 2014-12-04 9:41 ` Eli Zaretskii 2014-12-05 11:16 ` Richard Stallman 2014-12-05 11:28 ` Eli Zaretskii 2014-12-05 22:43 ` Richard Stallman 2014-12-05 23:15 ` Eli Zaretskii 2014-12-06 12:06 ` Richard Stallman 2014-12-06 12:59 ` Eli Zaretskii 2014-12-05 22:43 ` Richard Stallman 2014-12-05 23:17 ` Eli Zaretskii 2014-12-06 12:06 ` Richard Stallman 2014-12-02 14:44 ` Richard Stallman 2014-12-02 15:00 ` Eli Zaretskii 2014-12-03 8:39 ` Richard Stallman 2014-11-30 9:38 ` Richard Stallman 2014-11-30 15:27 ` Eli Zaretskii 2014-12-01 10:17 ` Richard Stallman 2014-12-01 16:17 ` Eli Zaretskii 2014-12-02 14:42 ` Richard Stallman 2014-12-02 14:48 ` Eli Zaretskii 2014-12-03 8:38 ` Richard Stallman 2014-12-03 11:56 ` Nicolas Richard 2014-12-03 17:12 ` Richard Stallman 2014-12-03 17:38 ` Eli Zaretskii 2014-12-04 14:30 ` Richard Stallman 2014-12-04 15:53 ` Stefan Monnier 2014-12-04 17:30 ` Eli Zaretskii 2014-12-04 20:25 ` Paul Eggert 2014-12-02 14:42 ` Richard Stallman 2014-12-02 14:52 ` Eli Zaretskii 2014-12-02 18:05 ` Eli Zaretskii 2014-12-03 17:13 ` Richard Stallman 2014-12-03 18:14 ` Eli Zaretskii 2014-12-05 22:44 ` Richard Stallman 2014-12-05 23:19 ` Eli Zaretskii 2014-12-07 9:20 ` Richard Stallman 2014-12-07 15:50 ` Eli Zaretskii 2014-12-08 0:26 ` Richard Stallman 2014-12-08 15:46 ` Eli Zaretskii 2014-12-03 17:13 ` Richard Stallman 2014-12-03 17:13 ` Richard Stallman 2014-11-29 17:14 ` Ted Zlatanov 2014-11-30 13:42 ` Stephen J. Turnbull 2014-11-30 15:36 ` Eli Zaretskii 2014-12-01 10:18 ` Richard Stallman 2014-12-01 16:18 ` Eli Zaretskii 2014-12-01 18:32 ` Stephen J. Turnbull 2014-12-01 19:12 ` Eli Zaretskii 2014-12-01 20:08 ` Stephen J. Turnbull 2014-12-01 20:42 ` Eli Zaretskii 2014-12-02 14:42 ` Richard Stallman 2014-12-02 14:54 ` Eli Zaretskii 2014-12-03 8:39 ` Richard Stallman 2014-11-28 11:19 ` Ted Zlatanov 2014-11-28 13:58 ` Lars Magne Ingebrigtsen 2014-11-28 19:49 ` Ted Zlatanov 2014-11-28 21:02 ` Stefan Monnier 2014-11-29 0:26 ` Ted Zlatanov 2014-11-28 22:26 ` Eli Zaretskii 2014-11-28 14:24 ` Stefan Monnier 2014-11-28 14:57 ` Eli Zaretskii 2014-11-29 6:17 ` Stephen J. Turnbull 2014-11-28 14:45 ` Eli Zaretskii 2014-11-28 17:09 ` Richard Stallman 2014-11-28 18:28 ` Eli Zaretskii 2014-11-29 17:03 ` Richard Stallman 2014-11-29 17:06 ` Eli Zaretskii 2014-11-30 9:37 ` Richard Stallman 2014-11-30 15:16 ` Eli Zaretskii 2014-12-01 10:18 ` Richard Stallman 2014-12-01 16:02 ` Eli Zaretskii 2014-11-28 19:28 ` Andreas Schwab 2014-11-29 17:04 ` Richard Stallman 2014-11-29 17:11 ` Eli Zaretskii 2014-11-30 9:38 ` Richard Stallman 2014-11-30 15:20 ` Eli Zaretskii 2014-11-30 23:39 ` chad 2014-12-01 3:49 ` Eli Zaretskii 2014-12-01 8:01 ` chad 2014-12-01 15:58 ` Eli Zaretskii 2014-12-02 14:41 ` Richard Stallman 2014-12-01 19:17 ` Richard Stallman 2014-12-01 10:18 ` Richard Stallman
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.