From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Stephen J. Turnbull" Newsgroups: gmane.emacs.devel Subject: Re: bidi-display-reordering is now non-nil by default Date: Fri, 05 Aug 2011 12:38:21 +0900 Message-ID: <87bow4h6j6.fsf@uwakimon.sk.tsukuba.ac.jp> References: <20110731.082721.451360942.wl@gnu.org> <20110731.085115.40009301.wl@gnu.org> <877h6yanje.fsf@fencepost.gnu.org> <878vre95g3.fsf@fencepost.gnu.org> <87fwlm7fam.fsf@fencepost.gnu.org> <87bowa7dza.fsf@fencepost.gnu.org> <877h6y7chn.fsf@fencepost.gnu.org> <831ux6cv5o.fsf@gnu.org> <87d3gpku3o.fsf@gnus.org> <834o1ypa2b.fsf@gnu.org> <87sjphhnbj.fsf@uwakimon.sk.tsukuba.ac.jp> <87k4ath4rd.fsf@uwakimon.sk.tsukuba.ac.jp> <87ipqdgu1e.fsf@uwakimon.sk.tsukuba.ac.jp> <87fwlhglqy.fsf@uwakimon.sk.tsukuba.ac.jp> <83ty9xnkcu.fsf@gnu.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1312515510 10282 80.91.229.12 (5 Aug 2011 03:38:30 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Fri, 5 Aug 2011 03:38:30 +0000 (UTC) Cc: larsi@gnus.org, list-general@mohsen.1.banan.byname.net, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Aug 05 05:38:25 2011 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1QpBEt-0002Qh-S4 for ged-emacs-devel@m.gmane.org; Fri, 05 Aug 2011 05:38:24 +0200 Original-Received: from localhost ([::1]:56669 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QpBEt-0007eB-3R for ged-emacs-devel@m.gmane.org; Thu, 04 Aug 2011 23:38:23 -0400 Original-Received: from eggs.gnu.org ([140.186.70.92]:55285) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QpBEq-0007dZ-1i for emacs-devel@gnu.org; Thu, 04 Aug 2011 23:38:21 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QpBEp-0004hF-5b for emacs-devel@gnu.org; Thu, 04 Aug 2011 23:38:20 -0400 Original-Received: from mgmt1.sk.tsukuba.ac.jp ([130.158.97.223]:38442) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QpBEn-0004gk-B8; Thu, 04 Aug 2011 23:38:17 -0400 Original-Received: from uwakimon.sk.tsukuba.ac.jp (uwakimon.sk.tsukuba.ac.jp [130.158.99.156]) by mgmt1.sk.tsukuba.ac.jp (Postfix) with ESMTP id 1F6793FA0709; Fri, 5 Aug 2011 12:38:13 +0900 (JST) Original-Received: by uwakimon.sk.tsukuba.ac.jp (Postfix, from userid 1000) id 9DDB81A26F8; Fri, 5 Aug 2011 12:38:21 +0900 (JST) In-Reply-To: <83ty9xnkcu.fsf@gnu.org> X-Mailer: VM 8.1.93a under 21.5 (beta31) "ginger" cd1f8c4e81cd XEmacs Lucid (x86_64-unknown-linux) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 130.158.97.223 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:142893 Archived-At: Eli Zaretskii writes: > They are not irrelevant. What you suggest runs the risk of adding or > removing LRM/RLM characters to/from a file against user > expectations. Sure, but byte-level equality is not part of that; character-level equality is. > Again, what if the user inserts another LRM? Insert another non-character "marker" in the buffer, using whichever non-character strategy were using. > > What's wrong with reparsing the buffer from the beginning, treating > > each change of value of the direction property as insertion of the > > appropriate direction mark? > > Reparsing the whole buffer upon each insertion? Is that the way to > make redisplay fast and efficient? No, that's a proof that it's *possible*, where your words claim it's *im*possible. Making it fast is a SMOP. You say it's beyond you, and that probably means it's beyond anybody competent enough in bidi to do the implementation. But let's not discourage anyone from trying. ;-) > How do you indicate them, exactly? Emacs has no features, except > again text properties, to indicate something like that. In any case, > isn't it beginning to sound more and more complicated? Sure. And the presence of non-graphic characters in the buffer is going to make other code more complicated. The question is, which complexity is preferable? You've made your choice, and Emacs has a bidi implementation. That's good, very good. Nevertheless, I am going to reserve judgment. > > But if that doesn't work, I don't see how having explicit mark > > characters in the buffer can work either. > > Explicit marks work because the reordering algorithm does TRT with > them, whether they are redundant or not. It doesn't care. By not > caring it makes it very easy to preserve the byte stream and not risk > changing it behind user's back. The algorithm will be the same, except that it needs to work with a "virtual" stream where some characters are not present in the buffer. This is no different from handling faces, which *could* be represented as characters in the buffer (and *are* in HTML, for example -- which of course has been deprecated in favor of CSS! Hmm... :-). The necessary buffering is a relatively small amount of complexity compared to the bidi algorithm itself. > The _value_ doesn't matter. It's the property symbol that cannot be > the same in overlapping regions, unless the values are identical. Of course the value matters. A 'direction property with a sequence value can encode the whole stack, up to 61 levels. Again, I wouldn't want to maintain that design (space-inefficiency and the question of consistency of neighboring regions are killers, I think), but there are surely lighter-weight, more efficient designs. > > Or you could simply replace the directional marks with a string on > > the preceding non-mark character containing the mark characters that > > were present in the source. > > And then move that string when text is inserted after the preceding > non-mark character, or that character is deleted, yes? Sounds like > fun. Put that way, not at all. But you know what? Emacs has long ago solved such problems, at least most of them. IIUC, in XEmacs, this could easily be implemented with a zero-length extent with appropriate stickiness attributes. If Emacs doesn't already have such a device, it would be easy enough to add a marker-with-direction extension by maintaining a hashtable with keys markers and values mark characters. Not terribly efficient, of course, but proof of concept. > Using explicit marks does have its drawbacks, but they are minor and > mostly just need to get used to. There is way too much about Emacs that users (and developers!) "mostly just need to get used to." :-( Whether we can do much about it, I don't know. But I'm not going to give up yet. :-) Thank you very much for taking the time out to explain your reasons for your design choices. I have a much better grasp of the practical issues involved in implementing bidi in Emacsen now.