From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: bidi-display-reordering is now non-nil by default Date: Thu, 18 Aug 2011 11:21:03 +0300 Message-ID: <83liurruz4.fsf@gnu.org> References: <4E48D309.6050503@acdlabs.ru> <83hb5jujjs.fsf@gnu.org> <874o1j10zv.fsf@fencepost.gnu.org> <8362lyvcli.fsf@gnu.org> <87fwl2r0l4.fsf@stupidchicken.com> <83zkjatnkz.fsf@gnu.org> <877h6et8oi.fsf@stupidchicken.com> <83vctxua2y.fsf@gnu.org> <87r54le4rd.fsf@stupidchicken.com> <8362lxtfeb.fsf@gnu.org> <87d3g56llz.fsf@stupidchicken.com> <8339h0tur0.fsf@gnu.org> <87ippvwtwx.fsf@stupidchicken.com> Reply-To: Eli Zaretskii NNTP-Posting-Host: lo.gmane.org X-Trace: dough.gmane.org 1313655680 19440 80.91.229.12 (18 Aug 2011 08:21:20 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 18 Aug 2011 08:21:20 +0000 (UTC) Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org To: Chong Yidong Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Aug 18 10:21:16 2011 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Qtxqi-0003fV-HJ for ged-emacs-devel@m.gmane.org; Thu, 18 Aug 2011 10:21:12 +0200 Original-Received: from localhost ([::1]:55591 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qtxqh-0005RZ-GS for ged-emacs-devel@m.gmane.org; Thu, 18 Aug 2011 04:21:11 -0400 Original-Received: from eggs.gnu.org ([140.186.70.92]:58629) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qtxqe-0005RF-0L for emacs-devel@gnu.org; Thu, 18 Aug 2011 04:21:09 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Qtxqb-0005Tv-W3 for emacs-devel@gnu.org; Thu, 18 Aug 2011 04:21:07 -0400 Original-Received: from mtaout22.012.net.il ([80.179.55.172]:54725) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qtxqb-0005TL-Js for emacs-devel@gnu.org; Thu, 18 Aug 2011 04:21:05 -0400 Original-Received: from conversion-daemon.a-mtaout22.012.net.il by a-mtaout22.012.net.il (HyperSendmail v2007.08) id <0LQ400H0073NPM00@a-mtaout22.012.net.il> for emacs-devel@gnu.org; Thu, 18 Aug 2011 11:21:03 +0300 (IDT) Original-Received: from HOME-C4E4A596F7 ([77.126.168.102]) by a-mtaout22.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0LQ400FWD770BNI0@a-mtaout22.012.net.il>; Thu, 18 Aug 2011 11:21:01 +0300 (IDT) In-reply-to: <87ippvwtwx.fsf@stupidchicken.com> X-012-Sender: halo1@inter.net.il X-detected-operating-system: by eggs.gnu.org: Solaris 10 (beta) X-Received-From: 80.179.55.172 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:143403 Archived-At: > From: Chong Yidong > Date: Wed, 17 Aug 2011 18:32:46 -0400 > Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org > > Eli Zaretskii writes: > > > I'm afraid making the reordering engine aware of all text properties > > will considerably slow down redisplay, due to the need to check > > character properties very frequently. It also runs a high risk of > > completely blending the reordering code with the display engine, which > > will make them both very hard to maintain; currently, they are clearly > > separated. > > No, the lookup would be done at the redisplay engine level, not the > reordering engine level: add a new entry in it_props[] for handling a > (say) `bidi-override' text property. Emacs would process this during > the step in redisplay where it handles other properties (like faces and > invisibility), and record the information into the iterator. The bidi > code would take it from there. This won't work, not with the way the reordering engine is currently integrated with redisplay. The reason is that above the reordering level, the iteration through buffer text is non-linear. Your suggestion assumes that the redisplay iterator will bump into this new text property _before_ it processes the text which follows it. But this assumption is false because of the non-linear scanning of the buffer text. Let me show an example to illustrate how the bidirectional display handles text properties. Suppose you have the following buffer text (as usual, capital letters mean R2L characters): 000000111110000 abcde ABCDE xyz ^ ^ The number above each character shows the text properties of the characters; 0 means no properties, 1 means some specific property. This example shows only one property, spanning only the R2L characters; the real-life examples can be much more complex. The '^' characters below show the "stop positions" computed by the iterator -- those are the buffer positions where display engine should process the text property by calling one or more handlers in the it_props[] array, filling the iterator with attributes necessary for displaying the text until the next "stop position". To move from the blank character between `e' and `A' to the next character in visual order, the display iterator calls the reordering engine. When it does that, the first (leftmost) "stop position" was not yet acted upon, because the current iterator position is smaller than that stop. When the call to the reordering engine returns, it sets the iterator position at `E', since the ABCDE part should be displayed as EDCBA on the screen. Oops! we just missed the "stop position". What happens next is the redisplay engine realizes that the stop position was missed, so it scans back to find the last "stop position" preceding `E' (since there could be other text properties or overlays in-between), and then handles it using the handlers in it_props[]; see handle_stop_backwards for how this is done. Then it can deliver `E' with the right attributes, and continue delivering all the successive characters, until it crosses some "stop position" again, either going forward or backward. This is why it won't work to control reordering with text properties: by the time the redisplay engine realizes that there's another text property to apply, a crucial part of reordering has already happened. The bidi_it structure that is part of the iterator already has all the information about reordering of "ABCDE", having scanned it all inside a single call to bidi_move_to_visually_next. That scan entirely ignores all text properties except one: the `display' property, and then only if its value will cause the covered text to be replaced by something else, like an image or a string. It would be possible, of course, to have the handler of the `bidi-override' property to toss all the reordering information, reposition to before `A' and start anew. But that's a terrible waste of cycles, especially if the text covered by that property is not so short. The waste is not only in that we will have to throw away information we already gathered at some cost, but also because repositioning the iterator to an arbitrary place means we need to restart the bidi iteration from the beginning of the line in order to have the correct state of the bidi iterator needed to continue from that place; see get_visually_first_element for the details. > >> Then it should be easy to exploit font-lock to give reasonably correct > >> bidi segmentation, e.g. by treating font-lock-comment-face and > >> font-lock-string-face boundaries as bidi segmentation boundaries. > > > > We should be very careful with reusing font-lock as basis for > > reordering, because the user has too much knobs to control font-lock. > > For example, few of the font-lock features speed up redisplay by > > deferring fontification to a later time. With font-lock, this just > > displays text in the default face; with reordering, it will flush > > incorrectly rendered text for a perceptible amount of time. I'm not > > sure it's a good idea. > > The fundamental issue is that correctly segmenting source code requires > knowledge of the underlying syntax. Sure, it's possible to come up with > some hacks that "mostly" work, but font lock is already there, so we > ought to try to use it first. Font-lock just uses regexps and syntax tables. Everything else in font-lock is meant to avoid the annoyingly long delay it takes to fully fontify a large buffer. What I'm saying is that, apart from using regexps and syntax tables, the considerations and trade-offs that are valid for font-lock are not necessarily valid for bidirectional display. > For this reason, I'm not about concerned about the deferred > fontification issue: if you want Emacs to segment properly, you'd want > it to do an amount of work equivalent to font-lock anyway. Amount of work is the least of my concerns in this regard. I'm worried about the effect the temporarily incorrect display will have on users.