From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: Re: Compositions and bidi display Date: Mon, 03 May 2010 11:39:24 +0900 Message-ID: References: <3A521851-F7CC-45DB-A2ED-8348EF96D5CF@Freenet.DE> <83fx2q5w86.fsf@gnu.org> <834oj22e96.fsf@gnu.org> <837hnuys42.fsf@gnu.org> <83mxwoxo1t.fsf@gnu.org> <83d3xjxys1.fsf@gnu.org> <83tyqtwh7z.fsf@gnu.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1272854208 23808 80.91.229.12 (3 May 2010 02:36:48 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 3 May 2010 02:36:48 +0000 (UTC) Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon May 03 04:36:46 2010 connect(): No such file or directory Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O8lWT-0002Kd-IN for ged-emacs-devel@m.gmane.org; Mon, 03 May 2010 04:36:42 +0200 Original-Received: from localhost ([127.0.0.1]:36824 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1O8lWS-0004T9-Uk for ged-emacs-devel@m.gmane.org; Sun, 02 May 2010 22:36:40 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1O8lWO-0004Su-L7 for emacs-devel@gnu.org; Sun, 02 May 2010 22:36:36 -0400 Original-Received: from [140.186.70.92] (port=54166 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1O8lWM-0004Sm-8u for emacs-devel@gnu.org; Sun, 02 May 2010 22:36:35 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1O8lWJ-0008MO-Vb for emacs-devel@gnu.org; Sun, 02 May 2010 22:36:34 -0400 Original-Received: from mx1.aist.go.jp ([150.29.246.133]:42545) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O8lWF-0008Lx-LG; Sun, 02 May 2010 22:36:28 -0400 Original-Received: from rqsmtp2.aist.go.jp (rqsmtp2.aist.go.jp [150.29.254.123]) by mx1.aist.go.jp with ESMTP id o432aMRh013857; Mon, 3 May 2010 11:36:22 +0900 (JST) env-from (handa@m17n.org) Original-Received: from smtp2.aist.go.jp by rqsmtp2.aist.go.jp with ESMTP id o432aMBY016802; Mon, 3 May 2010 11:36:22 +0900 (JST) env-from (handa@m17n.org) Original-Received: by smtp2.aist.go.jp with ESMTP id o432aLgB020652; Mon, 3 May 2010 11:36:21 +0900 (JST) env-from (handa@m17n.org) Original-Received: from handa by etlken with local (Exim 4.69) (envelope-from ) id 1O8lZ6-0000zb-He; Mon, 03 May 2010 11:39:24 +0900 In-Reply-To: <83tyqtwh7z.fsf@gnu.org> (message from Eli Zaretskii on Fri, 30 Apr 2010 10:08:00 +0300) X-detected-operating-system: by eggs.gnu.org: Solaris 9 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:124458 Archived-At: In article <83tyqtwh7z.fsf@gnu.org>, Eli Zaretskii writes: > > From: Kenichi Handa > > Cc: emacs-devel@gnu.org > > Date: Fri, 30 Apr 2010 15:06:11 +0900 > > > > In the case of "english HEBREW TEXT text" (lowercases are > > l2r characters, upppercases are r2l characters), > > get_next_display_element starts from the first "e" and > > proceeds to the first " " (stage 1), then jumps to the last > > "T" and proceeds back to the first "H" (stage 2), then jumps > > to the last " " and proceeds to the last "t" (stage 3). > This is only the simplest case, with just 2 embedding levels: the base > level of the paragraph, and the (higher) level of the embedded R2L > text. The general case is much more complex: there could be up to 60 > nested levels, and some of them could begin or end at the same buffer > position. bidi.c handles all this complexity by means of a very > simple algorithm, but that algorithm needs to know a lot about the > characters traversed so far. I don't think exposing all these > internals to xdisp.c is a good idea. Just exposing (or creating) one function that tells where the current bidi-run ends is enough. Is it that difficult? > > Note that composition_compute_stop_pos just finds a stop > > position to check, and the actual checking and composing is > > done by composition_reseat_it which is called by > > CHAR_COMPOSED_P. > Right, but the same is true for the bidi iteration: I need only to > know when to check for composition; the actual composing will be still > done by composition_reseat_it. I just cannot assume that I always > move linearly forward in the buffer. Therefore, it is not enough to > have only the next stop position recorded in the iterator. I need > more information recorded. What I'm trying to determine in this > thread is what needs to be recorded and how to compute what's needed. > Thanks for helping me. I don't understand the logic of "Therefore" in the above paragraph. > > Isn't it possible to record where the current bidi-run > > started while you scan a buffer in > > bidi_get_next_char_visually? > See above: it's tricky. The function in bidi.c that looks for the > beginning and end of a level run relies on almost all the other > functions in bidi.c, and it does that on the fly. The level edges are > not recorded anywhere, except in an internal cache used to speed up > moving back in the buffer. Then, what we need is a function that return the value of that cache. > > > If MAX_AUTO_COMPOSITION_LOOKBACK is not the right number, then how > > > long can a composition sequence be? > > > > It is MAX_COMPOSITION_COMPONENTS (16), but here it's not > > relevant. > Why not? Isn't it true that if none of the 16 characters preceding > the current position can start a composition sequence, then the > current position is not inside a composition sequence? It's true, but how does it contribute to find where to check a composition next time? > > > Another idea would be to call composition_compute_stop_pos repeatedly, > > > starting from the last cmp_it->stop_pos, until we find the last > > > stop_pos before the current iterator position, then compute the > > > beginning and end of the composable sequence at that position, and > > > record it in the iterator. Then we handle the composition when we > > > enter the sequence from either end. > > > > To move from one composition position to the next, we must > > actually call autocmp_chars and find where the current > > composition ends, then start searching for the next > > composition. As autocmp_chars calls Lisp and all functions > > to compose characters, it's so inefficient to call it > > repeatedly just to find the last one. > If the buffer or string is full of composed characters, then yes, it > would be a slowdown. Especially if the number of ``suspect'' stop > positions is much larger than the number of actual composition > sequences. But what else can be done, given the design of the > compositions that doesn't let us know the sequence length without > actually composing the character? Isn't it faster to call bidi_get_next_char_visually repeatedly. At least it doesn't call Lisp. And, aren't there any possibility in the current bidi code to provide a function that gives the information I'm asking? --- Kenichi Handa handa@m17n.org