From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: Re: Compositions and bidi display Date: Fri, 30 Apr 2010 15:06:11 +0900 Message-ID: References: <3A521851-F7CC-45DB-A2ED-8348EF96D5CF@Freenet.DE> <83fx2q5w86.fsf@gnu.org> <834oj22e96.fsf@gnu.org> <837hnuys42.fsf@gnu.org> <83mxwoxo1t.fsf@gnu.org> <83d3xjxys1.fsf@gnu.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1272607455 1156 80.91.229.12 (30 Apr 2010 06:04:15 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Fri, 30 Apr 2010 06:04:15 +0000 (UTC) Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Apr 30 08:04:14 2010 connect(): No such file or directory Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O7jKg-00032e-0n for ged-emacs-devel@m.gmane.org; Fri, 30 Apr 2010 08:04:14 +0200 Original-Received: from localhost ([127.0.0.1]:39991 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1O7jKe-0000Rw-1O for ged-emacs-devel@m.gmane.org; Fri, 30 Apr 2010 02:04:12 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1O7jKW-0000Ri-HJ for emacs-devel@gnu.org; Fri, 30 Apr 2010 02:04:04 -0400 Original-Received: from [140.186.70.92] (port=56415 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1O7jKU-0000RF-BO for emacs-devel@gnu.org; Fri, 30 Apr 2010 02:04:03 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1O7jKT-0007pM-0I for emacs-devel@gnu.org; Fri, 30 Apr 2010 02:04:02 -0400 Original-Received: from mx1.aist.go.jp ([150.29.246.133]:49187) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O7jKO-0007oU-Vz; Fri, 30 Apr 2010 02:03:57 -0400 Original-Received: from rqsmtp2.aist.go.jp (rqsmtp2.aist.go.jp [150.29.254.123]) by mx1.aist.go.jp with ESMTP id o3U63rbZ008169; Fri, 30 Apr 2010 15:03:53 +0900 (JST) env-from (handa@m17n.org) Original-Received: from smtp2.aist.go.jp by rqsmtp2.aist.go.jp with ESMTP id o3U63rqn025511; Fri, 30 Apr 2010 15:03:53 +0900 (JST) env-from (handa@m17n.org) Original-Received: by smtp2.aist.go.jp with ESMTP id o3U63rBa006356; Fri, 30 Apr 2010 15:03:53 +0900 (JST) env-from (handa@m17n.org) Original-Received: from handa by etlken with local (Exim 4.69) (envelope-from ) id 1O7jMZ-0005HU-S2; Fri, 30 Apr 2010 15:06:11 +0900 In-Reply-To: <83d3xjxys1.fsf@gnu.org> (message from Eli Zaretskii on Wed, 28 Apr 2010 20:38:54 +0300) X-detected-operating-system: by eggs.gnu.org: Solaris 9 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:124340 Archived-At: In article <83d3xjxys1.fsf@gnu.org>, Eli Zaretskii writes: > > The condition should be "until it reaches a character that > > should never be composed with the currently looking > > character". > That is the condition I'm looking for. But how to code it? Is the > code in find_automatic_composition a good starting point? No. The checking of possibility of composing characters at a specific position is done within composition_compute_stop_pos. What we need now is where we should stop searching in composition_compute_stop_pos. In the case of "english HEBREW TEXT text" (lowercases are l2r characters, upppercases are r2l characters), get_next_display_element starts from the first "e" and proceeds to the first " " (stage 1), then jumps to the last "T" and proceeds back to the first "H" (stage 2), then jumps to the last " " and proceeds to the last "t" (stage 3). When composition_compute_stop_pos is called in stage 1, ENDPOS should be the first " " because searching far is useless (we may have to compose some of "TEXT" before composing some of "HEBREW"). When composition_compute_stop_pos is called in stage 2, ENDPOS should be the first "H" because searching far back is useless, and so on. Note that composition_compute_stop_pos just finds a stop position to check, and the actual checking and composing is done by composition_reseat_it which is called by CHAR_COMPOSED_P. But composition_reseat_it also needs ENDPOS because when that funciton finds that there's no need of composition at the stop position, it calls composition_compute_stop_pos to update the next stop position. > > We may be able to simplify that condition to > > "until it reaches a character in the different bidi level > > (or chunk)". > But that could be very far back. Isn't it possible to record where the current bidi-run started while you scan a buffer in bidi_get_next_char_visually? > I would really like to avoid going too far back, just to > find out whether we reached a composition sequence, We don't have to re-calculate ENDPOS each time. It must be updated only when we pass over bidi boundary. Consider the above example case ("english ..."). > because (again AFAIU) the length of most such sequences is > just a few characters. Is it correct that searching back > MAX_AUTO_COMPOSITION_LOOKBACK characters is enough? No. > If MAX_AUTO_COMPOSITION_LOOKBACK is not the right number, then how > long can a composition sequence be? It is MAX_COMPOSITION_COMPONENTS (16), but here it's not relevant. What we need is to find where in a buffer (before the scan reaches ENDPOS) next composition will happen. And, to perform it efficiently, giving a proper ENDPOS is necessary. > Another idea would be to call composition_compute_stop_pos repeatedly, > starting from the last cmp_it->stop_pos, until we find the last > stop_pos before the current iterator position, then compute the > beginning and end of the composable sequence at that position, and > record it in the iterator. Then we handle the composition when we > enter the sequence from either end. To move from one composition position to the next, we must actually call autocmp_chars and find where the current composition ends, then start searching for the next composition. As autocmp_chars calls Lisp and all functions to compose characters, it's so inefficient to call it repeatedly just to find the last one. --- Kenichi Handa handa@m17n.org