From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Long lines and bidi [Was: Re: bug#13623: ...] Date: Fri, 08 Feb 2013 16:07:23 +0200 Message-ID: <838v6y99wk.fsf@gnu.org> References: <877gmp5a04.fsf@ed.ac.uk> <83vca89izh.fsf@gnu.org> <5110906D.7020406@yandex.ru> <83fw1aac3d.fsf@gnu.org> <51120360.4060104@yandex.ru> <51127363.5030203@yandex.ru> <834nhp9u9j.fsf@gnu.org> <5114FEBB.8020201@yandex.ru> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1360332500 28856 80.91.229.3 (8 Feb 2013 14:08:20 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 8 Feb 2013 14:08:20 +0000 (UTC) Cc: emacs-devel@gnu.org To: Dmitry Antipov Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Feb 08 15:08:40 2013 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1U3od1-0005r0-Uw for ged-emacs-devel@m.gmane.org; Fri, 08 Feb 2013 15:08:36 +0100 Original-Received: from localhost ([::1]:46225 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U3oci-0003PJ-MP for ged-emacs-devel@m.gmane.org; Fri, 08 Feb 2013 09:08:16 -0500 Original-Received: from eggs.gnu.org ([208.118.235.92]:58936) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U3oca-0003MJ-V4 for emacs-devel@gnu.org; Fri, 08 Feb 2013 09:08:14 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1U3ocZ-0007tu-5r for emacs-devel@gnu.org; Fri, 08 Feb 2013 09:08:08 -0500 Original-Received: from mtaout20.012.net.il ([80.179.55.166]:32980) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U3ocY-0007tg-P9 for emacs-devel@gnu.org; Fri, 08 Feb 2013 09:08:07 -0500 Original-Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0MHW00K00N7JZO00@a-mtaout20.012.net.il> for emacs-devel@gnu.org; Fri, 08 Feb 2013 16:07:04 +0200 (IST) Original-Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0MHW00KJSN7RXR20@a-mtaout20.012.net.il>; Fri, 08 Feb 2013 16:07:04 +0200 (IST) In-reply-to: <5114FEBB.8020201@yandex.ru> X-012-Sender: halo1@inter.net.il X-detected-operating-system: by eggs.gnu.org: Solaris 10 X-Received-From: 80.179.55.166 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:156884 Archived-At: > Date: Fri, 08 Feb 2013 17:33:47 +0400 > From: Dmitry Antipov > CC: Emacs development discussions > > On 02/06/2013 10:23 PM, Eli Zaretskii wrote: > > > Another area of redisplay optimizations would be the infamous > > very-long-lines use case. (Personally, I think this one is the single > > most important deficiency in the current display engine, by far more > > important than any other display problem.) > > I tried to scroll (down from the beginning and then up from the end) the > very pathological file (~150M with just ~500 lines) and got the following > profile: Profile alone is not enough. Please tell how did you "scroll", exactly (which commands did you use), and please also show the absolute times it took to perform each command. > 8.59% emacs emacs [.] bidi_resolve_weak What was in the file? bidi_resolve_weak high on the profile hints that it was full of punctuation or digits or banks, which is not really an interesting case. > 7.92% emacs emacs [.] bidi_level_of_next_char > 7.81% emacs emacs [.] get_next_display_element > 7.12% emacs emacs [.] move_it_in_display_line_to > 6.96% emacs emacs [.] x_produce_glyphs > 5.06% emacs libc-2.16.so [.] __memcpy_ssse3_back > 4.56% emacs emacs [.] next_element_from_buffer > 4.38% emacs emacs [.] bidi_move_to_visually_next > 4.26% emacs emacs [.] scan_buffer > 3.04% emacs libXft.so.2.3.1 [.] XftCharIndex > 2.93% emacs emacs [.] bidi_fetch_char > 2.67% emacs emacs [.] bidi_cache_iterator_state > 2.61% emacs emacs [.] lookup_glyphless_char_display > 2.47% emacs libXft.so.2.3.1 [.] XftGlyphExtents > 2.35% emacs emacs [.] bidi_resolve_neutral > 1.95% emacs emacs [.] bidi_get_type > 1.86% emacs emacs [.] detect_coding > 1.70% emacs emacs [.] produce_chars > 1.50% emacs emacs [.] bidi_resolve_explicit_1 > 1.18% emacs emacs [.] get_per_char_metric > 1.13% emacs emacs [.] bidi_cache_search.constprop.4 > 1.01% emacs emacs [.] xftfont_text_extents > 0.90% emacs emacs [.] bidi_explicit_dir_char > 0.88% emacs emacs [.] bidi_resolve_explicit > ... > > So the first question is: is it feasible/possible/desirable to detect that > the buffer has no R2L text at all and automatically force bidi-paragraph-direction > to left-to-right and bidi-display-reordering to nil? Ah, _that_ red herring... Why is that the first question? What were the times with and without bidi-display-reordering in this file? In my testing, the display engine performs awfully slow in both cases, so even though turning off reordering makes it faster, it is still so terribly slow that the problem is not going to be solved by that. As to your question: how can we know what characters are or aren't in the buffer without scanning it? And scanning the buffer is exactly what bidi.c does. As to bidi-paragraph-direction, the detection of the paragraph direction is turned off for long paragraphs anyway. Again, does setting bidi-paragraph-direction to left-to-right give you reasonable performance in that file? If not, this is just another red herring. Anyway, I think this is the wrong way to try to find the solution. The problem is not that scanning is slower with the bidi display. (If it were, we would see terribly slow performance with "normal" files as well.) The problem is that _we_scan_too_many_characters_. See this part of the profile: > 7.12% emacs emacs [.] move_it_in_display_line_to The display routines of the move_it_* family, which are heavily used in scrolling, cursor movement, and just about any display operation, _always_ scan each line from the beginning to the end, before they get to the next line. When each line is very long, those scans are very expensive. The way to make display significantly faster for long lines is to avoid scanning entire lines. The problem is how to do that without losing accuracy, e.g., without missing characters that affect the line metrics. IOW, our problem is to find clever algorithms and provide supporting data structures for those algorithms, so that we could avoid scanning very long lines in their entirety each time we need to move the cursor. When we find these algorithms and code them, the bidi "problem" will disappear without a trace.