From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Long lines and bidi Date: Sat, 09 Feb 2013 12:01:46 +0200 Message-ID: <83mwvd7qlx.fsf@gnu.org> References: <877gmp5a04.fsf@ed.ac.uk> <83vca89izh.fsf@gnu.org> <5110906D.7020406@yandex.ru> <83fw1aac3d.fsf@gnu.org> <51120360.4060104@yandex.ru> <51127363.5030203@yandex.ru> <834nhp9u9j.fsf@gnu.org> <5114FEBB.8020201@yandex.ru> <838v6y99wk.fsf@gnu.org> <836222983u.fsf@gnu.org> <51152A00.6070101@yandex.ru> <83y5ey7npl.fsf@gnu.org> <5115C3BC.8020203@cs.ucla.edu> <83txpl7u3w.fsf@gnu.org> <5116113D.5070707@cs.ucla.edu> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1360404120 16842 80.91.229.3 (9 Feb 2013 10:02:00 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 9 Feb 2013 10:02:00 +0000 (UTC) Cc: dmantipov@yandex.ru, emacs-devel@gnu.org To: Paul Eggert Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Feb 09 11:02:21 2013 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1U47GG-0004YY-OX for ged-emacs-devel@m.gmane.org; Sat, 09 Feb 2013 11:02:20 +0100 Original-Received: from localhost ([::1]:46497 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U47Fx-0001Xx-01 for ged-emacs-devel@m.gmane.org; Sat, 09 Feb 2013 05:02:01 -0500 Original-Received: from eggs.gnu.org ([208.118.235.92]:35202) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U47Ft-0001XC-FJ for emacs-devel@gnu.org; Sat, 09 Feb 2013 05:01:59 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1U47Fr-0000bx-Qu for emacs-devel@gnu.org; Sat, 09 Feb 2013 05:01:57 -0500 Original-Received: from mtaout20.012.net.il ([80.179.55.166]:58749) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U47Fr-0000bc-J1 for emacs-devel@gnu.org; Sat, 09 Feb 2013 05:01:55 -0500 Original-Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0MHY008006GGEE00@a-mtaout20.012.net.il> for emacs-devel@gnu.org; Sat, 09 Feb 2013 12:01:47 +0200 (IST) Original-Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0MHY008CV6IW3S80@a-mtaout20.012.net.il>; Sat, 09 Feb 2013 12:01:44 +0200 (IST) In-reply-to: <5116113D.5070707@cs.ucla.edu> X-012-Sender: halo1@inter.net.il X-detected-operating-system: by eggs.gnu.org: Solaris 10 X-Received-From: 80.179.55.166 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:156917 Archived-At: > Date: Sat, 09 Feb 2013 01:05:01 -0800 > From: Paul Eggert > CC: dmantipov@yandex.ru, emacs-devel@gnu.org > > On 02/09/2013 12:46 AM, Eli Zaretskii wrote: > > > 25% faster is still terribly slow for redisplay. > > Yes, as I said, it doesn't solve the performance problem. > Still, it doesn't complicate the code, and it significantly > improves speed in code likely to be executed often, so it > seems worth doing in its own right. I suspect that the use case that makes scan_buffer so high on the profile is very much skewed. My crystal ball says that the file in question was one very long paragraph, or at least had many-many _thousands_ of lines between empty lines that delimit paragraphs. scan_buffer is high on the profile because the bidi.c code tries to find the beginning of a paragraph, which determines the base direction of the paragraph, which in turn determines how the text should be reordered for display. By contrast, most real-life files have much less text between empty lines, so scan_buffer will not be at any prominent place in the profile. But redisplay of a buffer with very long lines will still be awfully slow, even if there's an empty line between every 2 long lines, although scan_buffer will no longer be a factor. OTOH, if you create a file with a single long paragraph, but whose lines have "normal" width, like 100 characters, redisplay will perform adequately, even though scan_buffer will be heavily used. (It would be interesting to see a profile for that, btw.) IOW, the solution in bidi.c for extremely long paragraphs is optimized for the 99% of use cases, where lines are not too long, i.e. for those cases where the old unidirectional display engine gave reasonable performance. Dmitry's use case, OTOH, is skewed on several counts: . it uses extremely long lines . it uses too many neutral/weak characters . it uses extremely long paragraphs This simultaneously hits on several unrelated weaknesses of the current display engine, with the result that the profile is a combination of at least 3 different reasons for slow-down, which makes it very hard to analyze the results and look for solutions. That is why I think we should attack this problem one reason at a time. The most important reason is the first one: long lines cause the display code traverse too much of buffer text. This is why you see x_produce_glyphs so high on the profile in the unidirectional case: it examines too many characters, much more than what will be actually displayed on the screen. Solve this problem, and the 2nd one will simply disappear without a trace, because it is at least linear in the number of scanned characters. If the 3rd problem is still a factor, after the 1st one is gone, we can tune the current optimization at that time.