From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Long lines and bidi [Was: Re: bug#13623: ...] Date: Fri, 08 Feb 2013 19:04:13 +0200 Message-ID: <83wqui7n5e.fsf@gnu.org> References: <877gmp5a04.fsf@ed.ac.uk> <83vca89izh.fsf@gnu.org> <5110906D.7020406@yandex.ru> <83fw1aac3d.fsf@gnu.org> <51120360.4060104@yandex.ru> <51127363.5030203@yandex.ru> <834nhp9u9j.fsf@gnu.org> <5114FEBB.8020201@yandex.ru> <838v6y99wk.fsf@gnu.org> <51152625.9070301@yandex.ru> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1360343070 18376 80.91.229.3 (8 Feb 2013 17:04:30 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 8 Feb 2013 17:04:30 +0000 (UTC) Cc: emacs-devel@gnu.org To: Dmitry Antipov Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Feb 08 18:04:51 2013 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1U3rNT-0004ML-LN for ged-emacs-devel@m.gmane.org; Fri, 08 Feb 2013 18:04:43 +0100 Original-Received: from localhost ([::1]:37028 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U3rNA-0003sI-Gu for ged-emacs-devel@m.gmane.org; Fri, 08 Feb 2013 12:04:24 -0500 Original-Received: from eggs.gnu.org ([208.118.235.92]:46840) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U3rN7-0003re-4a for emacs-devel@gnu.org; Fri, 08 Feb 2013 12:04:22 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1U3rN1-00041x-Ia for emacs-devel@gnu.org; Fri, 08 Feb 2013 12:04:21 -0500 Original-Received: from mtaout22.012.net.il ([80.179.55.172]:46417) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U3rN1-00041Z-AW for emacs-devel@gnu.org; Fri, 08 Feb 2013 12:04:15 -0500 Original-Received: from conversion-daemon.a-mtaout22.012.net.il by a-mtaout22.012.net.il (HyperSendmail v2007.08) id <0MHW00M00V69NT00@a-mtaout22.012.net.il> for emacs-devel@gnu.org; Fri, 08 Feb 2013 19:04:13 +0200 (IST) Original-Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout22.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0MHW00MJHVF1DL90@a-mtaout22.012.net.il>; Fri, 08 Feb 2013 19:04:13 +0200 (IST) In-reply-to: <51152625.9070301@yandex.ru> X-012-Sender: halo1@inter.net.il X-detected-operating-system: by eggs.gnu.org: Solaris 10 X-Received-From: 80.179.55.172 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:156898 Archived-At: > Date: Fri, 08 Feb 2013 20:21:57 +0400 > From: Dmitry Antipov > CC: emacs-devel@gnu.org > > On 02/08/2013 06:07 PM, Eli Zaretskii wrote: > > > Profile alone is not enough. Please tell how did you "scroll", > > exactly (which commands did you use), and please also show the > > absolute times it took to perform each command. > > (defun scroll-both () > (interactive) > (let ((start (float-time))) > (progn > (dotimes (n 100) (progn (scroll-up) (redisplay))) > (goto-char (point-max)) > (dotimes (n 100) (progn (scroll-down) (redisplay))) > (message "Elapsed %f seconds" (- (float-time) start))))) > > With bidi, ~600 second elapsed, and: > > 25.18% emacs emacs [.] scan_buffer > 7.04% emacs emacs [.] bidi_resolve_weak > 6.47% emacs emacs [.] get_next_display_element > 6.37% emacs emacs [.] bidi_level_of_next_char > 5.14% emacs libc-2.16.so [.] __memcpy_ssse3_back > 5.05% emacs emacs [.] move_it_in_display_line_to > 4.94% emacs emacs [.] x_produce_glyphs > 4.84% emacs libXft.so.2.3.1 [.] XftCharIndex > 3.72% emacs emacs [.] bidi_move_to_visually_next > 3.70% emacs emacs [.] next_element_from_buffer > 2.90% emacs libXft.so.2.3.1 [.] XftGlyphExtents > 2.05% emacs emacs [.] bidi_fetch_char > 2.02% emacs emacs [.] lookup_glyphless_char_display > 2.01% emacs emacs [.] bidi_resolve_neutral > 1.76% emacs emacs [.] bidi_cache_iterator_state > 1.70% emacs emacs [.] bidi_get_type > 1.51% emacs emacs [.] bidi_resolve_explicit_1 > 1.18% emacs libXft.so.2.3.1 [.] XftFontCheckGlyph > 1.12% emacs emacs [.] xftfont_encode_char > 1.01% emacs emacs [.] xftfont_text_extents > > Without bidi, ~230 seconds elapsed, and: This is consistent with my past measurements: (a) disabling bidi makes redisplay faster, but it is still awfully slow (2.3 sec per scroll); (b) bidi iteration is about 2 times slower than the unidirectional one (you get 3 times slower because your buffer is full of weak characters, which make the bidi iterator work harder due to the requirements of the Unicode Bidirectional Algorithm. > I suspect that scroll should be direction-agnostic in theory That theory is wrong. The reason is that functions that move by display lines can only move forward. So moving backward is coded very differently (a.k.a. "slower"). > but both profiled runs shows that scroll-down is much, much slower > than scroll-up (that's why elapsed time is so huge in both cases). That's expected; see also my explanation in a previous mail, which describes what move_it_vertically_backward does. That function is used a lot by scroll-down. > > What was in the file? bidi_resolve_weak high on the profile hints > > that it was full of punctuation or digits or banks, which is not > > really an interesting case. > > Your guess is correct; but I suspect that an average text in human language > contains less punctuations, digits and blanks than the C source code of the > same size :-). An average C code still has only a small fraction of punctuation. Just look at any C file. > > As to your question: how can we know what characters are or aren't in > > the buffer without scanning it? And scanning the buffer is exactly > > what bidi.c does. > > Hm... insert-file-contents tries to detect encoding by looking at first 1K > and last 3K of the file. Why the similar approach isn't applicable to bidi? No. Detecting encoding by a small portion is a heuristic that works only because most every file is encoded consistently. When a file is encoded inconsistently, the result of the above decoding heuristic is horribly wrong, and the consequences for the user are grave. As a recent example, see bug #13505. By contrast, scripts used in a text file do not have to be consistent or uniformly distributed over the file at all. So the probability to get this wrong will be much higher.