From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#38407: 27.0.50; infinite loop with display of large file without newlines Date: Wed, 04 Dec 2019 17:45:40 +0200 Message-ID: <83zhg8hz6j.fsf@gnu.org> References: <39c498717f8958e7fdc408d4da51d378@webmail.orcon.net.nz> <24031.28277.201123.531348@cochabamba.vanoostrum.org> <83r21sowx1.fsf@gnu.org> <24032.4698.256238.87458@cochabamba.vanoostrum.org> <83k17jq1ch.fsf@gnu.org> <24032.17845.921546.629745@cochabamba.vanoostrum.org> <83fti7p349.fsf@gnu.org> <4B1ABCA7-A69C-4251-8EBD-A11654A92642@vanoostrum.org> <83v9r2o4z9.fsf@gnu.org> <24035.27244.755074.180653@cochabamba.vanoostrum.org> <83lfrwlz2w.fsf@gnu.org> <83o8wpjsx5.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="170347"; mail-complaints-to="usenet@blaine.gmane.org" Cc: psainty@orcon.net.nz, pieter@vanoostrum.org, 38407@debbugs.gnu.org To: Robert Pluim Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Wed Dec 04 16:51:25 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1icWw0-000iBZ-0T for geb-bug-gnu-emacs@m.gmane.org; Wed, 04 Dec 2019 16:51:24 +0100 Original-Received: from localhost ([::1]:40636 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1icWvy-0001ml-Mu for geb-bug-gnu-emacs@m.gmane.org; Wed, 04 Dec 2019 10:51:22 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:55763) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1icWqq-0006yr-SU for bug-gnu-emacs@gnu.org; Wed, 04 Dec 2019 10:46:07 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1icWqp-000857-Kv for bug-gnu-emacs@gnu.org; Wed, 04 Dec 2019 10:46:04 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:36530) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1icWqo-00084s-H1 for bug-gnu-emacs@gnu.org; Wed, 04 Dec 2019 10:46:03 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1icWqo-0004YY-70 for bug-gnu-emacs@gnu.org; Wed, 04 Dec 2019 10:46:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 04 Dec 2019 15:46:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 38407 X-GNU-PR-Package: emacs Original-Received: via spool by 38407-submit@debbugs.gnu.org id=B38407.157547435717501 (code B ref 38407); Wed, 04 Dec 2019 15:46:02 +0000 Original-Received: (at 38407) by debbugs.gnu.org; 4 Dec 2019 15:45:57 +0000 Original-Received: from localhost ([127.0.0.1]:42503 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1icWqi-0004YC-PH for submit@debbugs.gnu.org; Wed, 04 Dec 2019 10:45:57 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:46667) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1icWqh-0004Xy-32 for 38407@debbugs.gnu.org; Wed, 04 Dec 2019 10:45:55 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:49624) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1icWqZ-00080N-Vd; Wed, 04 Dec 2019 10:45:48 -0500 Original-Received: from [176.228.60.248] (port=2175 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1icWqZ-0002HT-47; Wed, 04 Dec 2019 10:45:47 -0500 In-reply-to: (message from Robert Pluim on Wed, 04 Dec 2019 10:15:55 +0100) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:172836 Archived-At: > From: Robert Pluim > Cc: psainty@orcon.net.nz, pieter@vanoostrum.org, 38407@debbugs.gnu.org > Date: Wed, 04 Dec 2019 10:15:55 +0100 > > Eli> First, which part of SAVE_IT causes this? I'm guessing it's this > Eli> part: > > Eli> #define SAVE_IT(ITCOPY, ITORIG, CACHE) \ > Eli> do { \ > Eli> if (CACHE) \ > Eli> bidi_unshelve_cache (CACHE, true); \ > Eli> ITCOPY = ITORIG; \ > Eli> CACHE = bidi_shelve_cache (); \ <<<<<<<<<<<< > Eli> } while (false) > > Yes, itʼs bidi_shelve_cache Thanks for verifying. > Eli> And if this guess is also true, then I think the problem is that > Eli> databuf + sizeof (bidi_cache_idx) is unaligned on 64-bit systems, > Eli> since bidi_cache_idx is an int. > > The '_unaligned_' bit of that memmove function name does not mean > thatʼs itʼs doing unoptimized unaligned copies: it means it accepts > unaligned pointers, and aligns them as necessary to enable fast > copying. Anyway, I made bidi_cache_idx an intptr_t, and it made no > difference. OK, thanks. > Thread 1 "emacs" hit Breakpoint 3, bidi_shelve_cache () at bidi.c:981 > 981 alloc = (bidi_shelve_header_size > $25 = 30860 > $26 = 71842080 > > which means Emacs is copying 70MB of data every time bidi_shelve_cache > is called, and itʼs called *a lot* in this scenario. Could we not do > this shelving by pointer-swapping or similar rather than copying? Not sure I understand what kind of pointer-swapping you had in mind. We don't swap between 2 buffers here, we save away a snapshot of the iterator state each time we see a character where a line break can be made, so that we could restore that state when we exhaust the window's width. We must restore the iterator state to continue to the next visual line, and the bidi cache is an integral part of that state. We could perhaps lower the cache size limit (see BIDI_CACHE_MAX_ELTS_PER_SLOT in bidi.c), which would then proportionally decrease the time for making a copy of the cache. Or we could make some non-trivial changes in the logic of move_it_in_display_line_to (and similar changes in display_line) to detect when the cache becomes too large, and use a backup algorithm that doesn't copy it. But I question the utility of such changes: they will never get us a speedup like bidi-inhibit-bpa does, and for the relatively rare use case like this one (extremely long lines in a JSON file, with some bracketed parts containing R2L text, and the user activating visual-line-mode on top of that) inhibiting the BPA, whether via so-long or by the user or some other Lisp, sounds like an okay solution to me. If the JSON file has long lines, but no R2L text, we already have an optimization in bidi.c to avoid having a large cache; and if visual-line-mode is off, the cache doesn't need to be copied so frequently. So only the combination of the two causes this tremendous slowdown, and bidi-inhibit-bpa solves it better than any alternative. WDYT?