From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#56682: Fix the long lines font locking related slowdowns Date: Fri, 05 Aug 2022 10:38:57 +0300 Message-ID: <83k07n16ni.fsf@gnu.org> References: <837d46mjen.fsf@gnu.org> <8335esjppt.fsf@gnu.org> <837d43j198.fsf@gnu.org> <83y1wjhkkh.fsf@gnu.org> <83wnc3hkdg.fsf@gnu.org> <49df74e5-e16a-a532-98d1-66c6ff1eb6c6@yandex.ru> <83pmhuft5a.fsf@gnu.org> <05388e8d8836c2e7ef3e@heytings.org> <136c4fe0fcb9ce5181cb@heytings.org> <3d639ea12689d767ba2a@heytings.org> <03dce9be-5c51-94d7-a32a-52ab7f57dde2@yandex.ru> <83r11w2m0i.fsf@gnu.org> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="14479"; mail-complaints-to="usenet@ciao.gmane.io" Cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, gregory@heytings.org, monnier@iro.umontreal.ca To: Dmitry Gutov Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Fri Aug 05 09:44:49 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oJs0m-0003db-Ql for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 05 Aug 2022 09:44:49 +0200 Original-Received: from localhost ([::1]:44448 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oJs0l-000825-MX for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 05 Aug 2022 03:44:47 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:58114) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oJrwA-0005Zj-RW for bug-gnu-emacs@gnu.org; Fri, 05 Aug 2022 03:40:04 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:38032) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oJrwA-0007FQ-CH for bug-gnu-emacs@gnu.org; Fri, 05 Aug 2022 03:40:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1oJrwA-0004g7-6I for bug-gnu-emacs@gnu.org; Fri, 05 Aug 2022 03:40:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 05 Aug 2022 07:40:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 56682 X-GNU-PR-Package: emacs Original-Received: via spool by 56682-submit@debbugs.gnu.org id=B56682.165968515217914 (code B ref 56682); Fri, 05 Aug 2022 07:40:02 +0000 Original-Received: (at 56682) by debbugs.gnu.org; 5 Aug 2022 07:39:12 +0000 Original-Received: from localhost ([127.0.0.1]:56014 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oJrvL-0004er-Q6 for submit@debbugs.gnu.org; Fri, 05 Aug 2022 03:39:12 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:45144) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oJrvI-0004ed-TX for 56682@debbugs.gnu.org; Fri, 05 Aug 2022 03:39:10 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:33014) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oJrvC-0007BS-Pv; Fri, 05 Aug 2022 03:39:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=7SrsqDMH7jVRvoSw6IaZp6c0h+UJG2B9e5o+RSyf5Ig=; b=CwsB6KZ360Oa +DnvnV3skyQ1Q3FkF1DUMaP2L/zQiq6gTLOhV0l3oUHfwfX46J62eBprwFO7TNiZNqACfECq9oQ+4 kBpBpEY0l0cSGDLRTArtIckCeiFEbWlLpP1nzqtUh2ekvXmNNusCd2/4je4/jyPuCqD5Lz6qzxYOb fYS1GdPyOfjmkGd9N4nh57grxe+5MZJdTbhBsOYOFhtOTsBoLNlHx3u/SSXY/IllcMdlTGgYtDJ+u h5iRbe1RByBJ9Xl5+3Wtrpha305/toZmfSEv0qR9XvlUn3+Tgs5nxY+dAm4r2wZaMT3CnL0iffzDB vY4kKrK1LPHe3X589PmFLg==; Original-Received: from [87.69.77.57] (port=1516 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oJrvA-0000U0-1l; Fri, 05 Aug 2022 03:39:00 -0400 In-Reply-To: (message from Dmitry Gutov on Fri, 5 Aug 2022 04:39:46 +0300) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:238835 Archived-At: > Date: Fri, 5 Aug 2022 04:39:46 +0300 > Cc: gregory@heytings.org, gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, > monnier@iro.umontreal.ca > From: Dmitry Gutov > > >> So there is a particular performance problem with the display of > >> fontified buffers which I'd really like your help in fixing. > > > > Maybe it is in display, and maybe it isn't. Do you have any evidence > > that the sluggish response is due to redisplay? C-n, for example, is > > mostly not redisplay, but Lisp code in simple.el and occasional calls > > to vertical-motion. > > I come to that conclusion by observing said sluggish movement in a > buffer that is fully fontified. And yet the delays after pressing 'C-n' > or 'C-p' or 'C-n' seemed noticeable and very similar to delays on > PgUp/PgDown. There's much more to C-n/C-p/C-v/M-v than just fontifications and redisplay. If fontification-functions are out of the picture, it doesn't yet follow that the only factor that is left is redisplay. In general, redisplay _is_ slower when there are non-default faces in the buffer, but (a) that is inevitable due to additional processing, and (b) I hope you are not arguing against having font-lock faces, are you? The question, therefore, is whether the additional processing due to font-lock faces, by itself, is or isn't a significant factor in the slowdown you observe? I don't see how that question could be answered except by profiling. > Are these delays in fontification-functions? That seems unlikely given > the buffer is fontified already in full When the buffer is fully fontified, Emacs will not call fontification-functions at all, unless the buffer becomes modified. > If the delays are in font-lock anyway, that might be in the code which > checks that the area between window-start and window-end is fontified. Unlikely. That test uses the interval tree, and is reasonably fast. But again, only profiling will tell the truth. > I > don't see why that code has to be slow (long lines or not), so it should > be fixable easily enough. Unless 'get-text-property' or > 'next-single-property-change' can exhibit pathologic performance in the > presence of long lines, of course. But that doesn't show up in my testing. get-text-property and its ilk don't treat newlines as special characters, and don't even access buffer text at all. So no, that is most probably not a significant factor here. > > But even if the slow response is due to redisplay, we just have > > another cause that we need to investigate and try fixing. > > It seems to me that that cause actually has larger impact than > font-lock, because it does show itself in a moderately-sized (88K) > buffer, where font-lock doesn't feel like a problem. "Doesn't feel" like a problem? On what is that feeling based? We need profiling and other hard data. It is impossible to argue based on feelings. > It stands to reason > that the same "cause" might have a proportionally bigger impact in large > buffers as well, and only after we remove it (alone), then we can > evaluate how font-lock itself affects user experience, and how much of > its correctness (and for buffers of which size) we want to sacrifice. The buffer size shouldn't matter if lines are not too long, because redisplay always examines a small portion of the buffer that fits in the window, and sometimes also a couple of lines above and below. If lines _are_ long, then yes, their length (not the buffer size) is a significant factor, because the redisplay algorithms many times need to start from a previous line's beginning, to anchor their layout calculations (because only there the X coordinate is known in advance). That is the root cause which these changes we are discussing are trying to eliminate or at least alleviate. There are no other causes we know about. If you claim that there are such causes, you need to show that, not just by reasoning, but by measurements, after you eliminate the already-known effects, like the length of a physical line, character composition (when the script of the characters requires it), etc. And please keep in mind that without long lines, any slowdown factors that we could have are unlikely to cause unreasonably slow responses, because otherwise we'd have complaints about that long ago. (In fact we did have such complaints, and the problems they reported were fixed in past releases.) The only known situation with large buffers that is yet unsolved in core is when the buffer is larger than the available memory, so it causes paging when buffer text is accessed for display or navigation. The mental model you are building and on whose basis you are trying to reason about the ways to solve this problem should take all of the above into account. Only after that you may have a chance of identifying some hidden factor that eluded us, whose elimination could then allow to solve these problems without the narrowing we are now using. > > It says > > nothing about the measures we've already taken on master. They > > definitely make even this case faster, and with an unoptimized build I > > can now reasonably edit this file, something I couldn't do before. > > If my guess is right, the fix on master whammied all over the redisplay > with narrowing, both fixing the "cause" and restricting font-lock to the > same narrowed region. The latter part might be unnecessary in the usual > case (we might still decide to do that later for much larger buffers, > but that should be decided by a separate threshold variable). We need hard data, not guesses. It is impractical to argue about guesses, especially when they are based on incomplete or inaccurate understanding of how the relevant code really behaves. If you produce measurements or other facts that contradict our understanding, then we will be forced to reconsider and adapt to those facts. For now, you didn't yet say or show anything that amounts to such a contradiction. > >> Fixing in a way that doesn't add narrowing around > >> fontification-functions, because as we can see it's not necessary in > >> examples like this. > > > > If that is possible, sure. No one said that from now on every problem > > in Emacs that causes slow responses will be handled by narrowing. But > > if, for example, it turns out that the slow responses is due to time > > it takes some code to traverse a long stretch of fontified buffer, > > what other solution would you suggest except making the portion to be > > traversed shorter? > > For all I know, the most optimal fix might still be implemented through > narrowing, but it would be temporarily widened while > fontificiation-functions are run. The first step of these changes didn't narrow when fontification-functions were run. It was still insufficient, because font-lock still made Emacs extremely slow with long lines, whereas disabling font-lock removed that slowness. The next step then applied narrowing to fontification-functions as well, and that solved the slow cases. This is how this activity proceeded, and this is why we are reasonably sure at this point that fontification-functions _are_ indeed a significant slowdown factor when very long lines are involved. > >> I think we could have both speed and correctness, at least for files of > >> this size. > > > > That is not a given, and the experience till now suggests otherwise. > > I have commented out the code which applies the narrowing in > 'handle_fontified_prop' and recompiled. > > The result: > > - My 88K file is fontified correctly now. The redisplay and scrolling > performance seem unaffected (meaning still fast). > - dictionary.json (18M) seems to be fontified correctly as well now > (it's a mess by default on master), its scrolling performance is > unaffected too. The difference: I have to wait ~2 seconds the first time > I press 'M->'. We consider 2 seconds of wait in this case to be "too slow". But if all you are saying is that the value of long-line-threshold should be changed, or that perhaps the portion of the buffer around the window used for narrowing should be enlarged and/or exposed to control of Lisp programs, we can discuss that. My impression, though, was that your arguments are much more basic: that they argue against the very methods of solving this problem that we currently have on master. And that is an entirely different discussion than the one about the default values of these thresholds. > BTW, 'M-> M-<' triggers some puzzling long wait (~3 seconds) both on > master and with my change, every time I issue this sequence of commands. It would be useful to look into the reasons of that, thanks.