From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#56682: Fix the long lines font locking related slowdowns Date: Sun, 14 Aug 2022 20:59:34 +0300 Message-ID: <838rnqk8op.fsf@gnu.org> References: <92da07bd02941d5537e9@heytings.org> <5308e3b5-a160-17d7-77ee-b1d00acfa20d@yandex.ru> <92da07bd02a6cc861e1a@heytings.org> <837d3lzv8n.fsf@gnu.org> <2c8d6755-cfe2-6559-3fde-3fa30ffb411e@yandex.ru> <83mtcgy44k.fsf@gnu.org> <83k07jx5jn.fsf@gnu.org> <866e510d-a060-7daa-d002-97861d056fa7@yandex.ru> <1144021660321893@iva5-64778ce1ba26.qloud-c.yandex.net> <12348081660379417@sas2-a098efd00d24.qloud-c.yandex.net> <66bbbb95983414e79637@heytings.org> <83wnbckp0q.fsf@gnu.org> <8e884ebe-2d2e-d599-15c3-a5cfe5e6b295@yandex.ru> <83o7wnl7ok.fsf@gnu.org> <036414cc-c711-efaf-ed5b-f8ccfaca0604@yandex.ru> <83v8qvj79c.fsf@gnu.org> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="2094"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 56682@debbugs.gnu.org, gregory@heytings.org, monnier@iro.umontreal.ca To: Dmitry Gutov Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun Aug 14 20:01:19 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oNHvL-0000N6-3u for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 14 Aug 2022 20:01:19 +0200 Original-Received: from localhost ([::1]:38522 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oNHvJ-00016e-CG for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 14 Aug 2022 14:01:17 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:44664) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oNHv8-00016K-JI for bug-gnu-emacs@gnu.org; Sun, 14 Aug 2022 14:01:06 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:50117) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oNHv4-0008JQ-6Y for bug-gnu-emacs@gnu.org; Sun, 14 Aug 2022 14:01:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1oNHv4-00083s-1v for bug-gnu-emacs@gnu.org; Sun, 14 Aug 2022 14:01:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 14 Aug 2022 18:01:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 56682 X-GNU-PR-Package: emacs Original-Received: via spool by 56682-submit@debbugs.gnu.org id=B56682.166050000430825 (code B ref 56682); Sun, 14 Aug 2022 18:01:02 +0000 Original-Received: (at 56682) by debbugs.gnu.org; 14 Aug 2022 18:00:04 +0000 Original-Received: from localhost ([127.0.0.1]:39866 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oNHu5-00080I-Hw for submit@debbugs.gnu.org; Sun, 14 Aug 2022 14:00:04 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:59820) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oNHu1-000803-D9 for 56682@debbugs.gnu.org; Sun, 14 Aug 2022 14:00:00 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:49482) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oNHtw-0007kn-2I; Sun, 14 Aug 2022 13:59:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=XZYh1wAuhTAW4+nA4+DpqyM29nwGRGoAqTPqNNuH77Q=; b=pjFObqaJ7TEV lS0jWtcRrte1FBbEkP+6k9smL2GKkrVgDTS9V+4i+v1jOPGt3ABSA2uhk35EObUPxf3slTe14Y4jd /WraCctjmhcIdR1BuIt54iJibfkPyGmjsKSz8lGejWW0Dtq3d2PaAs9ZW25021LQoZJOXJeA2l3U/ S4JKtCtdLZlxQjASdXPorIZwZEAuBhSf3dJIc5kzDGYKVnr37R7dLuBab3W1p/JG87B6is2jQmxgV ns/juWWtCPQn+ZjHwsUkEtnMq8MkonA7hEJGcRByq58RuabSITtOjJCGLVjxpLQj8hBwNyiMmwull vGPqQqRg+0BYWJTZHSL37g==; Original-Received: from [87.69.77.57] (port=2136 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oNHtu-00088g-Lj; Sun, 14 Aug 2022 13:59:51 -0400 In-Reply-To: (message from Dmitry Gutov on Sun, 14 Aug 2022 20:47:40 +0300) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:239685 Archived-At: > Date: Sun, 14 Aug 2022 20:47:40 +0300 > Cc: 56682@debbugs.gnu.org, gregory@heytings.org, monnier@iro.umontreal.ca > From: Dmitry Gutov > > > The better way is to acknowledge that some inaccuracies are acceptable > > in those cases. With that in mind, one can design a syntax analyzer > > that looks back only a short ways, until it finds some place that > > could reasonably serve as an anchor point for heuristic decisions > > about whether we are inside or outside a string or comment, and then > > verifying that guess with some telltale syntactic elements that follow > > (like semi-colons or comment-end delimiters in C). While this kind of > > heuristics can sometimes fail, if they only fail rarely, the result is > > a huge win. > > You cannot design a language-agnostic syntax analyzer like that. _I_ cannot, but hopefully someone else will. > > In any case, the way to speed up these cases is to look at the profile > > and identify the code that is slowing us down; then attempt to make it > > faster. (20 sec is actually long enough for us to interrupt Emacs > > under a debugger and look at the backtrace to find the culprit.) > > I've profiled and benchmarked this scenario already: all of the delay > (17 seconds, to be precise) come from parse-partial-sexp. 1 GB is a lot. Before we get to 1GB files, there are 20MB files and 250MB files. I found quite a few low-hanging fruit in those that are worth plucking, while we wait for parse-partial-sexp to get its act together. > > If that solves the problems in a reasonable way for very long lines, > > maybe we will eventually have such an option. > > Can I merge the branch, then? Please wait until I have time to review it. > I was hoping for a stylistic review, perhaps. Like, whether you like the > name of the variable, and should it be split in two. > > A change of the default value(s) is on the table too. Will definitely do, I'm just busy with "other things" right now, most of them related to other aspects of long lines. > > One such major mode and one such file was presented long ago : a > > single-line XML file. > > XMl is indeed slower. It takes almost 3 seconds for me to scroll to the > end of a 20 MB XML file. > > Most of it comes from sgml--syntax-propertize-ppss, which is probably > justified: XML is a more complex language. Did you wait till nxml-mode did its initial scan and displayed "Valid" in the mode line? The performance is quite different before and after that. > But other than the initial delay, scrolling, and isearch, and local > editing, all work fast, unlike the original situation with JSON. With which branch?