From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.bugs Subject: bug#60691: 29.0.60; Slow tree-sitter font-lock in ruby-ts-mode Date: Mon, 30 Jan 2023 02:15:44 +0200 Message-ID: <152c2d15-ab5c-ff35-f79b-71691fc223f8@yandex.ru> References: <867cxv3dnn.fsf@mail.linkov.net> <90D8581D-C543-43EE-8BEB-51B5AFBCBAEE@gmail.com> <0e552ada-081f-ad90-19c2-645a64ef50ac@yandex.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="27753"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 Cc: juri@linkov.net, 60691-done@debbugs.gnu.org To: Yuan Fu Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Mon Jan 30 01:16:30 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pMHqX-00070F-6Q for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 30 Jan 2023 01:16:29 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pMHq8-0000Ud-Su; Sun, 29 Jan 2023 19:16:04 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pMHq7-0000U6-AJ for bug-gnu-emacs@gnu.org; Sun, 29 Jan 2023 19:16:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pMHq6-0006KH-V2 for bug-gnu-emacs@gnu.org; Sun, 29 Jan 2023 19:16:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1pMHq6-00072Y-O2 for bug-gnu-emacs@gnu.org; Sun, 29 Jan 2023 19:16:02 -0500 Resent-From: Dmitry Gutov Original-Sender: "Debbugs-submit" Resent-To: bug-gnu-emacs@gnu.org Resent-Date: Mon, 30 Jan 2023 00:16:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: cc-closed 60691 X-GNU-PR-Package: emacs Mail-Followup-To: 60691@debbugs.gnu.org, dgutov@yandex.ru, juri@linkov.net Original-Received: via spool by 60691-done@debbugs.gnu.org id=D60691.167503775527031 (code D ref 60691); Mon, 30 Jan 2023 00:16:02 +0000 Original-Received: (at 60691-done) by debbugs.gnu.org; 30 Jan 2023 00:15:55 +0000 Original-Received: from localhost ([127.0.0.1]:45607 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pMHpy-00071u-TM for submit@debbugs.gnu.org; Sun, 29 Jan 2023 19:15:55 -0500 Original-Received: from mail-wm1-f53.google.com ([209.85.128.53]:51136) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pMHpw-00071g-SC for 60691-done@debbugs.gnu.org; Sun, 29 Jan 2023 19:15:53 -0500 Original-Received: by mail-wm1-f53.google.com with SMTP id bg26so1119331wmb.0 for <60691-done@debbugs.gnu.org>; Sun, 29 Jan 2023 16:15:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:from:to:cc:subject:date:message-id:reply-to; bh=yAWpgZLe2L13sJGMg0oxfJqkKN/wH/FJe6Py2cJQQH8=; b=ALQ5OEp6jXRy4tmnNI6QsZVHw92xfxVCopiVFRTAFYhnxpohraQDuLPhWHMZD70Hq8 co7Eb8MDLGcJn1jTqbtdHkZue0HitufeMU82JBV57lBl5NA9x7yMW4KpGYmtfA6HjqrL 1/+GxlLkoblyUCkSKWMfcxMGKLF5FRvZsOAmyakBG4y7En3Idul5h650h4yxGxhhrE2M ONrXabH35Kj3cn6BubVgkJrHdxEV6FFse2xhVPwz2LsBiI+zIzgw14ZQHvlqnOW9fPva LUB6U7SZu6fpBRRSC7U7Z9rKt43bqRbsDL32w3G8IWvlihAZFsfUPSFalnHtu10jVhUv PxhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=yAWpgZLe2L13sJGMg0oxfJqkKN/wH/FJe6Py2cJQQH8=; b=JH9MqONsX02SPJa9lqAKxCDIneV1gyuxnIGKsfw9WJM0BUljYQCKDjiDZld8QaLjtI MV+LAWoAn42DQQj+pAhXTqN0gTo5lLdwYqJjWFa/CRbBTMjdsilYOppNdACSSG/Dos67 c4MkxK6mO9HuLOgY6FGYNkHYifV1cdZAXTXuhvcbxx/qZDerj7Z1L2AHt4/TNaa2lGBi QzU8P0z/u+ax6GMTnYkjLJHK4nFCVZ2JW725lVPpfenBl3F3gwMYkdo0qE60aw6S6lmL TNuSWm5pbOqGq3K/dHwjJYmh8NwdBzRX29tyofUsFOuME8ORrsg0OJY0enLPmMqoCrAz fZYw== X-Gm-Message-State: AO0yUKWe2jsrsyXsUG4GG+sjgIN+JEUX/BtQ670OQ+lwWDZuFE9gszT9 uh45AgfiwsI37NP5IkvvV/A= X-Google-Smtp-Source: AK7set+4DiPFItomTG73yZLfgp+rDdjx/O38fqLWpdLvkB58Ru/o3YTBB6F6SnhN/Thd8EX5jMjFXQ== X-Received: by 2002:a05:600c:825:b0:3dc:5823:d6be with SMTP id k37-20020a05600c082500b003dc5823d6bemr2392941wmp.25.1675037746767; Sun, 29 Jan 2023 16:15:46 -0800 (PST) Original-Received: from [192.168.0.2] ([46.251.119.176]) by smtp.googlemail.com with ESMTPSA id o3-20020a05600c4fc300b003db1d9553e7sm16870749wmq.32.2023.01.29.16.15.45 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 29 Jan 2023 16:15:46 -0800 (PST) Content-Language: en-US In-Reply-To: X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:254386 Archived-At: On 30/01/2023 01:23, Yuan Fu wrote: > >> On Jan 29, 2023, at 3:07 PM, Dmitry Gutov wrote: >> >> Hi Yuan, >> >> On 29/01/2023 10:25, Yuan Fu wrote: >> >>>>>> So if previously it happened once somehow during a certain scenario, now I have to repeat the same scenario 4 times, and the condition is met. >>>>> I was hoping that the scenario only happen once, oh well šŸ˜„ Iā€™ll >>>>> change the decision based on analyzing the treeā€™s dimension: too >>>>> deep or too wide activates the fast mode. Letā€™s see how it works. >>>> Thank you, let me know when it's time to test again. >>> Sorry for the delay. Now treesit-font-lock-fontify-region uses >>> treesit-subtree-stat to determine whether to enable the "fast mode". Now >>> it should be impossible to activate the fast mode on moderately sized >>> buffers. >> Thank you, it seems to work just fine in my scenario. And treesit-subtree-stat makes sense. >> >> I have a few more questions about the current strategy, though. >> >> IIUC, we only do the treesit--font-lock-fast-mode test once in treesit-font-lock-fontify-region, and then use the detected value for the whole later life of the buffer. Is that right? >> >> What if the buffer didn't originally have the problematic error nodes we are guarding from, and then later the user wrote enough code to have at least one of them? If they didn't close Emacs, or revert the buffer, our logic still wouldn't use the "fast node", would it? >> >> Or vice versa: if the buffer started out with error nodes, and consequently, "fast mode", but then the user has edited it so that those error nodes disappeared, shouldn't the buffer stop using the "fast mode"? >> >> From my measurements, in ruby-mode, at least treesit-subtree-stat is 20-40x faster than refontifying the whole buffer. So one possible strategy would be to repeat the test every time. I'm not sure it's fast enough in the "problem" buffers, though, and I don't have any to test. >> >> In those I did test, though, it takes ~1 ms. >> >> But we could repeat the test only once every couple of seconds and/or after the buffer has changed again. That would hopefully make it a non-bottleneck in all cases. > I should mention this in the comments, but the fast mode is only for very rare cases, where the file is mechanically generated and has some peculiarities that causes tree-sitter to work poorly. If the file is hand-written and ā€œnormalā€, even huge files like xdisp.c is well below the bar. Therefore I donā€™t think ā€œcrossing the lineā€ will realistically happen when editing source files. > > Here is the stats of two ā€œproblematic filesā€, named packet and dec_mask, comparing to xdisp.c: > > ;; max-depth max-width count > ;; cut-off 100 4000 > ;; packet (98159 46581 1895137) > ;; dec mask (3 64301 283995) > ;; xdisp.c (29 985 218971) > > Iā€™d say that any regular source file, even mechanically generated, wouldnā€™t go beyond ~50 levels in depth, and hand-written files should never has a node that has 4000+ direct children in the parse tree. Oh, thanks for the explanation. Then the current strategy makes sense. Is xdisp.c absolutely the largest C file in your experience? According to the above numbers, a file that's only 4x as large could hit our current cutoff. Though, TBH, maybe some extreme files do, and they have font-lock performance reduced somewhat. That's not the end of the world, and it shouldn't make a difference for the original scenario (diff-syntax fontification). Either way, I'm closing this report. Thank you for your help.