From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Tree-sitter integration on feature/tree-sitter Date: Thu, 12 May 2022 20:18:38 +0300 Message-ID: <83wneqoej5.fsf@gnu.org> References: <87y1zabmbt.fsf@gmail.com> <5F186EBD-CD21-422B-8B4F-0D5424173334@gmail.com> <875ymdwf76.fsf@gmail.com> <011DA1A3-0FA8-4449-878A-FD6B336B0F1B@gmail.com> <8735hhw75p.fsf@gmail.com> <83czgks4ss.fsf@gnu.org> <87wnesuw63.fsf@gmail.com> <83pmkkqhft.fsf@gnu.org> <87tu9wukbt.fsf@gmail.com> <83ee10qbk7.fsf@gnu.org> <8F6A43D1-D1EA-4602-A245-627DB7960FC2@gmail.com> <838rr7qqhw.fsf@gnu.org> <87sfpekf6t.fsf@gmail.com> <838rr6pwjt.fsf@gnu.org> <87pmkik7x6.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="17715"; mail-complaints-to="usenet@ciao.gmane.io" Cc: casouri@gmail.com, emacs-devel@gnu.org To: Yoav Marco Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu May 12 19:21:12 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1npCUy-0004Ts-Jd for ged-emacs-devel@m.gmane-mx.org; Thu, 12 May 2022 19:21:12 +0200 Original-Received: from localhost ([::1]:34196 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1npCUx-0001zM-73 for ged-emacs-devel@m.gmane-mx.org; Thu, 12 May 2022 13:21:11 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:34426) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1npCSS-00006E-OY for emacs-devel@gnu.org; Thu, 12 May 2022 13:18:37 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:41282) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1npCSS-00040U-FC; Thu, 12 May 2022 13:18:36 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=C5debewEsstewHrdUq3+Oku1R7msdm7ax2P7Xy1yg3E=; b=UCmlk1/4LMeOLXNV0Wj7 4Q427rbI3SKIlvxt7nyZYSLLJYBtrYT7Uakk86J3bNjjtPLFyYaW3DXghr5IGn5aniWqwcVPE5oWK D1tyniopts2mWALexAVYIuczJ6RzkAI1qS54LZu6WRfIAVjEfWv4xvbPc/EcJemE0UuxlEZckVPTZ 6xF20yLT/hVxKWq32PtQZdtTqHyzXV8WTDdL7eVZQ8fjcDZUE9qDNrZb2LQS6fHydknjJVrXr3YnV QL7XNdvjLDeUC5UrT+ezF68GdU7JfF4wbHQPPZBp6GlD6nOf7x1I+tnech2veeuoNrWuPR8XuSNHN pJGZBeBnD64oLg==; Original-Received: from [87.69.77.57] (port=4613 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1npCSR-0007Ic-4w; Thu, 12 May 2022 13:18:35 -0400 In-Reply-To: <87pmkik7x6.fsf@gmail.com> (message from Yoav Marco on Thu, 12 May 2022 19:26:50 +0300) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:289703 Archived-At: > From: Yoav Marco > Cc: casouri@gmail.com, emacs-devel@gnu.org > Date: Thu, 12 May 2022 19:26:50 +0300 > > How I understand it, if it takes 23.474s to fontify 2332 times without > query caching and 0.037s with, then 99.7% of the time is spent in > recompiling the same query, or (23.474 - 0.037)/2332 = 10ms per > fontification. Yes, and 10 ms is negligibly short. So, while the relative speedup is very significant, I still don't see any reason for caching the queries. But maybe we should make this discussion more concrete. Can you show the queries and explain how they are produced from the font-lock rules (or whatever else they are produced from)? How many different queries do we expect to have in a garden-variety major mode for a PL, and what do they depend on? > Explaination for the whole table: > > | | | font-lock | TS sexp | TS | TS query reuse | > | 1 | xdisp.c all at once | 12.886 | 0.031 | 0.016 | 0.017 | > | 2 | 20 × 512c | 0.273 | 0.214 | 0.209 | 0.000 | > | 3 | 512c to end | 4m+ | 24.177 | 23.474 | 0.037 | > > Rows: > - Benchmark 1 xdisp.c all at once: run font-lock-font-lock-fontify-region > on the entire buffer once > - Benchmark 2 20 × 512c: fontify the next 512 characters 20 times > - Benchmark 2 20 × 512c: fontify the next 512 characters until the > buffer ends Thanks. I think these benchmarks are not very useful. Representative benchmarks I can think of are: . the time it takes to visit xdisp.c and display the first window-full . visit xdisp.c, then immediately go to its end . C-v in xdisp.c (repeat many times to see how much a single C-v takes) > >> >> If we expose "compiled query” we don’t need to cache them either. > >> > > >> > Then the Lisp program will have to do that, which is even worse, > >> > because the problems I described will now have to be solved by Lisp > >> > application programmers, each time anew. > >> > >> Will they? They'd just need to compile their queries once, when defining > >> them or when setting treesit-font-lock-defaults. > > > > And decide when to discard them. > > I thought garbage collection could take care of that. Is that > problematic? GC can take care of queries that the Lisp program no longer needs, but the Lisp program should first decide that it no longer needs them. Like stop referencing them in any data structure.