From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Tree-sitter integration on feature/tree-sitter Date: Thu, 12 May 2022 08:17:15 +0300 Message-ID: <838rr7qqhw.fsf@gnu.org> References: <87y1zabmbt.fsf@gmail.com> <5F186EBD-CD21-422B-8B4F-0D5424173334@gmail.com> <875ymdwf76.fsf@gmail.com> <011DA1A3-0FA8-4449-878A-FD6B336B0F1B@gmail.com> <8735hhw75p.fsf@gmail.com> <83czgks4ss.fsf@gnu.org> <87wnesuw63.fsf@gmail.com> <83pmkkqhft.fsf@gnu.org> <87tu9wukbt.fsf@gmail.com> <83ee10qbk7.fsf@gnu.org> <8F6A43D1-D1EA-4602-A245-627DB7960FC2@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="35386"; mail-complaints-to="usenet@ciao.gmane.io" Cc: yoavm448@gmail.com, emacs-devel@gnu.org To: Yuan Fu Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu May 12 07:19:03 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1np1E6-00094i-Ey for ged-emacs-devel@m.gmane-mx.org; Thu, 12 May 2022 07:19:02 +0200 Original-Received: from localhost ([::1]:40386 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1np1E5-0007lw-6c for ged-emacs-devel@m.gmane-mx.org; Thu, 12 May 2022 01:19:01 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:52912) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1np1CM-0006wl-Fy for emacs-devel@gnu.org; Thu, 12 May 2022 01:17:14 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:57054) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1np1CM-0002j9-4F; Thu, 12 May 2022 01:17:14 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=pqVPQsyVx91koeD/D/t5NbDxUUSD8K6ABbSvdJoS1iw=; b=OmaX84Dr5iM+SbN/SY6Q gfKPbxz3WaDoUlqdJoYmzrGLhIivc0hNe+DXXQjvTgj3IQsWrivwGVOK7vclTxnN7Thu0mV9UfAqA mNa0QoidW9Iekbngib/RENrUaKQS4c0rFl10HmT4fSLU712bjAKCPK6u7+e+UhgSgJ8z7vZx4W1Ho xyvDXpnkrpPry9qh/JKOkJs1YuvC1suNeDFicNH9H14PcQCv/ThQ7BxmmeJ71+h8oAGod81II6SWv NfE+9MSX15KPYjqFVp50/HTNpYOOKiraB394kiPnTH9HZ6gnnQcVj7aQ3QUW9UTTCDGxVa0Wil10U 042fzH7FjTkvWA==; Original-Received: from [87.69.77.57] (port=3906 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1np1CL-00021Y-8n; Thu, 12 May 2022 01:17:13 -0400 In-Reply-To: <8F6A43D1-D1EA-4602-A245-627DB7960FC2@gmail.com> (message from Yuan Fu on Wed, 11 May 2022 13:14:33 -0700) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:289661 Archived-At: > From: Yuan Fu > Date: Wed, 11 May 2022 13:14:33 -0700 > Cc: Yoav Marco , > emacs-devel@gnu.org > > > | | | no reuse (now) | reuse | > > | 1 | Fontify xdisp.c all at once | 0.01s | 0.01s | > > | 2 | Fontify 60 next lines of xdisp.c ×10 | 0.10s | 0.00s | > > | 3 | Fontify 60 next lines till the end | 6.06s | 0.01s | > > > > If so, what is the significance of the last line in practical use > > cases? JIT font-lock never fontifies such large chunks of source > > code, it does that in 512-character chunks, which is less than 60 > > lines in most cases, and definitely not "till the end". > > I think that’s just a way to run font-lock enough times without repeatedly fontifying the same region? Then I'm not sure the result is very interesting by itself, unless we can find a way to use that result for estimating how long will it take to perform fontifications in some practical use cases that we care about, and compare that to what we have now in those use cases. > I redid the benchmark, but without his reuse patch, just to see how much time is spent on creating query objects. So fortifying 40 lines for 463 times takes 6.92s (according to Emacs, 7.30s according to the profiler). That counts to 0.0158s per call to font-lock-region, of which 0.0104s is spent on creating the query object. That seems to tell me if we optimize away the query object creation we can make font-locking very very fast? According to your benchmarks, it is already very fast: 16 msec is a negligible time interval. Of course, 40 is a somewhat arbitrary number, but to get a less arbitrary one, we should determine it from some concrete scenarios, such as the 512-character chunk JIT font-lock uses during redisplay, or the number of lines on a typical window that's important when one scrolls with C-v/M-v, etc. > If we expose "compiled query” we don’t need to cache them either. Then the Lisp program will have to do that, which is even worse, because the problems I described will now have to be solved by Lisp application programmers, each time anew. > Benchmark 3: fontify all of xdisp.c, 40 lines at a time. > took 88.28, of which 5.00 is GC (4 gc runs), loop count: 905 > > font-lock: 88.28s -> 0.1997285067873303 / loop So we already have an order-of-magnitude speed-up with tree-sitter: we go from 200 msec down to 16 msec. Also, 200 msec is above the threshold of human perception of a response delay, whereas 16 msec is way below that threshold. With such significantly faster font-lock, I wouldn't bother caching anything, at least not yet, not unless someone comes up with a practical use case where the query-compilation part really makes a significant practical difference in terms of absolute response times. Bottom line: I think the 6-msec speedup (from 16 to 10) in the scenario that was used in these benchmarks doesn't justify the complexities of caching the queries, given the overall excellent performance we get with tree-sitter. Caching is an optimization, and in this case it sounds like doing that now would be a premature optimization. Thanks.