From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail
From: Eli Zaretskii <eliz@gnu.org>
Newsgroups: gmane.emacs.devel
Subject: Re: Tree-sitter integration on feature/tree-sitter
Date: Thu, 12 May 2022 08:17:15 +0300
Message-ID: <838rr7qqhw.fsf@gnu.org>
References: <87y1zabmbt.fsf@gmail.com>
 <5F186EBD-CD21-422B-8B4F-0D5424173334@gmail.com> <875ymdwf76.fsf@gmail.com>
 <011DA1A3-0FA8-4449-878A-FD6B336B0F1B@gmail.com> <8735hhw75p.fsf@gmail.com>
 <83czgks4ss.fsf@gnu.org> <87wnesuw63.fsf@gmail.com> <83pmkkqhft.fsf@gnu.org>
 <87tu9wukbt.fsf@gmail.com> <83ee10qbk7.fsf@gnu.org>
 <8F6A43D1-D1EA-4602-A245-627DB7960FC2@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214";
	logging-data="35386"; mail-complaints-to="usenet@ciao.gmane.io"
Cc: yoavm448@gmail.com, emacs-devel@gnu.org
To: Yuan Fu <casouri@gmail.com>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu May 12 07:19:03 2022
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane-mx.org
Original-Received: from lists.gnu.org ([209.51.188.17])
	by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
	(Exim 4.92)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>)
	id 1np1E6-00094i-Ey
	for ged-emacs-devel@m.gmane-mx.org; Thu, 12 May 2022 07:19:02 +0200
Original-Received: from localhost ([::1]:40386 helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>)
	id 1np1E5-0007lw-6c
	for ged-emacs-devel@m.gmane-mx.org; Thu, 12 May 2022 01:19:01 -0400
Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:52912)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@gnu.org>) id 1np1CM-0006wl-Fy
 for emacs-devel@gnu.org; Thu, 12 May 2022 01:17:14 -0400
Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:57054)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@gnu.org>)
 id 1np1CM-0002j9-4F; Thu, 12 May 2022 01:17:14 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From:
 Date; bh=pqVPQsyVx91koeD/D/t5NbDxUUSD8K6ABbSvdJoS1iw=; b=OmaX84Dr5iM+SbN/SY6Q
 gfKPbxz3WaDoUlqdJoYmzrGLhIivc0hNe+DXXQjvTgj3IQsWrivwGVOK7vclTxnN7Thu0mV9UfAqA
 mNa0QoidW9Iekbngib/RENrUaKQS4c0rFl10HmT4fSLU712bjAKCPK6u7+e+UhgSgJ8z7vZx4W1Ho
 xyvDXpnkrpPry9qh/JKOkJs1YuvC1suNeDFicNH9H14PcQCv/ThQ7BxmmeJ71+h8oAGod81II6SWv
 NfE+9MSX15KPYjqFVp50/HTNpYOOKiraB394kiPnTH9HZ6gnnQcVj7aQ3QUW9UTTCDGxVa0Wil10U
 042fzH7FjTkvWA==;
Original-Received: from [87.69.77.57] (port=3906 helo=home-c4e4a596f7)
 by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@gnu.org>)
 id 1np1CL-00021Y-8n; Thu, 12 May 2022 01:17:13 -0400
In-Reply-To: <8F6A43D1-D1EA-4602-A245-627DB7960FC2@gmail.com> (message from
 Yuan Fu on Wed, 11 May 2022 13:14:33 -0700)
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
 <mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
 <mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org
Original-Sender: "Emacs-devel"
 <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>
Xref: news.gmane.io gmane.emacs.devel:289661
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/289661>

> From: Yuan Fu <casouri@gmail.com>
> Date: Wed, 11 May 2022 13:14:33 -0700
> Cc: Yoav Marco <yoavm448@gmail.com>,
>  emacs-devel@gnu.org
> 
> >  |   |                                      | no reuse (now) | reuse |
> >  | 1 | Fontify xdisp.c all at once          |          0.01s | 0.01s |
> >  | 2 | Fontify 60 next lines of xdisp.c ×10 |          0.10s | 0.00s |
> >  | 3 | Fontify 60 next lines till the end   |          6.06s | 0.01s |
> > 
> > If so, what is the significance of the last line in practical use
> > cases?  JIT font-lock never fontifies such large chunks of source
> > code, it does that in 512-character chunks, which is less than 60
> > lines in most cases, and definitely not "till the end".
> 
> I think that’s just a way to run font-lock enough times without repeatedly fontifying the same region?

Then I'm not sure the result is very interesting by itself, unless we
can find a way to use that result for estimating how long will it take
to perform fontifications in some practical use cases that we care
about, and compare that to what we have now in those use cases.

> I redid the benchmark, but without his reuse patch, just to see how much time is spent on creating query objects. So fortifying 40 lines for 463 times takes 6.92s (according to Emacs, 7.30s according to the profiler). That counts to 0.0158s per call to font-lock-region, of which 0.0104s is spent on creating the query object. That seems to tell me if we optimize away the query object creation we can make font-locking very very fast?

According to your benchmarks, it is already very fast: 16 msec is a
negligible time interval.  Of course, 40 is a somewhat arbitrary
number, but to get a less arbitrary one, we should determine it from
some concrete scenarios, such as the 512-character chunk JIT font-lock
uses during redisplay, or the number of lines on a typical window
that's important when one scrolls with C-v/M-v, etc.

> If we expose "compiled query” we don’t need to cache them either.

Then the Lisp program will have to do that, which is even worse,
because the problems I described will now have to be solved by Lisp
application programmers, each time anew.

> Benchmark 3: fontify all of xdisp.c, 40 lines at a time.
> took 88.28, of which 5.00 is GC (4 gc runs), loop count: 905
> 
> font-lock: 88.28s -> 0.1997285067873303 / loop

So we already have an order-of-magnitude speed-up with tree-sitter: we
go from 200 msec down to 16 msec.  Also, 200 msec is above the
threshold of human perception of a response delay, whereas 16 msec is
way below that threshold.  With such significantly faster font-lock, I
wouldn't bother caching anything, at least not yet, not unless someone
comes up with a practical use case where the query-compilation part
really makes a significant practical difference in terms of absolute
response times.

Bottom line: I think the 6-msec speedup (from 16 to 10) in the
scenario that was used in these benchmarks doesn't justify the
complexities of caching the queries, given the overall excellent
performance we get with tree-sitter.  Caching is an optimization, and
in this case it sounds like doing that now would be a premature
optimization.

Thanks.