From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yuan Fu Newsgroups: gmane.emacs.devel Subject: Re: Tree-sitter integration on feature/tree-sitter Date: Wed, 11 May 2022 23:07:35 -0700 Message-ID: References: <87y1zabmbt.fsf@gmail.com> <5F186EBD-CD21-422B-8B4F-0D5424173334@gmail.com> <875ymdwf76.fsf@gmail.com> <011DA1A3-0FA8-4449-878A-FD6B336B0F1B@gmail.com> <8735hhw75p.fsf@gmail.com> <83czgks4ss.fsf@gnu.org> <87wnesuw63.fsf@gmail.com> <83pmkkqhft.fsf@gnu.org> <87tu9wukbt.fsf@gmail.com> <83ee10qbk7.fsf@gnu.org> <8F6A43D1-D1EA-4602-A245-627DB7960FC2@gmail.com> <838rr7qqhw.fsf@gnu.org> Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.80.82.1.1\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="34020"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Yoav Marco , emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu May 12 08:09:15 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1np20g-0008k2-Nb for ged-emacs-devel@m.gmane-mx.org; Thu, 12 May 2022 08:09:14 +0200 Original-Received: from localhost ([::1]:60462 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1np20f-0004uy-9p for ged-emacs-devel@m.gmane-mx.org; Thu, 12 May 2022 02:09:13 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:58852) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1np1zQ-00049o-MO for emacs-devel@gnu.org; Thu, 12 May 2022 02:07:57 -0400 Original-Received: from mail-pg1-x52e.google.com ([2607:f8b0:4864:20::52e]:39918) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1np1zM-0001SR-HO; Thu, 12 May 2022 02:07:56 -0400 Original-Received: by mail-pg1-x52e.google.com with SMTP id a19so3491079pgw.6; Wed, 11 May 2022 23:07:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=LE4h7Z98BrlyHyHiFaS0vFvBvKG/iKkZsfL/L09iyso=; b=E77e44GH7eaKr3O3ua++wvtHEtrKcygnSV8QnX/ONjNNcGdOnKzjfoN9UxRgoeRcFa hCDl0m+rF0mLIgzK+n34oV+TcYLYvMSgiqLM9JEyvmfnK/Cbym+yfLrpw+bm79qseTuO 4XwEIhAynBRIj8iHPnytDHDt5K2K6GE+FiiQ0scgxg54ng+zX9zq5K99Hf+TihK5RT1n 6/VH8syIazGBjGSqaI5k10yxmLcbW7qtfg4hFBU8uYrFmDi4yEpxDlR0jWbGD2orAvF+ ySSbNTb/YlR3tpKLYmL7e6VlhKikFk8pkbQOZ8GhWE7kClbcy7W8gVgHckMjjGvZ7iLX YI6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=LE4h7Z98BrlyHyHiFaS0vFvBvKG/iKkZsfL/L09iyso=; b=b9tyNV5kTaAU+VertxJeNjLL2dFcshPwIBb8JAO17wbWwIrIfPGD8DJJlORXw+MCP0 yadhpLlRNuQT6oOoCROo08WrelHFpKMFd3O3OkCO6DPCJIlyuu3hu3PspkpyiEsnPLHy C1IFdibDsaGJ8Ufnp+f3or3LXK9St+6fFjxLGaBydFaqJ/x0S3Wj2v7nJtBZF9oHOVd7 031h3pRUFWwLKcySTap7TAlquYokN9kf6N0j/IUE+KK0qEilfzkaYdjP9dQgGoMseUB4 qRHEVpLBUCqKsut1mnNLwhKk9bTZFNXOvLsi86UIW7jWCMHpeY4W8hdPzfdqc7IuKJss PpRA== X-Gm-Message-State: AOAM53137PASMamBmvMeOP/kQ/h3y1A2UwWgdlgrt25zffJ04iOSkawd dEyRmwG3O+UqecQuy4xQEzGdZmlYdkCuUw== X-Google-Smtp-Source: ABdhPJxlrc11bBmvEyHbC+Qwy+lXcXzU90EZqJDUnE70qvRW5cWoTojycEIlEsm6+2gocaI8WmmkPw== X-Received: by 2002:aa7:8605:0:b0:50d:46d4:a1ad with SMTP id p5-20020aa78605000000b0050d46d4a1admr28397918pfn.66.1652335657696; Wed, 11 May 2022 23:07:37 -0700 (PDT) Original-Received: from smtpclient.apple ([2600:1700:2ec7:8c90:7cb3:8483:26c4:aa26]) by smtp.gmail.com with ESMTPSA id c186-20020a621cc3000000b0050dc76281b9sm2849598pfc.147.2022.05.11.23.07.36 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 11 May 2022 23:07:36 -0700 (PDT) In-Reply-To: <838rr7qqhw.fsf@gnu.org> X-Mailer: Apple Mail (2.3696.80.82.1.1) Received-SPF: pass client-ip=2607:f8b0:4864:20::52e; envelope-from=casouri@gmail.com; helo=mail-pg1-x52e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:289663 Archived-At: > On May 11, 2022, at 10:17 PM, Eli Zaretskii wrote: >=20 >> From: Yuan Fu >> Date: Wed, 11 May 2022 13:14:33 -0700 >> Cc: Yoav Marco , >> emacs-devel@gnu.org >>=20 >>> | | | no reuse (now) | reuse = | >>> | 1 | Fontify xdisp.c all at once | 0.01s | 0.01s = | >>> | 2 | Fontify 60 next lines of xdisp.c =C3=9710 | 0.10s | = 0.00s | >>> | 3 | Fontify 60 next lines till the end | 6.06s | 0.01s = | >>>=20 >>> If so, what is the significance of the last line in practical use >>> cases? JIT font-lock never fontifies such large chunks of source >>> code, it does that in 512-character chunks, which is less than 60 >>> lines in most cases, and definitely not "till the end". >>=20 >> I think that=E2=80=99s just a way to run font-lock enough times = without repeatedly fontifying the same region? >=20 > Then I'm not sure the result is very interesting by itself, unless we > can find a way to use that result for estimating how long will it take > to perform fontifications in some practical use cases that we care > about, and compare that to what we have now in those use cases. >=20 >> I redid the benchmark, but without his reuse patch, just to see how = much time is spent on creating query objects. So fortifying 40 lines for = 463 times takes 6.92s (according to Emacs, 7.30s according to the = profiler). That counts to 0.0158s per call to font-lock-region, of which = 0.0104s is spent on creating the query object. That seems to tell me if = we optimize away the query object creation we can make font-locking very = very fast? >=20 > According to your benchmarks, it is already very fast: 16 msec is a > negligible time interval. Of course, 40 is a somewhat arbitrary > number, but to get a less arbitrary one, we should determine it from > some concrete scenarios, such as the 512-character chunk JIT font-lock > uses during redisplay, or the number of lines on a typical window > that's important when one scrolls with C-v/M-v, etc. >=20 >> If we expose "compiled query=E2=80=9D we don=E2=80=99t need to cache = them either. >=20 > Then the Lisp program will have to do that, which is even worse, > because the problems I described will now have to be solved by Lisp > application programmers, each time anew. >=20 >> Benchmark 3: fontify all of xdisp.c, 40 lines at a time. >> took 88.28, of which 5.00 is GC (4 gc runs), loop count: 905 >>=20 >> font-lock: 88.28s -> 0.1997285067873303 / loop >=20 > So we already have an order-of-magnitude speed-up with tree-sitter: we > go from 200 msec down to 16 msec. Also, 200 msec is above the > threshold of human perception of a response delay, whereas 16 msec is > way below that threshold. With such significantly faster font-lock, I > wouldn't bother caching anything, at least not yet, not unless someone > comes up with a practical use case where the query-compilation part > really makes a significant practical difference in terms of absolute > response times. >=20 > Bottom line: I think the 6-msec speedup (from 16 to 10) in the > scenario that was used in these benchmarks doesn't justify the > complexities of caching the queries, given the overall excellent > performance we get with tree-sitter. Caching is an optimization, and > in this case it sounds like doing that now would be a premature > optimization. Sure, that makes sense, and I save writing code ;-) If we want it later = we can easily add that without breaking any API. Yuan=