From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Reliable after-change-functions (via: Using incremental parsing in Emacs) Date: Tue, 31 Mar 2020 16:14:16 +0300 Message-ID: <83369o3bvb.fsf@gnu.org> References: <83o8sf3r7i.fsf@gnu.org> <2E218879-0F24-4A20-B210-263C8D0BEEA4@gmail.com> <838sjh2red.fsf@gnu.org> Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="58845"; mail-complaints-to="usenet@ciao.gmane.io" Cc: casouri@gmail.com, akrl@sdf.org, emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Mar 31 15:14:38 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jJGj0-000FEF-02 for ged-emacs-devel@m.gmane-mx.org; Tue, 31 Mar 2020 15:14:38 +0200 Original-Received: from localhost ([::1]:37722 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jJGiy-0007Dc-VQ for ged-emacs-devel@m.gmane-mx.org; Tue, 31 Mar 2020 09:14:36 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:40611) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jJGiV-0006mo-B0 for emacs-devel@gnu.org; Tue, 31 Mar 2020 09:14:08 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:37759) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jJGiU-0006pc-R3; Tue, 31 Mar 2020 09:14:06 -0400 Original-Received: from [176.228.60.248] (port=1137 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jJGiU-0003QO-4A; Tue, 31 Mar 2020 09:14:06 -0400 In-Reply-To: (message from Stefan Monnier on Mon, 30 Mar 2020 23:10:57 -0400) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:246089 Archived-At: > From: Stefan Monnier > Cc: Yuan Fu , emacs-devel@gnu.org, akrl@sdf.org > Date: Mon, 30 Mar 2020 23:10:57 -0400 > > > IOW, our goal is not to build the syntax tree, it's to give > > tree-sitter enough information to allow us to fontify the part that's > > about to be displayed. We need to have tree-sitter play by Emacs > > rules, not teach Emacs to play by tree-sitter rules. > > IIUC, tree-sitter starts by parsing the whole buffer anyway, and then > keeps the parse tree up-to-date in response to buffer changes. Why does it need the entire buffer up front? that sounds like a potential performance killer. Fontifying a small part of a buffer doesn't need its entire text. In any case, I hope that passing the buffer to tree-sitter doesn't involve marshalling the entire buffer text via a function call as a huge string, or some such. We should instead request that tree-sitter exposes an API through which we could give it direct access to buffer text as 2 parts, before and after the gap, like we do with regex code. Otherwise this will be a bottleneck in the long run, not unlike the problem we have with LSP. > Its algorithm is tuned so that the time needed to update the tree is > more or less proportional to the size of the change. > > So jit-lock/font-lock doesn't need to pass any part of the buffer to > tree-sitter: tree-sitter already has the buffer's content and we can > assume its already parsed. What emacs-tree-sitter's proposed > tree-sitter-highlight does is provide a function which takes > a START..END, then finds which part of the existing parse tree cover > that region and "reads the tree" to fontify the corresponding text. I still don't see why it would need the entire buffer for this class of applications. Did anyone try the alternatives, in particular on very large buffers?