From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Ergus Newsgroups: gmane.emacs.devel Subject: Re: cc-mode fontification feels random Date: Sat, 12 Jun 2021 13:01:03 +0200 Message-ID: <20210612110103.u6kuh3d5vahxmxlt@Ergus> References: <83k0n09tkp.fsf@gnu.org> <837dj09p0e.fsf@gnu.org> <20210611232535.b4dyu3a2yxvdixys@Ergus> <87a6nw6jtf.fsf@telefonica.net> <20210612010844.45noqsg7wveeo3yw@Ergus> <83sg1n8t71.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="10596"; mail-complaints-to="usenet@ciao.gmane.io" Cc: ofv@wanadoo.es, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Jun 12 13:02:23 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ls1PD-0002ZW-NO for ged-emacs-devel@m.gmane-mx.org; Sat, 12 Jun 2021 13:02:23 +0200 Original-Received: from localhost ([::1]:39768 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ls1PC-00086s-9T for ged-emacs-devel@m.gmane-mx.org; Sat, 12 Jun 2021 07:02:22 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:49086) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ls1OD-0007J5-Om for emacs-devel@gnu.org; Sat, 12 Jun 2021 07:01:23 -0400 Original-Received: from sonic311-13.consmr.mail.bf2.yahoo.com ([74.6.131.123]:41652) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ls1O9-0001Tv-6T for emacs-devel@gnu.org; Sat, 12 Jun 2021 07:01:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=aol.com; s=a2048; t=1623495672; bh=gSKOKL4pW8QgOrzZTNQ2E+qY5tXZnv9DhQFZ8gU8Ebk=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From:Subject:Reply-To; b=O2QdKxpmyIGhJLEu2/1v81RtZtu1dlJmuRzOgab+iaDO95nzSAcDgf/ib/aLc7BsMfYq3LlwBHvyJaFni61p4qw0orN6EjVcQHU5NUYYX8po/u25v/fQSpxmTtuCK63wLHCJkIUvUgJYpcqOxC9FBHh86egpYLNYj7y3e4lMFOdXmXGaBUL5cNvN+MM8FYsl5HyQP9VQCi0SAvVKvb7d32S2NqUhu+Q6hcx5F1R5PCYr2EUwFTaIdW4MUuna5ry/grpGWyYBxgSzgfKFsOpPSmtMwFas6jZzQXpCZZoqC4bAqFgLb2jGpq+vkUKYWHQDUWK/FBStufHF780Wgl1iPQ== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1623495672; bh=VTvqAmrDLM0t+DglKlmn0coyhe1SF5YTVMe6b4MJngi=; h=X-Sonic-MF:Date:From:To:Subject:From:Subject; b=G6saUYLoN90k2KjRcmcp1xJNAqsv8Nc2KkfxvWWfQTkpIE0x5bc4yQgsR5BgvSyLUvv5RVZpwxJqmSRSAJ8NagHZTiEQM0jiLe8pPl3GGSA0uGPnrAFr/N/jw8Pa+Z4NilJ8UWo+ke3+jPFZ7FMp21nZU9OPQpMpftq3ukbG3r+UIk9T0Wa7rbVlRoixKm7CngBe791+AE9ari9W9rb2dB/aST4DNQ5fVCH1QM5KBdi+lq+szfEkd/eUFDJBYmd6mo5uzgY8xwcuS0wCIz1lRASeUJvqYRub8CmaBToWs9LuY5UogOV9onCyQcDyTPjQEMWn8gLU3M6HpTkaPqgHAQ== X-YMail-OSG: JfWNjq0VM1kcfSLffnF5clYNxNPeQ10ltOiJ3fWkYLYbWa48MtRyZn81q43WGRq 7uMf3UXpg0O1LJYw3E9QYNK.mBhU7qXVj4tS5Mp42NQ.3rtn8Ii1jb_lGj3lJVJikWLlPk8oH8tt FUA6myT6uwFR69ij4Kim0LOOtM2yYJnlt3CaehToAd4vaEiZZm7_eccf6C_xIqrEgg9igQzDYmko 5RCCNtOQD7NRn9A7_n8yCaAoCxCFsHYf0m.0sBUABD0bxiV9WFtAFTy64D5kx5VhWqsEdwER4hwN 7Yzx8wcwhs7rLRs9gXOOFLY6VkvfOWAOZrm.bj0AMWbvn1Bs6jj_lXkAyHLG10rQnWq7ELKzWZQm lsH6y_pbgGrfshsKsLUdGaiK3DztCMpq_m7OwYrVZdlhf.srOUDN1tY51Y5SQbhGv14YCcHvqnhP bhzgTRM5wWu5yulq4l10KbESXMAGXWUu6g8fpuNRLR1sJ1oKT9x6VgMRcgZdI4Ihkqmwbz6kk1Yx KQellYVbfdb26DPfJW.6hs7cLhyRcZc.syOsNeMsp4s.P14MUYnCVYzEOSXmUCw1l8tHK54PU9vm BkkQTu6rgqOZX1VMSByxhUEYYF0OBv4Bw0B3ije7qUjYWZlCNNNrAd492g4DOYsMy0rTDhkmv2TV 9fWqMRrvG1TscH1X34w3LwG6b0KyUFV3ZJd3QgRjMJBHAndcmiqdHG5EO8Ex3w4AyWvxBV6PO5Vz AHcCYBKHZw48vzikFW8GpVA5WKgF4UmF.G4.jMqPqMoC3YsHqoWzWyVRidGJ5lAV79MyXK_8R_Jz IpFOVX.aYkK1Vl8FQjb5vxuPsfKZC_uIuVxt3Vovkm X-Sonic-MF: Original-Received: from sonic.gate.mail.ne1.yahoo.com by sonic311.consmr.mail.bf2.yahoo.com with HTTP; Sat, 12 Jun 2021 11:01:12 +0000 Original-Received: by kubenode550.mail-prod1.omega.ir2.yahoo.com (VZM Hermes SMTP Server) with ESMTPA ID 854c4062489b8fec23a032daaad0b48f; Sat, 12 Jun 2021 11:01:07 +0000 (UTC) Content-Disposition: inline In-Reply-To: <83sg1n8t71.fsf@gnu.org> X-Mailer: WebService/1.1.18368 mail.backend.jedi.jws.acl:role.jedi.acl.token.atz.jws.hermes.aol Received-SPF: pass client-ip=74.6.131.123; envelope-from=spacibba@aol.com; helo=sonic311-13.consmr.mail.bf2.yahoo.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:270751 Archived-At: On Sat, Jun 12, 2021 at 09:58:58AM +0300, Eli Zaretskii wrote: >> Date: Sat, 12 Jun 2021 03:08:44 +0200 >> From: Ergus >> Cc: emacs-devel@gnu.org >> >> BTW: Eli was concerned about the extra copy of the buffer text to send >> it to tree-sitter. In this case the time to memcopy an array with all >> xdisp text is ~0.00085 seconds. > >If the intent is to use buffer-(sub)string, then you forget the price >of consing. That would trigger frequent GC cycles, which will all but >kill the otherwise fast performance. > >> Any way if we don't want the copy we can use >> ts_parser_set_included_ranges to exclude the gap and pass the text >> pointer directly without any copy. > >I hope someone will try that and report the results. > >The other design issue with TS integration is that I'd like it to plug >into the JIT font-lock interface of the display engine, so that we >don't unnecessarily fontify parts of the buffer that won't be >displayed, and always do fontify the parts that will be. If I understand something about our cc-mode functionalities (and many of those functionalities we don't want to loose like indentation and code navigation). Probably the "right" way to use tree-sitter (maybe Alan wants give a more precise technical description) is not only fontify but use the tree information to add contextual information to the text (something that I think cc-mode does.) And then let font-lock do the magic. The tree-sitter tree is basically contextual information, and (for example) if we have processed the whole buffer and we already have the tree, then scrolling won't need to parse anything, adding or removing text is a localized modification, so with the previous tree we can re-parse only the modified region. The choice may be then if we propertize the text of the whole buffer or just the visible region OR if we want to "propertize on demand". This will save us from the hard parsing in cc-mode to fontify "on the fly". > I don't >really care if TS actually processes a much larger chunk of text, if >it does that quickly enough, but processing the resulting faces will >take time on the Emacs side, and that is better avoided. But then we won't get all the contextual information we need for indentation, code navigation or fold the code right? so we'll be still "sub-utilizing" the tree sitter features that may give useful functionalities we already have in cc-mode, and we may also like to have in other more "limited" modes. > More >importantly, integration into JIT font-lock machinery means we don't >need to use other hooks, which is a step back, since using such hooks >for fontification was already shown to have serious problems in pre-21 >Emacs: they don't always catch all the changes which require >re-fontification. > I see two approaches here: 1) add the tree-sitter properties/faces to the buffer text (fully or partially on the visible regions) 2) use the tree-sitter information directly from the tree and add the visible properties from there. This second one will require a more complete api of tree-sitter functions exposed to elisp, but in my opinion it worth it in accuracy, speed and simplicity (a single API to rule them all). And to support many languages we don't actually have like rust or the fancy C++ > 11. + Remember that TS has the partial parsing options (specifying the regions to parse), the re-parsing option (using a previous tree for the same buffer as a hint which reduces the times abruptly), or even a tree comparison function that produces a new tree with the differences with the "hint" tree to know what needs to be updated. Plus all the navigation function like find parent or child nodes, parsing error handling, iterate over nodes and so on.