From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: How to add pseudo vector types Date: Thu, 15 Jul 2021 19:48:26 +0300 Message-ID: <83r1fz5xr9.fsf@gnu.org> References: <83h7gw6pyj.fsf@gnu.org> <45EBF16A-C953-42C7-97D1-3A2BFEF7DD01@gmail.com> <83y2a764oy.fsf@gnu.org> <83v95b60fn.fsf@gnu.org> <00DD5BFE-D14E-449A-9319-E7B725DEBFB3@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="26541"; mail-complaints-to="usenet@ciao.gmane.io" Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org To: Yuan Fu Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Jul 15 18:49:40 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1m44YO-0006ia-Ad for ged-emacs-devel@m.gmane-mx.org; Thu, 15 Jul 2021 18:49:40 +0200 Original-Received: from localhost ([::1]:45146 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m44YN-0003lR-3O for ged-emacs-devel@m.gmane-mx.org; Thu, 15 Jul 2021 12:49:39 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:55808) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m44XU-000320-U9 for emacs-devel@gnu.org; Thu, 15 Jul 2021 12:48:44 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:56592) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m44XU-0007NO-9C; Thu, 15 Jul 2021 12:48:44 -0400 Original-Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:1815 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m44XT-0006xN-SK; Thu, 15 Jul 2021 12:48:44 -0400 In-Reply-To: <00DD5BFE-D14E-449A-9319-E7B725DEBFB3@gmail.com> (message from Yuan Fu on Thu, 15 Jul 2021 12:19:31 -0400) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:271266 Archived-At: > From: Yuan Fu > Date: Thu, 15 Jul 2021 12:19:31 -0400 > Cc: monnier@iro.umontreal.ca, > emacs-devel@gnu.org > > > Why do you need to do this when a buffer is updated? why not use > > display as the trigger? Large portions of a buffer will never be > > displayed, and some buffers will not be displayed at all. Why waste > > cycles on them? Redisplay is perfectly equipped to tell you when some > > chunk of buffer text is going to be redrawn, and it already knows to > > do nothing if the buffer haven't changed. > > Tree-sitter expects you to tell it every single change to the parsed text. That cannot be true, because the parsed text could be in a state where parsing it will fail. When you are in the middle of writing the code, this is what will happen many times, even if you pass the whole buffer to the parser. And since tree-sitter _must_ be able to deal with this problem, it also must be able to receive incomplete parts of the buffer text, and do the best it can with it. > Say you have a buffer with some content and scrolled through it, so tree-sitter has parsed the whole buffer. Then some elisp edited some text outside the visible portion. Redisplay doesn’t happen, we don’t tell this edit to tree-sitter. Then I scroll to the place that has been edited. What now? Now you call tree-sitter passing it the part of the buffer that needs to be parsed (e.g., the chunk that is about to be displayed). If tree-sitter needs to look back, it will. > I’ve lost the change information, and tree-sitter’s tree is out-dated. No information is lost because the updated buffer text is available. > We can fontify on-demand, but we can’t parse on-demand. Sorry, I don't believe this is true. tree-sitter _must_ be able to deal with these situations, because it must be able to deal with incomplete text that cannot be parsed without parse errors. In addition, Emacs records (for redisplay purposes) two places in each buffer related to changes: the minimum buffer position before which no changes were done since last redisplay, and the maximum buffer position beyond which there were no changes. This can also be used to pass only a small part of the buffer to the parser, because the rest didn't change. > What we can do is to only parse the portion from BOB to the visible portion. So we won’t parse the whole buffer unless you scroll to the bottom. My primary worry is the fact that you want to use buffer-change hooks (and will soon enough want to use post-command-hook as well). They slow down editing, sometimes tremendously, so I'd very much prefer not to use those hooks for fontification/parsing. The original font-lock mechanism in Emacs 19 used these hooks; we switched to jit-lock and its redisplay-triggered fontifications because the original design had problems which couldn't be solved reliably and with reasonable performance. I hope we will not make the mistake of going back to that sub-optimal design. > >> And, for tree-sitter to take the buffer’s content directly, we need to tell it to skip the gap. > > > > AFAIR, tree-sitter allows the calling package to provide a function to > > access the text, isn't that so? If so, you could write a function > > that accesses buffer text via BYTE_POS_ADDR etc., and that knows how > > to skip the gap already. > > Yes, that function returns a char*. But what if the gap is in the middle of the portion that tree-sitter wants to read? If you provide the function that returns text one character at a time, as AFAIR tree-sitter allows, you will be able to skip the gap automagically by using BYTE_POS_ADDR. If that's not possible for some reason, or not performant enough, we could ask tree-sitter developers to add an API that access buffer text in two chunks, in which case it will be called first with text before the gap, and then with text after the gap. Like we do when we call regex search functions. > Alternatively, we can copy the text out and pass it to tree-sitter, but you don’t like that, IIRC. Yes, because it means memory allocation, which could be slow, especially for large buffers. It could even fail if the buffer is large enough and the system is under memory pressure. > >> I only need to modify gap_left, gap_right, make_gap_smaller and make_gap_larger, right? > > > > Why would you need to _modify_ any of these? > > Because I want to let tree-sitter to know where is the gap so it can avoid it when reading text. Knowing where is the gap doesn't need any changes to these functions. See GPT_BYTE, GPT_SIZE, BUF_GPT_BYTE, and BUF_GPT_SIZE. And the gap cannot move while tree-sitter accesses the buffer, because no other part of the Lisp machine can run at that time.