From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: emacs-tree-sitter and Emacs Date: Thu, 02 Apr 2020 17:03:43 +0300 Message-ID: <83v9mix9vk.fsf@gnu.org> References: <83eeta3sa0.fsf@gnu.org> <86369ojbig.fsf@stephe-leake.org> <83lfnfz6jr.fsf@gnu.org> <864ku3htmb.fsf@stephe-leake.org> Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="42711"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Stephen Leake Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Apr 02 16:04:40 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jK0SV-000Ayy-Pw for ged-emacs-devel@m.gmane-mx.org; Thu, 02 Apr 2020 16:04:39 +0200 Original-Received: from localhost ([::1]:40530 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jK0SU-0006o8-RF for ged-emacs-devel@m.gmane-mx.org; Thu, 02 Apr 2020 10:04:38 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:54261) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jK0Rt-0006Gr-Ic for emacs-devel@gnu.org; Thu, 02 Apr 2020 10:04:03 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:57277) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jK0Rs-0007LO-Ri; Thu, 02 Apr 2020 10:04:00 -0400 Original-Received: from [176.228.60.248] (port=4836 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jK0Rs-0003Zf-1x; Thu, 02 Apr 2020 10:04:00 -0400 In-Reply-To: <864ku3htmb.fsf@stephe-leake.org> (message from Stephen Leake on Wed, 01 Apr 2020 11:51:40 -0800) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:246257 Archived-At: > From: Stephen Leake > Date: Wed, 01 Apr 2020 11:51:40 -0800 > > Eli Zaretskii writes: > > > Can you tell in more detail why you need to rely on these hooks? They > > shouldn't be necessary, AFAIU. > > It is an optimization choice. > > In an unmodified buffer, that is smaller than 100,000 characters > (default setting of wisi-partial-parse-threshold), the entire buffer is > parsed once; that applies faces to all the Ada identifiers that need > faces (standard font-lock regexp handles the reserved words). Then when > font-lock fontifies a region, no parsing is needed. But why do you need that initial full parse in the first place? Is parsing parts of the buffer so much harder? > Indent is similar; the parse sets text properties holding the indent for > each line; indent-region then applies them. Indent is a different use case: it happens by user command, and thus has different time restrictions than redisplay. > If the default setting of jit-lock-defer-time (ie nil) is used, then > font-lock runs immediately after each change, and the after-change hooks > are not needed. But as I have mentioned, I always run with > jit-lock-defer-time set to 1.0 (because parsing is not fast enough in > some cases), so the change hooks are needed. AFAIU, tree-sitter and similar parsers are supposed to be much faster, so the problem with slow parsing, and all the solutions to alleviate that problem, may not be necessary, if they are the only reason for using the hooks. > The alternative to not requiring after-change hooks is to always do a full > parse, for ever call of fontify-region or indent-region. That is far too > slow. Even for indentation, a full parse should not be needed. You need to only parse the outermost enclosing function/procedure, right? That's rarely the full buffer, except when the buffer is small. > Note that Tree-Sitter requires one full parse of the buffer to generate > the parse tree that is later updated incrementally; in an unmodified > buffer, only that one parse is needed. Tree-sitter cannot know what the full buffer holds, so nothing prevents us from passing it just part of the buffer. After all, tree-sitter should be able to do a decent job when the part we pass to it actually _is_ all we have in the buffer, right? > > And they cannot pick up every relevant change; for example, what > > happens if some face used for font-lock is modified? > > Yes, that is a flaw. Not likely to occur in everyday use Redisplay cannot rely on something being "unlikely", because it's expected to produce correct results in all situations. Incorrect display is one of the worst bugs that can happen in an editor. In a modified buffer that is not yet syntactically correct we can get away with slightly incorrect fontifications, but missing face changes will produce horribly incorrect results even if nothing has changed syntactically. That is why I think we should try to avoid using hooks for fontification as much as we can. I can understand why fontification methods that are too slow want to get some help from hooks, but when we design and implement novel fontification methods using fast parsers, we should first try doing that without any hooks, because we already know, from the bitter experience of Emacs 19, that using hooks is a dead end. We developed jit-lock in Emacs 21 precisely to avoid using such hooks, because we realized that those old methods won't work well enough. > >> By default font-lock runs after every character typed > > > > No, it only runs when redisplay kicks in. If you type very quickly, > > it won't run for every character. At least AFAIR. > > What triggers redisplay? When Emacs is about to read input, if no input is available, it performs redisplay. IOW, Emacs enters redisplay when it's about to become idle. > In practice, I and other ada-mode users notice font-lock running after > each character, with the default setting of jit-lock-defer-time. There > is a comment in jit-lock.el indicating that the default value may have > been 0.25 at one point (I did not check the git history); perhaps you > are remembering that behavior? The 0.25 value is just a reminder of the default timing of a similar feature in lazy-lock (RIP), used in Emacs 19. AFAIR, we never had jit-lock-defer-time non-nil by default in Emacs, because during development of Emacs 21 the consensus was that its effect is too surprising, and because (at least in those days) the default jit-lock was fast enough for us to be able to leave the deferred fontification disabled. > For example, in Ada the comment-start is "--". No matter how fast I type > the two chars, ada-mode reports a syntax error after the first one. That means you don't type fast enough, at least relative to your CPU speed. > I don't think there's anything in ada-mode that forces a redisplay > (except explicitly calling wisi-parse-buffer; that calls > font-lock-ensure). But I'd be happy to investigate further if you are > sure it should not work this way. In other similar situations (e.g., in Flyspell mode) we wait for some non-zero idle time before actually running the code which could react to slow typing with annoying messages. > The elisp manual section "Forcing redisplay" says "Emacs normally tries > to redisplay the screen whenever it waits for input." After I type the > first character, it is no longer waiting for input, it is processing > that character. I assume here "process that char code" includes running > after-change-functions, which is (small) elisp code. But I guess after > processing that char, before calling redisplay, it checks if there is > more input, which should be true if I type fast enough. Perhaps "process > that char code" is faster than the combination of my fingers and the > keyboard char send rate? Yes, most probably. > Hmm. M-x (execute-kbd-macro "--") does not show a syntax-error fringe > blink. I'm not sure if that is relevant here. I think it is, because it injects the characters through the same input queue as when you type. It just does that much faster. > I mentioned above that the parser is only too slow when there is a bad > syntax error, and recover is slow. However, that is the typical case > while editing code. AFAIU, producing reasonably good results in this case is one of the explicit design goals of tree-sitter. So it might be much better in these situations. But I have no first-hand experience to tell if that's indeed so.