From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stephen Leake Newsgroups: gmane.emacs.devel Subject: Re: How to add pseudo vector types Date: Wed, 21 Jul 2021 08:49:15 -0700 Message-ID: <86wnpj4qh0.fsf@stephe-leake.org> References: <83h7gw6pyj.fsf@gnu.org> <45EBF16A-C953-42C7-97D1-3A2BFEF7DD01@gmail.com> <83y2a764oy.fsf@gnu.org> <83v95b60fn.fsf@gnu.org> <00DD5BFE-D14E-449A-9319-E7B725DEBFB3@gmail.com> <83r1fz5xr9.fsf@gnu.org> <86mtqh54wo.fsf@stephe-leake.org> <83czrd53z2.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="7962"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (windows-nt) Cc: casouri@gmail.com, monnier@iro.umontreal.ca, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Jul 21 17:53:39 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1m6EXR-0001sS-7y for ged-emacs-devel@m.gmane-mx.org; Wed, 21 Jul 2021 17:53:37 +0200 Original-Received: from localhost ([::1]:55144 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m6EXQ-0000a8-7W for ged-emacs-devel@m.gmane-mx.org; Wed, 21 Jul 2021 11:53:36 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:37598) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m6ETN-0002P6-9g for emacs-devel@gnu.org; Wed, 21 Jul 2021 11:49:25 -0400 Original-Received: from gateway21.websitewelcome.com ([192.185.45.89]:44311) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m6ETK-0002Oe-MC for emacs-devel@gnu.org; Wed, 21 Jul 2021 11:49:24 -0400 Original-Received: from cm10.websitewelcome.com (cm10.websitewelcome.com [100.42.49.4]) by gateway21.websitewelcome.com (Postfix) with ESMTP id 40A0A41E7D420 for ; Wed, 21 Jul 2021 10:49:20 -0500 (CDT) Original-Received: from host2007.hostmonster.com ([67.20.76.71]) by cmsmtp with SMTP id 6ETHmNCmyoIHn6ETImaM2a; Wed, 21 Jul 2021 10:49:20 -0500 X-Authority-Reason: nr=8 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=stephe-leake.org; s=default; h=Content-Type:MIME-Version:Message-ID: In-Reply-To:Date:References:Subject:Cc:To:From:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=QtZep+WJ3gSDNbrku4kREU9f7XJHSldY9mXDSF0Nnjs=; b=Z5yRLfQvidwMxnA0ibOX//G4sL wR8BbMZS/6cAYFm1QfKHmiHc2XU7/ht3Jw/P+7aYNiFy8JwqLVhh4q3z+tY4ghNqt/uk6zj3J6web 8gAgK76pniwki8LDDztKV+BN6ke1FAFXAJHl4SfGpt7gSG5XBJUiJimHMZktp/g2f/3jMdzFFka6F Mf81ExJQZfYtyS/DDdnZUxQyDJCJZHlK74dEN8xzBe7uLPe86bmKjlG8EWYUoQEyrplKxSX24XPuc V+xton0t5Y6S3rllvNs3KjQArSDJr7Bu3AQr8Ip/vwGajIJyZ1i8Qpas6c+feTSsW/npLodTBZgkZ qa4nLKwg==; Original-Received: from [76.77.182.20] (port=61779 helo=Takver4) by host2007.hostmonster.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1m6ETH-004JZM-GQ; Wed, 21 Jul 2021 09:49:19 -0600 In-Reply-To: <83czrd53z2.fsf@gnu.org> (Eli Zaretskii's message of "Tue, 20 Jul 2021 19:45:21 +0300") X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - host2007.hostmonster.com X-AntiAbuse: Original Domain - gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - stephe-leake.org X-BWhitelist: no X-Source-IP: 76.77.182.20 X-Source-L: No X-Exim-ID: 1m6ETH-004JZM-GQ X-Source-Sender: (Takver4) [76.77.182.20]:61779 X-Source-Auth: stephen_leake@stephe-leake.org X-Email-Count: 4 X-Source-Cap: c3RlcGhlbGU7c3RlcGhlbGU7aG9zdDIwMDcuaG9zdG1vbnN0ZXIuY29t X-Local-Domain: yes Received-SPF: permerror client-ip=192.185.45.89; envelope-from=stephen_leake@stephe-leake.org; helo=gateway21.websitewelcome.com X-Spam_score_int: -8 X-Spam_score: -0.9 X-Spam_bar: / X-Spam_report: (-0.9 / 5.0 requ) BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_NEUTRAL=0.779 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:271416 Archived-At: Eli Zaretskii writes: >> From: Stephen Leake >> Cc: Yuan Fu , monnier@iro.umontreal.ca, >> emacs-devel@gnu.org >> Date: Tue, 20 Jul 2021 09:25:11 -0700 >> >> > In addition, Emacs records (for redisplay purposes) two places in each >> > buffer related to changes: the minimum buffer position before which no >> > changes were done since last redisplay, and the maximum buffer >> > position beyond which there were no changes. This can also be used to >> > pass only a small part of the buffer to the parser, because the rest >> > didn't change. >> >> Again, the input to tree-sitter is a list of changes, not a block of >> text containing changes. > > I fail to see the significance of the difference. Surely, you could > hand it a block of text with changes to mean that this block replaces > the previous version of that block. It might take the parser more > work to update the parse tree in this case, but if it's fast enough, > that won't be the problem. Right? tree-sitter doesn't store the previous text, so there's nothing to compare it to. Alternately, this would require the parser to store the previous text so it can compute the diff; that could be added in a wrapper around tree-sitter. wisi does store the previous text, so it could compute the diff. But because of memory pressure, we want a design that does not require a copy of the buffer text; when wisi is turned into an Emacs module, it will not store a copy of the text. >> If the parser is in an Emacs module, so it has direct access to the >> buffer, then the hooks only need to record the buffer positions of the >> insertions and deletions, not the new text. That should be very fast. > > (You are talking about the undo-list.) Almost; the undo-list can get reset before the parser needs it. And sometimes it is disabled. But it might make sense to try to use that instead of maintaining a separate list of changes. It might make sense to delete the matching change from the parser change list when undo is invoked, rather than adding another change. > But even this is wasteful: it is quite customary to delete, then > re-insert, then re-delete again, etc. several times. So collecting > these operations will produce much more "changes" than strictly > needed. Yes. The wisi parser Ada code includes a step that combines all the changes (in arbitrary buffer-pos order) into a minimal list of changes in buffer-pos order; that simplifies applying multiple changes to the parse tree. We could move that to elisp, if that would help (it's in Ada because I much prefer debugging Ada to debugging elisp). That could be done in the buffer-change hook; if the current change can be combined with the previous one, do that instead of adding a new one. > That's why I'm trying to find a simpler, less wasteful strategies. > Since TS is very fast, we can trade some of the speed for simpler, > more scalable design of tracking changes. I don't see how optimizing the change list makes it more "scalable"; the worst case is that the optimal list is the complete list of actions the user takes, and that will happen often enough to be an important case. In practice font-lock is triggered on every character typed by the user (Emacs is faster than people can type), so there will typically be only one change; nothing to optimize. In the case where some elisp is changing the buffer in several places (ie indent-region, or some other re-format), optimizing the change list might make sense, if the elisp code is not already optimized for that. -- -- Stephe