From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Reliable after-change-functions (via: Using incremental parsing in Emacs) Date: Tue, 31 Mar 2020 15:35:41 -0400 Message-ID: References: <83o8sf3r7i.fsf@gnu.org> <2E218879-0F24-4A20-B210-263C8D0BEEA4@gmail.com> <838sjh2red.fsf@gnu.org> <83369o3bvb.fsf@gnu.org> <816186eb-baac-f5c7-04df-a3f30780d91d@yandex.ru> <83k1301qq4.fsf@gnu.org> <834ku41km9.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="66877"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Cc: casouri@gmail.com, akrl@sdf.org, emacs-devel@gnu.org, dgutov@yandex.ru To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Mar 31 21:36:33 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jJMga-000HHZ-Sm for ged-emacs-devel@m.gmane-mx.org; Tue, 31 Mar 2020 21:36:32 +0200 Original-Received: from localhost ([::1]:43482 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jJMgY-0003bO-Q4 for ged-emacs-devel@m.gmane-mx.org; Tue, 31 Mar 2020 15:36:30 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:40157) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jJMfw-00034p-3e for emacs-devel@gnu.org; Tue, 31 Mar 2020 15:35:53 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jJMfu-0006Zw-DS for emacs-devel@gnu.org; Tue, 31 Mar 2020 15:35:51 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:52073) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1jJMfq-0006Vw-3P; Tue, 31 Mar 2020 15:35:46 -0400 Original-Received: from pmg3.iro.umontreal.ca (localhost [127.0.0.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 6209144FA97; Tue, 31 Mar 2020 15:35:44 -0400 (EDT) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id C06EB44F8DB; Tue, 31 Mar 2020 15:35:42 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1585683342; bh=t+bY8FGGKhmGUsW6OaPZzeYjOI7LbpVy35a8iS3kbbE=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=ASaeuR7z57S4kPt8Aq5cWjZt+G3ur4nqP1WGnyKZHTyB24AQYn0jzKTIsSDi/x1mA lc0dbQjL1GO83B4b44df2m4FHG1A6yAKUideGxDwntHP3ZgsIKxT4pJOVB8kIlqHmD xSLnwpkshfGTRqJGiy62HCNKz8Kuhy6nzvyWzdzO6cK0Ck5AkcYqrAIkGhHvFZnxW+ uDoBQJL6axQKlEWtiDWNvzrRJyrIwsGsVqL2Gi5Zp6lEA1LTKbQAtmAuSBqNHEmNVM V6BvIr2p1wrhmEXCrcVS7zMonR2lIHBc1firmRwSd51HTtx/zPnrwIa8DkaH56cbLS o5KhkVJ5AwW7Q== Original-Received: from alfajor (unknown [104.247.241.114]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 76AC112059A; Tue, 31 Mar 2020 15:35:42 -0400 (EDT) In-Reply-To: <834ku41km9.fsf@gnu.org> (Eli Zaretskii's message of "Tue, 31 Mar 2020 20:48:14 +0300") X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 132.204.25.50 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:246154 Archived-At: >> > It should be obvious that sending a buffer as a single string is less >> > efficient than letting tree-sitter access buffer text directly. We >> > just need an appropriate API for that (maybe there is one already, I >> > didn't take a look at their sources since January). >> My benchmark say that `buffer-string` takes about 1/3 the time of >> `parse-partial-sexp`, so letting tree-sitter access our buffer text >> directly is unlikely to give more than a 30% speed up. > Sure, but we never call parse-partial-sexp on the entire buffer, do we? Not sure how that's relevant. I only used `parse-partial-sexp` as a lower bound on the time tree-sitter is likely to take to do its own parsing. >> It doesn't mean it wouldn't be a desirable optimization, but it does >> mean that it likely won't make a large difference as to whether it's >> "fast enough". > I disagree. Your disagreement doesn't seem to be with what I said: I didn't argue about the elegance or efficiency, only about the fact that the performance impact is likely to be small enough that it's not going to affect the viability of the approach. > Communicating with a C library by making a string out of buffer text > is extremely inelegant and inefficient. We shouldn't do that except > when the strings are very short. FWIW, elegant/efficient or not, that's the standard way to do it, AFAICT. E.g. that's what we do in `secure-hash`, that's what we do when parsing JSON, ... You basically always need to en/decode the content (even if it is into utf-8, we still need to handle the potential raw-bytes), so a copy is hard to avoid. Note that for regexp-matching the problem is slightly different because we don't know beforehand which part of the buffer will be consulted, so doing a "copy and then regmatch" would be too inefficient (we'd always need to copy everything til point-max). Stefan