From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Reliable after-change-functions (via: Using incremental parsing in Emacs) Date: Wed, 01 Apr 2020 22:46:18 -0400 Message-ID: References: <83369o1khx.fsf@gnu.org> <83imijz68s.fsf@gnu.org> <831rp7ypam.fsf@gnu.org> <86wo6yhj4d.fsf@stephe-leake.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="12092"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Cc: emacs-devel To: Stephen Leake Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Apr 02 04:48:28 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jJpu7-00032L-76 for ged-emacs-devel@m.gmane-mx.org; Thu, 02 Apr 2020 04:48:27 +0200 Original-Received: from localhost ([::1]:60938 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jJpu6-0003ou-A0 for ged-emacs-devel@m.gmane-mx.org; Wed, 01 Apr 2020 22:48:26 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:49485) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jJps8-0002jV-Mp for emacs-devel@gnu.org; Wed, 01 Apr 2020 22:46:25 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jJps7-00087A-3a for emacs-devel@gnu.org; Wed, 01 Apr 2020 22:46:24 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:51412) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1jJps6-00083e-SY for emacs-devel@gnu.org; Wed, 01 Apr 2020 22:46:23 -0400 Original-Received: from pmg2.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id E3AE6814A3; Wed, 1 Apr 2020 22:46:21 -0400 (EDT) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id E164380CA0; Wed, 1 Apr 2020 22:46:19 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1585795579; bh=cd9fMQg9aY0RcqHG2DlTCB2fuPlCqehqQPAQKpNxZ/k=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=FXAbJr3rJATlAYSLtsVawNobABuWw5WaNSuyNztEH2/dOZiqH36lLqR6HiNYdpLkB rDl70P8ly8RmHbdfMkY5GWkTnYqHE0UHn3Vn4fTm1cIBcrnR9SFgUdL+iFeflm0YVF V1hwBCDQQcLsdFUlmytSxQW2eu31hV8O2mldtm/JzB2Ynvy22xiUS7Seg6x/JGn5Kn K5t85O3hz+1PR46y66Yczd59dDhv1JWUu2j/WusK3q1gXr2ik+vEmL/533eLtJTHm2 IvxdC3DKcWl/pbgcResgrgULgIHpmA3iRiVQz2hNaGKnXz4AQsLsIv6CZmt+RjSnqk O904Vdxrd6ksw== Original-Received: from alfajor (unknown [104.247.241.114]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id A01D212047F; Wed, 1 Apr 2020 22:46:19 -0400 (EDT) In-Reply-To: <86wo6yhj4d.fsf@stephe-leake.org> (Stephen Leake's message of "Wed, 01 Apr 2020 15:38:26 -0800") X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 132.204.25.50 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:246235 Archived-At: > In C or C++ body files, "a complete parse" is typically one variable or > function declaration. So if Emacs can reliably find the beginning and > end of those declarations, IIUC, a large part of CC-mode's trouble is exactly the need to find somewhat reliably a position vaguely like "the beginning of a declaration". It's very much a non-trivial problem (and in the general case to properly handle all possible comments you need to start parsing from point-min). >> And yes, doing this by consing strings is not a good idea, it will >> slow things down and cause a lot of GC. It is best avoided. Thus my >> questions above. > I'm not sure how "convert syntax tree to elisp" compares to "consing > strings". I would certainly expect it to cause a lot of GC. If the GC is the worry, we can use a function which encodes the buffer using a given coding-system and returns a malloc'd array of bytes. >>> It's stored in a buffer-local variable. I haven't measured the memory >>> they take. Memory is released when the tree object is garbage-collected >>> (it's a `user-ptr'). > Is it an elisp structure (or accesible from elisp)? Have you written > code that traverses it to provide faces and indentation? According to https://github.com/tree-sitter/tree-sitter/issues/222 the parse tree takes around 10 times the size of the source text. At least that's for tree-sitter's own parse-tree; not sure how that relates to emacs-tree-sitter's yet. Stefan