From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: How to add pseudo vector types Date: Thu, 22 Jul 2021 22:05:29 +0300 Message-ID: <83eebq2mpy.fsf@gnu.org> References: <83h7gw6pyj.fsf@gnu.org> <45EBF16A-C953-42C7-97D1-3A2BFEF7DD01@gmail.com> <83y2a764oy.fsf@gnu.org> <83v95b60fn.fsf@gnu.org> <00DD5BFE-D14E-449A-9319-E7B725DEBFB3@gmail.com> <83r1fz5xr9.fsf@gnu.org> <1AAB1BCC-362B-4249-B785-4E0530E15C60@gmail.com> <83czri67h0.fsf@gnu.org> <46BBFF88-76C3-4818-8805-5437409BEA93@gmail.com> <83wnpq46uk.fsf@gnu.org> <533BD53B-4E85-4E9E-B46A-346A5BBAD0F5@gmail.com> <258CB68D-1CC1-42C8-BDCD-2A8A8099B783@gmail.com> <1a776770-50b7-93cd-6591-c9a5b3a56eb8@gmail.com> <8335s64v10.fsf@gnu.org> <5380C92B-6C15-4490-A1E0-1C3132DBB16A@gmail.com> <83k0li2shw.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="2424"; mail-complaints-to="usenet@ciao.gmane.io" Cc: cpitclaudel@gmail.com, monnier@iro.umontreal.ca, emacs-devel@gnu.org To: Yuan Fu Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Jul 22 21:06:49 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1m6e1v-0000V5-8e for ged-emacs-devel@m.gmane-mx.org; Thu, 22 Jul 2021 21:06:47 +0200 Original-Received: from localhost ([::1]:59762 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m6e1u-0001no-2l for ged-emacs-devel@m.gmane-mx.org; Thu, 22 Jul 2021 15:06:46 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:43550) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m6e0z-00011A-Os for emacs-devel@gnu.org; Thu, 22 Jul 2021 15:05:49 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:33728) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m6e0y-0000F9-NK; Thu, 22 Jul 2021 15:05:48 -0400 Original-Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:4198 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m6e0y-0001j8-Aa; Thu, 22 Jul 2021 15:05:48 -0400 In-Reply-To: (message from Yuan Fu on Thu, 22 Jul 2021 13:47:20 -0400) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:271474 Archived-At: > From: Yuan Fu > Date: Thu, 22 Jul 2021 13:47:20 -0400 > Cc: Stefan Monnier , > Clément Pit-Claudel , > emacs-devel@gnu.org > > > More generally: is the problem real? If you make a file that is 1000 > > copies of xdisp.c, and then submit it to TS, do you really get 10GB of > > memory consumption? This is something that is good to know up front, > > so we'd know what to expect down the road. > > Yes. I concatenated 100 xdisp.c together, and parsed them with my simple C program. It used 1.8 G. I didn’t test for 1000 together, but I think the trend is linear. That's good to know, thanks. So what does TS do if it attempts to allocate more memory and that fails? Regardless, we'd need some fallback strategy, because AFAIU many people run with VM overcommit enabled, so the OOM killer will just kill the Emacs process when it asks for too much memory. > >>>> +DEFUN ("tree-sitter-node-type", > >>>> + Ftree_sitter_node_type, Stree_sitter_node_type, 1, 1, 0, > >>>> + doc: /* Return the NODE's type as a symbol. */) > >>>> + (Lisp_Object node) > >>>> +{ > >>>> + CHECK_TS_NODE (node); > >>>> + TSNode ts_node = XTS_NODE (node)->node; > >>>> + const char *type = ts_node_type(ts_node); > >>>> + return intern_c_string (type); > >>> > >>> Why do we need to intern the string each time? can't we store the > >>> interned symbol there, instead of a C string, in the first place? > >> > >> I’m not sure what do you mean by “store the interned symbol there”, where do I store the interned symbol? > > > > In the struct that ts_node_type accesses, instead of the 'char *' > > string you store there now. > > The struct that ts_node_type accesses is a TSNode, which is defined by tree-sitter. ts_node_type is an API provided by tree-sitter, I’m just exposing it to lisp. I could return strings instead of symbols, but I thought symbols might be more appropriate and more convenient for users of this function. Maybe there's a better way of exposing that to Lisp. But that's a minor point, it can be left for later. > Is below the correct way to set a buffer-local variable? (I’m setting tree-sitter-parser-list.) > > struct buffer *old_buffer = current_buffer; > set_buffer_internal (XBUFFER (buffer)); > > Fset (Qtree_sitter_parser_list, > Fcons (lisp_parser, Fsymbol_value (Qtree_sitter_parser_list))); > > set_buffer_internal (old_buffer); Yes, but it would be better to use DEFVAR_LISP and then you could assign directly to Vtree_sitter_parser_list, instead of using Fset. > Also, we don’t call change hooks in replace_range_2, why? Because it is called in a loop, one character at a time. The caller of replace_range_2 calls these hooks for the entire region, once. > Should I update tree-sitter trees in that function, or should I not? The only caller is casify_region, so you could update there.