From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: cc-mode fontification feels random Date: Fri, 04 Jun 2021 22:51:01 +0300 Message-ID: <83fsxxl87u.fsf@gnu.org> References: <831r9iw473.fsf@gnu.org> <2d6d1cb0-2e8f-ceea-cb83-3bb840b65115@dancol.org> <83zgw6udxt.fsf@gnu.org> <87czt1zzns.fsf@gmail.com> <371647e9-9508-ae98-26f0-3649d7ba114e@dancol.org> <83o8clla1u.fsf@gnu.org> <179d874d918.2816.cc5b3318d7e9908e2c46732289705cb0@dancol.org> <83im2tl9cg.fsf@gnu.org> <179d8841388.2816.cc5b3318d7e9908e2c46732289705cb0@dancol.org> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="16672"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org, monnier@iro.umontreal.ca, joaotavora@gmail.com To: Daniel Colascione Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Jun 04 21:51:45 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lpFr7-00041s-Jn for ged-emacs-devel@m.gmane-mx.org; Fri, 04 Jun 2021 21:51:45 +0200 Original-Received: from localhost ([::1]:55544 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lpFr5-0008Tk-IB for ged-emacs-devel@m.gmane-mx.org; Fri, 04 Jun 2021 15:51:43 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:33856) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lpFqd-0007nf-0F for emacs-devel@gnu.org; Fri, 04 Jun 2021 15:51:15 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:49328) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lpFqY-0000XM-35; Fri, 04 Jun 2021 15:51:10 -0400 Original-Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:4340 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lpFqX-0006C4-9Q; Fri, 04 Jun 2021 15:51:09 -0400 In-Reply-To: <179d8841388.2816.cc5b3318d7e9908e2c46732289705cb0@dancol.org> (message from Daniel Colascione on Fri, 04 Jun 2021 12:33:25 -0700) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:270407 Archived-At: > From: Daniel Colascione > CC: , , > Date: Fri, 04 Jun 2021 12:33:25 -0700 > > > What do you think tree-sitter does with the fast copy you hand to it? > > doesn't it walk it one character at a time? > > > > And if you studied the tree-sitter's internals, and it uses > > get_buffer_char as a means of copying text into its own buffer, then > > perhaps we could ask tree-sitter developers to avoid the copy and use > > the text directly. > > Teaching TS to use a generic cursor interface would be great. I don't remember if I looked at how it does it now, but are you sure it doesn't already know how to do that? Sounds like a natural thing to me, but maybe I'm missing something. > > buffer-substring is not just a copy of a chunk of text, it's much > > more. > > The variant without text properties doesn't do much. It allocates memory! For a large buffer (think xdisp.c) that is best avoided. I hope if we need to memcpy, we could at least use a pointer to a buffer allocated by the parser library, so we won't need to. > > Even if eventually we need to use a memory copy, that'll run > > circles around buffer-substring, and will avoid triggering GC. > > Sure. I'm not opposed to adding an API that's basically a more efficient > buffer substring for C callers. I'm just pointing out that the idea of > giving TS "direct access" to a buffer without any copy at all doesn't make > a lot of sense. If it can use that wisely, I don't see why it wouldn't make sense. If it cannot, then I agree. But still, I'd rather not give up from the get-go and use buffer-substring just because it's there, I'd try looking for something more scalable and less Lisp-consing. Also, I hope we could arrange the copying to be driven by the display engine through the JIT font-lock machinery, rather than sending the entire buffer or its large parts. > >> Because any kind of "access" to the buffer that doesn't expose the gap is > >> going to be a copy anyway. > > > > The regexp routines aren't. > > The regexp routines have Emacs specific knowledge. I mean the way regexp routines use the buffer text as a C string (as 2 C strings, actually). That doesn't use any Emacs specific knowledge except the gap, and even the latter is largely solved by the caller.