From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yuan Fu Newsgroups: gmane.emacs.devel Subject: Re: Update on tree-sitter structure navigation Date: Thu, 7 Sep 2023 18:06:49 -0700 Message-ID: <8A2B8A2E-FC24-401B-ACF8-688F2B157FB6@gmail.com> References: <5E7F2A94-4377-45C0-8541-7F59F3B54BA1@gmail.com> <87h6odhxs6.fsf@localhost> <87msxzsee1.fsf@localhost> Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.700.6\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="8767"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel , Danny Freeman , Theodor Thornhill , =?utf-8?Q?Jostein_Kj=C3=B8nigsen?= , Randy Taylor , Wilhelm Kirschbaum , Perry Smith , Dmitry Gutov To: Ihor Radchenko Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Sep 08 03:07:30 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qePy5-00026j-Kr for ged-emacs-devel@m.gmane-mx.org; Fri, 08 Sep 2023 03:07:29 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qePxi-0000bY-UA; Thu, 07 Sep 2023 21:07:06 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qePxh-0000Zr-Dn for emacs-devel@gnu.org; Thu, 07 Sep 2023 21:07:05 -0400 Original-Received: from mail-pl1-x633.google.com ([2607:f8b0:4864:20::633]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qePxf-0006AG-4I for emacs-devel@gnu.org; Thu, 07 Sep 2023 21:07:05 -0400 Original-Received: by mail-pl1-x633.google.com with SMTP id d9443c01a7336-1c328b53aeaso13319425ad.2 for ; Thu, 07 Sep 2023 18:07:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1694135221; x=1694740021; darn=gnu.org; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=hj9I9a3FVu8KIqZjuxHsFqYE0D+NyIQLSGdo2EoEQAA=; b=LdjI//BGh7WIsQg/brxpdYxtCbOjcTUdICA4eBIcVfFDIV0wx2bOTo+8qsxpS4dXYR KpYurWFBSLXghXTesJFbmdOYgNy0xtqoakyJJqCxg4q1U+nn2STysVU0Em6T53mTkSRT 7myj0ZKqT9YUDF0OXjoZjz9ylzsLdHtfiyfgf8u55PcdscicClWtM9yHsIh0trq/z5rM tghNLuYqgMF3sY+1qZJ2glnK3pRWgEoP9Ngjna6lqNnAl96iw8sBFBuEQcAA/Lqb061o XYfFuxv1NZ9pz4rALnqoPDCntXvEr3jPLd1D1TvOaIPmUh5KN3OWkhUqfW6pFLOE5pmU wmtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694135221; x=1694740021; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hj9I9a3FVu8KIqZjuxHsFqYE0D+NyIQLSGdo2EoEQAA=; b=hbURHJleaiA8kKPSlIbr30KqF6PtytPt6iemY8sH2GSjSlBis4VRp+uAesuncE1KmO fqlitoC1dBQzQdJnbmyu99vURIzn0sa2pqEoJC/OLhyn5hfhvmxybQ5wcZ1su/o8k9aV Rx5dBGd18wJXsBIj6mx5zVuYuJLwdW/u1qFYwnzbri2z63O5JmQUQWXXzg13m18bi1vG CqrhIeNZlOHIWytDxPQyRV3D6HNZ+ycy7I0RfBEwxsJB14zKY2mYeT9FavJiPycuqi3i 73nvmWhEu3eTz05YEokuYuwGzdjZovD2/mjNjMSG7atmVNUUV/y67wwaFWA9IhsmZsO1 DiQQ== X-Gm-Message-State: AOJu0YyLtnBmLcbG3/etnR6SY0/X1BJG3QS5L1+Ls9yO12sP9OpwSNTk oT83RRtizaJs9ecgDQCjz5Y= X-Google-Smtp-Source: AGHT+IEIO/MZUk/V7PlEZvVeq/FIcJH7BVbV7/oGNnGbDmVX7ny6NWS3aI5Z6RLwIhvduCOAm/ZiGg== X-Received: by 2002:a17:902:e883:b0:1bc:671d:6d31 with SMTP id w3-20020a170902e88300b001bc671d6d31mr1407641plg.3.1694135221587; Thu, 07 Sep 2023 18:07:01 -0700 (PDT) Original-Received: from smtpclient.apple (cpe-172-117-161-177.socal.res.rr.com. [172.117.161.177]) by smtp.gmail.com with ESMTPSA id ja10-20020a170902efca00b001b9da8b4eb7sm355431plb.35.2023.09.07.18.07.00 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 07 Sep 2023 18:07:01 -0700 (PDT) In-Reply-To: <87msxzsee1.fsf@localhost> X-Mailer: Apple Mail (2.3731.700.6) Received-SPF: pass client-ip=2607:f8b0:4864:20::633; envelope-from=casouri@gmail.com; helo=mail-pl1-x633.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:310302 Archived-At: > On Sep 6, 2023, at 4:57 AM, Ihor Radchenko = wrote: >=20 > Yuan Fu writes: >=20 >> I think that both NODE types and attributes can be standardized. >>=20 >> If we come up with a thing-at-point interface that provides more = information than the current (BEG . END), tree-sitter surely can support = it as a backend. Just need SomeOne to come up with it :-) But I don=E2=80=99= t see how this interface can support semantic information like arglist = of a defun, or type of a declaration=E2=80=94these things are not = universal to all =E2=80=9Cnodes=E2=80=9D. >=20 > For example, consider something like >=20 > (thing-slot 'arglist (thing-at-point 'defun)) ; =3D> (ARGLIST_BEG . = ARGLIST_END) > (thing-slot 'arglist (thing-at-point 'variable)) ; =3D> nil >=20 Yeah, that makes sense. >>>> - I can=E2=80=99t think of a good way to integrate tree-sitter = queries with >>>> the navigation functions we have right now. Most importantly, >>>> tree-sitter query always search top-down, and you can=E2=80=99t = limit the >>>> depth it searches. OTOH, our navigation functions work by = traversing >>>> the tree node-to-node. >>>=20 >>> May you elaborate about the difficulties you encountered? >>=20 >> Ideally I=E2=80=99d like to pass a query and a node to = treesit-node-match-p, which returns t if the query matches the node. But = queries don=E2=80=99t work like that. They search the node and returns = all the matches within that node, which could be potentially wasteful. >=20 > Isn't ts_query_cursor_next_match only searching a single match? Seems so, that=E2=80=99s good. But there=E2=80=99s no guarantee that the = first match with be the top node, even thought implementation-wise, I = think that=E2=80=99s probably the case. Maybe we can ask tree-sitter = developer to add such a promise. >>>> - Isolated ranges. For many embedded languages, each blocks should = be independent from another, but currently all the embedded blocks are = connected together and parsed by a single parser. We probably need to = spawn a parser for each block. I=E2=80=99ll probably work on this one = next. >>>=20 >>> Do you mean that a single parser sees subsequent block as a = continuation >>> of the previous? >>=20 >> Exactly. >=20 > Then, I can see cases when we do and also when we do _not_ want = separate > parsers for different blocks. For example, literate programming often > uses other language blocks that are intended to be continuous. Surprise, I added support for local parsers. Major mode authors can = choose between global and local parsers. Yuan=