From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yoav Marco Newsgroups: gmane.emacs.devel Subject: Re: Tree-sitter api Date: Fri, 24 Dec 2021 12:04:13 +0200 Message-ID: <87a6gq5mxl.fsf@gmail.com> References: Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="16444"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: mu4e 1.6.3; emacs 28.0.60 Cc: cpitclaudel@gmail.com, theo@thornhill.no, ubolonton@gmail.com, emacs-devel@gnu.org, p.stephani2@gmail.com, monnier@iro.umontreal.ca, eliz@gnu.org, stephen_leake@stephe-leake.org, john@yates-sheets.org To: casouri@gmail.com Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Dec 24 14:21:46 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1n0kW1-00042R-Je for ged-emacs-devel@m.gmane-mx.org; Fri, 24 Dec 2021 14:21:45 +0100 Original-Received: from localhost ([::1]:38540 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n0kVy-0001Ac-Gm for ged-emacs-devel@m.gmane-mx.org; Fri, 24 Dec 2021 08:21:42 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:53018) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n0hbO-0007oQ-ON for emacs-devel@gnu.org; Fri, 24 Dec 2021 05:15:07 -0500 Original-Received: from [2a00:1450:4864:20::32c] (port=47083 helo=mail-wm1-x32c.google.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1n0hbM-0008E9-RS; Fri, 24 Dec 2021 05:15:06 -0500 Original-Received: by mail-wm1-x32c.google.com with SMTP id d198-20020a1c1dcf000000b0034569cdd2a2so4593926wmd.5; Fri, 24 Dec 2021 02:15:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=references:user-agent:from:to:cc:cc:cc:cc:cc:cc:cc:cc:cc:subject :date:in-reply-to:message-id:mime-version; bh=SKPGJjrfGcSd/mTujQ0jZZruYkSUk9Vb9hpLpLlsOWE=; b=G/Uy5gylqUKIPnAkUA8WpyhPnM0CE5W3zBbTscsB6CARSUOFkBJi3GBgxkrqTIiaa8 ACrFdNTZnwbdOzAgTMJ46Y4d11uz7qCSXMdSTWcWyPUKdjiCH9UKvHmFxfc5mzmM4v5u lQM9laVHqRe9x/iHPIVHxzfvNZ5qqdaObPZ1Kg2n8BOGyBO9+i8zyJSvgdgsZTSyzDwV gNORjy1oJ5tliwXLvTBS3saBrKWq6qkaR/YFxxjJvYPsofqmLfBMBV8Y1l1zWypCom7z dRSEwzG7p8bdO+gYhSGusmYFx9Nti841bGEQA5OEZHsSCIE5o91PMmYAiQyUPs0P6PCI qD7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:references:user-agent:from:to:cc:cc:cc:cc:cc:cc :cc:cc:cc:subject:date:in-reply-to:message-id:mime-version; bh=SKPGJjrfGcSd/mTujQ0jZZruYkSUk9Vb9hpLpLlsOWE=; b=kVHfpLSE+lF3qfF3frcWp1v2X7I1S2F4rhP7YLSlgAfCHSZ5kPgoSaB83bxw+1nX59 bz/A/lIQJXsBGNlXafuu3ea9Xi6FUjdIqX77eUyrWfJjmtfXxJtT7BedK3Df6gumrPbr 9gugdCw7tTljqUsP6eVOsbQYahg7EyF8lTmjaaHhj3QTsFElskihPqpGbGMRqWLy+4rR GxhsdGKMrgYvgRgckh7FLuMfIbjoT3AAUfTt6WcO2e+AKZ5R2J8tEviIf3G56FkkwaTm togQDi3BL6nMazppOm9tiX7KMlYPeDQRDow9zee83+Pob+5wTAx5FTs7dAnlHOOnmgTb bizg== X-Gm-Message-State: AOAM533KCK8L5NTAEoitio5wv3V5k0TtHaml3oXyquEZBNSrIlUco3CE nwpLG2QHpUYJg/EqTh23jzE= X-Google-Smtp-Source: ABdhPJyu3Q1cU94Su9wPAggd7y27tSHoX1+3qLCB8yQbMI0/nghQAllHQk6iIZ5axNXjB/SqwUWTdQ== X-Received: by 2002:a1c:2606:: with SMTP id m6mr4643309wmm.52.1640340902296; Fri, 24 Dec 2021 02:15:02 -0800 (PST) Original-Received: from localhost ([77.124.194.146]) by smtp.gmail.com with ESMTPSA id n41sm9621280wms.32.2021.12.24.02.15.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Dec 2021 02:15:02 -0800 (PST) In-reply-to: <9C5A86D6-0E7D-4DDF-B211-278EF9AC7E01@gmail.com> X-Host-Lookup-Failed: Reverse DNS lookup failed for 2a00:1450:4864:20::32c (failed) Received-SPF: pass client-ip=2a00:1450:4864:20::32c; envelope-from=yoavm448@gmail.com; helo=mail-wm1-x32c.google.com X-Spam_score_int: 14 X-Spam_score: 1.4 X-Spam_bar: + X-Spam_report: (1.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RDNS_NONE=0.793, SORTED_RECIPS=2.499, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Fri, 24 Dec 2021 08:19:31 -0500 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:283137 Archived-At: Hi, Yuan and I had a discussion on github https://github.com/casouri/emacs/issues/5 and he suggested we move here. I'm quoting our comments for conveniece. Yoav Marco writes: > Hi! My question is about the lines: > https://github.com/casouri/emacs/blob/a4f90c5f95476914fb8789c67652af1025644af8/src/tree-sitter.c#L1375-L1380 > > /* TODO: We could cache the query object, so that repeatedly > querying with the same query can reuse the query object. It also > saves us from expanding the sexp query into a string. I don't > know how much time that could save though. */ > TSQuery *ts_query = ts_query_new (lang, source, strlen (source), > &error_offset, &error_type); > > > Regarding error handling mostly. > > In this branch queries are saved as *strings* and compiled in the internals on > each use. In elisp-tree-sitter, you call `tsc-make-query` and use the object > it returns for calls to tsc-query-captures which is the analog for > tree-sitter-query-capture. > > What happens if your query is deformed, or simply has a typo in a node name? > We call `tree-sitter-query-capture` on each keystroke in > `tree-sitter-font-lock-fontify-region`. With the compilation occurring > ahead-of-time it would fail once, but here wouldn't it barrage you with > errors? > > Especially with patterns that aren't set in stone and can be modified like > font-lock keywords, I think compiling the query when the pattern is added is > better than on each execution. > > One nice thing though about compiling queries only when queried is that you > can call `ts_query_delete` straight away. With users compiling queries it > would need to be up to garbage collection, I think. Yuan Fu writes: >> What happens if your query is deformed, or simply has a typo in a node name? >> We call tree-sitter-query-capture on each keystroke in >> tree-sitter-font-lock-fontify-region. With the compilation occurring >> ahead-of-time it would fail once, but here wouldn't it barrage you with >> errors? > > Not quite barraging, jit-lock will just silently fail and leave a bunch of > logs in Messages. I don't think error out when calling > tree-sitter-query-capture is a grave problem, since 1) it doesn't barrage as > you worried and 2) I don't expect queries in major modes to ship wrong code: > it's not like a bug that could go undiscovered, if the query has a typo, the > major mode writer will certainly find out when he/she tries to fontify a > buffer. > > I can see some advantages to compile the query ahead of time. 1) It would be > helpful to know there is an error before calling > tree-sitter-font-lock-fontify-region and see an unfontified buffer, not > knowing what went wrong. I can add a function, say, tree-sitter-compile-query > that checks a query (as in query pattern) and passes it on if its correct. 2) > It could potentially saves recompilation of the query. But computing the query > most probably takes negligible time. > > On the other hand, compiling the query has downsides: I don't know what does > tsc-make-query return, I assume an internal object? I try to minimize the > number of new object types I introduce to Emacs, for hygiene. So far I've > managed to add only parser object and node object. If there aren't good > reasons I'm inclined to not add a query object. So far the advantages that I > see aren't very convincing. > > If you want to continue the discussion, I suggest we continue at emacs-devel, > that way others who are more knowledgable than I can join and offer their > opinion.