From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yoav Marco Newsgroups: gmane.emacs.devel Subject: Re: Tree-sitter integration on feature/tree-sitter Date: Mon, 09 May 2022 20:50:32 +0300 Message-ID: <87y1zabmbt.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="37450"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: mu4e 1.6.3; emacs 29.0.50 Cc: emacs-devel@gnu.org To: casouri@gmail.com Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon May 09 20:32:11 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1no8B1-0009Z7-Ee for ged-emacs-devel@m.gmane-mx.org; Mon, 09 May 2022 20:32:11 +0200 Original-Received: from localhost ([::1]:36556 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1no8Az-0004F4-Vt for ged-emacs-devel@m.gmane-mx.org; Mon, 09 May 2022 14:32:10 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:54300) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no7yw-000704-M8 for emacs-devel@gnu.org; Mon, 09 May 2022 14:19:42 -0400 Original-Received: from mail-wr1-x42f.google.com ([2a00:1450:4864:20::42f]:38776) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1no7yt-0007m7-Ue for emacs-devel@gnu.org; Mon, 09 May 2022 14:19:41 -0400 Original-Received: by mail-wr1-x42f.google.com with SMTP id k2so20596040wrd.5 for ; Mon, 09 May 2022 11:19:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=user-agent:from:to:cc:subject:in-reply-to:date:message-id :mime-version:content-transfer-encoding; bh=YLDkXU9vTu4PAGE0uPG7LnLjR4qLHNC3SauBPyQC0VE=; b=U5GPROBStrqCmDBPi8W5b+28p9kqF1j/bMP9UA4xevXAVBdA1hObD81Ph/2M2V8ag3 vghKbpj2rO8QkFhUTqw7a1ibtmnERmqeSy8Q7TiDk0ayyK/+k1QqBxA23Ub9oIlUxadP JDeZR//fmFogqN89VWBeLblAUDiIosrzkTeroZn7T5IfxxXdyZa6oUaqHMLGOtUL1tYR WSKYksEFb/XCtSmiQNQcofTgiRki7AROVTkKgsGYfSm+PGZEF97xORa0hwDWUbv1cIEA 5JCgzeRmQU6FG/11iqFRQcAY7OdsHmDyNnJD4W9GADrKS1Yp7j06YFPxwXfm8DQkp/DO O6jg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:user-agent:from:to:cc:subject:in-reply-to:date :message-id:mime-version:content-transfer-encoding; bh=YLDkXU9vTu4PAGE0uPG7LnLjR4qLHNC3SauBPyQC0VE=; b=gpfeYfrAeSQ8ICw7xbwTgMNaXEbUlkYbi0zP2cWF8zn5hAlWke618JyZ6SAfRkCiM2 xLu6XzhxMBVGtb7/AdmYRoSnIpXQCsPwIISfrkkO3gqCzZPdBSmgush5vQA1nKD61AUf hlEcMVdTp3np5oC62Rv+1MwA3YqKb89fmqGyRAwutdCqtq3J8CnscEf+1nISu19u/NYX s5W4XZvarIuXuVWpZcSCCgbTNbz0ad2wuUiigJ0vuo+T3Fv+dxLykGoI2od1i26GLQHN MTES7AE4tF8TY40lXbhTFRI2pqNOOi5qF+e/oCRWPcB/v1hUPPzxCsB3nM7f3sMZoJi8 bjTQ== X-Gm-Message-State: AOAM532KZxN9S+GHv7vQJ9cCif688qj9rWwE1aSuydj7iAgdW+OXHiPW JvME2od+5HuO6txKed2k0qs= X-Google-Smtp-Source: ABdhPJxSZ+6ZchkMe7fLJ93ItJWgWAWVnuDx1vSvpkrLoYyX2XLRJNTrMFw1e7wQjmOV/JABJ3XACQ== X-Received: by 2002:a5d:4407:0:b0:20c:55c4:540e with SMTP id z7-20020a5d4407000000b0020c55c4540emr15198894wrq.92.1652120377379; Mon, 09 May 2022 11:19:37 -0700 (PDT) Original-Received: from localhost ([77.126.101.171]) by smtp.gmail.com with ESMTPSA id v16-20020a5d6790000000b0020c5253d8f7sm11850660wru.67.2022.05.09.11.19.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 May 2022 11:19:36 -0700 (PDT) In-Reply-To: 9E6D13F6-7E50-44EE-A357-C971A11A3636@gmail.com Received-SPF: pass client-ip=2a00:1450:4864:20::42f; envelope-from=yoavm448@gmail.com; helo=mail-wr1-x42f.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Mon, 09 May 2022 14:30:24 -0400 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:289546 Archived-At: Looking at the code, isn't the while loop in treesit-query-capture O(n=C2=B2)? It essentially amounts to result =3D nil while (next capture group is avaliable) { captures =3D nil for (capture in capture group) { captures =3D cons(capture, captures) } captures =3D nreverse(captures) if (captures pass all predicates in their query) result =3D nconc(result, captures) // <----- THE OFFENDER } A better way to do this would be to call nconc(captures, result) and nreverse it all at the end instead of at the end of the for loop. An even faster way would be to add unconditionally to result and roll it back in case predicates fail. This doesn't use nconc at all: result =3D nil while (next capture group is avaliable) { prev_result =3D result; for (capture in capture group) { result =3D cons(capture, result) } if (captures *fail* at a predicate) result =3D prev_result } result =3D nreverse(result) Context: I'm still working on profiling query compilation, and from what I understand of gprof's output (not much) nconc indeed is very slow here. Seeing nconc in the report is what made me look for nconc usage in treesit-query-capture. index % time self children called name [1] 98.4 1.76 0.13 169+9031282 [1] 1.59 0.00 41385 Fnconc [2] 0.06 0.05 2432616+3714 process_mark_stack [24] 0.08 0.00 147643 re_match_2_internal [25] 0.02 0.00 1707+56214 mark_char_table = [32] 0.00 0.02 18 garbage_collect = [33] ... The thing profiled is calling treesit-font-lock-fontify-region with c queries accidentally on a go file with 8k lines. Still shouldn't take the 2.2 seconds that it did, though. > BTW, I would appreciate for someone to look at the manual and maybe > touch up a bit, as I=E2=80=99m not a native speaker and might write somet= hing > not very idiomatic/fluent. One thing I've noticed - the manual node starts by introducing treesit-available-p, a function that doesn't seem to exist anymore?