From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yuan Fu Newsgroups: gmane.emacs.devel Subject: Re: Tree-sitter introduction documentation Date: Fri, 30 Dec 2022 16:03:42 -0800 Message-ID: References: <83edszjslp.fsf@gnu.org> <87tu1vxs3a.fsf@ledu-giraud.fr> <831qozjob7.fsf@gnu.org> <87cz8jxoat.fsf@ledu-giraud.fr> <83wn6ri7pn.fsf@gnu.org> <5e0a3185-de82-b339-0fa2-956779e63d6f@cornell.edu> <868rj6vfep.fsf@gmail.com> <4895891b-e5ea-9c37-f51b-df2e479ee758@yandex.ru> <83y1qt11xq.fsf@gnu.org> <9eb013da-d0fc-8e17-c6e3-1e8f913aebfa@yandex.ru> <83pmc50xxc.fsf@gnu.org> <71cfe4e8-3bb8-b0a6-9be5-8c0a6d92cfab@yandex.ru> <83h6xg29z3.fsf@gnu.org> <87wn6cyey5.fsf@posteo.net> <787B1EB4-1925-4679-8747-449DCD685432@gmail.com> <83y1qo6h7a.fsf@gnu.org> Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="37907"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Philip Kaludercic , monnier@iro.umontreal.ca, dgutov@yandex.ru, theophilusx@gmail.com, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Dec 31 01:04:39 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pBPMc-0009d0-KZ for ged-emacs-devel@m.gmane-mx.org; Sat, 31 Dec 2022 01:04:38 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pBPLp-0008NQ-Mv; Fri, 30 Dec 2022 19:03:49 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pBPLo-0008NC-9Z for emacs-devel@gnu.org; Fri, 30 Dec 2022 19:03:48 -0500 Original-Received: from mail-pj1-x102c.google.com ([2607:f8b0:4864:20::102c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pBPLm-0004x0-27; Fri, 30 Dec 2022 19:03:48 -0500 Original-Received: by mail-pj1-x102c.google.com with SMTP id o8-20020a17090a9f8800b00223de0364beso27124110pjp.4; Fri, 30 Dec 2022 16:03:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=pzXCyb9PU2jpjS3nvSsH8IOMvysFd1NHmfvCKY7RyQw=; b=qnsQ5psXt6/WQsCb9lF0PlsoxzdLruaSc39uW0ZnqlsS7ntfdybtgr+DB+C5wKECeN jqPMQSO2B4TdjzZ24WrWfsAVpl1vR9AOg5Q+/R7vOq0l6o5HsvdhK4kcxmMRHsj+vqee tABAc6NLiIBpzh08WaWiGyAf6gPYbM1WeUTdzl+i+xBVF9kqjitWew1CJAX1ib+ftwXc cLLl/j9yXe+uY05pGgw264kjDdg29Y0frq7hteCOQqZPZ/mP0lUBOUUxIjD0F74R16AO 5RAIWivNNhJKnCwxyCmexNwB0Dmu5QtpzCfUdp/Ko5ooLteYtmzEMGqUF9c4TEf91aVw WAlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pzXCyb9PU2jpjS3nvSsH8IOMvysFd1NHmfvCKY7RyQw=; b=n9dsluOzfXV7TdaO1Ck2wwgRJud926b83JpHVGzcJ5vudr7mmD2ifRUv8zM7gxZLFX yOAo6bJSsYdfpX0B3lv+Ww4VjE6LBeWfbDoZXKDY1yIxzqZT/VkeDNSdPf1xWlPKoggL DirpR7QnjXqniVzxH0ANK3Mfa0DhspV6U/4mK2WUjpUPT/dcAyBCRk90J87mM77RUZ2g /Uf1IKPCz3WmH4oXQwFzxN1cziu+uX5ARmtxG6CzhtZCEEGEMQ24ydT7asUQocVp9/6x YFaJJygBtJoAGSaFNt2C2g1EcrtTy6JqC+WwwitDyNxm6ZIFue1bADbeOfyuTkG8npN1 BmXA== X-Gm-Message-State: AFqh2krSUa7mFZwVdWN7bn9z3+jHz9/eAb9WuBt2K1bV/MOlYcqzRDLi 2F7SWvIWRIqpy9VnVBEAkCBkEDF1dU4= X-Google-Smtp-Source: AMrXdXvHf0PImctWoRpwJ6n+Zj4NcjzPsC7MKqJO46KsjYWcrW7IahRdGwKo8lqmdMB7NOVDNnptkQ== X-Received: by 2002:a05:6a20:4904:b0:9d:efbf:8156 with SMTP id ft4-20020a056a20490400b0009defbf8156mr50616849pzb.31.1672445023936; Fri, 30 Dec 2022 16:03:43 -0800 (PST) Original-Received: from smtpclient.apple (cpe-172-117-161-177.socal.res.rr.com. [172.117.161.177]) by smtp.gmail.com with ESMTPSA id pw16-20020a17090b279000b002260cff0b2dsm6637740pjb.26.2022.12.30.16.03.42 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Dec 2022 16:03:43 -0800 (PST) In-Reply-To: <83y1qo6h7a.fsf@gnu.org> X-Mailer: Apple Mail (2.3696.120.41.1.1) Received-SPF: pass client-ip=2607:f8b0:4864:20::102c; envelope-from=casouri@gmail.com; helo=mail-pj1-x102c.google.com X-Spam_score_int: -10 X-Spam_score: -1.1 X-Spam_bar: - X-Spam_report: (-1.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, FREEMAIL_REPLY=1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:302140 Archived-At: > On Dec 30, 2022, at 7:31 AM, Eli Zaretskii wrote: >=20 >> From: Yuan Fu >> Date: Fri, 30 Dec 2022 03:06:37 -0800 >> Cc: Stefan Monnier , >> Eli Zaretskii , >> Dmitry Gutov , >> theophilusx@gmail.com, >> emacs-devel@gnu.org >>=20 >>> I have asked the question before, but freedom or not, the above is a >>> nuisance to run for every language. If the process is as automatic = as >>> the above example demonstrates, shouldn't Emacs have a command to = take a >>> grammar and compile+install it? I guess this could be more = complicated >>> if the grammar is generated using a custom tool-chain for each = language >>> (or is it always Javascript?), but nothing impossible. >>=20 >> Though the magic of programming, such command now exists: = treesit-install-language-grammar. It needs recipes to work, though. The = recipe would involve https://github.com, which I guess is probably too = heretical to include in Emacs source, so I left the recipes empty. I = tested the install command with these recipes: >>=20 >> (setq treesit-language-source-alist >> '((python = "https://github.com/tree-sitter/tree-sitter-python.git") >> (typescript = "https://github.com/tree-sitter/tree-sitter-typescript.git" >> "typescript/src" "typescript"))) >=20 > Thanks. I did some minor fixes to the doc strings, but this command > still "needs work"(TM). See my comments below: >=20 > This command requires Git, a C compiler and (sometimes) a C++ = compiler, > and the linker to be installed and on PATH. It also requires that = the > recipe for LANG exists in `treesit-language-source-alist'. >=20 > I don't think treesit-language-source-alist is a good idea, especially > if we don't intend populating it, at least not as a user-facing > feature. Instead, the command should ask the user for the relevant > values, and offer recording the values on some file that would be read > next time the user wants to install an updated library. I consider this as a fallback method for installing language grammars. = Because distress might not end up bundle language grammar for us, and = even if they do, they can=E2=80=99t cover every grammar so some user = would end up needing to install some grammar by themselves. If we = don=E2=80=99t include this feature, someone will definitely write = something like this and make it a third-party package (indeed, someone = already has). So we might have it in Emacs and do it right. This is the use case that I had in mind when writing this function: some = major mode xxx-mode requires language grammar for xxx, so it has the = following instruction in its readme: Add installation recipe of tree-sitter-xxx to your config, and run = treesit-install-language-grammar: (add-to-list 'treesit-language-source-alist '(xxx "https://github.com/xxx/tree-sitter-xxx.git")) >=20 > OUT-DIR is the directory to put the compiled library file, it > defaults to ~/.emacs.d/tree-sitter. >=20 > I don't understand what "defaults" means here, since OUT-DIR is not an > optional argument of treesit--install-language-grammar-1. Ah yes, fixed. >=20 > (let* ((lang (symbol-name lang)) > (default-directory "/tmp") >=20 > A literal "/tmp" is not portable and un-Emacsy; please use > temporary-file-directory instead. >=20 > (soext (pcase system-type > ('darwin "dylib") > ((or 'ms-dos 'cywin 'windows-nt) "dll") >=20 > MS-DOS doesn't use DLL files. Please use dynamic-library-suffixes > instead, it's already set up correctly. And the code should be ready > for that variable having a nil value. Fixed those, thanks. >=20 > (message "Cloning repository") > ;; git clone xxx --depth 1 --quiet workdir > (treesit--call-process-signal > "git" nil t nil "clone" url "--depth" "1" "--quiet" > workdir) >=20 > Why "--depth 1"? This should be a defcustom, and the default should > be to clone the full repository, IMO. Also, what about updating the > library when it is already installed, and the Git repository already > exists for it? Or are we going to clone anew each time and them > remove the repository? that could make its cloning be slow in some > cases. Since the purpose of this command is to install the grammar, why would = we want a full clone? For an =E2=80=9Caverage user=E2=80=9D, all they = need is the library. If they wants to hack on the grammar, it makes more = sense to install the toolchain and clone the repository themselves. And = yes, this command clone anew each time and removes the repository.=20 >=20 > ;; cp "${grammardir}"/grammar.js "${sourcedir}" > (copy-file (concat grammar-dir "/grammar.js") > (concat source-dir "/grammar.js")) >=20 > Why is this part needed? In any case, please don't use concat to > produce file names, use expand-file-name instead. Also, we should > call copy-file with 4th argument non-nil, I think. To be honest I don=E2=80=99t remember, it is in build.sh so I copied it = verbatim. I=E2=80=99ll see what it=E2=80=99s for. (But I kept it for = now.) >=20 > (treesit--call-process-signal > cc nil t nil "-fPIC" "-c" "-I." "parser.c") >=20 > I wonder why we don't use 'compile' here. That would show the > compiler output without any extra efforts. I wanted to keep it simple, synchronous, and quiet, and didn=E2=80=99t = thought much about it. >=20 > ;; Copy out. > (copy-file lib-name (concat out-dir "/") t) >=20 > See above: don't use concat here. >=20 > This command should also be mentioned in NEWS, where we describe how > to install the grammar libraries. I=E2=80=99ll do that if we decide this function is desirable and good. > Bottom line: I think we need first to discuss how we want such a > facility to work, and only then implement it. I agree. I was worried about the feature freeze thing :-) Yuan