From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Philip Kaludercic Newsgroups: gmane.emacs.devel Subject: Re: Tree-sitter introduction documentation Date: Fri, 30 Dec 2022 11:25:25 +0000 Message-ID: <87v8lt9lqy.fsf@posteo.net> References: <83edszjslp.fsf@gnu.org> <87tu1vxs3a.fsf@ledu-giraud.fr> <831qozjob7.fsf@gnu.org> <87cz8jxoat.fsf@ledu-giraud.fr> <83wn6ri7pn.fsf@gnu.org> <5e0a3185-de82-b339-0fa2-956779e63d6f@cornell.edu> <868rj6vfep.fsf@gmail.com> <4895891b-e5ea-9c37-f51b-df2e479ee758@yandex.ru> <83y1qt11xq.fsf@gnu.org> <9eb013da-d0fc-8e17-c6e3-1e8f913aebfa@yandex.ru> <83pmc50xxc.fsf@gnu.org> <71cfe4e8-3bb8-b0a6-9be5-8c0a6d92cfab@yandex.ru> <83h6xg29z3.fsf@gnu.org> <87wn6cyey5.fsf@posteo.net> <787B1EB4-1925-4679-8747-449DCD685432@gmail.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="16533"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Stefan Monnier , Eli Zaretskii , Dmitry Gutov , theophilusx@gmail.com, emacs-devel@gnu.org To: Yuan Fu Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Dec 30 12:26:44 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pBDX9-00044I-5I for ged-emacs-devel@m.gmane-mx.org; Fri, 30 Dec 2022 12:26:43 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pBDWf-0004bk-Nc; Fri, 30 Dec 2022 06:26:13 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pBDVy-0004Um-Ub for emacs-devel@gnu.org; Fri, 30 Dec 2022 06:25:31 -0500 Original-Received: from mout01.posteo.de ([185.67.36.65]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pBDVv-0001Qo-KT for emacs-devel@gnu.org; Fri, 30 Dec 2022 06:25:30 -0500 Original-Received: from submission (posteo.de [185.67.36.169]) by mout01.posteo.de (Postfix) with ESMTPS id 2DE742400A7 for ; Fri, 30 Dec 2022 12:25:23 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017; t=1672399523; bh=f1f60lje+Fk4RYjgq1NYYDzuDK/yz8MBkDDIuurRk+g=; h=From:To:Cc:Subject:Date:From; b=AT8ln8+jsAwGqGnrcLXdtOJZEc55028a4jU2j2EkmAo9baaEKD+rhTTHIsLsT72SX eGONv5+8GXQsfKH7sNwHuJrPRDnv3u8ZiqGdUyHzSn30HiD+J2Dv3BMRNYNJK3BEWB JAA6BfuM67BVfwW6YT7eOLJkKcqs65yPEvGNem+lCl8bpyG//l5hFuLCaKFkFONX4b wKbFhbLwsYXJk6fOM1OC2pPrpw15NvwXiE18gXshd58KQ9WiV0BFadEZX+xFVyUSY7 KPWg6czxqHgvx1JcCtRKldR829pI3RqT/jvy70s74iwWq4RoNyRyfsiftJJlnzZOTC +rBIoyojH+nsg== Original-Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4Nk2xG1HzRz9rxF; Fri, 30 Dec 2022 12:25:19 +0100 (CET) In-Reply-To: <787B1EB4-1925-4679-8747-449DCD685432@gmail.com> (Yuan Fu's message of "Fri, 30 Dec 2022 03:06:37 -0800") Received-SPF: pass client-ip=185.67.36.65; envelope-from=philipk@posteo.net; helo=mout01.posteo.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:302094 Archived-At: --=-=-= Content-Type: text/plain Yuan Fu writes: >> On Dec 27, 2022, at 8:44 AM, Philip Kaludercic wrote: >> >> Stefan Monnier writes: >> >>>> It doesn't need any project, it is literally two command lines. >>>> Here's an example: >>>> >>>> gcc -O2 -I. -c -o parser.o parser.c >>>> gcc -shared parser.o scanner.o -ltree-sitter -o libtree-sitter-c-sharp.dll >>> >>> AFAIK `parser.c` is a file generated from the actual grammar's source, >>> itself written in Javascript. >>> >>> So the above instructions are akin to downloading a precompiled binary >>> and installing it. While it is the most convenient path for the >>> end-users, it's important w.r.t Freedom to make sure that grammars can >>> also be regenerated from source by the end users. >> >> I have asked the question before, but freedom or not, the above is a >> nuisance to run for every language. If the process is as automatic as >> the above example demonstrates, shouldn't Emacs have a command to take a >> grammar and compile+install it? I guess this could be more complicated >> if the grammar is generated using a custom tool-chain for each language >> (or is it always Javascript?), but nothing impossible. > > Though the magic of programming, such command now exists: treesit-install-language-grammar. It needs recipes to work, though. The recipe would involve https://github.com, which I guess is probably too heretical to include in Emacs source, so I left the recipes empty. I tested the install command with these recipes: > > (setq treesit-language-source-alist > '((python "https://github.com/tree-sitter/tree-sitter-python.git") > (typescript "https://github.com/tree-sitter/tree-sitter-typescript.git" > "typescript/src" "typescript"))) > > Yuan If acceptable, it looks good. I could imagine that it should be OK if we point to GitHub, since we are just using it as a Git host. Here are a few suggestions --=-=-= Content-Type: text/plain Content-Disposition: inline diff --git a/lisp/treesit.el b/lisp/treesit.el index b120ca68c5..651898e948 100644 --- a/lisp/treesit.el +++ b/lisp/treesit.el @@ -99,6 +99,15 @@ treesit :group 'tools :version "29.1") +(defcustom treesit-enabled-modes nil + "List of modes to enable tree-sitter support if available. +When initialising a major mode with potential tree-sitter +support, this variable is consulted. The special value t will +enable tree-sitter support whenever possible." + :type '(choice (const :tag "Whenever possible" t) + (repeat :tag "Specific modes" function)) + :version "29.1") + (defcustom treesit-max-buffer-size (let ((mb (* 1024 1024))) ;; 40MB for 64-bit systems, 15 for 32-bit. @@ -2690,20 +2699,19 @@ treesit--install-language-grammar-1 For LANG, URL, SOURCE-DIR, GRAMMAR-DIR, CC, C++, see `treesit-language-source-alist'. If anything goes wrong, this function signals an error." - (let* ((lang (symbol-name lang)) - (default-directory "/tmp") - (workdir (expand-file-name "treesit-workdir-00893133134")) + (let* ((default-directory (make-temp-file "treesit-workdir" t)) + (workdir (expand-file-name "repo")) (source-dir (expand-file-name (or source-dir "src") workdir)) (grammar-dir (expand-file-name (or grammar-dir "") workdir)) - (cc (or cc "cc")) - (c++ (or c++ "c++")) + (cc (or cc (seq-find #'executable-find '("cc" "gcc" "c99")) + (error "No C compiler found"))) + (c++ (or c++ (seq-find #'executable-find '("c++" "g++")))) (soext (pcase system-type ('darwin "dylib") ((or 'ms-dos 'cywin 'windows-nt) "dll") (_ "so"))) (out-dir (or (and out-dir (expand-file-name out-dir)) - (expand-file-name - "tree-sitter" user-emacs-directory))) + (locate-user-emacs-file "tree-sitter"))) (lib-name (format "libtree-sitter-%s.%s" lang soext))) (unwind-protect (with-temp-buffer @@ -2713,8 +2721,8 @@ treesit--install-language-grammar-1 "git" nil t nil "clone" url "--depth" "1" "--quiet" workdir) ;; cp "${grammardir}"/grammar.js "${sourcedir}" - (copy-file (concat grammar-dir "/grammar.js") - (concat source-dir "/grammar.js")) + (copy-file (file-name-concat grammar-dir "grammar.js") + (file-name-concat source-dir "grammar.js")) ;; cd "${sourcedir}" (setq default-directory source-dir) (message "Compiling library") @@ -2723,6 +2731,7 @@ treesit--install-language-grammar-1 cc nil t nil "-fPIC" "-c" "-I." "parser.c") ;; cc -fPIC -c -I. scanner.c (when (file-exists-p "scanner.c") + (unless c++ (error "No C++ compiler found")) (treesit--call-process-signal cc nil t nil "-fPIC" "-c" "-I." "scanner.c")) ;; c++ -fPIC -I. -c scanner.cc @@ -2739,7 +2748,7 @@ treesit--install-language-grammar-1 (rx bos (+ anychar) ".o" eos)) "-o" ,lib-name)) ;; Copy out. - (copy-file lib-name (concat out-dir "/") t) + (copy-file lib-name (file-name-as-directory out-dir) t) (message "Library installed to %s/%s" out-dir lib-name)) (when (file-exists-p workdir) (delete-directory workdir t))))) --=-=-=--