From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: =?UTF-8?Q?Jostein_Kj=c3=b8nigsen?= Newsgroups: gmane.emacs.devel Subject: Re: Call for volunteers: add tree-sitter support to major modes Date: Wed, 19 Oct 2022 10:03:33 +0200 Message-ID: <213a26fd-1043-07e5-2a78-310b7f71bfdf@secure.kjonigsen.net> References: <83czb1jrm3.fsf@gnu.org> <878rlo7on0.fsf@thornhill.no> <83k04y1gd2.fsf@gnu.org> <87wn8xbyr2.fsf@yahoo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="31717"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.2.2 Cc: Eli Zaretskii , casouri@gmail.com, theo@thornhill.no, emacs-devel@gnu.org To: Po Lu , Alan Mackenzie Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Oct 19 10:08:07 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ol47R-00080Q-7P for ged-emacs-devel@m.gmane-mx.org; Wed, 19 Oct 2022 10:08:07 +0200 Original-Received: from localhost ([::1]:40880 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ol47P-00022i-Lx for ged-emacs-devel@m.gmane-mx.org; Wed, 19 Oct 2022 04:08:03 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:36288) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ol43D-0008UD-Mb for emacs-devel@gnu.org; Wed, 19 Oct 2022 04:03:45 -0400 Original-Received: from wout2-smtp.messagingengine.com ([64.147.123.25]:43129) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ol43B-00060V-Ih; Wed, 19 Oct 2022 04:03:43 -0400 Original-Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.west.internal (Postfix) with ESMTP id 1CB2E320096E; Wed, 19 Oct 2022 04:03:37 -0400 (EDT) Original-Received: from mailfrontend2 ([10.202.2.163]) by compute1.internal (MEProxy); Wed, 19 Oct 2022 04:03:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= secure.kjonigsen.net; h=cc:cc:content-transfer-encoding :content-type:date:date:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to; s=fm2; t=1666166616; x=1666253016; bh=F4rn2sbnEL lN64YwJe9mmPFiJNiIV8w7ougcy239a0Y=; b=TDp7XhO3XtVgv2i//oZwsSCwYs Ddf8Rl01ObYZnYSrwrw629lYhAVVnl0me+wk6eXzdD1CCF2+U6PkhTt1V6KWhTmS Nsvd3nIjYk01JHgWLxhBgRx6A0CRz6E9XECA3UbuY+RB6Knit4tCzTGwN4H5Qzi0 fb0fPFR4E3wYsAMvUx0ds6lXgGGYf6j62bV1zAKtx6w7zJuZkrKlzbzA+m5WZssc rt463h0K+35kFCWlbYNj29xxi/UjH6udQaJ664v5zPWykbNk2AGBBU0nuP36KRC1 ws5HtjrQz8cW2+oMfi9Po5uoCjy/tl2Ut/37cuL4S06Ku5yUn4jtzo1UlMQg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t=1666166616; x= 1666253016; bh=F4rn2sbnELlN64YwJe9mmPFiJNiIV8w7ougcy239a0Y=; b=a HIX0X4zVVJhKUs7kvVA3hO5PIOk5zRIRlpTqDnVLw5f/x5hd6r69/bjjJukfjoB/ c/n3lpbT9PkLYDxcQp65QieMWnieRTVBBx0zuMtnr8LTZxRSLLGx1Jt6o3NZDTTN 2BqxZIXZw+W4tqyoW/R0ikLdo3yU+K0JXfzwc/49p+hr5XJL/DTYHBfTcEVIvNvm MFMtr6j5D4MF2NwUXy4CBPngrfdt+qCuN6v+7hssz5+1BTX4nCojI+QyNZDzWYMQ PcCf5AA2VTyNB9tsURtMwi4qi6Cw91yIhLi8HJamU4AAAtLL3x6rc59GNwiuoAnH +1fSmeuPx0URbvQjTDA1g== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvfedrfeelfedguddviecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefkffggfgfuvfevfhfhjggtgfesthejredttdefjeenucfhrhhomheplfho shhtvghinhgpmfhjpphnihhgshgvnhcuoehjohhsthgvihhnsehsvggtuhhrvgdrkhhjoh hnihhgshgvnhdrnhgvtheqnecuggftrfgrthhtvghrnhepgfehffekveegveejhefhvedu udegffdvudelkedvieeuveffkeehieekkedujeehnecuffhomhgrihhnpehgihhthhhusg drtghomhenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhm pehjohhsthgvihhnsehsvggtuhhrvgdrkhhjohhnihhgshgvnhdrnhgvth X-ME-Proxy: Feedback-ID: ib2f84088:Fastmail Original-Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 19 Oct 2022 04:03:35 -0400 (EDT) Content-Language: en-US In-Reply-To: <87wn8xbyr2.fsf@yahoo.com> Received-SPF: pass client-ip=64.147.123.25; envelope-from=jostein@secure.kjonigsen.net; helo=wout2-smtp.messagingengine.com X-Spam_score_int: -26 X-Spam_score: -2.7 X-Spam_bar: -- X-Spam_report: (-2.7 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:298108 Archived-At: > > Could someone please tell me how well tree-sitter supports pre-standard > C with liberal (and sometimes non-standard) use of the C language? I'm > talking about code that looks like this: > > MACRO_USED_TO_DEFINE_SPECIAL_FUNCTIONS (function_name, cells, transform) > some_kind_of_ptr cells; > another_kind_of_ptr *transform; > { > extern maybe_tls (int) errno; > extern caddr_t bar (_P (another_kind_of_ptr, ...)); > int rc; > > BEGIN_A_KIND_OF_SECTION ({ > ENTRY (dx, dy, shx, shy) > float dx, dy, shx, shy; > > if (!bar (other_function (dx, dy, shx, shy), > etc, etc, etc)) > die ("bar", sys_errlist[errno]); > }, register float, section_name); > > rc = more_code_here (§ion_name_desc, etc); > return rc; > } > > I don't doubt that tree-sitter is good at parsing newer languages like > Typescript, but does it support C all that well? > If you want to get an idea of the syntaxes which are supported by tree-sitter, all language grammar-specifications have test-cases associated with them. Some of the test-cases for "vanilla" C can be found here: https://github.com/tree-sitter/tree-sitter-c/tree/master/test/corpus You may find the amibiguities file in particular interesting: https://github.com/tree-sitter/tree-sitter-c/blob/master/test/corpus/ambiguities.txt For C++ there is a separate grammar all together: https://github.com/tree-sitter/tree-sitter-cpp/tree/master/test/corpus It similarly has defined test-cases for how ambiguous statements should be parsed: https://github.com/tree-sitter/tree-sitter-cpp/blob/master/test/corpus/ambiguities.txt -- Jostein