From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: tree-sitter: conceptional problem solvable at Emacs' level? Date: Thu, 09 Feb 2023 10:50:59 +0200 Message-ID: <83h6vvmdcc.fsf@gnu.org> References: <87zg9n45ig.fsf@yahoo.com> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="10579"; mail-complaints-to="usenet@ciao.gmane.io" Cc: holgerschurig@gmail.com, Emacs-devel@gnu.org To: Po Lu Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Feb 09 09:51:32 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pQ2eR-0002Xw-1k for ged-emacs-devel@m.gmane-mx.org; Thu, 09 Feb 2023 09:51:31 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pQ2dc-000754-SA; Thu, 09 Feb 2023 03:50:40 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pQ2db-00074Y-Ay for Emacs-devel@gnu.org; Thu, 09 Feb 2023 03:50:39 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pQ2db-0007kF-1k; Thu, 09 Feb 2023 03:50:39 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=MLWwKsfN8isAThxb7yTp0OdmNB4G6jup9X20Ud3uXgY=; b=iUMxq2VXFa4K VMJcodzT5rIr1vRNyqUPsRBcMa5VFEzBYgA2zmbfrDgW+1abh1RwNA4b5jiwCGVjMqeXdw9RAK/Go VkysnUxQYhdKJJzyKzVkdT5q2W60VwUPDW3qA7T3xqQXJIkUr5ktGSE3R2u836yrl1RN5W67PHIwU C8/pwTW4wKLaMPBk466OYdoPn56kiaJ4wuM2zXO+na7NJd+vIumbUoo3WdR3q1XQ50esJHpbS6vIF BVFJj2z2f1er1VZE8Jz/IFiajqvkocfx5/DaQS7Bp+Ls+AGbqwgsI0MTTNXYFHiwsrM9BI/DeDjv3 7HX3u05+SaGDaAvvZdAMrw==; Original-Received: from [87.69.77.57] (helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pQ2da-000473-Hc; Thu, 09 Feb 2023 03:50:38 -0500 In-Reply-To: <87zg9n45ig.fsf@yahoo.com> (message from Po Lu on Thu, 09 Feb 2023 16:17:27 +0800) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:303067 Archived-At: > From: Po Lu > Cc: Emacs-devel@gnu.org > Date: Thu, 09 Feb 2023 16:17:27 +0800 > > Holger Schurig writes: > > > Hi, I run branch emacs-29 since some time with great success. And now I > > wanted to test out tree-sitter and c++-test-mode. Unfortunately, I > > stumbled into some conceptional problems and wonder if this is actually > > solvable by Emacs, or if some would need a completely new grammar. > > > > The issue is: tree-sitter doesn't work well with C macros. > > > > I program a lot in C++/Qt. So let's look at this (valid) C++ program: > > > > ----------------------------------------------------------------------------- > > #include > > > > class Test : public QObject > > { > > Q_OBJECT > > public: > > Test() : QObject() {}; > > public slots: > > void someSlot() {}; > > }; > > ----------------------------------------------------------------------------- > > > > If have the libraries installed (e.g. qtbase5-dev on Debian), you can > > compile this perfectly. > > > > However, tree-sitter produces a garbage syntax tree: > > > > - contain some bitfield node (which isn't really there) > > - contains an error node (despite the code being compilable) > > > > And as a result, BOTH the indentation and the font-locking is wrong. > > > > > > Would I need to create a tree-sitter grammar in JavaScript that > > understands this macro-enhanced C++? That would be quite difficult. > > Or will there be a method to add some kind of tiny-preprocessor to > > c++-ts-mode, so that it can substitute "Q_OBJECT", "signals" and "slots" > > with nothing before handing things over to tree-sitter? > > > > > > In comparison, I could teach the old cc-mode about this macro-enriched > > C++ just with > > > > (c-add-style "qt-gnu" > > '("gnu" (c-access-key . > > "\\<\\(signals\\|public\\|protected\\|private\\|public > > slots\\|protected slots\\|private slots\\):"))) > > > > > > I guess that a lot of C and C++ programs use macros. And if there is no > > simple way to aid tree-sitter in understanding this, then I fear > > tree-sitter enhanced modes will often be unusable on them. > > My suggestion is simply to stay with CC Mode. Suggestions for what to do for now aside, I would still want us to try to figure out the possibilities for better handling of C/C++ macros in tree-sitter supported modes. I don't want to give up yet, because the kludges similar to c-add-style used by CC mode might be possible with tree-sitter modes as well. Or maybe some other solution could work, including the idea of letting tree-sitter see preprocessed source code (although this is probably harder to implement, and must be done on the C level). We just started using these modes in Emacs, so it is small wonder that issues like this are popping up, and will probably keep popping up for some time to come. I see no reason whatsoever to give up on tree-sitter just because these minor problems in marginal cases are brought up; we should instead solve them one by one. Being minor problems, they in no way invalidate the basic decision to try using tree-sitter in Emacs, not from where I stand.