From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yuan Fu Newsgroups: gmane.emacs.devel Subject: Re: How does c-ts-mode, tree-sitter indentation, and preprocessor directives work? Date: Sun, 1 Dec 2024 01:32:20 -0800 Message-ID: References: <86plmferwu.fsf@gnu.org> <52D99EBA-1DCB-4559-A645-A53E7CF82FED@gmail.com> Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3776.700.51\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="4750"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Eli Zaretskii , =?utf-8?Q?Bj=C3=B6rn_Lindqvist?= , emacs-devel@gnu.org To: Filippo Argiolas Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Dec 01 10:33:30 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tHgKY-000154-Et for ged-emacs-devel@m.gmane-mx.org; Sun, 01 Dec 2024 10:33:30 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tHgJi-0000Bh-A7; Sun, 01 Dec 2024 04:32:38 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tHgJg-0000BF-Bo for emacs-devel@gnu.org; Sun, 01 Dec 2024 04:32:36 -0500 Original-Received: from mail-pj1-x1032.google.com ([2607:f8b0:4864:20::1032]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1tHgJe-0001hs-AL; Sun, 01 Dec 2024 04:32:36 -0500 Original-Received: by mail-pj1-x1032.google.com with SMTP id 98e67ed59e1d1-2ee989553c1so522301a91.3; Sun, 01 Dec 2024 01:32:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1733045552; x=1733650352; darn=gnu.org; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=B2V0NPF0rG/Kky80wh1RBxjCt4HToSP+AOWvOh/GFRo=; b=FHDW9b3reBrUXiFkDKXzUgG5zzcrpgaH9CWoPoIp54lKzpKNqDCTu4Gqh7uu7mS6c5 mQPoVqpzkt0eXzktm1Ck4S6Z2lMPqilzjifsihTSnVe8OjSeLPYunbD2tTK9umZ9b7ou sK/hqqDs/rEMaFxQD4KMYvIautpy0Hr5GHHkyXyKK6fAOsizOYuhDFJvH0UCd7M2C/VL THUfeNOd5SVWgz5+mQhIrtaoDJnQCqxu8fKgBBdihX+fSkpJQqVgZrLyzDH75dJQqJRT T/bRIiz12EEB+u5gntCObPcCVHrf3m4V6fNSs+nZh/G3BbNgnHtxs+HN5/8CIJYY2Is7 Jatg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733045552; x=1733650352; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=B2V0NPF0rG/Kky80wh1RBxjCt4HToSP+AOWvOh/GFRo=; b=DBNEmA7f0PxNBf8QVMgUq9lyGbSgDqLR8Ktoy/pTyRnevd//sMENXf2+TngFtZWt52 cmOiIz2dDsAE6CLXNV4sqRSGNkgFr48Nik/57V5HByjSeJ/kBhjdfN7JJvXp1U0U1C8X hNAyY29nQq04PAEvpS8qVXxaliDwyhVkMrDRipH2jGOQT/axrNmAYWoknm2SiNDLdrwl Yc4XumGUkLX+PUhF6BbhwyPDwYsTfprj/4KicAEg7mSaOOgvubDrcDnQnIS7awFfmUsv R+feEyOfKWagrlnjY5I0SwBO13Vc3sCm/c6i7eE4SRKYOKYo1Yo6+eULjZvng99WOG5u ab8g== X-Forwarded-Encrypted: i=1; AJvYcCVs0wElhQzzGq68LqoD6YA+M16/+OfneGUtqzTYfrCkWUpnSnnNHxMVwQpxTWWDGTRFkSrxC3HVjwyelQ==@gnu.org X-Gm-Message-State: AOJu0YwQRCDGLIi2M5fB7h488lrZ5rafRT+ltFy5dOPML2RDUbSFP+1N cesWcu7zL7UR2rcOz706GMeCS0SlNM6On+fSW5bLqUyBfrX0EaqugKZkqA== X-Gm-Gg: ASbGncurJHK67W+zG0v+3KCjnwkB4iMfEmbbWJ0C+wQ0e5YwvhclflhdA7/0WSy/j90 +haW6aKycC0aU8XMAbeOESCOinK8FeQQD5l/Ii0TiCH6SSTwBlPuOBIo3Htmqx/hM8OQGRB2Q9l n/Qb8z5Y6RhZR92YF7kI4izU3nOYfzecaDZ3B9at7sHEYCPQxLMoqNa/6nYzrcLU474Z/pisx8W RUZgT58LOXmXcl8Sfe+Sx5zibT1dYb2ZHxXeoQ5VIU+t4YPekgP/wTT8RD+yxHbTj4kyFp2ZA== X-Google-Smtp-Source: AGHT+IHQ31w9/HxzfhaTobMI9FRg1H1+xWcvWB3NakfwaSiTCb4R2FmVFj9ybr3RhMfB+wOirVAldw== X-Received: by 2002:a17:90b:35c9:b0:2ea:77d9:6345 with SMTP id 98e67ed59e1d1-2ee08ed4430mr20645963a91.22.1733045552266; Sun, 01 Dec 2024 01:32:32 -0800 (PST) Original-Received: from smtpclient.apple ([2601:646:8f81:6120:71b7:718f:7faa:8436]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2ee72dea03dsm2952083a91.27.2024.12.01.01.32.31 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 01 Dec 2024 01:32:31 -0800 (PST) In-Reply-To: X-Mailer: Apple Mail (2.3776.700.51) Received-SPF: pass client-ip=2607:f8b0:4864:20::1032; envelope-from=casouri@gmail.com; helo=mail-pj1-x1032.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:325910 Archived-At: > On Dec 1, 2024, at 12:36=E2=80=AFAM, Filippo Argiolas = wrote: >=20 > Yuan Fu writes: >=20 >>> On Nov 28, 2024, at 10:30=E2=80=AFAM, Filippo Argiolas = wrote: >>>=20 >>> Eli Zaretskii writes: >>>=20 >>>>> From: Bj=C3=B6rn Lindqvist >>>>> Date: Thu, 28 Nov 2024 00:27:17 +0100 >>>>>=20 >>>>> I've been trying to get c-ts-mode to indent like I want, but I'm >>>>> running into problems related to preprocessor directives. >>>>=20 >>>> Preprocessor directives are difficult because the tree-sitter C/C++ >>>> grammars include only partial support for them. >>>>=20 >>>>> For >>>>> example, consider a type definition nested in two #ifdefs: >>>>>=20 >>>>> #ifdef X >>>>> #ifdef Y >>>>> typedef int foo; >>>>> #endif >>>>> #endif >>>>>=20 >>>>> Since both the parent and grand parent of the type_definition is a >>>>> preproc_ifdef no rule matches. >>>>=20 >>>> But if you go back (up) the parent-child hierarchy, you will >>>> eventually find a node which is not a preproc_SOMETHING, and can go >>>> from there, no? >>>>=20 >>>=20 >>> I believe we might have a bug here, as far as I can tell it does not >>> match >>>=20 >>> ((n-p-gp nil "preproc" "translation_unit") column-0 0) >>>=20 >>> Because both parent and grand parent are preproc. So it matches one = of >>> the `c-ts-mode--standalone-parent-skip-preproc' rules right after. >>>=20 >>> After skipping preproc nodes parent is translation_unit and indents = an offset >>> from there. Guess this step could be made smarter to check for >>> translation_unit and the rule above could be removed? >>>=20 >>>>> Another issue is that I want my >>>>> preprocessor directives kept at column 0, which unfortunately = screws >>>>> up all rules that refer to the parent. E.g.: >>>>>=20 >>>>> ((parent-is "if_statement") standalone-parent 4) >>>>>=20 >>>>> Doesn't work for >>>>>=20 >>>>> int main() { >>>>> if (true) >>>>> #ifdef A >>>>> prutt(); >>>>> #else >>>>> fis(); >>>>> #endif >>>>> } >>>>>=20 >>>>> The rule I'd like to express is "take the indent of the closest >>>>> *indenting* parent and add one indent". That rule would match = whether >>>>> that parent is a "while_statement", "if_statement", = "for_statement", >>>>> etc. You can't express such rules with tree-sitter, can you? >>>>=20 >>>> Not sure, but Yuan will know. >>>=20 >>> This can be worked around as Yuan showed, but isn't it a grammar = bug? >>> problem is with the #ifdef function and if statement become = siblings, without >>> preproc they have a child-parent relation. >>>=20 >>> In my experience c-ts-mode is a bit fragile with preprocessor >>> statements, probably because the grammar itself is fragile (see >>> e.g. [1]) and the problem is an hard one. >>=20 >> Right. >>=20 >>> Yuan, do you think c-ts-mode could some way benefit from LSP = knowledge >>> about inactive preprocessor branches? Idea is that we would at least >>> have a good syntax tree in the active branches while allowing some >>> errors in the inactive ones. >>=20 >> Maybe. Technically you can create a parser and sets its range to only = included the active branches. But for it to work end-to-end would = require some major effort. I=E2=80=99m not sure if it=E2=80=99s worth it = (in terms of code complexity and maintenance cost). >=20 > Interesting, maybe I'll experiment a bit with it and see where it > goes. Agree that it already sounds overkill for little gain. >=20 > My major annoyance more than indent is when the preprocessor = statements > break function detection and imenu/breadcrumb. I have one offending = file > of this kind at work which unfortunately I cannot share. Will try to > extract a test case that reproduce the issue and open a bug. May be it > can be worked around some way from c-ts-mode. I share the frustration. Tree-sitter for C could=E2=80=99ve been so much = better if weren=E2=80=99t for the preprocessor and macros.=20 IME, whether it can be worked around depends on the specific code. Some = code just generates a parse tree that=E2=80=99s hard to recover. Yuan=