From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Filippo Argiolas Newsgroups: gmane.emacs.devel Subject: Re: How does c-ts-mode, tree-sitter indentation, and preprocessor directives work? Date: Sun, 01 Dec 2024 09:36:34 +0100 Message-ID: References: <86plmferwu.fsf@gnu.org> <52D99EBA-1DCB-4559-A645-A53E7CF82FED@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="12333"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Eli Zaretskii , =?utf-8?Q?Bj=C3=B6rn?= Lindqvist , emacs-devel@gnu.org To: Yuan Fu Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Dec 01 09:37:10 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tHfS1-00032j-VG for ged-emacs-devel@m.gmane-mx.org; Sun, 01 Dec 2024 09:37:10 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tHfRZ-0003Ll-Pv; Sun, 01 Dec 2024 03:36:41 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tHfRY-0003LU-UF for emacs-devel@gnu.org; Sun, 01 Dec 2024 03:36:40 -0500 Original-Received: from mail-wr1-x42d.google.com ([2a00:1450:4864:20::42d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1tHfRX-000211-Ah; Sun, 01 Dec 2024 03:36:40 -0500 Original-Received: by mail-wr1-x42d.google.com with SMTP id ffacd0b85a97d-385f06d0c8eso42121f8f.0; Sun, 01 Dec 2024 00:36:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1733042197; x=1733646997; darn=gnu.org; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:from:to:cc:subject:date:message-id :reply-to; bh=XlMC2s1qtx491/uedmL1jPkXwGFpRm62NAZVLt5zv90=; b=G2Ju8FRpRdThvprkrGlGUhCQeEB8Z4MHfznUx8to21chtSbZJSnsjEeXTKP/2dAAJb Do4L5eRhD7/J2y1hTCgdpaLzbClyk1bPEFkhIQDTUKrYvtHEwiwOpy/sEDXVHmoNqsvk BTl5E3Az1CPk7evT6iJrMIg29J4bDziw//84W3x//MGjRMmAUcSyuYExtcd+yyCxsipV XfAVU+vBHlY0JkmWRob5GPy526cmIKfLdwNwliV0DWkVQjT9LZJc1poBNjEVyzmPZpU2 +eyOrt6VUSOdobble3ZkknXrkomeN8S9SsD1hYg+1fuVZ/mWCisqTVZZqSxmP+1UIiiJ JAZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733042197; x=1733646997; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XlMC2s1qtx491/uedmL1jPkXwGFpRm62NAZVLt5zv90=; b=mnuMV7FxHSYiAf76eGFt1v7KChM7h3eLyX6Rd1YvWnVD+8WerrNIVLw1f/SebYRue0 zmz7hy+HVxUHKD/PU0GbdiwZNWo4OpRojRaUkl30cXqf7gdzaEEtAWk4urMc2HraGhcV SiQbvjwM+yXlgmpya2Bnkxmtg/Ck9/JS5Q+ycIXQjW4xJa7Qa9moTBzYIWJBoFkT2nRc weCVXBLBCC+dLdy2M+YRNehvS4e7IcwtE7XGT+vfQErzw8rqMs6mutXvFK54wwkbfH3K hY5VyWEFhtE+LO54g50bKFaMXMNFkPvPXMj7Yfuh7a0z+J3pdGaspcaiD68DRTSTH/2A neCg== X-Forwarded-Encrypted: i=1; AJvYcCWJ8gtVgp9+XQiKN9HyBDpyEBh6oQKOL7McPzMLo6cmrAVLSoEmBEWreq7atU3a55jamgncfue91vvSgA==@gnu.org X-Gm-Message-State: AOJu0Yx0tvlRSNwAwZcdLZQOWFHm23vEb0u9dqikN8yrX/RtgbvVmw81 srCNgPeiaTKBFXVcIoX9wiIV6nEvSBww3Avgf+o1n46NBqMwU+kQULdKJUPT X-Gm-Gg: ASbGnctOP6M4myBq7ZhKwvtRFkCH0BiYrD1WevQtS7BYiHttT686eTq+ZQ/V402UBnQ XRWx71jroBU8OIttdyAEB4nCasNIVnGu6KZuPw3MHh668kEgNCAx81BJJL+k6q5up+FLPWqxqBa FJzFULpCh9Wn/RgulHhAbqVuULiF+SW+Yrx0KlbiI0Prwc1noaO44CJ01BvM4M0Z5pkSDTG7i/G HZ1eDhaIeJX/pkyMx0rO1Agz1xgKchfRkYepYLUGzSagougzHA8FwAVd+c= X-Google-Smtp-Source: AGHT+IHwxVxnjvnSUrP9TVrEwPHnKQsBBd0U3PD5LVrFliDjMahX5qkij5EUlraoe6lWR+HMy2w88w== X-Received: by 2002:a5d:6d09:0:b0:385:e17a:ce6f with SMTP id ffacd0b85a97d-385e17acfe2mr6824046f8f.24.1733042196442; Sun, 01 Dec 2024 00:36:36 -0800 (PST) Original-Received: from mba ([151.81.191.240]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-434aa7e5285sm140728705e9.40.2024.12.01.00.36.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Dec 2024 00:36:35 -0800 (PST) In-Reply-To: <52D99EBA-1DCB-4559-A645-A53E7CF82FED@gmail.com> Received-SPF: pass client-ip=2a00:1450:4864:20::42d; envelope-from=filippo.argiolas@gmail.com; helo=mail-wr1-x42d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:325908 Archived-At: Yuan Fu writes: >> On Nov 28, 2024, at 10:30=E2=80=AFAM, Filippo Argiolas wrote: >>=20 >> Eli Zaretskii writes: >>=20 >>>> From: Bj=C3=B6rn Lindqvist >>>> Date: Thu, 28 Nov 2024 00:27:17 +0100 >>>>=20 >>>> I've been trying to get c-ts-mode to indent like I want, but I'm >>>> running into problems related to preprocessor directives. >>>=20 >>> Preprocessor directives are difficult because the tree-sitter C/C++ >>> grammars include only partial support for them. >>>=20 >>>> For >>>> example, consider a type definition nested in two #ifdefs: >>>>=20 >>>> #ifdef X >>>> #ifdef Y >>>> typedef int foo; >>>> #endif >>>> #endif >>>>=20 >>>> Since both the parent and grand parent of the type_definition is a >>>> preproc_ifdef no rule matches. >>>=20 >>> But if you go back (up) the parent-child hierarchy, you will >>> eventually find a node which is not a preproc_SOMETHING, and can go >>> from there, no? >>>=20 >>=20 >> I believe we might have a bug here, as far as I can tell it does not >> match >>=20 >> ((n-p-gp nil "preproc" "translation_unit") column-0 0) >>=20 >> Because both parent and grand parent are preproc. So it matches one of >> the `c-ts-mode--standalone-parent-skip-preproc' rules right after. >>=20 >> After skipping preproc nodes parent is translation_unit and indents an o= ffset >> from there. Guess this step could be made smarter to check for >> translation_unit and the rule above could be removed? >>=20 >>>> Another issue is that I want my >>>> preprocessor directives kept at column 0, which unfortunately screws >>>> up all rules that refer to the parent. E.g.: >>>>=20 >>>> ((parent-is "if_statement") standalone-parent 4) >>>>=20 >>>> Doesn't work for >>>>=20 >>>> int main() { >>>> if (true) >>>> #ifdef A >>>> prutt(); >>>> #else >>>> fis(); >>>> #endif >>>> } >>>>=20 >>>> The rule I'd like to express is "take the indent of the closest >>>> *indenting* parent and add one indent". That rule would match whether >>>> that parent is a "while_statement", "if_statement", "for_statement", >>>> etc. You can't express such rules with tree-sitter, can you? >>>=20 >>> Not sure, but Yuan will know. >>=20 >> This can be worked around as Yuan showed, but isn't it a grammar bug? >> problem is with the #ifdef function and if statement become siblings, wi= thout >> preproc they have a child-parent relation. >>=20 >> In my experience c-ts-mode is a bit fragile with preprocessor >> statements, probably because the grammar itself is fragile (see >> e.g. [1]) and the problem is an hard one. > > Right. > >> Yuan, do you think c-ts-mode could some way benefit from LSP knowledge >> about inactive preprocessor branches? Idea is that we would at least >> have a good syntax tree in the active branches while allowing some >> errors in the inactive ones. > > Maybe. Technically you can create a parser and sets its range to only inc= luded the active branches. But for it to work end-to-end would require some= major effort. I=E2=80=99m not sure if it=E2=80=99s worth it (in terms of c= ode complexity and maintenance cost). Interesting, maybe I'll experiment a bit with it and see where it goes. Agree that it already sounds overkill for little gain. My major annoyance more than indent is when the preprocessor statements break function detection and imenu/breadcrumb. I have one offending file of this kind at work which unfortunately I cannot share. Will try to extract a test case that reproduce the issue and open a bug. May be it can be worked around some way from c-ts-mode. Filippo