From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: cc-mode fontification feels random Date: Wed, 09 Jun 2021 16:36:35 -0400 Message-ID: References: <73ff18bf-66dc-7d7a-a0db-8edc2cdceba8@gmx.at> <83o8cge4lg.fsf@gnu.org> <62e438b5-d27f-1d3c-69c6-11fe29a76d74@dancol.org> <83fsxsdxhu.fsf@gnu.org> <179f22a44d8.2816.cc5b3318d7e9908e2c46732289705cb0@dancol.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="34864"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Cc: Daniel Colascione , Eli Zaretskii , rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org To: Alan Mackenzie Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Jun 09 22:37:41 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lr4xI-0008sB-KK for ged-emacs-devel@m.gmane-mx.org; Wed, 09 Jun 2021 22:37:40 +0200 Original-Received: from localhost ([::1]:54608 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lr4xH-0003zw-MB for ged-emacs-devel@m.gmane-mx.org; Wed, 09 Jun 2021 16:37:39 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:51750) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lr4wO-0003DL-Qu for emacs-devel@gnu.org; Wed, 09 Jun 2021 16:36:44 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:10063) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lr4wL-00067g-PF; Wed, 09 Jun 2021 16:36:43 -0400 Original-Received: from pmg2.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 3A8CD804DE; Wed, 9 Jun 2021 16:36:38 -0400 (EDT) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 9C7C780483; Wed, 9 Jun 2021 16:36:36 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1623270996; bh=zXRrA25UsEjEZS68sIF3SaMkgVdkGuWOAQlywNtdtUY=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=e0inlwspM1hB3RYCqjQqIpgUzsQSgGoC2qSBxwekfreTv3p1Y32yutPXUiBPcFUlN co13MgpLkK0ciep0BMnAuZvl+nFro2ibKRHnEQCg5v7QkdXkvurRG1GX1TbkqzJYEK A6KRD0k/yqfLtBeENc0aEwEWIU/lX3Qb8ZTk8ty5tC+taQrBOZ0xZ3nahWxfo4cmZW oJwbBvBWGwm6skpfEOO6WB+RmGxzfVRn2UzlLxThoYWnznGA8YgKtM2cyfj/PT/GpX JmvUx/SxEWkDadhM8WN9eGDYg6iEY2XTf9Pw3aFFjRDMgInuSuezaMir7coCoW0UZm ViZjRO+v6hFUQ== Original-Received: from alfajor (69-196-163-239.dsl.teksavvy.com [69.196.163.239]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 4547E120D9F; Wed, 9 Jun 2021 16:36:36 -0400 (EDT) In-Reply-To: (Alan Mackenzie's message of "Wed, 9 Jun 2021 20:20:26 +0000") Received-SPF: pass client-ip=132.204.25.50; envelope-from=monnier@iro.umontreal.ca; helo=mailscanner.iro.umontreal.ca X-Spam_score_int: -42 X-Spam_score: -4.3 X-Spam_bar: ---- X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:270613 Archived-At: > That's a rather negative way of putting things, which is a bit indefinite > and wishy-washy. You could instead try to specify which tokens should get > font-lock-type-face and which shouldn't, thus giving something concrete > to discuss. I think this will be difficult to do well, and may lead to > the result which I alluded to above. It has to be said also that C/C++ is quite unusual in that knowing which identifier is a type is necessary for correct parsing. If it weren't so, we could reliably highlight types not based on their name but based on their location in the syntax. I think an approach like that of tree-sitter should be able (at least in theory) to give reasonably good highlighting of types based on their position (tho sadly not in those cases where the syntax is ambiguous). I don't have a good intuition of how often ambiguities come into play in real code, nor how much work would be needed to disambiguate most cases (without relying on discovery of the corresponding type declarations). If ambiguities are rare enough and/or easy enough to disambiguate via some simple/local heuristic, then maybe CC-mode could try to highlight types based on their location rather than based on their identifiers. This would make it more stable (not dependent on the order in which chunks are highlighted) and maybe more reliable. But I suspect that it's not easy to do that kind of parsing, short of doing a full parse like tree-sitter does. Stefan