From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#25987: 25.2; support gcc fixit notes Date: Thu, 12 Nov 2020 15:54:31 +0200 Message-ID: <83mtzmznmw.fsf@gnu.org> References: <87lgsj1jle.fsf@tromey.com> <1521218887.2913.237.camel@redhat.com> <83muz7pyde.fsf@gnu.org> <83o8lf9p68.fsf@gnu.org> <26f277bb345f10efe6340ac4074960905064fc97.camel@redhat.com> <83362i2nul.fsf@gnu.org> <8666386379d22239075d9237f00f40469c5be454.camel@redhat.com> <837drkopuf.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="15131"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 25987@debbugs.gnu.org To: David Malcolm Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Nov 12 14:55:13 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kdD4D-0003pb-Dc for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 12 Nov 2020 14:55:13 +0100 Original-Received: from localhost ([::1]:59690 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kdD4C-0003K8-9e for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 12 Nov 2020 08:55:12 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:57124) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kdD42-0003Jg-9a for bug-gnu-emacs@gnu.org; Thu, 12 Nov 2020 08:55:02 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:60711) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kdD41-0004Sj-W1 for bug-gnu-emacs@gnu.org; Thu, 12 Nov 2020 08:55:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1kdD41-0003ZZ-Ua for bug-gnu-emacs@gnu.org; Thu, 12 Nov 2020 08:55:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 12 Nov 2020 13:55:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 25987 X-GNU-PR-Package: emacs Original-Received: via spool by 25987-submit@debbugs.gnu.org id=B25987.160518926713670 (code B ref 25987); Thu, 12 Nov 2020 13:55:01 +0000 Original-Received: (at 25987) by debbugs.gnu.org; 12 Nov 2020 13:54:27 +0000 Original-Received: from localhost ([127.0.0.1]:44021 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kdD3S-0003YQ-OM for submit@debbugs.gnu.org; Thu, 12 Nov 2020 08:54:27 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:56908) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kdD3Q-0003YD-SY for 25987@debbugs.gnu.org; Thu, 12 Nov 2020 08:54:25 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:52211) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kdD3L-0004F1-ID; Thu, 12 Nov 2020 08:54:19 -0500 Original-Received: from [176.228.60.248] (port=4748 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kdD3K-0004UH-GV; Thu, 12 Nov 2020 08:54:19 -0500 In-Reply-To: (message from David Malcolm on Wed, 11 Nov 2020 14:36:49 -0500) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:193160 Archived-At: > From: David Malcolm > Cc: 25987@debbugs.gnu.org > Date: Wed, 11 Nov 2020 14:36:49 -0500 > > On Tue, 2020-10-20 at 18:54 +0300, Eli Zaretskii wrote: > > > From: David Malcolm > > > Cc: 25987@debbugs.gnu.org > > > Date: Tue, 20 Oct 2020 10:52:05 -0400 > > > > > > One possible issue: in the final diagnostic, there's a fix-it hint > > > with > > > non-ASCII replacement text, replacing "two_pi" with "two_π" (where > > > the > > > final char in the latter is GREEK SMALL LETTER PI, U+03C0) > > > > > > This replacement currently expressed as encoded bytes i.e: > > > > > > fix-it:"demo.c":{51:10-51:16}:"two_\317\200" > > > > > > where \317\200 is the octal-escaped representation of the two bytes > > > of > > > the UTF-8 encoding of the character. > > > > > > Is this going to work for Emacs? > > > > You mean, GCC doesn't actually emit the UTF-8 encoding of π, it emits > > its ASCII-fied representation? We'd need to decode that, but is that > > really justified? Why not emit UTF-8? > > I have an implementation that simply emits UTF-8 in quotes, escaping > backslash, tab, newline, and doublequotes as before. (we have to > escape at least newline, given that fix-it hint replacement text can > contain them, and we're using newline to terminate the parseable hint). Sorry, I've lost the context: where did those non-ASCII names come from? are they names of variables in the user's program? If so, in what encoding does GCC quote portions of the source code in its warning/error messages? Does it use the exact byte stream it found in the source, or does it perform any conversions of the encoding? > However, the filename also needs to be escaped. Currently I'm applying > the same escaping rules to both filename and replacement text. > What is the encoding of the filename? What if the bytes in a filename > aren't UTF-8 encoded? How does emacs handle this case? Emacs has a separate variable for the encoding of file names, which gets set from the locale settings. But this is not necessarily relevant to the issue at hand, because we are talking about processing output from a sub-process (GCC) which includes both file names and other stuff, such as fragments of the source code. When Emacs processes sub-process output, it generally assumes all of it is encoded in the same encoding. So if, for example, you encode non-ASCII variables in UTF-8 while the file names are emitted in some other encoding (perhaps because the locale's codeset is not UTF-8), then there will be complications: we will have to read the output from GCC in its raw form, and then decode "by hand" (in Lisp) each part of it as appropriate (which means we will need to be able to identifye each such part). So it's important to understand the situation and its limitations for proposing the best solution. > I tried creating file with the name "byte 0xff" .txt, and with valid > UTF-8 non- ascii names and emacs reported them as \377.txt and with > the UTF-8 names respectively, so perhaps I should simply emit the > bytes and pretend they are UTF-8? What do you mean by "pretend" in this context?