unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: David Malcolm <dmalcolm@redhat.com>
Cc: 25987@debbugs.gnu.org
Subject: bug#25987: 25.2; support gcc fixit notes
Date: Thu, 12 Nov 2020 15:54:31 +0200	[thread overview]
Message-ID: <83mtzmznmw.fsf@gnu.org> (raw)
In-Reply-To: <a5181d7c54cec863cc1c25d39154b5d1a2c15741.camel@redhat.com> (message from David Malcolm on Wed, 11 Nov 2020 14:36:49 -0500)

> From: David Malcolm <dmalcolm@redhat.com>
> Cc: 25987@debbugs.gnu.org
> Date: Wed, 11 Nov 2020 14:36:49 -0500
> 
> On Tue, 2020-10-20 at 18:54 +0300, Eli Zaretskii wrote:
> > > From: David Malcolm <dmalcolm@redhat.com>
> > > Cc: 25987@debbugs.gnu.org
> > > Date: Tue, 20 Oct 2020 10:52:05 -0400
> > > 
> > > One possible issue: in the final diagnostic, there's a fix-it hint
> > > with
> > > non-ASCII replacement text, replacing "two_pi" with "two_π" (where
> > > the
> > > final char in the latter is GREEK SMALL LETTER PI, U+03C0)
> > > 
> > > This replacement currently expressed as encoded bytes i.e:
> > > 
> > > fix-it:"demo.c":{51:10-51:16}:"two_\317\200"
> > > 
> > > where \317\200 is the octal-escaped representation of the two bytes
> > > of
> > > the UTF-8 encoding of the character.
> > > 
> > > Is this going to work for Emacs?
> > 
> > You mean, GCC doesn't actually emit the UTF-8 encoding of π, it emits
> > its ASCII-fied representation?  We'd need to decode that, but is that
> > really justified?  Why not emit UTF-8?
> 
> I have an implementation that simply emits UTF-8 in quotes, escaping
> backslash, tab, newline, and doublequotes as before.  (we have to
> escape at least newline, given that fix-it hint replacement text can
> contain them, and we're using newline to terminate the parseable hint).

Sorry, I've lost the context: where did those non-ASCII names come
from? are they names of variables in the user's program?  If so, in
what encoding does GCC quote portions of the source code in its
warning/error messages?  Does it use the exact byte stream it found in
the source, or does it perform any conversions of the encoding?

> However, the filename also needs to be escaped.  Currently I'm applying
> the same escaping rules to both filename and replacement text.
> What is the encoding of the filename?  What if the bytes in a filename
> aren't UTF-8 encoded?  How does emacs handle this case?

Emacs has a separate variable for the encoding of file names, which
gets set from the locale settings.  But this is not necessarily
relevant to the issue at hand, because we are talking about processing
output from a sub-process (GCC) which includes both file names and
other stuff, such as fragments of the source code.  When Emacs
processes sub-process output, it generally assumes all of it is
encoded in the same encoding.  So if, for example, you encode
non-ASCII variables in UTF-8 while the file names are emitted in some
other encoding (perhaps because the locale's codeset is not UTF-8),
then there will be complications: we will have to read the output from
GCC in its raw form, and then decode "by hand" (in Lisp) each part of
it as appropriate (which means we will need to be able to identifye
each such part).

So it's important to understand the situation and its limitations for
proposing the best solution.

> I tried creating file with the name "byte 0xff" .txt, and with valid
> UTF-8 non- ascii names and emacs reported them as \377.txt and with
> the UTF-8 names respectively, so perhaps I should simply emit the
> bytes and pretend they are UTF-8?

What do you mean by "pretend" in this context?





  reply	other threads:[~2020-11-12 13:54 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-05 21:47 bug#25987: 25.2; support gcc fixit notes Tom Tromey
2017-03-06 18:35 ` Eli Zaretskii
2017-03-07 13:54   ` Tom Tromey
2017-03-07 15:55     ` Eli Zaretskii
2017-03-08 18:34       ` Tom Tromey
2017-03-08 19:22         ` Eli Zaretskii
2017-03-09  4:20           ` Richard Stallman
2017-03-09 15:36             ` Eli Zaretskii
2017-03-08 18:44     ` Tom Tromey
2017-03-08 19:28       ` Eli Zaretskii
2017-03-09 16:37         ` Dmitry Gutov
2017-03-09 16:56           ` Eli Zaretskii
2017-03-09 17:37             ` Dmitry Gutov
2017-03-09 18:32               ` Eli Zaretskii
2017-03-09 21:26                 ` Dmitry Gutov
2017-08-06  3:34           ` Tom Tromey
2017-03-09 16:18 ` Dmitry Gutov
2017-03-09 16:53   ` Eli Zaretskii
2017-03-09 17:49     ` Dmitry Gutov
2017-03-09 18:35       ` Eli Zaretskii
2017-08-06  3:31   ` Tom Tromey
2018-03-16 16:48 ` David Malcolm
2018-03-16 20:19   ` Eli Zaretskii
2020-10-06 18:17     ` David Malcolm
2020-10-06 18:37       ` Eli Zaretskii
2020-10-12 22:27         ` David Malcolm
2020-10-13  7:34           ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
2020-10-13 14:37           ` Eli Zaretskii
2020-10-14 22:43             ` David Malcolm
2020-10-15  7:47               ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-01-14 21:37                 ` David Malcolm
2020-10-15 13:53               ` Eli Zaretskii
2020-10-15 14:23                 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
2020-10-15 14:29                   ` Eli Zaretskii
2020-10-15 14:44                     ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
2020-10-20 14:52             ` David Malcolm
2020-10-20 15:54               ` Eli Zaretskii
2020-11-11 19:36                 ` David Malcolm
2020-11-12 13:54                   ` Eli Zaretskii [this message]
2020-11-13 16:47                     ` David Malcolm
2020-11-14 14:21                       ` Eli Zaretskii
2020-11-14 19:46                         ` David Malcolm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83mtzmznmw.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=25987@debbugs.gnu.org \
    --cc=dmalcolm@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).