all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Robert Pluim <rpluim@gmail.com>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: emacs-devel@gnu.org
Subject: Re: master d57bb0c: Treat passed strings as raw-text when percent-escaping in epg
Date: Thu, 12 Dec 2019 16:19:46 +0100	[thread overview]
Message-ID: <m2o8wdwoz1.fsf@gmail.com> (raw)
In-Reply-To: <jwv4ky5vfls.fsf-monnier+emacs@gnu.org> (Stefan Monnier's message of "Thu, 12 Dec 2019 08:58:33 -0500")

>>>>> On Thu, 12 Dec 2019 08:58:33 -0500, Stefan Monnier <monnier@iro.umontreal.ca> said:

    Stefan> Hi Robert,
    >> The strings contained in gpg keys can contain UTF-8 data, but can also
    >> use percent-escapes to encode non-ASCII chars.  When converting those
    >> escapes, use 'raw-text' coding system rather than 'string-to-unibyte',
    >> since the latter signals an error for non-ASCII characters.

    Stefan> I don't quite understand: "can contain UTF-8 data" seems odd here since
    Stefan> you're calling `encode-coding-string` whose input argument is a sequence
    Stefan> of characters whereas "UTF-8 data" can only be found in sequences of bytes.

    Stefan> Did you mean "can contain non-ASCII characters"?

"can contain non-ASCII characters encoded using UTF-8", which means
they end up in a multi-byte string in emacs.

    Stefan> The other problem with the above description is the "raw-text" since
    Stefan> it's far from clear what it means (personally I really have no idea
    Stefan> what is "raw text" and the way Emacs understands "raw text" is more or
    Stefan> less "EOL-separated lines of bytes" which does not seem to match your
    Stefan> description since string-to-unibyte doesn't signal errors when
    Stefan> encountering bytes).

Itʼs replacing the use of string-to-unibyte on a multibyte string
containing non-ASCII characters, which signals an error, with
encode-coding-string using 'raw-text, which produces a bunch of
bytes. My other choices were 'binary or 'no-conversion, which do the
same, but have even less meaningful names.

    Stefan> Looking at the code, I see that the only caller of
    Stefan> `epg--decode-percent-escape` seems to be
    Stefan> `epg--decode-percent-escape-utf-8` which decodes the bytes returned by
    Stefan> `epg--decode-percent-escape` using `utf-8` so I think it makes more
    Stefan> sense to encode using `utf-8` than `raw-text`, WDYT?

No. The string that is passed to epg--decode-percent-escape can
contain non-ASCII characters encoded as UTF-8, plus percent-escaped
representations of non-ASCII characters. In order to convert those
percent-escaped characters correctly, the string has to be treated as
a unibyte array of bytes, then re-converted to multibyte by encoding
with utf-8 afterwards.

Robert



  reply	other threads:[~2019-12-12 15:19 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20191212073737.19923.49084@vcs0.savannah.gnu.org>
     [not found] ` <20191212073738.9C7A3200E3@vcs0.savannah.gnu.org>
2019-12-12 13:58   ` master d57bb0c: Treat passed strings as raw-text when percent-escaping in epg Stefan Monnier
2019-12-12 15:19     ` Robert Pluim [this message]
2019-12-12 15:28       ` Eli Zaretskii
2019-12-12 15:33         ` Robert Pluim
2019-12-12 15:36           ` Eli Zaretskii
2019-12-12 15:30       ` Robert Pluim
2019-12-12 15:45         ` Eli Zaretskii
2019-12-12 16:17           ` Robert Pluim
2019-12-16 16:06           ` Stefan Monnier
2019-12-12 15:42       ` Andreas Schwab
2019-12-12 16:11         ` Robert Pluim
2019-12-12 16:33           ` Andreas Schwab
2019-12-12 16:39             ` Robert Pluim
2019-12-12 16:55               ` Andreas Schwab
2019-12-12 17:57           ` Eli Zaretskii
2019-12-13  9:40             ` Robert Pluim
2019-12-13 10:33               ` Eli Zaretskii
2019-12-16 16:21       ` Stefan Monnier
2019-12-16 16:38         ` Robert Pluim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m2o8wdwoz1.fsf@gmail.com \
    --to=rpluim@gmail.com \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.