unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#39689: 26.3; browse-url-mail not supporting RFC6068 (UTF-8-Based Percent-Encoding)
@ 2020-02-20 13:48 Vegard Vesterheim via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2020-02-27 10:51 ` Robert Pluim
  0 siblings, 1 reply; 6+ messages in thread
From: Vegard Vesterheim via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-02-20 13:48 UTC (permalink / raw)
  To: 39689

Emacs does not seem to correctly handle UTF-8-Based Percent-Encoding as
illustrated in Chapter 6.2 from RFC6068.

The command 
   emacs -Q -l browse-url -eval '(browse-url-mail "mailto:user@example.org?subject=caf%C3%A9&body=caf%C3%A9")'

should result in a message buffer with the string "café" insterted into the
body part of the message. Instead the string "café" is inserted.

I am running Ubuntu 18.04.3 LTS.

M-x emacs-version returns:
  "GNU Emacs 26.3 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.22.30) of 2019-09-16" 

$ locale -a | grep -i utf
C.UTF-8
en_AG.utf8
en_AU.utf8
en_BW.utf8
en_CA.utf8
en_DK.utf8
en_GB.utf8
en_HK.utf8
en_IE.utf8
en_IL.utf8
en_IN.utf8
en_NG.utf8
en_NZ.utf8
en_PH.utf8
en_SG.utf8
en_US.utf8
en_ZA.utf8
en_ZM.utf8
en_ZW.utf8
nb_NO.utf8

$ env | grep LC
LC_MEASUREMENT=en_US.UTF-8
LC_PAPER=en_US.UTF-8
LC_MONETARY=en_US.UTF-8
LC_NAME=en_US.UTF-8
LC_ADDRESS=en_US.UTF-8
LC_NUMERIC=en_US.UTF-8
LC_TELEPHONE=en_US.UTF-8
LC_IDENTIFICATION=en_US.UTF-8
LC_TIME=nb_NO.utf8

$ env | grep LANG
LANG=en_US.UTF-8
GDM_LANG=en
NLS_LANG=NORWEGIAN_NORWAY.WE8ISO8859P1
LANGUAGE=en


--

Vennlig hilsen/Best regards
Vegard Vesterheim
Senior Software engineer
+47 48 11 98 98
vegard.vesterheim@uninett.no





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#39689: 26.3; browse-url-mail not supporting RFC6068 (UTF-8-Based Percent-Encoding)
  2020-02-20 13:48 bug#39689: 26.3; browse-url-mail not supporting RFC6068 (UTF-8-Based Percent-Encoding) Vegard Vesterheim via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2020-02-27 10:51 ` Robert Pluim
  2020-02-27 15:08   ` Robert Pluim
  0 siblings, 1 reply; 6+ messages in thread
From: Robert Pluim @ 2020-02-27 10:51 UTC (permalink / raw)
  To: Vegard Vesterheim; +Cc: 39689, larsi

>>>>> On Thu, 20 Feb 2020 14:48:54 +0100, Vegard Vesterheim via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org> said:

    Vegard> Emacs does not seem to correctly handle UTF-8-Based Percent-Encoding as
    Vegard> illustrated in Chapter 6.2 from RFC6068.

    Vegard> The command 
    Vegard>    emacs -Q -l browse-url -eval '(browse-url-mail "mailto:user@example.org?subject=caf%C3%A9&body=caf%C3%A9")'

    Vegard> should result in a message buffer with the string "café" insterted into the
    Vegard> body part of the message. Instead the string "café" is inserted.

Yes, the assumption in rfc2368-unhexify-string is that percent
escaping is being done of ASCII characters.

epg--decode-percent-escape-as-utf-8 in epg.el does the
right thing, it could be renamed and moved. I think rfc2047 decoding
needs doing on the result as well. Lars, should I just stick these in
rfc2368.el but named something like rfc6068-unhexify-string and
rfc6068-decode-2047-string or something?

Robert





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#39689: 26.3; browse-url-mail not supporting RFC6068 (UTF-8-Based Percent-Encoding)
  2020-02-27 10:51 ` Robert Pluim
@ 2020-02-27 15:08   ` Robert Pluim
  2020-03-14 12:25     ` Lars Ingebrigtsen
  0 siblings, 1 reply; 6+ messages in thread
From: Robert Pluim @ 2020-02-27 15:08 UTC (permalink / raw)
  To: Vegard Vesterheim; +Cc: 39689, larsi

>>>>> On Thu, 27 Feb 2020 11:51:35 +0100, Robert Pluim <rpluim@gmail.com> said:

>>>>> On Thu, 20 Feb 2020 14:48:54 +0100, Vegard Vesterheim via "Bug
    Robert> reports for GNU Emacs, the Swiss army knife of text editors"
    Robert> <bug-gnu-emacs@gnu.org> said:

    Vegard> Emacs does not seem to correctly handle UTF-8-Based Percent-Encoding as
    Vegard> illustrated in Chapter 6.2 from RFC6068.

    Vegard> The command 
    Vegard> emacs -Q -l browse-url -eval '(browse-url-mail "mailto:user@example.org?subject=caf%C3%A9&body=caf%C3%A9")'

    Vegard> should result in a message buffer with the string "café" insterted into the
    Vegard> body part of the message. Instead the string "café" is inserted.

    Robert> Yes, the assumption in rfc2368-unhexify-string is that percent
    Robert> escaping is being done of ASCII characters.

    Robert> epg--decode-percent-escape-as-utf-8 in epg.el does the
    Robert> right thing, it could be renamed and moved. I think rfc2047 decoding
    Robert> needs doing on the result as well. Lars, should I just stick these in
    Robert> rfc2368.el but named something like rfc6068-unhexify-string and
    Robert> rfc6068-decode-2047-string or something?

Oh, and thereʼs another version in gnus-util, and one in url, and an
almost compatible one in org [1]. The gnus and url ones suffer from
this same issue, although they both return different wrong results :-)

At least the epg, rfc2368, gnus, and the url versions look like they
can be unified. Not sure where to put them though.

Footnotes:
[1]  It supports % representation of the UTF-8 encoding of chars, but
     also of the unicode code point of chars, so eg %E1 gets turned
     into á. Iʼm sure thereʼs some historical reason for that.






^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#39689: 26.3; browse-url-mail not supporting RFC6068 (UTF-8-Based Percent-Encoding)
  2020-02-27 15:08   ` Robert Pluim
@ 2020-03-14 12:25     ` Lars Ingebrigtsen
  2020-10-25 11:41       ` Vegard Vesterheim via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2021-08-30  0:05       ` Lars Ingebrigtsen
  0 siblings, 2 replies; 6+ messages in thread
From: Lars Ingebrigtsen @ 2020-03-14 12:25 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 39689, Vegard Vesterheim

Robert Pluim <rpluim@gmail.com> writes:

> Oh, and thereʼs another version in gnus-util, and one in url, and an
> almost compatible one in org [1]. The gnus and url ones suffer from
> this same issue, although they both return different wrong results :-)

:-)

> At least the epg, rfc2368, gnus, and the url versions look like they
> can be unified. Not sure where to put them though.

Putting them in either rfc2368 or url.el would make sense.  Hm...
perhaps url-util.el?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#39689: 26.3; browse-url-mail not supporting RFC6068 (UTF-8-Based Percent-Encoding)
  2020-03-14 12:25     ` Lars Ingebrigtsen
@ 2020-10-25 11:41       ` Vegard Vesterheim via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2021-08-30  0:05       ` Lars Ingebrigtsen
  1 sibling, 0 replies; 6+ messages in thread
From: Vegard Vesterheim via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-10-25 11:41 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 39689, Robert Pluim

On Sat, 14 Mar 2020 13:25:49 +0100 Lars Ingebrigtsen <larsi@gnus.org> wrote:

> Robert Pluim <rpluim@gmail.com> writes:
>
>> Oh, and thereʼs another version in gnus-util, and one in url, and an
>> almost compatible one in org [1]. The gnus and url ones suffer from
>> this same issue, although they both return different wrong results :-)
>
> :-)
>
>> At least the epg, rfc2368, gnus, and the url versions look like they
>> can be unified. Not sure where to put them though.
>
> Putting them in either rfc2368 or url.el would make sense.  Hm...
> perhaps url-util.el?

I am assuming this bug is not yet fixed. Can anyone advice on a
workaround I can apply for this bug. I am using emacs 26.1 (as packaged
in Debian buster)

-- 
Vennlig hilsen/Best regards
Vegard Vesterheim
Senior Software engineer
+47 48 11 98 98
vegard.vesterheim@uninett.no





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#39689: 26.3; browse-url-mail not supporting RFC6068 (UTF-8-Based Percent-Encoding)
  2020-03-14 12:25     ` Lars Ingebrigtsen
  2020-10-25 11:41       ` Vegard Vesterheim via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2021-08-30  0:05       ` Lars Ingebrigtsen
  1 sibling, 0 replies; 6+ messages in thread
From: Lars Ingebrigtsen @ 2021-08-30  0:05 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 39689, Vegard Vesterheim

Lars Ingebrigtsen <larsi@gnus.org> writes:

>> At least the epg, rfc2368, gnus, and the url versions look like they
>> can be unified. Not sure where to put them though.
>
> Putting them in either rfc2368 or url.el would make sense.  Hm...
> perhaps url-util.el?

I made a new file, obsoleted rfc2368, and adjusted browse-url and epg.
So the original reported problem should now be fixed in Emacs 28.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-08-30  0:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-20 13:48 bug#39689: 26.3; browse-url-mail not supporting RFC6068 (UTF-8-Based Percent-Encoding) Vegard Vesterheim via Bug reports for GNU Emacs, the Swiss army knife of text editors
2020-02-27 10:51 ` Robert Pluim
2020-02-27 15:08   ` Robert Pluim
2020-03-14 12:25     ` Lars Ingebrigtsen
2020-10-25 11:41       ` Vegard Vesterheim via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-08-30  0:05       ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).