unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#36852: 27.0.50; ietf-drums-parse-address doesn't handle non-ascii properly
@ 2019-07-30  9:16 Štěpán Němec
  2019-07-30  9:53 ` Robert Pluim
  2019-09-15 12:00 ` Lars Ingebrigtsen
  0 siblings, 2 replies; 3+ messages in thread
From: Štěpán Němec @ 2019-07-30  9:16 UTC (permalink / raw)
  To: 36852


ietf-drums-parse-address (AKA mail-header-parse-address) uses
ietf-drums-atext-token to parse display-name, but the regexp range only
contains ASCII characters, so e.g. as used in debbugs-gnu-show-reports,
the following happens:

  (mail-header-parse-address
   (decode-coding-string "Áaááá Ůůůůů <aaa@example.net>" 'utf-8))

  ;;=> ("aaa@example.net" . "aááá")

It actually only cares about the first char of a word:

  (let ((ietf-drums-atext-token "-ÁŮ^a-zA-Z0-9!#$%&'*+/=?_`{|}~"))
    (mail-header-parse-address
     (decode-coding-string "Áaááá Ůůůůů <aaa@example.net>" 'utf-8)))

  ;;=> ("aaa@example.net" . "Áaááá Ůůůůů")

I'm not quite sure what the proper fix is, as the ASCII-only thing seems
to be intentional. Maybe it's just not supposed to be used the way it is
used in debbugs-gnu.el?





^ permalink raw reply	[flat|nested] 3+ messages in thread

* bug#36852: 27.0.50; ietf-drums-parse-address doesn't handle non-ascii properly
  2019-07-30  9:16 bug#36852: 27.0.50; ietf-drums-parse-address doesn't handle non-ascii properly Štěpán Němec
@ 2019-07-30  9:53 ` Robert Pluim
  2019-09-15 12:00 ` Lars Ingebrigtsen
  1 sibling, 0 replies; 3+ messages in thread
From: Robert Pluim @ 2019-07-30  9:53 UTC (permalink / raw)
  To: Štěpán Němec; +Cc: 36852

>>>>> On Tue, 30 Jul 2019 11:16:53 +0200, Štěpán Němec <stepnem@gmail.com> said:

    Štěpán> ietf-drums-parse-address (AKA mail-header-parse-address) uses
    Štěpán> ietf-drums-atext-token to parse display-name, but the regexp range only
    Štěpán> contains ASCII characters, so e.g. as used in debbugs-gnu-show-reports,
    Štěpán> the following happens:

    Štěpán>   (mail-header-parse-address
    Štěpán>    (decode-coding-string "Áaááá Ůůůůů <aaa@example.net>" 'utf-8))

    Štěpán>   ;;=> ("aaa@example.net" . "aááá")

    Štěpán> It actually only cares about the first char of a word:

    Štěpán>   (let ((ietf-drums-atext-token "-ÁŮ^a-zA-Z0-9!#$%&'*+/=?_`{|}~"))
    Štěpán>     (mail-header-parse-address
    Štěpán>      (decode-coding-string "Áaááá Ůůůůů <aaa@example.net>" 'utf-8)))

    Štěpán>   ;;=> ("aaa@example.net" . "Áaááá Ůůůůů")

    Štěpán> I'm not quite sure what the proper fix is, as the ASCII-only thing seems
    Štěpán> to be intentional. Maybe it's just not supposed to be used the way it is
    Štěpán> used in debbugs-gnu.el?

Mail headers are defined to be ascii-only, although as Iʼve just
discovered, gmail undoes Gnus' perfectly formatted RFC 2047 encoding
and replaces it with UTF-8 characters. Bad Google, bad.

Perhaps mail-header-parse-address could just discard the complete
display string if it finds a non-ascii char? That would at least
prevent it from propagating.

Robert





^ permalink raw reply	[flat|nested] 3+ messages in thread

* bug#36852: 27.0.50; ietf-drums-parse-address doesn't handle non-ascii properly
  2019-07-30  9:16 bug#36852: 27.0.50; ietf-drums-parse-address doesn't handle non-ascii properly Štěpán Němec
  2019-07-30  9:53 ` Robert Pluim
@ 2019-09-15 12:00 ` Lars Ingebrigtsen
  1 sibling, 0 replies; 3+ messages in thread
From: Lars Ingebrigtsen @ 2019-09-15 12:00 UTC (permalink / raw)
  To: Štěpán Němec; +Cc: 36852

Štěpán Němec <stepnem@gmail.com> writes:

> ietf-drums-parse-address (AKA mail-header-parse-address) uses
> ietf-drums-atext-token to parse display-name, but the regexp range only
> contains ASCII characters, so e.g. as used in debbugs-gnu-show-reports,
> the following happens:
>
>   (mail-header-parse-address
>    (decode-coding-string "Áaááá Ůůůůů <aaa@example.net>" 'utf-8))
>
>   ;;=> ("aaa@example.net" . "aááá")

That's not a valid email address, so perhaps `ietf-drums-parse-address'
should return a blank string as the name here...  On the other hand,
calling that function on something that's not an email address (which
debbugs-gnu does here) it should probably be free to return whatever.

> I'm not quite sure what the proper fix is, as the ASCII-only thing seems
> to be intentional. Maybe it's just not supposed to be used the way it is
> used in debbugs-gnu.el?

Indeed.  I've now changed debbugs-gnu to split the "OCTETS
<MORE-OCTETS>" string returned by the debbugs web server correctly.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-09-15 12:00 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-07-30  9:16 bug#36852: 27.0.50; ietf-drums-parse-address doesn't handle non-ascii properly Štěpán Němec
2019-07-30  9:53 ` Robert Pluim
2019-09-15 12:00 ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).