* bug#36852: 27.0.50; ietf-drums-parse-address doesn't handle non-ascii properly
@ 2019-07-30 9:16 Štěpán Němec
2019-07-30 9:53 ` Robert Pluim
2019-09-15 12:00 ` Lars Ingebrigtsen
0 siblings, 2 replies; 3+ messages in thread
From: Štěpán Němec @ 2019-07-30 9:16 UTC (permalink / raw)
To: 36852
ietf-drums-parse-address (AKA mail-header-parse-address) uses
ietf-drums-atext-token to parse display-name, but the regexp range only
contains ASCII characters, so e.g. as used in debbugs-gnu-show-reports,
the following happens:
(mail-header-parse-address
(decode-coding-string "Áaááá Ůůůůů <aaa@example.net>" 'utf-8))
;;=> ("aaa@example.net" . "aááá")
It actually only cares about the first char of a word:
(let ((ietf-drums-atext-token "-ÁŮ^a-zA-Z0-9!#$%&'*+/=?_`{|}~"))
(mail-header-parse-address
(decode-coding-string "Áaááá Ůůůůů <aaa@example.net>" 'utf-8)))
;;=> ("aaa@example.net" . "Áaááá Ůůůůů")
I'm not quite sure what the proper fix is, as the ASCII-only thing seems
to be intentional. Maybe it's just not supposed to be used the way it is
used in debbugs-gnu.el?
^ permalink raw reply [flat|nested] 3+ messages in thread
* bug#36852: 27.0.50; ietf-drums-parse-address doesn't handle non-ascii properly
2019-07-30 9:16 bug#36852: 27.0.50; ietf-drums-parse-address doesn't handle non-ascii properly Štěpán Němec
@ 2019-07-30 9:53 ` Robert Pluim
2019-09-15 12:00 ` Lars Ingebrigtsen
1 sibling, 0 replies; 3+ messages in thread
From: Robert Pluim @ 2019-07-30 9:53 UTC (permalink / raw)
To: Štěpán Němec; +Cc: 36852
>>>>> On Tue, 30 Jul 2019 11:16:53 +0200, Štěpán Němec <stepnem@gmail.com> said:
Štěpán> ietf-drums-parse-address (AKA mail-header-parse-address) uses
Štěpán> ietf-drums-atext-token to parse display-name, but the regexp range only
Štěpán> contains ASCII characters, so e.g. as used in debbugs-gnu-show-reports,
Štěpán> the following happens:
Štěpán> (mail-header-parse-address
Štěpán> (decode-coding-string "Áaááá Ůůůůů <aaa@example.net>" 'utf-8))
Štěpán> ;;=> ("aaa@example.net" . "aááá")
Štěpán> It actually only cares about the first char of a word:
Štěpán> (let ((ietf-drums-atext-token "-ÁŮ^a-zA-Z0-9!#$%&'*+/=?_`{|}~"))
Štěpán> (mail-header-parse-address
Štěpán> (decode-coding-string "Áaááá Ůůůůů <aaa@example.net>" 'utf-8)))
Štěpán> ;;=> ("aaa@example.net" . "Áaááá Ůůůůů")
Štěpán> I'm not quite sure what the proper fix is, as the ASCII-only thing seems
Štěpán> to be intentional. Maybe it's just not supposed to be used the way it is
Štěpán> used in debbugs-gnu.el?
Mail headers are defined to be ascii-only, although as Iʼve just
discovered, gmail undoes Gnus' perfectly formatted RFC 2047 encoding
and replaces it with UTF-8 characters. Bad Google, bad.
Perhaps mail-header-parse-address could just discard the complete
display string if it finds a non-ascii char? That would at least
prevent it from propagating.
Robert
^ permalink raw reply [flat|nested] 3+ messages in thread
* bug#36852: 27.0.50; ietf-drums-parse-address doesn't handle non-ascii properly
2019-07-30 9:16 bug#36852: 27.0.50; ietf-drums-parse-address doesn't handle non-ascii properly Štěpán Němec
2019-07-30 9:53 ` Robert Pluim
@ 2019-09-15 12:00 ` Lars Ingebrigtsen
1 sibling, 0 replies; 3+ messages in thread
From: Lars Ingebrigtsen @ 2019-09-15 12:00 UTC (permalink / raw)
To: Štěpán Němec; +Cc: 36852
Štěpán Němec <stepnem@gmail.com> writes:
> ietf-drums-parse-address (AKA mail-header-parse-address) uses
> ietf-drums-atext-token to parse display-name, but the regexp range only
> contains ASCII characters, so e.g. as used in debbugs-gnu-show-reports,
> the following happens:
>
> (mail-header-parse-address
> (decode-coding-string "Áaááá Ůůůůů <aaa@example.net>" 'utf-8))
>
> ;;=> ("aaa@example.net" . "aááá")
That's not a valid email address, so perhaps `ietf-drums-parse-address'
should return a blank string as the name here... On the other hand,
calling that function on something that's not an email address (which
debbugs-gnu does here) it should probably be free to return whatever.
> I'm not quite sure what the proper fix is, as the ASCII-only thing seems
> to be intentional. Maybe it's just not supposed to be used the way it is
> used in debbugs-gnu.el?
Indeed. I've now changed debbugs-gnu to split the "OCTETS
<MORE-OCTETS>" string returned by the debbugs web server correctly.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-09-15 12:00 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-07-30 9:16 bug#36852: 27.0.50; ietf-drums-parse-address doesn't handle non-ascii properly Štěpán Němec
2019-07-30 9:53 ` Robert Pluim
2019-09-15 12:00 ` Lars Ingebrigtsen
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).