all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* bug#19431: 24.4; Bad handling of RFC2047 encoded headers by 'mail-extract-address-components'
@ 2014-12-22 15:10 Enrico Scholz
  2018-04-15 17:51 ` Lars Ingebrigtsen
  0 siblings, 1 reply; 2+ messages in thread
From: Enrico Scholz @ 2014-12-22 15:10 UTC (permalink / raw)
  To: 19431

Hi,

the emacs email framework fails on email addresses containing umlauts.
E.g. in the following example

--- {{{ snip ---
; set a (nearly) real-world To: address; the umlaut '=C3=A4' encoding
; was replaced by '=61=65'
(let* ((address "=?utf-8?Q?B=61=65Br=2C_Klaus?= <test@example.com>")
       (decoded (rfc2047-decode-string address)))
  ; show output with encoded umlauts and non-RFC2047 header
  (print (mail-extract-address-components "\"Baer, Klaus\" <test@example.com>"))
  (print address t)
  (print decoded t)
  ; previous prints were just for debugging purposes; now, the real
  ; functions will be called...
  (print (mail-extract-address-components address))
  (print (mail-extract-address-components decoded)))
--- }}} snip ---

none of the last two debug outputs show the expected split.

| ("Klaus Baer" "test@example.com")            <--- this is expected
| 
| "=?utf-8?Q?B=61=65r=2C_Klaus?= <test@example.com>"
| 
| "Baer, Klaus <test@example.com>"
| 
| ("utf" "test@example.com")   <-- BAD (working on undecoded string)
| 
| (nil "Baer")                 <-- BAD (working on decoded string)
| (nil "Baer")


Unfortunately, such RFC2047 encoded addresses are very common in Germany
so that e.g. BBDB (which works on the 'decoded' string) fails in very
much cases.



Enrico





^ permalink raw reply	[flat|nested] 2+ messages in thread

* bug#19431: 24.4; Bad handling of RFC2047 encoded headers by 'mail-extract-address-components'
  2014-12-22 15:10 bug#19431: 24.4; Bad handling of RFC2047 encoded headers by 'mail-extract-address-components' Enrico Scholz
@ 2018-04-15 17:51 ` Lars Ingebrigtsen
  0 siblings, 0 replies; 2+ messages in thread
From: Lars Ingebrigtsen @ 2018-04-15 17:51 UTC (permalink / raw)
  To: Enrico Scholz; +Cc: 19431

Enrico Scholz <enrico.scholz@sigma-chemnitz.de> writes:

> the emacs email framework fails on email addresses containing umlauts.
> E.g. in the following example
>
> --- {{{ snip ---
> ; set a (nearly) real-world To: address; the umlaut '=C3=A4' encoding
> ; was replaced by '=61=65'
> (let* ((address "=?utf-8?Q?B=61=65Br=2C_Klaus?= <test@example.com>")
>        (decoded (rfc2047-decode-string address)))
>   ; show output with encoded umlauts and non-RFC2047 header
>   (print (mail-extract-address-components "\"Baer, Klaus\" <test@example.com>"))
>   (print address t)
>   (print decoded t)
>   ; previous prints were just for debugging purposes; now, the real
>   ; functions will be called...
>   (print (mail-extract-address-components address))
>   (print (mail-extract-address-components decoded)))

Yes, that's a very confusing and not very useful function.  I've now
updated the doc string to point to `mail-header-parse-address', which is
the function that should be used to parse address headers, and does the
right thing also on German addresses.

I don't think it's worth trying to fix the mess that is
`mail-extract-address-components'.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-04-15 17:51 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-22 15:10 bug#19431: 24.4; Bad handling of RFC2047 encoded headers by 'mail-extract-address-components' Enrico Scholz
2018-04-15 17:51 ` Lars Ingebrigtsen

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.