* Bug in mail-extract-address-components (mail-extr.el)?
@ 2002-09-23 12:25 Reiner Steib
2002-09-23 12:57 ` Jesper Harder
2002-09-24 22:06 ` Simon Josefsson
0 siblings, 2 replies; 6+ messages in thread
From: Reiner Steib @ 2002-09-23 12:25 UTC (permalink / raw)
In Gnus, I use a `message-citation-line-function' that extracts the
full name of the previous poster with the function
`mail-extract-address-components' [1]:
,----[ C-h f mail-extract-address-components RET ]
| mail-extract-address-components is a compiled Lisp function in `mail-extr'.
| (mail-extract-address-components ADDRESS &optional ALL)
|
| Given an RFC-822 address ADDRESS, extract full name and canonical address.
| Returns a list of the form (FULL-NAME CANONICAL-ADDRESS).
| If no name can be extracted, FULL-NAME will be nil.
| [...]
`----
Recently I noticed, that the function fails for the following From:
line (which seem to be correct according to RFC-822):
| From: "Harald H.-J. Bongartz" <bongie@gmx.net>
Instead of "Harald H.-J. Bongartz" I get "Harald H.":
ELISP> (require 'mail-extr)
mail-extr
ELISP> (setq email "\"Harald H.-J. Bongartz\" <bongie@gmx.net>")
"\"Harald H.-J. Bongartz\" <bongie@gmx.net>"
ELISP> (setq data (mail-extract-address-components email))
("Harald H." "bongie@gmx.net")
ELISP> (car data)
"Harald H."
The error is reproducible with Emacs 21.1 and Emacs from CVS (last
week). The problem seems to be the "-":
ELISP> (car (mail-extract-address-components
"\"Harald H. J. Bongartz\" <bongie@gmx.net>")
"Harald H. J. Bongartz"
Is this a bug in `mail-extract-address-components' or should I use a
different function to get the full name?
Bye, Reiner.
[1] My function is based on a suggestion of François Fleuret in
news:<s02pu3weovk.fsf@wasabi.inria.fr>
--
,,,
(o o)
---ooO-(_)-Ooo--- PGP key available via WWW http://rsteib.home.pages.de/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Bug in mail-extract-address-components (mail-extr.el)?
2002-09-23 12:25 Bug in mail-extract-address-components (mail-extr.el)? Reiner Steib
@ 2002-09-23 12:57 ` Jesper Harder
2002-09-23 14:29 ` Reiner Steib
2002-09-24 22:06 ` Simon Josefsson
1 sibling, 1 reply; 6+ messages in thread
From: Jesper Harder @ 2002-09-23 12:57 UTC (permalink / raw)
Reiner Steib <4uce.02.r.steib@gmx.net> writes:
> Recently I noticed, that the function fails for the following From:
> line (which seem to be correct according to RFC-822):
>
> | From: "Harald H.-J. Bongartz" <bongie@gmx.net>
>
> Instead of "Harald H.-J. Bongartz" I get "Harald H.":
>
> Is this a bug in `mail-extract-address-components' or should I use a
> different function to get the full name?
In this particular case `gnus-extract-address-components' works better:
(gnus-extract-address-components "\"Harald H.-J. Bongartz\" <bongie@gmx.net>")
==> ("Harald H.-J. Bongartz" "bongie@gmx.net")
But usually `mail-extract-address-components' is more reliable (but also
really complicated).
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Bug in mail-extract-address-components (mail-extr.el)?
2002-09-23 12:57 ` Jesper Harder
@ 2002-09-23 14:29 ` Reiner Steib
2002-09-23 14:48 ` lawrence mitchell
0 siblings, 1 reply; 6+ messages in thread
From: Reiner Steib @ 2002-09-23 14:29 UTC (permalink / raw)
On Mon, Sep 23 2002, Jesper Harder wrote:
> Reiner Steib <4uce.02.r.steib@gmx.net> writes:
[...]
>> Instead of "Harald H.-J. Bongartz" I get "Harald H.":
>>
>> Is this a bug in `mail-extract-address-components' or should I use a
>> different function to get the full name?
>
> In this particular case `gnus-extract-address-components' works better:
[...]
> ==> ("Harald H.-J. Bongartz" "bongie@gmx.net")
Thanks for the hint!
> But usually `mail-extract-address-components' is more reliable (but also
> really complicated).
The code of mail-e-a-c spans more than 700 lines, whereas gnus-e-a-c
has only 27 lines. Therefore it's even more surprising that mail-e-a-c
fails for the given example (assuming it's a valid RFC-822 address),
which probably occurs quite often in real life [1]. mail-e-a-c also
fails for this:
(car (mail-extract-address-components "\"K.-H. Foo\" <foo@bar.invalid>"))
==> nil
Bye, Reiner.
[1] At least in Germany such names are not so rare: Abbreviated forms
of Karl-Heinz, ...
--
,,,
(o o)
---ooO-(_)-Ooo--- PGP key available via WWW http://rsteib.home.pages.de/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Bug in mail-extract-address-components (mail-extr.el)?
2002-09-23 14:29 ` Reiner Steib
@ 2002-09-23 14:48 ` lawrence mitchell
2002-09-24 21:30 ` Simon Josefsson
0 siblings, 1 reply; 6+ messages in thread
From: lawrence mitchell @ 2002-09-23 14:48 UTC (permalink / raw)
[...] mail-e-a-c vs gnus-e-a-c.
Jesper Harder commented:
>> But usually `mail-extract-address-components' is more reliable (but also
>> really complicated).
To which Reiner Steib responded:
> The code of mail-e-a-c spans more than 700 lines, whereas gnus-e-a-c
> has only 27 lines. Therefore it's even more surprising that mail-e-a-c
> fails for the given example (assuming it's a valid RFC-822 address),
> which probably occurs quite often in real life [1]. mail-e-a-c also
> fails for this:
> (car (mail-extract-address-components "\"K.-H. Foo\" <foo@bar.invalid>"))
> ==> nil
mail-e-a-c also fails when for a single name/comment part of the
email address:
(mail-extract-address-components "lawrence <foo@bar.com>")
=> (nil "foo@bar.com")
Which, by my reading of RFC2822 is a valid address form (ICBW).
Time for a bug report I wonder?
--
lawrence mitchell <wence@gmx.li>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Bug in mail-extract-address-components (mail-extr.el)?
2002-09-23 14:48 ` lawrence mitchell
@ 2002-09-24 21:30 ` Simon Josefsson
0 siblings, 0 replies; 6+ messages in thread
From: Simon Josefsson @ 2002-09-24 21:30 UTC (permalink / raw)
lawrence mitchell <wence@gmx.li> writes:
> [...] mail-e-a-c vs gnus-e-a-c.
>
> Jesper Harder commented:
>>> But usually `mail-extract-address-components' is more reliable (but also
>>> really complicated).
>
> To which Reiner Steib responded:
>> The code of mail-e-a-c spans more than 700 lines, whereas gnus-e-a-c
>> has only 27 lines. Therefore it's even more surprising that mail-e-a-c
>> fails for the given example (assuming it's a valid RFC-822 address),
>> which probably occurs quite often in real life [1]. mail-e-a-c also
>> fails for this:
>
>> (car (mail-extract-address-components "\"K.-H. Foo\" <foo@bar.invalid>"))
>> ==> nil
>
> mail-e-a-c also fails when for a single name/comment part of the
> email address:
>
> (mail-extract-address-components "lawrence <foo@bar.com>")
> => (nil "foo@bar.com")
>
> Which, by my reading of RFC2822 is a valid address form (ICBW).
This is a feature, see `mail-extr-ignore-single-names'. I think the
default value is a bad choice though.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Bug in mail-extract-address-components (mail-extr.el)?
2002-09-23 12:25 Bug in mail-extract-address-components (mail-extr.el)? Reiner Steib
2002-09-23 12:57 ` Jesper Harder
@ 2002-09-24 22:06 ` Simon Josefsson
1 sibling, 0 replies; 6+ messages in thread
From: Simon Josefsson @ 2002-09-24 22:06 UTC (permalink / raw)
Reiner Steib <4uce.02.r.steib@gmx.net> writes:
> Recently I noticed, that the function fails for the following From:
> line (which seem to be correct according to RFC-822):
>
> | From: "Harald H.-J. Bongartz" <bongie@gmx.net>
>
> Instead of "Harald H.-J. Bongartz" I get "Harald H.":
Yes, mail-extr.el does (too) many things. The code that fails in this
example is:
;; Fixup initials
((looking-at mail-extr-initial-pattern)
(or (eq (following-char) (upcase (following-char)))
(setq lower-case-flag t))
(forward-char 1)
(if (eq ?. (following-char))
(forward-char 1)
(insert ?.))
(or (eq ?\ (following-char))
(insert ?\ ))
(setq word-found-flag t))
> Is this a bug in `mail-extract-address-components' or should I use a
> different function to get the full name?
mail-extr is not a clean RFC 2822 parser, it is a heuristic parser.
There is no complete RFC 2822 parser in Emacs AFAIK, only several
heuristic ones.
A real RFC 2822 parser would be good to have, it would improve Gnus'
header encoding which sometimes generate bad QP that causes mail to be
bounced...
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2002-09-24 22:06 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-09-23 12:25 Bug in mail-extract-address-components (mail-extr.el)? Reiner Steib
2002-09-23 12:57 ` Jesper Harder
2002-09-23 14:29 ` Reiner Steib
2002-09-23 14:48 ` lawrence mitchell
2002-09-24 21:30 ` Simon Josefsson
2002-09-24 22:06 ` Simon Josefsson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).