* bug#56422: 29.0.50; mail-extract-address-components poorly handles " via " addresses
@ 2022-07-06 15:22 Sam Steingold
2022-07-07 9:10 ` Lars Ingebrigtsen
0 siblings, 1 reply; 3+ messages in thread
From: Sam Steingold @ 2022-07-06 15:22 UTC (permalink / raw)
To: 56422
Often "From" email addresses in mailing lists look like this:
--8<---------------cut here---------------start------------->8---
From: Po Lu via "Emacs development discussions." <emacs-devel@gnu.org>
From: carlmarcos--- via Users list for the GNU Emacs text editor <help-gnu-emacs@gnu.org>
From: Stefan Monnier via Users list for the GNU Emacs text editor <help-gnu-emacs@gnu.org>
--8<---------------cut here---------------end--------------->8---
`mail-extract-address-components' handles them poorly:
--8<---------------cut here---------------start------------->8---
(mail-extract-address-components "Po Lu via \"Emacs development discussions.\" <emacs-devel@gnu.org>")
=> ("Po Lu via" "emacs-devel@gnu.org")
(mail-extract-address-components "carlmarcos--- via Users list for the GNU Emacs text editor <help-gnu-emacs@gnu.org>")
=> ("carlmarcos" "help-gnu-emacs@gnu.org")
(mail-extract-address-components "Stefan Monnier via Users list for the GNU Emacs text editor <help-gnu-emacs@gnu.org>")
=> ("Stefan Monnier via Users list for the" "help-gnu-emacs@gnu.org")
--8<---------------cut here---------------end--------------->8---
The correct handling would be
--8<---------------cut here---------------start------------->8---
(mail-extract-address-components "Po Lu via \"Emacs development discussions.\" <emacs-devel@gnu.org>")
=> ("Emacs development discussions." "emacs-devel@gnu.org")
(mail-extract-address-components "carlmarcos--- via Users list for the GNU Emacs text editor <help-gnu-emacs@gnu.org>")
=> ("Users list for the GNU Emacs text editor" "help-gnu-emacs@gnu.org")
(mail-extract-address-components "Stefan Monnier via Users list for the GNU Emacs text editor <help-gnu-emacs@gnu.org>")
=> ("Users list for the GNU Emacs text editor" "help-gnu-emacs@gnu.org")
--8<---------------cut here---------------end--------------->8---
or, at least,
--8<---------------cut here---------------start------------->8---
(mail-extract-address-components "Po Lu via \"Emacs development discussions.\" <emacs-devel@gnu.org>")
=> ("Po Lu via \"Emacs development discussions.\"" "emacs-devel@gnu.org")
(mail-extract-address-components "carlmarcos--- via Users list for the GNU Emacs text editor <help-gnu-emacs@gnu.org>")
=> ("carlmarcos--- via Users list for the GNU Emacs text editor" "help-gnu-emacs@gnu.org")
(mail-extract-address-components "Stefan Monnier via Users list for the GNU Emacs text editor <help-gnu-emacs@gnu.org>")
=> ("Stefan Monnier via Users list for the GNU Emacs text editor" "help-gnu-emacs@gnu.org")
--8<---------------cut here---------------end--------------->8---
Please see the relevant discussion on the BBDB user list:
https://lists.nongnu.org/archive/html/bbdb-user/2022-06/msg00000.html
https://lists.nongnu.org/archive/html/bbdb-user/2022-07/msg00000.html
In https://lists.nongnu.org/archive/html/bbdb-user/2022-07/msg00006.html
I propose a workaround for a _single_ address (i.e., when the second
argument to `mail-extract-address-components' is nil):
--8<---------------cut here---------------start------------->8---
(defun mail-extract-handle-via (args)
"Handle `via` in email address"
(let ((address (car args))
(all (cadr args)))
(if (and (null all)
(string-match " via \\(.*\\)$" address))
(list (match-string 1 address) nil)
(list address all))))
(advice-add 'mail-extract-address-components :filter-args 'mail-extract-handle-via)
--8<---------------cut here---------------end--------------->8---
this is clearly suboptimal, especially when the "name" part of the
address contains many words:
--8<---------------cut here---------------start------------->8---
(mail-extract-address-components "Stefan Monnier via Users list for the GNU Emacs text editor <help-gnu-emacs@gnu.org>")
("Users list for the" "help-gnu-emacs@gnu.org")
--8<---------------cut here---------------end--------------->8---
It appears that `mail-header-parse-address' is a better choice (as per
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=10406) but you might still
consider addressing this issue...
Thank you.
In GNU Emacs 29.0.50 (build 2, x86_64-apple-darwin21.5.0, NS appkit-2113.50 Version 12.4 (Build 21F79))
of 2022-07-05 built on 3c22fb11fdab.ant.amazon.com
Repository revision: 59276ff81d1ab391f4e3cd91f3070a12c51a3507
Repository branch: master
Windowing system distributor 'Apple', version 10.3.2113
System Description: macOS 12.4
--
Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.2113
http://childpsy.net http://calmchildstories.com http://steingoldpsychology.com
https://www.peaceandtolerance.org/ https://honestreporting.com
The past is gone, the present is ephemeral, the future is a guess.
^ permalink raw reply [flat|nested] 3+ messages in thread
* bug#56422: 29.0.50; mail-extract-address-components poorly handles " via " addresses
2022-07-06 15:22 bug#56422: 29.0.50; mail-extract-address-components poorly handles " via " addresses Sam Steingold
@ 2022-07-07 9:10 ` Lars Ingebrigtsen
2022-07-11 10:55 ` Lars Ingebrigtsen
0 siblings, 1 reply; 3+ messages in thread
From: Lars Ingebrigtsen @ 2022-07-07 9:10 UTC (permalink / raw)
To: Sam Steingold; +Cc: 56422
Sam Steingold <sds@gnu.org> writes:
> The correct handling would be
>
> (mail-extract-address-components "Po Lu via \"Emacs development discussions.\" <emacs-devel@gnu.org>")
> => ("Emacs development discussions." "emacs-devel@gnu.org")
mail-extract-address-components is a DWIM-ish thing that doesn't have
much documented behaviour -- it just tries to make things "pretty" by
applying lots of (mostly misguided) heuristics. So talking about
"correct" here isn't er correct.
If you have an RFC822bis From header, you should use
`mail-header-parse-address'. If you have something that's vaguely like
a mail header and want to split it, use `mail-header-parse-address-lax'.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
^ permalink raw reply [flat|nested] 3+ messages in thread
* bug#56422: 29.0.50; mail-extract-address-components poorly handles " via " addresses
2022-07-07 9:10 ` Lars Ingebrigtsen
@ 2022-07-11 10:55 ` Lars Ingebrigtsen
0 siblings, 0 replies; 3+ messages in thread
From: Lars Ingebrigtsen @ 2022-07-11 10:55 UTC (permalink / raw)
To: Sam Steingold; +Cc: 56422
Lars Ingebrigtsen <larsi@gnus.org> writes:
> If you have an RFC822bis From header, you should use
> `mail-header-parse-address'. If you have something that's vaguely like
> a mail header and want to split it, use `mail-header-parse-address-lax'.
So I don't think there's anything to fix here, and I'm closing this bug
report.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-07-11 10:55 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-07-06 15:22 bug#56422: 29.0.50; mail-extract-address-components poorly handles " via " addresses Sam Steingold
2022-07-07 9:10 ` Lars Ingebrigtsen
2022-07-11 10:55 ` Lars Ingebrigtsen
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).