all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Simon Josefsson <jas@extundo.com>
Cc: emacs-devel@gnu.org
Subject: Re: mail-extract-address-components extract modified full name
Date: Tue, 27 Jul 2004 18:28:09 +0200	[thread overview]
Message-ID: <ilubri1ifdy.fsf@latte.josefsson.org> (raw)
In-Reply-To: <jwvy8l55ykz.fsf-monnier+emacs@gnu.org> (Stefan Monnier's message of "27 Jul 2004 10:19:49 -0400")

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> like the approach you propose.  XEmacs users have reported even
>> Latin-2 problems with the current implementation (Emacs do not have
>> those problems, though, but it suggest the implementation could be
>> improved).
>
> [ The below is all "IIRC". ]
>
> The function is supposed to receive ASCII input, so it's no wonder it might
> break in other circumstances.  Why ASCII input?
>
> Because the way things are defined in the RFCs, you should split the address
> before doing the un-quoting of base64 and QP thingies.
> I.e. after unquoting, the string might not be parsable any more (because
> one of the QP chars could be a ", a \, a <, or something like that).
>
> So the usual answer is that if you call the function with non-ASCII input,
> you're not using it properly.  But of course, it's not that simple since you
> might want to call that function e.g. on an email message that is being
> written and that hasn't been QP-encoded yet.

I agree, and have been arguing the same thing when people complain
that mail-extr* cannot handle their weird input.

Unfortunately, it is a losing discussion, since I can't claim that
mail-extr* is only intended for use with all-ASCII valid RFC 822
input, since that isn't what it implement.  It is just a big hack, and
could be massaged into behaving (badly) for any purpose.

One example is that BBDB reportedly uses mail-extr* to split the
e-mail addresses it store locally, in ~/.bbdb, which naturally aren't
QP encoded.  This probably illustrate a class of applications, that
deal with mail addresses, but aren't proper mail reader or writer, so
it wouldn't make sense for them to use QP.

IMHO, there should be two packages:

1) Proper RFC (2)822 parser.  There is rfc822.el but it is
   insufficient, and I'm not sure it is correct -- it uses regexp's a
   lot, but I recall that the "correct" 2822 grammar, expressed as
   regexp's, is much more complex than what rfc822.el does.
   Naturally, it should only accept valid RFC 822 input, which is
   ASCII only.

   (Incidentally, the QP encoder/decoder need to use this package,
   since QP must only be applied to certain RFC 2822 grammatical
   terminals, not all text, and I believe the current QP
   encoder/decoder doesn't do this properly.)

2) Ad-hoc approach that split real world textual e-mail address,
   including non-ASCII, into its components.  Might use the proper
   parser, at least partially.  Perhaps similar to what Katsumi
   Yamaoka proposed.

When these two packages exist, each current uses of mail-extr* should
be investigated to find out what is really intended there.

At some point in time, I counted the number of functions in Emacs that
implement something similar than the mail-extr* functions do
(e.g. take a textual e-mail address and split it up) and found ~5-10
versions, all with their own problems.

Sadly, I keep writing rants about the situation instead of working on
solving it...  Perhaps partly that is because it is not straight
forward to solve this; you will probably have to implement one API
first, tinker with it to get experience with it, and then rewrite it
slightly, and so on.  Sounds like real work.  Perhaps someone else has
a clearer vision on how to implement it, and time to try it out.

Thanks.

  reply	other threads:[~2004-07-27 16:28 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-07-25  4:41 mail-extract-address-components extract modified full name Yoichi NAKAYAMA
2004-07-26  1:29 ` Richard Stallman
2004-07-26  2:08   ` Katsumi Yamaoka
2004-07-26  3:09     ` Katsumi Yamaoka
2004-07-26  3:39       ` Katsumi Yamaoka
2004-07-26  4:58       ` Miles Bader
2004-07-26  6:59 ` Lars Magne Ingebrigtsen
2004-07-26 11:09   ` Katsumi Yamaoka
2004-07-27  7:11     ` Katsumi Yamaoka
2004-07-27  9:29       ` Simon Josefsson
2004-07-27 12:39         ` Katsumi Yamaoka
2004-07-27 14:19         ` Stefan Monnier
2004-07-27 16:28           ` Simon Josefsson [this message]
2004-07-28  3:33             ` Katsumi Yamaoka
2004-07-29  3:57       ` Yoichi NAKAYAMA

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ilubri1ifdy.fsf@latte.josefsson.org \
    --to=jas@extundo.com \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.