unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#7760: mail-mbox-from produces invalid mbox From lines when From lines are multiline
@ 2010-12-30  3:06 Mark Lillibridge
  2011-01-02  2:36 ` Glenn Morris
  0 siblings, 1 reply; 2+ messages in thread
From: Mark Lillibridge @ 2010-12-30  3:06 UTC (permalink / raw)
  To: 7760


[at least version 23.1 onwards]

    Mbox from lines are required to be a single line starting with
"From "; see mail/rmail.el:720 for evidence of this:

(defvar rmail-unix-mail-delimiter
  (let ((time-zone-regexp
	 (concat "\\([A-Z]?[A-Z]?[A-Z][A-Z]\\( DST\\)?"
		 "\\|[-+]?[0-9][0-9][0-9][0-9]"
		 "\\|"
		 "\\) *")))
    (concat
     "From "

     ;; Many things can happen to an RFC 822 mailbox before it is put into
     ;; a `From' line.  The leading phrase can be stripped, e.g.
     ;; `Joe <@w.x:joe@y.z>' -> `<@w.x:joe@y.z>'.  The <> can be stripped, e.g.
     ;; `<@x.y:joe@y.z>' -> `@x.y:joe@y.z'.  Everything starting with a CRLF
     ;; can be removed, e.g.
     ;;		From: joe@y.z (Joe	K
     ;;			User)
     ;; can yield `From joe@y.z (Joe 	K Fri Mar 22 08:11:15 1996', and
     ;;		From: Joe User
     ;;			<joe@y.z>
     ;; can yield `From Joe User Fri Mar 22 08:11:15 1996'.
     ;; The mailbox can be removed or be replaced by white space, e.g.
     ;;		From: "Joe User"{space}{tab}
     ;;			<joe@y.z>
     ;; can yield `From {space}{tab} Fri Mar 22 08:11:15 1996',
     ;; where {space} and {tab} represent the Ascii space and tab characters.
     ;; We want to match the results of any of these manglings.
     ;; The following regexp rejects names whose first characters are
     ;; obviously bogus, but after that anything goes.
     "\\([^\0-\b\n-\r\^?].*\\)? "

     ;; The time the message was sent.
     "\\([^\0-\r \^?]+\\) +"				; day of the week
     "\\([^\0-\r \^?]+\\) +"				; month
     "\\([0-3]?[0-9]\\) +"				; day of month
     "\\([0-2][0-9]:[0-5][0-9]\\(:[0-6][0-9]\\)?\\) *"	; time of day

     ;; Perhaps a time zone, specified by an abbreviation, or by a
     ;; numeric offset.
     time-zone-regexp

     ;; The year.
     " \\([0-9][0-9]+\\) *"

     ;; On some systems the time zone can appear after the year, too.
     time-zone-regexp

     ;; Old uucp cruft.
     "\\(remote from .*\\)?"

     "\n"))
  "Regexp matching the delimiter of messages in UNIX mail format
\(UNIX From lines), minus the initial ^.  Note that if you change
this expression, you must change the code in `rmail-nuke-pinhead-header'
that knows the exact ordering of the \\( \\) subexpressions.")


However, mail-mbox-from in mail/mail-utils.el:387:
(defun mail-mbox-from ()
  "Return an mbox \"From \" line for the current message.
The buffer should be narrowed to just the header."
  (let ((from (or (mail-fetch-field "from")
		  (mail-fetch-field "really-from")
		  (mail-fetch-field "sender")
		  "unknown"))
	(date (mail-fetch-field "date")))
    (format "From %s %s\n" (mail-strip-quoted-names from)
	    (or (and date
		     (ignore-errors
		      (current-time-string (date-to-time date))))
		(current-time-string)))))


produces multiple line results for messages containing multiple line
From lines; for example, consider the following message headers from
a real message:

Return-Path: <palsberg@cs.purdue.edu>
Resent-Date: Fri, 25 Oct 1996 18:01:41 -0500 (EST)
Resent-To: objecttypes-redistribution@daimi.aau.dk
Date: Fri, 25 Oct 1996 18:36:19 -0400
From: Andrew Myers <andru@lcs.mit.edu>,
        Joseph Bank <jbank@martigny.ai.mit.edu>,
        Barbara Liskov <liskov@lcs.mit.edu>
To: objecttypes@daimi.aau.dk
Subject: Parameterized Types for Java

(This message may or may not meet various e-mail standards; that is
irrelevant -- mail-mbox-from must give valid results for all real
messages.)

On this message, mail-mbox-from produces the following invalid result:

"From andru@lcs.mit.edu,
        jbank@martigny.ai.mit.edu,
        liskov@lcs.mit.edu Fri Oct 25 15:36:19 1996
"

I attach a fairly simple fix which ignores everything starting with a
comma or newline in the from/etc. lines.  It instead produces:

"From CloveApple@aol.com Thu Dec  1 23:37:29 1994
"

Truncating starting with a newline is justified by the comments above:

     ;; `<@x.y:joe@y.z>' -> `@x.y:joe@y.z'.  Everything starting with a CRLF
     ;; can be removed, e.g.
     ;;		From: joe@y.z (Joe	K
     ;;			User)
     ;; can yield `From joe@y.z (Joe 	K Fri Mar 22 08:11:15 1996', and
     ;;		From: Joe User
     ;;			<joe@y.z>
     ;; can yield `From Joe User Fri Mar 22 08:11:15 1996'.

Truncating starting with a comma is to attempt to preserve the spirit of
the from line (give a single sending mailbox), but is not strictly
necessary.

- Mark


New version is:
(defun mail-mbox-from ()
  "Return an mbox \"From \" line for the current message.
The buffer should be narrowed to just the header."
  (let* ((from (or (mail-fetch-field "from")
		   (mail-fetch-field "really-from")
		   (mail-fetch-field "sender")
		   "unknown"))
	 (stripped-from (mail-strip-quoted-names from))
	 (final-from (substring stripped-from 0 
				(string-match "[,\n]" stripped-from)))
	 (date (mail-fetch-field "date")))
    (format "From %s %s\n" final-from
	    (or (and date
		     (ignore-errors
		      (current-time-string (date-to-time date))))
		(current-time-string)))))

ts-rhel5 [158]% diff new-mail-utils.el new-mail-utils2.el
390,395c390,398
<   (let ((from (or (mail-fetch-field "from")
<                 (mail-fetch-field "really-from")
<                 (mail-fetch-field "sender")
<                 "unknown"))
<       (date (mail-fetch-field "date")))
<     (format "From %s %s\n" (mail-strip-quoted-names from)
---
>   (let* ((from (or (mail-fetch-field "from")
>                  (mail-fetch-field "really-from")
>                  (mail-fetch-field "sender")
>                  "unknown"))
>        (stripped-from (mail-strip-quoted-names from))
>        (final-from (substring stripped-from 0 
>                               (string-match "[,\n]" stripped-from)))
>        (date (mail-fetch-field "date")))
>     (format "From %s %s\n" final-from





^ permalink raw reply	[flat|nested] 2+ messages in thread

* bug#7760: mail-mbox-from produces invalid mbox From lines when From lines are multiline
  2010-12-30  3:06 bug#7760: mail-mbox-from produces invalid mbox From lines when From lines are multiline Mark Lillibridge
@ 2011-01-02  2:36 ` Glenn Morris
  0 siblings, 0 replies; 2+ messages in thread
From: Glenn Morris @ 2011-01-02  2:36 UTC (permalink / raw)
  To: 7760-done

Version: 23.3

Thanks; I applied something similar.





^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2011-01-02  2:36 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-12-30  3:06 bug#7760: mail-mbox-from produces invalid mbox From lines when From lines are multiline Mark Lillibridge
2011-01-02  2:36 ` Glenn Morris

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).