unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH] Clean up author display for some "Last, First" cases
@ 2010-04-22  5:04 Dirk Hohndel
  2010-04-24 15:30 ` Carl Worth
  0 siblings, 1 reply; 3+ messages in thread
From: Dirk Hohndel @ 2010-04-22  5:04 UTC (permalink / raw)
  To: notmuch


We specifically check if this is one of these two patterns:
 "Last, First" <first.last@company.com>
 "Last, First MI" <first.mi.last@company.com>
If this is the case, we rewrite the author name in a more
reader friendly manner

Signed-off-by: Dirk Hohndel <hohndel@infradead.org>
---
 lib/thread.cc |   51 +++++++++++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/lib/thread.cc b/lib/thread.cc
index baa0d7f..7e72114 100644
--- a/lib/thread.cc
+++ b/lib/thread.cc
@@ -144,6 +144,51 @@ _thread_move_matched_author (notmuch_thread_t *thread,
     return;
 }
 
+/* clean up the uggly "Lastname, Firstname" format that some mail systems
+ * (most notably, Exchange) are creating to be "Firstname Lastname" 
+ * To make sure that we don't change other potential situations where a 
+ * comma is in the name, we check that we match one of these patterns
+ * "Last, First" <first.last@company.com>
+ * "Last, First MI" <first.mi.last@company.com>
+ */
+char *
+_thread_cleanup_author (notmuch_thread_t *thread,
+			const char *author, const char *from)
+{
+    char *cleanauthor,*testauthor;
+    const char *comma;
+    char *blank;
+    int fname,lname;
+
+    cleanauthor = talloc_strdup(thread, author);
+    if (cleanauthor == NULL)
+	return NULL;
+    comma = strchr(author,',');
+    if (comma) {
+	/* let's assemble what we think is the correct name */
+	lname = comma - author;
+	fname = strlen(author) - lname - 2;
+	strncpy(cleanauthor, comma + 2, fname);
+	*(cleanauthor+fname) = ' ';
+	strncpy(cleanauthor + fname + 1, author, lname);
+	*(cleanauthor+fname+1+lname) = '\0';
+	/* make a temporary copy and see if it matches the email */
+	testauthor = xstrdup(cleanauthor);
+	
+	blank=strchr(testauthor,' ');
+	while (blank != NULL) {
+	    *blank = '.';
+	    blank=strchr(testauthor,' ');
+	}
+	if (strcasestr(from, testauthor) == NULL)
+	    /* we didn't identify this as part of the email address 
+	    * so let's punt and return the original author */
+	    strcpy (cleanauthor, author);
+	       
+    }
+    return cleanauthor;
+}
+
 /* Add 'message' as a message that belongs to 'thread'.
  *
  * The 'thread' will talloc_steal the 'message' and hold onto a
@@ -158,6 +203,7 @@ _thread_add_message (notmuch_thread_t *thread,
     InternetAddressList *list;
     InternetAddress *address;
     const char *from, *author;
+    char *cleanauthor;
 
     _notmuch_message_list_add_message (thread->message_list,
 				       talloc_steal (thread, message));
@@ -178,8 +224,9 @@ _thread_add_message (notmuch_thread_t *thread,
 		mailbox = INTERNET_ADDRESS_MAILBOX (address);
 		author = internet_address_mailbox_get_addr (mailbox);
 	    }
-	    _thread_add_author (thread, author);
-	    notmuch_message_set_author (message, author);
+	    cleanauthor = _thread_cleanup_author (thread, author, from);
+	    _thread_add_author (thread, cleanauthor);
+	    notmuch_message_set_author (message, cleanauthor);
 	}
 	g_object_unref (G_OBJECT (list));
     }
-- 
1.6.6.1


-- 
Dirk Hohndel
Intel Open Source Technology Center

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] Clean up author display for some "Last, First" cases
  2010-04-22  5:04 [PATCH] Clean up author display for some "Last, First" cases Dirk Hohndel
@ 2010-04-24 15:30 ` Carl Worth
  2010-04-24 16:57   ` Dirk Hohndel
  0 siblings, 1 reply; 3+ messages in thread
From: Carl Worth @ 2010-04-24 15:30 UTC (permalink / raw)
  To: Dirk Hohndel, notmuch

[-- Attachment #1: Type: text/plain, Size: 1898 bytes --]

On Wed, 21 Apr 2010 22:04:39 -0700, Dirk Hohndel <hohndel@infradead.org> wrote:
> +/* clean up the uggly "Lastname, Firstname" format that some mail systems
> + * (most notably, Exchange) are creating to be "Firstname Lastname" 
> + * To make sure that we don't change other potential situations where a 
> + * comma is in the name, we check that we match one of these patterns
> + * "Last, First" <first.last@company.com>
> + * "Last, First MI" <first.mi.last@company.com>

This is an interesting idea. We could make it a little more flexible by
doing a regexp comparison of "first.*last" against the email address,
(perhaps people have email addresses like carl_worth@example.com?)

> +    char *cleanauthor,*testauthor;

I'd much rather see an underscore separating two words in a single
identifier, (so clean_author, test_author).

> +	/* let's assemble what we think is the correct name */
> +	lname = comma - author;
> +	fname = strlen(author) - lname - 2;
> +	strncpy(cleanauthor, comma + 2, fname);
> +	*(cleanauthor+fname) = ' ';
> +	strncpy(cleanauthor + fname + 1, author, lname);
> +	*(cleanauthor+fname+1+lname) = '\0';

The comment above, ("what we think is the correct name"), didn't help me
understand what the code is doing. And the code is hard enough to follow
that I could really use some help. Something like:

/* Break at comma and reverse: "Last, First etc." -> "First Last etc." */

Lots of little additions here and there so plenty of chance for an
off-by-one. Do we have a test case for this yet?

> +	/* make a temporary copy and see if it matches the email */
> +	testauthor = xstrdup(cleanauthor);

It would be preferable to use talloc functions consistently. (Existing
occurrences of xstrdup in the code base are for the sake of
talloc-unfriendly glib data structures like GHashTable.)

As is, testauthor is leaking.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] Clean up author display for some "Last, First" cases
  2010-04-24 15:30 ` Carl Worth
@ 2010-04-24 16:57   ` Dirk Hohndel
  0 siblings, 0 replies; 3+ messages in thread
From: Dirk Hohndel @ 2010-04-24 16:57 UTC (permalink / raw)
  To: Carl Worth, notmuch

On Sat, 24 Apr 2010 08:30:22 -0700, Carl Worth <cworth@cworth.org> wrote:
> On Wed, 21 Apr 2010 22:04:39 -0700, Dirk Hohndel <hohndel@infradead.org> wrote:
> > +/* clean up the uggly "Lastname, Firstname" format that some mail systems
> > + * (most notably, Exchange) are creating to be "Firstname Lastname" 
> > + * To make sure that we don't change other potential situations where a 
> > + * comma is in the name, we check that we match one of these patterns
> > + * "Last, First" <first.last@company.com>
> > + * "Last, First MI" <first.mi.last@company.com>
> 
> This is an interesting idea. We could make it a little more flexible by
> doing a regexp comparison of "first.*last" against the email address,
> (perhaps people have email addresses like carl_worth@example.com?)

I'll look into that. We actually had some discussion about this on IRC
and I was thinking of taking this feature to a new level... something
like: 
- by default we show names as they come in (least surprise)
- we offer to reverse Last, First
- we offer to shorten to FirstL
- we offer an alias map
So I could define that mail from "cworth@cworth.org" gets the author
listed as "cworth". Or as CarlW.

> > +    char *cleanauthor,*testauthor;
> 
> I'd much rather see an underscore separating two words in a single
> identifier, (so clean_author, test_author).

Happy to comply to your preferences in the future

> > +	/* let's assemble what we think is the correct name */
> > +	lname = comma - author;
> > +	fname = strlen(author) - lname - 2;
> > +	strncpy(cleanauthor, comma + 2, fname);
> > +	*(cleanauthor+fname) = ' ';
> > +	strncpy(cleanauthor + fname + 1, author, lname);
> > +	*(cleanauthor+fname+1+lname) = '\0';
> 
> The comment above, ("what we think is the correct name"), didn't help me
> understand what the code is doing. And the code is hard enough to follow
> that I could really use some help. Something like:
> 
> /* Break at comma and reverse: "Last, First etc." -> "First Last etc." */

Ok, I'll try to be more explicit in documenting algorithms

> Lots of little additions here and there so plenty of chance for an
> off-by-one. Do we have a test case for this yet?

Nope. Will do.

> > +	/* make a temporary copy and see if it matches the email */
> > +	testauthor = xstrdup(cleanauthor);
> 
> It would be preferable to use talloc functions consistently. (Existing
> occurrences of xstrdup in the code base are for the sake of
> talloc-unfriendly glib data structures like GHashTable.)
> 
> As is, testauthor is leaking.

Oops.

/D

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-04-24 16:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-22  5:04 [PATCH] Clean up author display for some "Last, First" cases Dirk Hohndel
2010-04-24 15:30 ` Carl Worth
2010-04-24 16:57   ` Dirk Hohndel

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).