unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH] Make the date parser nicer
@ 2010-01-22 15:26 Sebastian Spaeth
  2010-01-22 15:33 ` Sebastian Spaeth
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Sebastian Spaeth @ 2010-01-22 15:26 UTC (permalink / raw)
  To: notmuch

Currently we have to enter mail dates as timestamps. This approach does 2 things: it requires the prefix 'date:' and it allows timestamps to be specified as YYYY, YYYYMM or YYYYMMDD. So a notmuch show date:2005..20060512 will find all mails from 2005-01-01 until 2006-05-12. The code is probably not in a proper location yet and needs to be shoved around by someone more knowledgable than me. My C++ skills are somewhat,... lacking...

Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
---
 lib/database.cc |   94 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 93 insertions(+), 1 deletions(-)

diff --git a/lib/database.cc b/lib/database.cc
index 5b12320..102a6ff 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -494,6 +494,97 @@ _notmuch_database_ensure_writable (notmuch_database_t *notmuch)
     return NOTMUCH_STATUS_SUCCESS;
 }
 
+struct MaildateValueRangeProcessor : public Xapian::ValueRangeProcessor {
+    MaildateValueRangeProcessor() {}
+
+    Xapian::valueno operator()(std::string &begin, std::string &end) {
+        if (begin.substr(0, 5) != "date:")
+            return Xapian::BAD_VALUENO;
+        begin.erase(0, 5);
+
+	// Parse the begin date to time_t
+	struct tm *timeinfo;
+	time_t begintime, endtime;
+	//const char * startptr;
+	int year, month, day;
+
+	if (begin.size() == 8) {
+	  int no_items;
+	  no_items = sscanf(begin.c_str(), "%4i%2i%2i", &year, &month, &day);
+	  if (no_items != 3)
+	    return Xapian::BAD_VALUENO;
+	} else if (begin.size() == 6) {
+	  int no_items;
+	  day = 1;
+	  no_items = sscanf(begin.c_str(), "%4i%2i", &year, &month);
+	  if (no_items != 2)
+	    return Xapian::BAD_VALUENO;
+	} else if (begin.size() == 4) {
+	  int no_items;
+	  day = 1;
+	  month = 1;
+	  no_items = sscanf(begin.c_str(), "%4i", &year);
+	  if (no_items != 1)
+	    return Xapian::BAD_VALUENO;
+	} else {
+	  // no expected time format
+	  return Xapian::BAD_VALUENO;
+	}
+
+	begintime = time(NULL);
+	timeinfo = localtime( &begintime );
+	timeinfo -> tm_year = year - 1900;
+	timeinfo -> tm_mon = month - 1;
+	fprintf (stderr, "Startdate %d %d %d\n",year,month,day);
+	timeinfo -> tm_mday = day;
+	begintime = mktime ( timeinfo );
+
+	if (begintime == -1)
+	  // no valid time format
+	  return Xapian::BAD_VALUENO;
+
+	if (end.size() == 8) {
+	  int no_items;
+	  no_items = sscanf(end.c_str(), "%4i%2i%2i", &year, &month, &day);
+	  if (no_items != 3)
+	    return Xapian::BAD_VALUENO;
+	} else if (end.size() == 6) {
+	  int no_items;
+	  day = 31;
+	  no_items = sscanf(end.c_str(), "%4i%2i", &year, &month);
+	  if (no_items != 2)
+	    return Xapian::BAD_VALUENO;
+	} else if (end.size() == 4) {
+	  int no_items;
+	  day = 31;
+	  month = 12;
+	  no_items = sscanf(end.c_str(), "%4i", &year);
+	  if (no_items != 1)
+	    return Xapian::BAD_VALUENO;
+	} else {
+	  // no expected time format
+	  return Xapian::BAD_VALUENO;
+	}
+
+	timeinfo = localtime( &begintime );
+	timeinfo -> tm_year = year - 1900;
+	timeinfo -> tm_mon = month - 1;
+	fprintf (stderr, "Enddate %d %d %d\n",year,month,day);
+	timeinfo -> tm_mday = day;
+	endtime = mktime ( timeinfo );
+	//XXX: plus 1 day to make the last day inclusive??
+
+	if (endtime == -1)
+	  // no valid time format
+	  return Xapian::BAD_VALUENO;
+	
+	begin.assign(Xapian::sortable_serialise(begintime));
+	end.assign(Xapian::sortable_serialise(endtime));
+
+        return NOTMUCH_VALUE_TIMESTAMP;
+    }
+};
+
 notmuch_database_t *
 notmuch_database_open (const char *path,
 		       notmuch_database_mode_t mode)
@@ -570,7 +661,8 @@ notmuch_database_open (const char *path,
 	notmuch->query_parser = new Xapian::QueryParser;
 	notmuch->term_gen = new Xapian::TermGenerator;
 	notmuch->term_gen->set_stemmer (Xapian::Stem ("english"));
-	notmuch->value_range_processor = new Xapian::NumberValueRangeProcessor (NOTMUCH_VALUE_TIMESTAMP, "date:", true);
+	notmuch->value_range_processor = new MaildateValueRangeProcessor();
+	  // (NOTMUCH_VALUE_TIMESTAMP);
 
 	notmuch->query_parser->set_default_op (Xapian::Query::OP_AND);
 	notmuch->query_parser->set_database (*notmuch->xapian_db);
-- 
1.6.3.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] Make the date parser nicer
  2010-01-22 15:26 [PATCH] Make the date parser nicer Sebastian Spaeth
@ 2010-01-22 15:33 ` Sebastian Spaeth
  2010-01-22 16:04 ` Sebastian Spaeth
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Sebastian Spaeth @ 2010-01-22 15:33 UTC (permalink / raw)
  To: notmuch


Please find in the previous mail attached my patch (against my current
all-feature branch, but should pretty much apply to current master too).
It is a proof-of-concept to make the date parser nicer. The following 
searches work with this code:

notmuch show...
... date:2001..2010 (from beginning of 2001 until end of 2010)
... date:20011201..200506 (from 1001-12-01 until 2005-06-31

The code will pretty surely need some cleaning up, as I can hardly code
C, not to speak of C++. But at least it works and it is not very
intrusive.

(Ps. now that I think of it, I always use day 31 as last day, which is
surely wrong, it also still accepts some obviously wrong dates like
20040231)

We could also think about using a xapian DateValueRangeParser to do
that, but that would require saving the timestamp as YYYYMMDD in the
database (which we probably do not want).

Feedback welcome,
Sebastian

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Make the date parser nicer
  2010-01-22 15:26 [PATCH] Make the date parser nicer Sebastian Spaeth
  2010-01-22 15:33 ` Sebastian Spaeth
@ 2010-01-22 16:04 ` Sebastian Spaeth
  2010-01-24 14:13 ` Sebastian Spaeth
  2010-01-26  6:36 ` [PATCH] Make the date parser nicer Keith Packard
  3 siblings, 0 replies; 12+ messages in thread
From: Sebastian Spaeth @ 2010-01-22 16:04 UTC (permalink / raw)
  To: notmuch

On Fri, 22 Jan 2010 16:26:11 +0100, Sebastian Spaeth <Sebastian@SSpaeth.de> wrot> +	if (begin.size() == 8) {
> +	  int no_items;
> +	  no_items = sscanf(begin.c_str(), "%4i%2i%2i", &year, &month, &day);
> +	  if (no_items != 3)
> +	    return Xapian::BAD_VALUENO;

Also I have found that my sscanf skills are sourly lacking:
a date of 20060108 will lead to 2006 1 0 for (year, month, day)

I bet someone knows a nice way to parse those values in a good way.

Sebastian

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] Make the date parser nicer
  2010-01-22 15:26 [PATCH] Make the date parser nicer Sebastian Spaeth
  2010-01-22 15:33 ` Sebastian Spaeth
  2010-01-22 16:04 ` Sebastian Spaeth
@ 2010-01-24 14:13 ` Sebastian Spaeth
  2010-01-25 10:50   ` [PATCH] Make the date parser nicer. This is v3 and considered to be final (but the documentation) Sebastian Spaeth
  2010-01-26  6:36 ` [PATCH] Make the date parser nicer Keith Packard
  3 siblings, 1 reply; 12+ messages in thread
From: Sebastian Spaeth @ 2010-01-24 14:13 UTC (permalink / raw)
  To: notmuch

Currently we have to enter mail dates as timestamps. This approach does 2 things: 1) it requires the prefix 'date:'
2) it allows dates to be specified in some formats. So a notmuch show date:2005..2006-05-12 will find all mails from 2005-01-01 until 2006-05-12.
The code is probably not in a proper location yet and needs to be shoved around by someone more knowledgable than me.
My C++ skills are somewhat,... lacking...

Possible time formats: YYYY-MM-DD,YYYY-MM (in that month) , YYYY (in that year)
MM-DD (month-day in current year), DD (day in current month)

Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
---
 lib/database.cc |   90 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 89 insertions(+), 1 deletions(-)

diff --git a/lib/database.cc b/lib/database.cc
index 5b12320..9c2842d 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -494,6 +494,94 @@ _notmuch_database_ensure_writable (notmuch_database_t *notmuch)
     return NOTMUCH_STATUS_SUCCESS;
 }
 
+struct MaildateValueRangeProcessor : public Xapian::ValueRangeProcessor {
+    MaildateValueRangeProcessor() {}
+
+  time_t
+  parsedate(std::string &str, bool early) {
+	// Parse the begin date to time_t
+	// possible time formats:
+	// YYYY-MM-DD (size 10)
+	// YYYY-MM    (size 7)
+	// YYYY       (size 4)
+	//      MM-DD (size 5)
+	//         DD (size 2)
+        // begin of time unit when 'early', end of when not
+	struct tm *timeinfo;
+	time_t timet;
+	//const char * startptr;
+	int year = 0, month = 0, day = 0;
+
+	if (str.size() == 2) {
+	  // parse day, then remove it from the string
+	  day = atoi(str.c_str());
+	  str.erase(0,2);
+	}
+
+	if (str.size() == 4 or str.size() == 7 or str.size() == 10) {
+	  // parse year, then remove it from the string
+	  year = atoi(str.c_str());
+	  str.erase(0,5);
+	}
+	
+	month = atoi(str.c_str());
+	str.erase(0,3);
+
+	// Do we have a day component left?
+	if (str.size())
+		   day = atoi(str.c_str());	
+
+	if (year == 0 && month == 0 && day == 0)
+	  // no expected time format
+	  return NULL ;
+
+	timet = time(NULL);
+	timeinfo = gmtime( &timet );
+	timeinfo -> tm_isdst = 0;
+	if (!early && !month) ++year; 
+	if (year)  timeinfo -> tm_year = year - 1900;
+
+	if (month) timeinfo -> tm_mon = month - 1;
+	//else if (year) timeinfo -> tm_mon = (early ? 0: 12);
+
+	if (day) timeinfo -> tm_mday = (early ? day : ++day);
+	else timeinfo -> tm_mday = 1;
+
+	timeinfo -> tm_hour = 0;
+	timeinfo -> tm_min  = 0;
+	timeinfo -> tm_sec  = 0;
+	timet = mktime ( timeinfo );
+
+        if (!early) --timet;
+	if (timet == -1)
+	  return NULL;
+	return timet;
+  }
+
+    Xapian::valueno operator()(std::string &begin, std::string &end) {
+      time_t begintime, endtime;
+
+      if (begin.substr(0, 5) != "date:")
+	 return Xapian::BAD_VALUENO;
+      begin.erase(0, 5);
+
+      begintime = parsedate ( begin, true);
+      if (begintime == -1)
+	// no valid time format
+	return Xapian::BAD_VALUENO;
+
+      endtime = parsedate ( end, false);
+      if (endtime == -1)
+	// no valid time format
+	return Xapian::BAD_VALUENO;
+
+      begin.assign(Xapian::sortable_serialise(begintime));
+      end.assign(Xapian::sortable_serialise(endtime));
+
+      return NOTMUCH_VALUE_TIMESTAMP;
+    }
+};
+
 notmuch_database_t *
 notmuch_database_open (const char *path,
 		       notmuch_database_mode_t mode)
@@ -570,7 +658,7 @@ notmuch_database_open (const char *path,
 	notmuch->query_parser = new Xapian::QueryParser;
 	notmuch->term_gen = new Xapian::TermGenerator;
 	notmuch->term_gen->set_stemmer (Xapian::Stem ("english"));
-	notmuch->value_range_processor = new Xapian::NumberValueRangeProcessor (NOTMUCH_VALUE_TIMESTAMP, "date:", true);
+	notmuch->value_range_processor = new MaildateValueRangeProcessor();
 
 	notmuch->query_parser->set_default_op (Xapian::Query::OP_AND);
 	notmuch->query_parser->set_database (*notmuch->xapian_db);
-- 
1.6.3.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH] Make the date parser nicer. This is v3 and considered to be final (but the documentation).
  2010-01-24 14:13 ` Sebastian Spaeth
@ 2010-01-25 10:50   ` Sebastian Spaeth
  2010-01-25 12:22     ` [PATCH] Make the date parser nicer (v3 + 'now' keyword) Sebastian Spaeth
  0 siblings, 1 reply; 12+ messages in thread
From: Sebastian Spaeth @ 2010-01-25 10:50 UTC (permalink / raw)
  To: notmuch

Currently we have to enter mail dates as timestamps. This approach does 2 things:
1) it requires the prefix 'date:'
2) it allows dates to be specified in a flexible way. So a notmuch show date:2005..2006-05-12 will find all mails from 2005-01-01 until 2006-05-12.

Possible time formats: YYYY-MM-DD, YYYY-MM (from/through that month) , YYYY (from/through that year), MM-DD (month-day in current year), DD (day in current month).

Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
---
 lib/database.cc |   80 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 79 insertions(+), 1 deletions(-)

diff --git a/lib/database.cc b/lib/database.cc
index 5b12320..da2fda8 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -494,6 +494,84 @@ _notmuch_database_ensure_writable (notmuch_database_t *notmuch)
     return NOTMUCH_STATUS_SUCCESS;
 }
 
+struct MaildateValueRangeProcessor : public Xapian::ValueRangeProcessor {
+    MaildateValueRangeProcessor() {}
+
+  time_t
+  parsedate(std::string &str, bool early) {
+    /* Parse the date to a 'time_t', return NULL on error            */
+    /* possible time formats: YYYY-MM-DD, YYYY-MM, YYYY,             */
+    /* MM-DD (current month), DD (day in current month).             */
+    /* Uses start of time unit when 'early', end otherwise, e.g.     */
+    /* 2001:=2001-01-01:00:00:00 when 'early' or 2001-12-31:23:59:59 */
+    struct tm *timeinfo;
+    time_t timet;
+    int year = 0, month = 0, day = 0;
+
+    if (str.size() == 2) {
+      /* We got just current day in month, parse & remove it */
+      day = atoi(str.c_str());
+      str.erase(0,2);
+    }
+    
+    if (str.size() == 4 or str.size() > 5) {
+      /* expect a year, parse & remove it */
+      year = atoi(str.c_str());
+      str.erase(0,5);
+    }
+
+    /* parse & remove month if there is sth left in the string */
+    month = atoi(str.c_str());
+    str.erase(0,3);
+
+    /* Parse day if we have one left */
+    if (str.size())
+      day = atoi(str.c_str());	
+
+    if (year == 0 && month == 0 && day == 0)
+      // no expected time format
+      return -1 ;
+
+    timet = time(NULL);                /* init timeinfo with current time */
+    timeinfo = gmtime(&timet);
+    /* add timeunit if !early (1 second too much, which we deduct later   */
+    if (!early) {
+      if (year && !month)        ++year;  /* only year given              */
+      if (year && month && !day) ++month; /* year & month given           */
+    }
+    if (year)  timeinfo -> tm_year = year - 1900;
+    if (month) timeinfo -> tm_mon = month - 1;
+    if (day)   timeinfo -> tm_mday = (early ? day : ++day);
+    else       timeinfo -> tm_mday = 1;
+
+    timeinfo -> tm_hour = 0;
+    timeinfo -> tm_min  = 0;
+    timeinfo -> tm_sec  = (early ? 0 : -1); /* -1 sec if !early */
+    timet = mktime(timeinfo);
+
+    return timet;
+  }
+
+    Xapian::valueno operator()(std::string &begin, std::string &end) {
+      time_t begintime, endtime;
+
+      if (begin.substr(0, 5) != "date:")
+	 return Xapian::BAD_VALUENO;
+      begin.erase(0, 5);
+
+      begintime = parsedate(begin, true);
+      endtime   = parsedate(end, false);
+      if ((begintime == -1) || (endtime == -1))
+	// parsedate failed, no valid time format
+	return Xapian::BAD_VALUENO;
+
+      begin.assign(Xapian::sortable_serialise(begintime));
+      end.assign(Xapian::sortable_serialise(endtime));
+
+      return NOTMUCH_VALUE_TIMESTAMP;
+    }
+};
+
 notmuch_database_t *
 notmuch_database_open (const char *path,
 		       notmuch_database_mode_t mode)
@@ -570,8 +648,7 @@ notmuch_database_open (const char *path,
 	notmuch->query_parser = new Xapian::QueryParser;
 	notmuch->term_gen = new Xapian::TermGenerator;
 	notmuch->term_gen->set_stemmer (Xapian::Stem ("english"));
-	notmuch->value_range_processor = new Xapian::NumberValueRangeProcessor (NOTMUCH_VALUE_TIMESTAMP);
+	notmuch->value_range_processor = new MaildateValueRangeProcessor();
 
 	notmuch->query_parser->set_default_op (Xapian::Query::OP_AND);
 	notmuch->query_parser->set_database (*notmuch->xapian_db);
-- 
1.6.3.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] Make the date parser nicer (v3 + 'now' keyword)
  2010-01-25 10:50   ` [PATCH] Make the date parser nicer. This is v3 and considered to be final (but the documentation) Sebastian Spaeth
@ 2010-01-25 12:22     ` Sebastian Spaeth
  2010-01-25 13:14       ` [PATCH] Make the date parser nicer (v3 + 'now' keyword) (final mail) Sebastian Spaeth
  0 siblings, 1 reply; 12+ messages in thread
From: Sebastian Spaeth @ 2010-01-25 12:22 UTC (permalink / raw)
  To: notmuch


The patch previously sent can be considered my final attempt to a nicer
date parser. In addition to some testing, I have updated the
documentation to reflect the new syntax, and I have also added the
keyword 'now' as a possibility. So 'date:2002..now' (all mails from 2002
until now) or 'date:now..31' (all mails that McFly sent to you between
now and the 31st of this month) should all work. Note that this is not a
sincere: 'since' replacement, as 2001..now won't find mails with
timestamps in the future.

The relevant 4 patches are in my git tree, git://github.com/spaetz/notmuch-all-feature.git then switch to branch 'dateparser':

59e9b56 remove superfluous debug statements from date parsing
4bdb0b0 allow 'now' as keyword for now
cf6c500 Adapt documentation to new date: syntax
d8d3d0b Make the date parser nicer. This is v3 and considered to be final (but the documentation).


Thanks,
Sebastian

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Make the date parser nicer (v3 + 'now' keyword) (final mail)
  2010-01-25 12:22     ` [PATCH] Make the date parser nicer (v3 + 'now' keyword) Sebastian Spaeth
@ 2010-01-25 13:14       ` Sebastian Spaeth
  0 siblings, 0 replies; 12+ messages in thread
From: Sebastian Spaeth @ 2010-01-25 13:14 UTC (permalink / raw)
  To: notmuch

Sorry, very last mail from me on this issue. I squashed the patches into 
3 distinct ones, and inherited a tree from cworth's master tree for 
easier pulling (rather than basing it off my all-features branch). This 
branch is here:

http://github.com/spaetz/notmuch-all-feature/commits/dateparser2
(git://github.com/spaetz/notmuch-all-feature.git then switch to branch 
'dateparser2')

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Make the date parser nicer
  2010-01-22 15:26 [PATCH] Make the date parser nicer Sebastian Spaeth
                   ` (2 preceding siblings ...)
  2010-01-24 14:13 ` Sebastian Spaeth
@ 2010-01-26  6:36 ` Keith Packard
  2010-01-26  9:12   ` Sebastian Spaeth
  2010-01-26 11:50   ` Sebastian Spaeth
  3 siblings, 2 replies; 12+ messages in thread
From: Keith Packard @ 2010-01-26  6:36 UTC (permalink / raw)
  To: Sebastian Spaeth, notmuch

[-- Attachment #1: Type: text/plain, Size: 11553 bytes --]

On Fri, 22 Jan 2010 16:26:11 +0100, Sebastian Spaeth <Sebastian@SSpaeth.de> wrote:
> Currently we have to enter mail dates as timestamps. This approach
> does 2 things: it requires the prefix 'date:' and it allows timestamps
> to be specified as YYYY, YYYYMM or YYYYMMDD. So a notmuch show
> date:2005..20060512 will find all mails from 2005-01-01 until
> 2006-05-12. The code is probably not in a proper location yet and
> needs to be shoved around by someone more knowledgable than me. My C++
> skills are somewhat,... lacking...

Here's some code which further improves date parsing by allowing lots of
date formats, including things like "today", "thisweek", ISO and US date
formats and month names. You can separate two dates with .. to make a
range, or you can just use the default range ("lastmonth" is everything
From the 1st of the previous month to the 1st of the current month).

I think this fits nicely with your code.

From 432b210a6218a809ebcddbb0e5b94a1ddbd34207 Mon Sep 17 00:00:00 2001
From: Keith Packard <keithp@keithp.com>
Date: Mon, 25 Jan 2010 22:35:30 -0800
Subject: [PATCH] Add date.c

---
 lib/date.c |  455 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 455 insertions(+), 0 deletions(-)
 create mode 100644 lib/date.c

diff --git a/lib/date.c b/lib/date.c
new file mode 100644
index 0000000..611ae15
--- /dev/null
+++ b/lib/date.c
@@ -0,0 +1,455 @@
+/*
+ * Copyright © 2009 Keith Packard <keithp@keithp.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
+ */
+
+#include "notmuch.h"
+#include <time.h>
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+
+#define DAY	(24 * 60 * 60)
+
+static void
+today(struct tm *result, time_t after) {
+    time_t	t;
+
+    if (after)
+	t = after;
+    else
+	time(&t);
+    localtime_r(&t, result);
+    result->tm_sec = result->tm_min = result->tm_hour = 0;
+}
+
+static int parse_today(const char *text, time_t *first, time_t *last, time_t after) {
+    if (strcasecmp(text, "today") == 0) {
+	struct tm n;
+	today(&n, 0);
+	*first = mktime(&n);
+	*last = *first + DAY;
+	return 0;
+    }
+    return 1;
+}
+
+static int parse_yesterday(const char *text, time_t *first, time_t *last, time_t after) {
+    if (strcasecmp(text, "yesterday") == 0) {
+	struct tm n;
+	today(&n, 0);
+	*last = mktime(&n);
+	*first = *last - DAY;
+	return 0;
+    }
+    return 1;
+}
+
+static int parse_thisweek(const char *text, time_t *first, time_t *last, time_t after) {
+    if (strcasecmp(text, "thisweek") == 0) {
+	struct tm n;
+	today(&n, 0);
+	*first = mktime(&n) - (n.tm_wday * DAY);
+	*last = *first + DAY * 7;
+	return 0;
+    }
+    return 1;
+}
+
+static int parse_lastweek(const char *text, time_t *first, time_t *last, time_t after) {
+    if (strcasecmp(text, "lastweek") == 0) {
+	struct tm n;
+	today(&n, 0);
+	*last = mktime(&n) - (n.tm_wday * DAY);
+	*first = *last - DAY * 7;
+	return 0;
+    }
+    return 1;
+}
+
+static int parse_thismonth(const char *text, time_t *first, time_t *last, time_t after) {
+    if (strcasecmp(text, "thismonth") == 0) {
+	struct tm n;
+	today(&n, 0);
+	n.tm_mday = 1;
+	*first = mktime(&n);
+	if (n.tm_mon++ == 12) {
+	    n.tm_mon = 0;
+	    n.tm_year++;
+	}
+	*last = mktime(&n);
+	return 0;
+    }
+    return 1;
+}
+
+static int parse_lastmonth(const char *text, time_t *first, time_t *last, time_t after) {
+    if (strcasecmp(text, "lastmonth") == 0) {
+	struct tm n;
+	today(&n, 0);
+	n.tm_mday = 1;
+	if (n.tm_mon == 0) {
+	    n.tm_year--;
+	    n.tm_mon = 11;
+	} else
+	    n.tm_mon--;
+	*first = mktime(&n);
+	if (n.tm_mon++ == 12) {
+	    n.tm_mon = 0;
+	    n.tm_year++;
+	}
+	*last = mktime(&n);
+	return 0;
+    }
+    return 1;
+}
+
+static const char *months[12][2] = {
+    { "January", "Jan" },
+    { "February", "Feb" },
+    { "March", "Mar" },
+    { "April", "Apr" },
+    { "May", "May" },
+    { "June", "Jun" },
+    { "July", "Jul" },
+    { "August", "Aug" },
+    { "September", "Sep" },
+    { "October", "Oct" },
+    { "November", "Nov" },
+    { "December", "Dec" },
+};
+
+static int year(const char *text, int *y) {
+    char *end;
+    *y = strtol(text, &end, 10);
+    if (end == text)
+	return 1;
+    if (*end != '\0')
+	return 1;
+    if (*y < 1970 || *y > 2038)
+	return 1;
+    *y -= 1900;
+    return 0;
+}
+
+static int month(const char *text, int *m) {
+    char *end;
+    int i;
+    for (i = 0; i < 12; i++) {
+	if (strcasecmp(text, months[i][0]) == 0 ||
+	    strcasecmp(text, months[i][1]) == 0)
+	{
+	    *m = i;
+	    return 0;
+	}
+    }
+    *m = strtol(text, &end, 10);
+    if (end == text)
+	return 1;
+    if (*end != '\0')
+	return 1;
+    if (*m < 1 || *m > 12)
+	return 1;
+    *m -= 1;
+    return 0;
+}
+
+static int day(const char *text, int *d) {
+    char *end;
+    *d = strtol(text, &end, 10);
+    if (end == text)
+	return 1;
+    if (*end != '\0')
+	return 1;
+    if (*d < 1 || *d > 31)
+	return 1;
+    return 0;
+}
+
+/* month[-day] */
+static int parse_month(const char *text, time_t *first, time_t *last, time_t after) {
+    int		m = 0, d = 0;
+    int		i;
+    struct tm	n;
+    char	tmp[80];
+    char	*t;
+    char	*save;
+    char	*token;
+
+    if(strlen (text) >= sizeof (tmp))
+	return 1;
+    strcpy(tmp, text);
+    
+    t = tmp;
+    save = NULL;
+    i = 0;
+    while ((token = strtok_r(t, "-", &save)) != NULL) {
+	i++;
+	switch(i) {
+	case 1:
+	    if (month(token, &m) != 0)
+		return 1;
+	    break;
+	case 2:
+	    if (day(token, &d) != 0)
+		return 1;
+	    break;
+	default:
+	    return 1;
+	}
+	t = NULL;
+    }
+    today(&n, after);
+    if (after) {
+	if (m < n.tm_mon)
+	    n.tm_year++;
+    } else {
+	if (m > n.tm_mon)
+	    n.tm_year--;
+    }
+    switch (i) {
+    case 1:
+	n.tm_mday = 1;
+	n.tm_mon = m;
+	*first = mktime(&n);
+	if (++n.tm_mon > 11) {
+	    n.tm_mon = 0;
+	    n.tm_year++;
+	}
+	*last = mktime(&n);
+	return 0;
+    case 2:
+	n.tm_mday = d;
+	n.tm_mon = m;
+	*first = mktime(&n);
+	*last = *first + DAY;
+	return 0;
+    }
+    return 1;
+}
+
+/* year[-month[-day]] */
+static int parse_iso(const char *text, time_t *first, time_t *last, time_t after) {
+    int		y = 0, m = 0, d = 0;
+    int		i;
+    struct tm	n;
+    char	tmp[80];
+    char	*t;
+    char	*save;
+    char	*token;
+
+    if(strlen (text) >= sizeof (tmp))
+	return 1;
+    strcpy(tmp, text);
+    
+    t = tmp;
+    save = NULL;
+    i = 0;
+    while ((token = strtok_r(t, "-", &save)) != NULL) {
+	i++;
+	switch(i) {
+	case 1:
+	    if (year(token, &y) != 0)
+		return 1;
+	    break;
+	case 2:
+	    if (month(token, &m) != 0)
+		return 1;
+	    break;
+	case 3:
+	    if (day(token, &d) != 0)
+		return 1;
+	    break;
+	default:
+	    return 1;
+	}
+	t = NULL;
+    }
+    today(&n, 0);
+    switch (i) {
+    case 1:
+	n.tm_mday = 1;
+	n.tm_mon = 0;
+	n.tm_year = y;
+	*first = mktime(&n);
+	n.tm_year = y + 1;
+	*last = mktime(&n);
+	return 0;
+    case 2:
+	n.tm_mday = 1;
+	n.tm_mon = m;
+	n.tm_year = y;
+	*first = mktime(&n);
+	if (++n.tm_mon > 11) {
+	    n.tm_mon = 0;
+	    n.tm_year++;
+	}
+	*last = mktime(&n);
+	return 0;
+    case 3:
+	n.tm_mday = d;
+	n.tm_mon = m;
+	n.tm_year = y;
+	*first = mktime(&n);
+	*last = *first + DAY;
+	return 0;
+    }
+    return 1;
+}
+
+/* month[/day[/year]] */
+static int parse_us(const char *text, time_t *first, time_t *last, time_t after) {
+    int		y = 0, m = 0, d = 0;
+    int		i;
+    struct tm	n;
+    char	tmp[80];
+    char	*t;
+    char	*save;
+    char	*token;
+
+    if(strlen (text) >= sizeof (tmp))
+	return 1;
+    strcpy(tmp, text);
+    
+    t = tmp;
+    save = NULL;
+    i = 0;
+    while ((token = strtok_r(t, "/", &save)) != NULL) {
+	i++;
+	switch(i) {
+	case 1:
+	    if (month(token, &m) != 0)
+		return 1;
+	    break;
+	case 2:
+	    if (day(token, &d) != 0)
+		return 1;
+	    break;
+	case 3:
+	    if (year(token, &y) != 0)
+		return 1;
+	    break;
+	default:
+	    return 1;
+	}
+	t = NULL;
+    }
+    today(&n, after);
+    if (after) {
+	if (m < n.tm_mon)
+	    n.tm_year++;
+    } else {
+	if (m > n.tm_mon)
+	    n.tm_year--;
+    }
+    switch (i) {
+    case 1:
+	n.tm_mday = 1;
+	n.tm_mon = m;
+	*first = mktime(&n);
+	if (++n.tm_mon > 11) {
+	    n.tm_mon = 0;
+	    n.tm_year++;
+	}
+	*last = mktime(&n);
+	return 0;
+    case 2:
+	n.tm_mday = d;
+	n.tm_mon = m;
+	*first = mktime(&n);
+	*last = *first + DAY;
+	return 0;
+    case 3:
+	n.tm_mday = d;
+	n.tm_mon = m;
+	n.tm_year = y;
+	*first = mktime(&n);
+	*last = *first + DAY;
+	return 0;
+    }
+    return 1;
+}
+
+static int (*parsers[])(const char *text, time_t *first, time_t *last, time_t after) = {
+    parse_today,
+    parse_yesterday,
+    parse_thisweek,
+    parse_lastweek,
+    parse_thismonth,
+    parse_lastmonth,
+    parse_month,
+    parse_iso,
+    parse_us,
+    0,
+};
+
+static notmuch_status_t
+notmuch_one_date(const char *text, time_t *first, time_t *last, time_t after)
+{
+    int		i;
+    for (i = 0; parsers[i]; i++)
+	if (parsers[i](text, first, last, after) == 0)
+	    return NOTMUCH_STATUS_SUCCESS;
+    return NOTMUCH_STATUS_INVALID_DATE;
+}
+
+notmuch_status_t
+notmuch_date(const char *text, time_t *first, time_t *last)
+{
+    char	*dots;
+    char	first_text[80], last_text[80];
+    notmuch_status_t	status;
+    time_t	first_first, first_last, last_first, last_last;
+
+    if (strlen(text) > sizeof (first_text))
+	return NOTMUCH_STATUS_INVALID_DATE;
+    dots = strstr(text, "..");
+    if (dots) {
+	strncpy(first_text, text, dots - text);
+	first_text[dots-text] = '\0';
+	status = notmuch_one_date(first_text, &first_first, &first_last, 0);
+	if (status)
+	    return status;
+	status = notmuch_one_date(dots + 2, &last_first, &last_last, first_first);
+	if (status)
+	    return status;
+	*first = first_first;
+	*last = last_last;
+	return 0;
+    }
+    return notmuch_one_date(text, first, last, 0);
+}
+
+#if 1
+int
+main (int argc, char **argv)
+{
+    int	i;
+    for (i = 1; i < argc; i++) {
+	time_t	first, last;
+
+	if (notmuch_date(argv[i], &first, &last) == 0) {
+	    char	first_string[80], last_string[80];
+
+	    ctime_r(&first, first_string);
+	    first_string[strlen(first_string)-1] = '\0';
+	    ctime_r(&last, last_string);
+	    last_string[strlen(last_string)-1] = '\0';
+	    printf ("%s: %s - %s\n", argv[i], first_string, last_string);
+	}
+    }
+}
+#endif
-- 
1.6.6


-- 
keith.packard@intel.com

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] Make the date parser nicer
  2010-01-26  6:36 ` [PATCH] Make the date parser nicer Keith Packard
@ 2010-01-26  9:12   ` Sebastian Spaeth
  2010-01-26 11:50   ` Sebastian Spaeth
  1 sibling, 0 replies; 12+ messages in thread
From: Sebastian Spaeth @ 2010-01-26  9:12 UTC (permalink / raw)
  To: notmuch

On Mon, 25 Jan 2010 22:36:35 -0800, Keith Packard <keithp@keithp.com> wrote:
> Here's some code which further improves date parsing by allowing lots of
> date formats, including things like "today", "thisweek", ISO and US date
> formats and month names. You can separate two dates with .. to make a
> range, or you can just use the default range ("lastmonth" is everything
> From the 1st of the previous month to the 1st of the current month).
> 
> I think this fits nicely with your code.

Hey, cool. I tried to keep my patch as small as possible, thanks for a
full-fledged date string parser :). I'll see how I can integrate that
into my dateparser branch.

Thanks
Sebastian

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Make the date parser nicer
  2010-01-26  6:36 ` [PATCH] Make the date parser nicer Keith Packard
  2010-01-26  9:12   ` Sebastian Spaeth
@ 2010-01-26 11:50   ` Sebastian Spaeth
  2010-01-26 17:55     ` Keith Packard
  1 sibling, 1 reply; 12+ messages in thread
From: Sebastian Spaeth @ 2010-01-26 11:50 UTC (permalink / raw)
  To: Keith Packard, notmuch

On Mon, 25 Jan 2010 22:36:35 -0800, Keith Packard <keithp@keithp.com> wrote:
> Here's some code which further improves date parsing by allowing lots of
> date formats, including things like "today", "thisweek", ISO and US date
> formats and month names. You can separate two dates with .. to make a
> range, or you can just use the default range ("lastmonth" is everything
> From the 1st of the previous month to the 1st of the current month).
> 
> I think this fits nicely with your code.

It fit nicely indeed. I have just integrated your date parser into my
code and sent it as a series of 4 patches based on current cworth
master. (commits 2565fc6 and 96e11c3 will not compile on their own)

ec3c79a integrate keithp's date.c into the notmuch date parser and delete my previous own 
2565fc6 compile date.c as well
96e11c3 add date parser file from Keith
6ed2569 Make the date parser nicer.

The topic branch is here for those who don't want to apply mail patches:
http://github.com/spaetz/notmuch-all-feature/commits/dateparser3

Documentation of the new notmuch_parse_date function:
/* Parse a string into the first and last possible timestamps.
 * It parses the possible formats and stops if one pattern matches.
 * Keywords: 'today','yesterday','thisweek','lastweek','thismonth',
 *           'lastmonth'
 * Month-day : month[-day]] (month: January, Jan, or 1)\n"
 * ISO format: year[-month[-day]]
 * US format : month[/day[/year]]
 *
 * 'after' is used to fill in bits from context if left out, e.g. a
 * 'date:2004..01' will find from 2004-01-01 through 2004-01-31
 *
 * Return values:
 * NOTMUCH_STATUS_SUCCESS
 * NOTMUCH_STATUS_INVALID_DATE: Error parsing the date string

Please pull :-).

Sebastian

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Make the date parser nicer
  2010-01-26 11:50   ` Sebastian Spaeth
@ 2010-01-26 17:55     ` Keith Packard
  2010-01-27  9:15       ` Sebastian Spaeth
  0 siblings, 1 reply; 12+ messages in thread
From: Keith Packard @ 2010-01-26 17:55 UTC (permalink / raw)
  To: Sebastian Spaeth, notmuch

[-- Attachment #1: Type: text/plain, Size: 1115 bytes --]

On Tue, 26 Jan 2010 12:50:41 +0100, "Sebastian Spaeth" <Sebastian@SSpaeth.de> wrote:
> On Mon, 25 Jan 2010 22:36:35 -0800, Keith Packard <keithp@keithp.com> wrote:
> > Here's some code which further improves date parsing by allowing lots of
> > date formats, including things like "today", "thisweek", ISO and US date
> > formats and month names. You can separate two dates with .. to make a
> > range, or you can just use the default range ("lastmonth" is everything
> > From the 1st of the previous month to the 1st of the current month).
> > 
> > I think this fits nicely with your code.
> 
> It fit nicely indeed. I have just integrated your date parser into my
> code and sent it as a series of 4 patches based on current cworth
> master. (commits 2565fc6 and 96e11c3 will not compile on their own)

Very cool. Oh, if you've got commits that don't compile on their own,
you should squash them together (or fix it in some other way). Makes
bisecting easier in the future.

Also, cworth is on vacation this week, so we won't be seeing any
merging to master...

-- 
keith.packard@intel.com

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Make the date parser nicer
  2010-01-26 17:55     ` Keith Packard
@ 2010-01-27  9:15       ` Sebastian Spaeth
  0 siblings, 0 replies; 12+ messages in thread
From: Sebastian Spaeth @ 2010-01-27  9:15 UTC (permalink / raw)
  To: notmuch

On Tue, 26 Jan 2010 09:55:00 -0800, Keith Packard <keithp@keithp.com> wrote:
> Very cool. Oh, if you've got commits that don't compile on their own,
> you should squash them together (or fix it in some other way). Makes
> bisecting easier in the future.

Makes sense. I am still quite new to git, so excuse those beginner's
lapses. Perhaps Carl could squash commits ec3c79a and 2565fc6 when (if?)
pulling, that would make every step compile IMHO.

> Also, cworth is on vacation this week, so we won't be seeing any
> merging to master...

No hurry :-). cworth will have to do quite some catching up when he
returns.

The one "disadvantage" my integration has over your original approach,
is that we now always require "date:XXX..YYY". A 'date:lastmonth' won't
work, it will need to be "date:lastmonth..today". The reason is that
xapian only seems to invoke the RangeParser when something of the format
'A..B' is passed as a parameter. So while we could get "date:..2005" to
work, "date:2005.." is not passed to the RangeParser handler, it seems.
 
We could ditch the "date:" prefix, but imho it is more consistent with
the other keywords to use it. I have no strong feelings about this.

It also still has the same limitation, in that it will not find emails with
a future timestamp (I use date:lastweek..5000 to get all mails with a
future stamp).

Sebastian

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-01-27  9:15 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-22 15:26 [PATCH] Make the date parser nicer Sebastian Spaeth
2010-01-22 15:33 ` Sebastian Spaeth
2010-01-22 16:04 ` Sebastian Spaeth
2010-01-24 14:13 ` Sebastian Spaeth
2010-01-25 10:50   ` [PATCH] Make the date parser nicer. This is v3 and considered to be final (but the documentation) Sebastian Spaeth
2010-01-25 12:22     ` [PATCH] Make the date parser nicer (v3 + 'now' keyword) Sebastian Spaeth
2010-01-25 13:14       ` [PATCH] Make the date parser nicer (v3 + 'now' keyword) (final mail) Sebastian Spaeth
2010-01-26  6:36 ` [PATCH] Make the date parser nicer Keith Packard
2010-01-26  9:12   ` Sebastian Spaeth
2010-01-26 11:50   ` Sebastian Spaeth
2010-01-26 17:55     ` Keith Packard
2010-01-27  9:15       ` Sebastian Spaeth

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).