From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id DF0EE431FB6 for ; Wed, 3 Oct 2012 13:32:23 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vBCmkDSndEtk for ; Wed, 3 Oct 2012 13:32:22 -0700 (PDT) Received: from mail-la0-f53.google.com (mail-la0-f53.google.com [209.85.215.53]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 34DC5431FAE for ; Wed, 3 Oct 2012 13:32:22 -0700 (PDT) Received: by lahl5 with SMTP id l5so3745573lah.26 for ; Wed, 03 Oct 2012 13:32:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:to:cc:subject:in-reply-to:references:user-agent:date :message-id:mime-version:content-type:x-gm-message-state; bh=7kuxDjMOydSS+WAnh7FkBvjru2hZcUGcz73BHGWdpn4=; b=JVT8v4KJFDYYbssh/DsJjqQdpcD8b/ucyLlC8vLH264unZk4+nALpCoJ3HBz3x0qtI aD2es4LfTmZP/ZPEoz35X2P5M0ZClOpiR2c0H+ZGnHUf6lwD7cY/twumuDNHOHZ99aFX qhyprID2zDq1WcPREPtsn3/HVixxvXT3AXJXIgvLtFtjy3gyEigxnPZBgPuD7jTUW4k1 j2BB1SIbwq8KiMg08OF2xpG3r5KnrbJOsQQq9KbE1ndbFQZZUwmiY462qQafn0q3ILp/ Wk2cgZfbD2YsjQw6NRNphe9h2hWNhh2eA0sDg19U0IbsAIf2WCh9SJ+tPvjfgq54jVH6 +RWQ== Received: by 10.152.144.2 with SMTP id si2mr2572296lab.26.1349296340620; Wed, 03 Oct 2012 13:32:20 -0700 (PDT) Received: from localhost (dsl-hkibrasgw4-fe51df00-27.dhcp.inet.fi. [80.223.81.27]) by mx.google.com with ESMTPS id xw14sm1649379lab.15.2012.10.03.13.32.18 (version=SSLv3 cipher=OTHER); Wed, 03 Oct 2012 13:32:19 -0700 (PDT) From: Jani Nikula To: Michal Sojka , notmuch@notmuchmail.org, David Bremner Subject: Re: [PATCH] test: Improve tests for the date/time parser module In-Reply-To: <87zk4e1f5k.fsf@steelpick.2x.cz> References: <24186aafbdcb967b8f66c2390c928f3788ab6cbf.1347484177.git.jani@nikula.org> <87zk4e1f5k.fsf@steelpick.2x.cz> User-Agent: Notmuch/0.14+34~g2c0277c (http://notmuchmail.org) Emacs/23.3.1 (i686-pc-linux-gnu) Date: Wed, 03 Oct 2012 23:32:16 +0300 Message-ID: <87lifn47qn.fsf@nikula.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Gm-Message-State: ALoCoQl0iVlBwmLJfZf2vY7ACP1yrUNqNXjUyqlgf5peAuEPlVouO/EAaa1bRS75nZdz7AaNM9jz X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2012 20:32:24 -0000 On Tue, 25 Sep 2012, Michal Sojka wrote: > This patch reworks date/time parser library test program to make it > easier to to write the actual tests. It also modifies the notmuch test > script and adds several new tests to it. Cool! > The INPUT file for the test contains both the dates to be parsed as well > as the "expected" results. The test program outputs the results in the > same format and replaces expected results with real results. Currently, > the "expected" results in the INPUT file correspond to the real results, > so the test passes. Some results are, however, different from what I > would expect - this is mentioned in the comments after '#'. Please see comments inline next to tests. > > This patch applies on top of Jani's patchset. > --- > It can be seen that there are several errors and unexpected results. > As I've already written, I'm not sure that the approach taken by this > library is the right one. I tend to agree with mina86, that using a > more systematic approach (such as bison) would be beneficial. Then we just have to agree to disagree here. :) > This is however not to say to throw this patchset away. Either Jani > will be able to fix all the corner cases. Or we can work together to > develop a better solution - add support for ranges to the bison > parser. I think I've got the cases pretty much covered, and they're mostly not about syntax and parsing, but rather semantics; what to do with the parsed data. > -Michal > > diff --git a/test/Makefile.local b/test/Makefile.local > index 9ae130a..b9105c7 100644 > --- a/test/Makefile.local > +++ b/test/Makefile.local > @@ -20,7 +20,7 @@ $(dir)/symbol-test: $(dir)/symbol-test.o > $(call quiet,CXX) $^ -o $@ -Llib -lnotmuch $(XAPIAN_LDFLAGS) > > $(dir)/parse-time: $(dir)/parse-time.o parse-time-string/parse-time-string.o > - $(call quiet,CC) $^ -o $@ > + $(call quiet,CC) $^ -o $@ -lrt > > .PHONY: test check > > diff --git a/test/parse-time-string b/test/parse-time-string > index 34b80d7..265437c 100755 > --- a/test/parse-time-string > +++ b/test/parse-time-string > @@ -14,13 +14,48 @@ _parse_time () > ${TEST_DIRECTORY}/parse-time --format=%s "$*" > } > > -test_begin_subtest "date(1) default format without TZ code" > -test_expect_equal "$(_parse_time Fri Aug 3 23:06:06 2012)" "$(_date Fri Aug 3 23:06:06 2012)" > +test_begin_subtest "Date parser tests" > +cat < INPUT > +now -> Tue Jan 11 11:11:00 +0000 2011 > +2010-1-1 -> parse_time_string() error: 5 I think that's invalid per ISO 8601. > +Jan 2 -> Sat Jan 02 11:11:00 +0000 2010 # Why 2010? This is an interesting bug. The idea was that specifying a month without year would always refer to past. So when you give Jan 2011 in the reference time, Jan refers to the previous year. Seems simple, but looking closer, also "Jan 2 this year" would end up in 2010. Not good. It would probably be possible to fix this, but per the simplicity of implementation and unambiguity goals, I think I'll just make them refer to current year, at least for now. This may mean having to look into supporting "last {monthname,weekday}" for better usability, but I'll leave that as a future improvement. > +Mon -> Mon Jan 10 11:11:00 +0000 2011 > +last Friday -> parse_time_string() error: 4 "last " is not supported. > +2 hours ago -> parse_time_string() error: 1 "ago" is not supported. > +last month -> Sat Dec 11 11:11:00 +0000 2010 > +month ago -> parse_time_string() error: 1 Ditto. > +8am -> Tue Jan 11 08:00:00 +0000 2011 > +9:15 -> Tue Jan 11 09:15:00 +0000 2011 > +12:34 -> Tue Jan 11 12:34:00 +0000 2011 > +monday -> Mon Jan 10 11:11:00 +0000 2011 > +yesterday -> Mon Jan 10 11:11:00 +0000 2011 > +tomorrow -> parse_time_string() error: 1 "tomorrow" is not supported (do you get a lot of mail from the future? ;) > + -> Tue Jan 11 11:11:00 +0000 2011 # Shouldn't empty string return an error??? *shrug* It just starts with the reference time, and finds nothing to add or remove. Let's call it a feature. > > -test_begin_subtest "date(1) --rfc-2822 format" > -test_expect_equal "$(_parse_time Fri, 03 Aug 2012 23:07:46 +0100)" "$(_date Fri, 03 Aug 2012 23:07:46 +0100)" > +Aug 3 23:06:06 2012 -> Fri Aug 03 23:06:06 +0000 2012 # date(1) default format without TZ code > +Fri, 03 Aug 2012 23:07:46 +0100 -> Fri Aug 03 22:07:46 +0000 2012 # rfc-2822 > +2012-08-03 23:09:37+03:00 -> Fri Aug 03 20:09:37 +0000 2012 # rfc-3339 seconds > > -test_begin_subtest "date(1) --rfc=3339=seconds format" > -test_expect_equal "$(_parse_time 2012-08-03 23:09:37+03:00)" "$(_date 2012-08-03 23:09:37+03:00)" > +10s -> Tue Jan 11 11:10:50 +0000 2011 > +19701223s -> Wed Dec 23 11:10:59 +0000 1970 # Surprising - number is parsed as date and 's' as '1 second' Will be fixed. > +19701223 -> Wed Dec 23 11:11:00 +0000 1970 > + > +19701223 +0100 -> Wed Dec 23 11:11:00 +0000 1970 # Timezone is ignored without an error It's not ignored. Date is specified, but the time comes from the reference time. It's the same absolute time regardless of the timezone. > + > +today ^-> Wed Jan 12 00:00:00 +0000 2011 # This should be 11 23:59:59 See my previous mail. Can be fixed. > +today v-> Tue Jan 11 00:00:00 +0000 2011 > + > +thisweek ^-> Sun Jan 16 00:00:00 +0000 2011 # This should be Sunday 23:59:59 > +thisweek v-> Sun Jan 09 00:00:00 +0000 2011 # This should be Monday 00:00:00 Implementation simplicity and dodging a localization issue. Start of the week is based on the tm_mday field of struct tm, where 0 == Sunday. Even if I personally agree Monday is the 1st day of the week. > + > +two months ago-> parse_time_string() error: 1 # Comments in the code suggest that this is supported When in doubt, trust code over comments. ;) > +two months -> Thu Nov 11 11:11:00 +0000 2010 > + > +1348569850 -> parse_time_string() error: 4 # Seconds since epoch not yet supported? Backward compatibility in notmuch??? > +10 -> parse_time_string() error: 4 # Seconds since epoch? Indeed, seconds since epoch not yet supported. There is no backwards compatibility issue, as you can still use the .. format as described in notmuch-search-terms(7) man page. The new date:.. just doesn't support seconds since epoch yet. And I think I'll make the syntax "@" to let you have " seconds" without surprises. > +EOF > + > +${TEST_DIRECTORY}/parse-time --now="Tue Jan 11 11:11:00 +0000 2011" < INPUT > OUTPUT > +test_expect_equal_file INPUT OUTPUT > > test_done > diff --git a/test/parse-time.c b/test/parse-time.c > index b4de76b..0415f49 100644 > --- a/test/parse-time.c > +++ b/test/parse-time.c > @@ -18,59 +18,47 @@ > * Author: Jani Nikula > */ > > + > +#define _XOPEN_SOURCE 500 /* for strptime() and snprintf() */ > #include > #include > #include > #include > +#include > > #include "parse-time-string.h" > > -/* > - * concat argv[start]...argv[end - 1], separating them by a single > - * space, to a malloced string > - */ > -static char * > -concat_args (int start, int end, char *argv[]) > -{ > - int i; > - size_t len = 1; > - char *p; > - > - for (i = start; i < end; i++) > - len += strlen (argv[i]) + 1; > - > - p = malloc (len); > - if (!p) > - return NULL; > - > - *p = 0; > - > - for (i = start; i < end; i++) { > - if (i != start) > - strcat (p, " "); > - strcat (p, argv[i]); > - } > - > - return p; > -} > - > #define DEFAULT_FORMAT "%a %b %d %T %z %Y" > > static void > usage (const char *name) > { > - printf ("Usage: %s [options ...] \n\n", name); > + printf ("Usage: %s [options ...]\n\n", name); > printf ( > - "Parse and display it in given format.\n\n" > - " -f, --format=FMT output format, FMT according to strftime(3)\n" > - " (default: \"%s\")\n" > - " -n, --now=N use N seconds since epoch as now (default: now)\n" > - " -u, --up round result up (default: no rounding)\n" > - " -d, --down round result down (default: no rounding)\n" > - " -h, --help print this help\n", > + "Parse date/time read from stdin and display it in given format.\n\n" > + " -f, --format=FMT output format for dates and input format for --now,\n" > + " FMT according to strftime(3) (default: \"%s\")\n" > + " -n, --now=N reference date in FMT (default: now)\n" > + " -h, --help print this help\n" > + "\n" > + "stdin should contain one date/time per line in the following format:\n" > + " [ [ comment ] ]\n" > + "where determines the operation performed on the .\n" > + "It can be one of '->', '^->', 'v->' meaning convert, convert and round\n" > + "up, convert and round down, respectively.\n", > DEFAULT_FORMAT); > } > > +static const char * > +get_round_str (int round) > +{ > + switch (round) { > + case PARSE_TIME_ROUND_UP: return "^"; > + case PARSE_TIME_ROUND_DOWN: return "v"; > + default: return ""; > + } > +} > + > int > main (int argc, char *argv[]) > { > @@ -79,14 +67,10 @@ main (int argc, char *argv[]) > time_t result; > time_t now; > time_t *nowp = NULL; > - char *argstr; > int round = PARSE_TIME_NO_ROUND; > - char buf[1024]; > const char *format = DEFAULT_FORMAT; > struct option options[] = { > { "help", no_argument, NULL, 'h' }, > - { "up", no_argument, NULL, 'u' }, > - { "down", no_argument, NULL, 'd' }, > { "format", required_argument, NULL, 'f' }, > { "now", required_argument, NULL, 'n' }, > { NULL, 0, NULL, 0 }, > @@ -111,8 +95,13 @@ main (int argc, char *argv[]) > round = PARSE_TIME_ROUND_DOWN; > break; > case 'n': > - /* specify now in seconds since epoch */ > - now = (time_t) strtol (optarg, NULL, 10); > + memset (&tm, 0, sizeof (tm)); > + char *parsed = strptime (optarg, format, &tm); One of the problems with strptime is that it doesn't support time zones, which is why I chose not to use it here. (You can specify %z in the format to ignore it, but it looks like it's ignored no matter what. *shrug*) Combined with mktime below, you introduce possible TZ and DST variations in the tests, which can be problematic. So perhaps we should keep the reference time as a timestamp here. I didn't look at this test tool patch very closely yet, but in general I like the very much increased clarity in the tests. I'll look into this more when I have a moment. Thanks for the tests, comments, and corner cases. They've been helpful. BR, Jani. > + if (!parsed) { > + fprintf (stderr, "Cannot parse reference date: %s\n", optarg); > + return 1; > + } > + now = mktime (&tm); > if (now >= (time_t) 0) > nowp = &now; > break; > @@ -124,22 +113,47 @@ main (int argc, char *argv[]) > } > } > > - argstr = concat_args (optind, argc, argv); > - if (!argstr) > - return 1; > - > - r = parse_time_string (argstr, &result, nowp, round); > - > - free (argstr); > - > - if (r) > - return 1; > - > - if (!localtime_r (&result, &tm)) > - return 1; > - > - strftime (buf, sizeof (buf), format, &tm); > - printf ("%s\n", buf); > + char input[BUFSIZ]; > + while (fgets (input, BUFSIZ, stdin) && input[0]) { > + if (input[0] == '\n') { > + printf ("\n"); > + continue; > + } > + char *arrow; > + char *comment = strrchr (input, '#'); > + arrow = strstr (input, "->"); > + round = PARSE_TIME_NO_ROUND; > + if (arrow > input) { > + switch (arrow[-1]) { > + case '^': round = PARSE_TIME_ROUND_UP; arrow--; break; > + case 'v': round = PARSE_TIME_ROUND_DOWN; arrow--; break; > + default: break; > + } > + } > + if (arrow) > + *arrow = 0; > + else > + arrow = input + strlen (input); /* XXX: comment is not handled */ > + while (arrow > input && arrow[-1] == '\n') > + arrow--; > + *arrow-- = 0; > + > + r = parse_time_string (input, &result, nowp, round); > + char resstr[BUFSIZ]; > + if (r) > + snprintf (resstr, sizeof(resstr), "parse_time_string() error: %d", r); > + else if (!localtime_r (&result, &tm)) > + snprintf (resstr, sizeof(resstr), "localtime(result) error"); > + else > + strftime (resstr, sizeof (resstr), format, &tm); > + > + char buf[BUFSIZ]; > + snprintf (buf, sizeof(buf), "%s%s-> %s", input, get_round_str (round), resstr); > + if (!comment) > + printf ("%s\n", buf); > + else > + printf ("%-*s%s", (int)(comment - input), buf, comment); > + } > > return 0; > }