unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [RFC PATCH 00/14] modular mail stores based on URIs
@ 2012-06-25 20:41 Ethan Glasser-Camp
  2012-06-25 20:41 ` [RFC PATCH 01/14] All access to mail files goes through the mailstore module Ethan Glasser-Camp
                   ` (6 more replies)
  0 siblings, 7 replies; 31+ messages in thread
From: Ethan Glasser-Camp @ 2012-06-25 20:41 UTC (permalink / raw)
  To: notmuch

Hi guys,

Sorry for dropping off the mailing list after I sent my last patch series (http://notmuchmail.org/pipermail/notmuch/2012/009470.html). I haven't had the time or a stable enough email address to really follow notmuch development :)

I signed onto #notmuch a week or two ago and asked what I would need to do to get a feature like this one into mainline. j4ni told me that he agreed with the feedback to my original patch series, and suggested that I follow mjw1009's advice of having filenames encode all information about mail storage transparently, and that this would solve the problem with the original patch series of sprinkling mail storage parameters all over the place. bremner suggested that he had been thinking about how to support mbox or other multiple-message archives, and also commented that he wasn't crazy about so much of the API being in strings.

Based on this advice, I decided to revise my approach to this patchset, one that is based around the stated desire to work with mbox formats. This approach, in contrast to the mailstore approach that Michal Sojka proposed and I revised, encodes all mail access information as URIs. These URIs are stored in Xapian the way that relative paths are right now. Examples might be:

    maildir:///home/ethan/Mail/folder/cur/filename:2,S
    mbox:///home/ethan/Mail/folder/file.mbox#byte-offset+lenght
    couchdb://ethan:password@localhost:8080/some-doc-id

Personally, this isn't my favorite approach, for the following reasons:

1. Notmuch, at some point in its history, chose to store file paths relative to a "mail database", with the intent that if this mail database was moved, filenames would not change and everything would Just Work (tm). The above scheme completely reverses this design decision, and in general completely breaks this relocatability. I don't see any easy way to handle this problem. This isn't just a wishlist feature; at least two things in the test suite (caching of corpus.mail, and the atomicity tests) rely on this behavior.

2. Mail access information, i.e. open connections, etc. can only be stored in variables global to the mailstore code, and cannot be stored as private members of a mailstore object. This is more an aesthetic concern than a functional one.

Anyhow, the following (enormous) patch series implement this design. I used uriparser as an external library to parse URIs. The API for this library is a little idiosyncratic. uriparser supports parsing Unicode URIs (strings of wchar_t), but I just used ASCII filenames because I think that's what comes out of Xapian.

Patch 11 is borrowed directly from the last patch series.

The last four or five patches add mbox support, including a few tests. That part of the series is still very first-draft: I added a new config option to specify URIs to scan, and ">From " lines still need to be unescaped. However, we support scanning mbox files whether messages have content-length or not.

I will try to receive feedback on this series more gratefully than the last one. :)

Thanks again for your time,

Ethan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [RFC PATCH 01/14] All access to mail files goes through the mailstore module
  2012-06-25 20:41 [RFC PATCH 00/14] modular mail stores based on URIs Ethan Glasser-Camp
@ 2012-06-25 20:41 ` Ethan Glasser-Camp
  2012-06-28 20:48   ` Mark Walters
  2012-06-25 20:41 ` [RFC PATCH 02/14] Introduce uriparser Ethan Glasser-Camp
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: Ethan Glasser-Camp @ 2012-06-25 20:41 UTC (permalink / raw)
  To: notmuch

This commit introduces the mailstore module which provides two
functions, notmuch_mailstore_open and notmuch_mailstore_close. These
functions are currently just stub calls to fopen and fclose, but later
can be made more complex in order to support mail storage systems
where one message might not be one file.

Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
---
 lib/Makefile.local    |    1 +
 lib/database.cc       |    2 +-
 lib/index.cc          |    2 +-
 lib/mailstore.c       |   34 ++++++++++++++++++++++++
 lib/message-file.c    |    6 ++---
 lib/notmuch-private.h |    3 +++
 lib/notmuch.h         |   16 +++++++++++
 lib/sha1.c            |   70 +++++++++++++++++++++++++++++++++++++------------
 mime-node.c           |    4 +--
 notmuch-show.c        |   12 ++++-----
 10 files changed, 120 insertions(+), 30 deletions(-)
 create mode 100644 lib/mailstore.c

diff --git a/lib/Makefile.local b/lib/Makefile.local
index 8a9aa28..cfc77bb 100644
--- a/lib/Makefile.local
+++ b/lib/Makefile.local
@@ -51,6 +51,7 @@ libnotmuch_c_srcs =		\
 	$(dir)/filenames.c	\
 	$(dir)/string-list.c	\
 	$(dir)/libsha1.c	\
+	$(dir)/mailstore.c	\
 	$(dir)/message-file.c	\
 	$(dir)/messages.c	\
 	$(dir)/sha1.c		\
diff --git a/lib/database.cc b/lib/database.cc
index 761dc1a..c035edc 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -1773,7 +1773,7 @@ notmuch_database_add_message (notmuch_database_t *notmuch,
 	if (message_id == NULL ) {
 	    /* No message-id at all, let's generate one by taking a
 	     * hash over the file's contents. */
-	    char *sha1 = notmuch_sha1_of_file (filename);
+	    char *sha1 = notmuch_sha1_of_message (filename);
 
 	    /* If that failed too, something is really wrong. Give up. */
 	    if (sha1 == NULL) {
diff --git a/lib/index.cc b/lib/index.cc
index e377732..b607e82 100644
--- a/lib/index.cc
+++ b/lib/index.cc
@@ -441,7 +441,7 @@ _notmuch_message_index_file (notmuch_message_t *message,
 	initialized = 1;
     }
 
-    file = fopen (filename, "r");
+    file = notmuch_mailstore_open (filename);
     if (! file) {
 	fprintf (stderr, "Error opening %s: %s\n", filename, strerror (errno));
 	ret = NOTMUCH_STATUS_FILE_ERROR;
diff --git a/lib/mailstore.c b/lib/mailstore.c
new file mode 100644
index 0000000..48acd47
--- /dev/null
+++ b/lib/mailstore.c
@@ -0,0 +1,34 @@
+/* mailstore.c - code to access individual messages
+ *
+ * Copyright © 2009 Carl Worth
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see http://www.gnu.org/licenses/ .
+ *
+ * Author: Carl Worth <cworth@cworth.org>
+ */
+#include <stdio.h>
+
+#include "notmuch-private.h"
+
+FILE *
+notmuch_mailstore_open (const char *filename)
+{
+    return fopen (filename, "r");
+}
+
+int
+notmuch_mailstore_close (FILE *file)
+{
+    return fclose (file);
+}
diff --git a/lib/message-file.c b/lib/message-file.c
index 915aba8..271389c 100644
--- a/lib/message-file.c
+++ b/lib/message-file.c
@@ -86,7 +86,7 @@ _notmuch_message_file_destructor (notmuch_message_file_t *message)
 	g_hash_table_destroy (message->headers);
 
     if (message->file)
-	fclose (message->file);
+	notmuch_mailstore_close (message->file);
 
     return 0;
 }
@@ -104,7 +104,7 @@ _notmuch_message_file_open_ctx (void *ctx, const char *filename)
 
     talloc_set_destructor (message, _notmuch_message_file_destructor);
 
-    message->file = fopen (filename, "r");
+    message->file = notmuch_mailstore_open (filename);
     if (message->file == NULL)
 	goto FAIL;
 
@@ -361,7 +361,7 @@ notmuch_message_file_get_header (notmuch_message_file_t *message,
     }
 
     if (message->parsing_finished) {
-        fclose (message->file);
+        notmuch_mailstore_close (message->file);
         message->file = NULL;
     }
 
diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h
index bfb4111..5dbe821 100644
--- a/lib/notmuch-private.h
+++ b/lib/notmuch-private.h
@@ -468,6 +468,9 @@ notmuch_sha1_of_string (const char *str);
 char *
 notmuch_sha1_of_file (const char *filename);
 
+char *
+notmuch_sha1_of_message (const char *filename);
+
 /* string-list.c */
 
 typedef struct _notmuch_string_node {
diff --git a/lib/notmuch.h b/lib/notmuch.h
index 3633bed..0ca367b 100644
--- a/lib/notmuch.h
+++ b/lib/notmuch.h
@@ -1233,6 +1233,22 @@ notmuch_message_thaw (notmuch_message_t *message);
 void
 notmuch_message_destroy (notmuch_message_t *message);
 
+/* Get access to the underlying data of a message.
+ *
+ * Right now, all messages are simple files in maildirs, but this is
+ * hopefully subject to change.
+ *
+ * If the returned FILE* is not NULL, be sure to call
+ * notmuch_mailstore_close when you're done with it.
+ */
+FILE *
+notmuch_mailstore_open (const char *filename);
+
+/* Release any resources associated with this file.
+ */
+int
+notmuch_mailstore_close (FILE *file);
+
 /* Is the given 'tags' iterator pointing at a valid tag.
  *
  * When this function returns TRUE, notmuch_tags_get will return a
diff --git a/lib/sha1.c b/lib/sha1.c
index cc48108..462c07e 100644
--- a/lib/sha1.c
+++ b/lib/sha1.c
@@ -64,19 +64,11 @@ notmuch_sha1_of_string (const char *str)
     return _hex_of_sha1_digest (digest);
 }
 
-/* Create a hexadecimal string version of the SHA-1 digest of the
- * contents of the named file.
- *
- * This function returns a newly allocated string which the caller
- * should free() when finished.
- *
- * If any error occurs while reading the file, (permission denied,
- * file not found, etc.), this function returns NULL.
+/* Internal function to feed the contents of a FILE * to the sha1 functions.
  */
-char *
-notmuch_sha1_of_file (const char *filename)
-{
-    FILE *file;
+
+static char *
+_notmuch_sha1_of_filep (FILE *file){
 #define BLOCK_SIZE 4096
     unsigned char block[BLOCK_SIZE];
     size_t bytes_read;
@@ -84,14 +76,10 @@ notmuch_sha1_of_file (const char *filename)
     unsigned char digest[SHA1_DIGEST_SIZE];
     char *result;
 
-    file = fopen (filename, "r");
-    if (file == NULL)
-	return NULL;
-
     sha1_begin (&sha1);
 
     while (1) {
-	bytes_read = fread (block, 1, 4096, file);
+	bytes_read = fread (block, 1, BLOCK_SIZE, file);
 	if (bytes_read == 0) {
 	    if (feof (file)) {
 		break;
@@ -108,8 +96,56 @@ notmuch_sha1_of_file (const char *filename)
 
     result = _hex_of_sha1_digest (digest);
 
+    return result;
+}
+
+/* Create a hexadecimal string version of the SHA-1 digest of the
+ * contents of the named file.
+ *
+ * This function returns a newly allocated string which the caller
+ * should free() when finished.
+ *
+ * If any error occurs while reading the file, (permission denied,
+ * file not found, etc.), this function returns NULL.
+ */
+char *
+notmuch_sha1_of_file (const char *filename)
+{
+    FILE *file;
+    char *result;
+    file = fopen (filename, "r");
+    if (file == NULL)
+	return NULL;
+
+    result = _notmuch_sha1_of_filep (file);
+
     fclose (file);
 
     return result;
 }
 
+/* Create a hexadecimal string version of the SHA-1 digest of the
+ * contents of the named message, which is accessed via a mailstore.
+ *
+ * This function returns a newly allocated string which the caller
+ * should free() when finished.
+ *
+ * If any error occurs while reading the file, (permission denied,
+ * file not found, etc.), this function returns NULL.
+ */
+char *
+notmuch_sha1_of_message (const char *filename)
+{
+    FILE *file;
+    char *result;
+
+    file = notmuch_mailstore_open (filename);
+    if (file == NULL)
+	return NULL;
+
+    result = _notmuch_sha1_of_filep (file);
+
+    notmuch_mailstore_close (file);
+
+    return result;
+}
diff --git a/mime-node.c b/mime-node.c
index 97e8b48..a5c60d0 100644
--- a/mime-node.c
+++ b/mime-node.c
@@ -49,7 +49,7 @@ _mime_node_context_free (mime_node_context_t *res)
 	g_object_unref (res->stream);
 
     if (res->file)
-	fclose (res->file);
+	notmuch_mailstore_close (res->file);
 
     return 0;
 }
@@ -79,7 +79,7 @@ mime_node_open (const void *ctx, notmuch_message_t *message,
     }
     talloc_set_destructor (mctx, _mime_node_context_free);
 
-    mctx->file = fopen (filename, "r");
+    mctx->file = notmuch_mailstore_open (filename);
     if (! mctx->file) {
 	fprintf (stderr, "Error opening %s: %s\n", filename, strerror (errno));
 	status = NOTMUCH_STATUS_FILE_ERROR;
diff --git a/notmuch-show.c b/notmuch-show.c
index 8247f1d..2847d25 100644
--- a/notmuch-show.c
+++ b/notmuch-show.c
@@ -692,7 +692,7 @@ format_part_mbox (const void *ctx, mime_node_t *node, unused (int indent),
 	INTERNAL_ERROR ("format_part_mbox requires a root part");
 
     filename = notmuch_message_get_filename (message);
-    file = fopen (filename, "r");
+    file = notmuch_mailstore_open (filename);
     if (file == NULL) {
 	fprintf (stderr, "Failed to open %s: %s\n",
 		 filename, strerror (errno));
@@ -716,7 +716,7 @@ format_part_mbox (const void *ctx, mime_node_t *node, unused (int indent),
 
     printf ("\n");
 
-    fclose (file);
+    notmuch_mailstore_close (file);
 
     return NOTMUCH_STATUS_SUCCESS;
 }
@@ -739,7 +739,7 @@ format_part_raw (unused (const void *ctx), mime_node_t *node,
 	    return NOTMUCH_STATUS_FILE_ERROR;
 	}
 
-	file = fopen (filename, "r");
+	file = notmuch_mailstore_open (filename);
 	if (file == NULL) {
 	    fprintf (stderr, "Error: Cannot open file %s: %s\n", filename, strerror (errno));
 	    return NOTMUCH_STATUS_FILE_ERROR;
@@ -749,18 +749,18 @@ format_part_raw (unused (const void *ctx), mime_node_t *node,
 	    size = fread (buf, 1, sizeof (buf), file);
 	    if (ferror (file)) {
 		fprintf (stderr, "Error: Read failed from %s\n", filename);
-		fclose (file);
+		notmuch_mailstore_close (file);
 		return NOTMUCH_STATUS_FILE_ERROR;
 	    }
 
 	    if (fwrite (buf, size, 1, stdout) != 1) {
 		fprintf (stderr, "Error: Write failed\n");
-		fclose (file);
+		notmuch_mailstore_close (file);
 		return NOTMUCH_STATUS_FILE_ERROR;
 	    }
 	}
 
-	fclose (file);
+	notmuch_mailstore_close (file);
 	return NOTMUCH_STATUS_SUCCESS;
     }
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC PATCH 02/14] Introduce uriparser
  2012-06-25 20:41 [RFC PATCH 00/14] modular mail stores based on URIs Ethan Glasser-Camp
  2012-06-25 20:41 ` [RFC PATCH 01/14] All access to mail files goes through the mailstore module Ethan Glasser-Camp
@ 2012-06-25 20:41 ` Ethan Glasser-Camp
  2012-06-25 20:41 ` [RFC PATCH 03/14] mailstore can read from maildir: URLs Ethan Glasser-Camp
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 31+ messages in thread
From: Ethan Glasser-Camp @ 2012-06-25 20:41 UTC (permalink / raw)
  To: notmuch

Seeing as there is no glib-standard way to parse URIs, an external
library is needed. This commit introduces another program in compat/
and a stanza in ./configure to test if uriparser is there.

Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
---
 Makefile.local          |    2 +-
 compat/have_uriparser.c |   17 +++++++++++++++++
 configure               |   23 ++++++++++++++++++++---
 3 files changed, 38 insertions(+), 4 deletions(-)
 create mode 100644 compat/have_uriparser.c

diff --git a/Makefile.local b/Makefile.local
index a890df2..084f44e 100644
--- a/Makefile.local
+++ b/Makefile.local
@@ -41,7 +41,7 @@ PV_FILE=bindings/python/notmuch/version.py
 # Smash together user's values with our extra values
 FINAL_CFLAGS = -DNOTMUCH_VERSION=$(VERSION) $(CFLAGS) $(WARN_CFLAGS) $(CONFIGURE_CFLAGS) $(extra_cflags)
 FINAL_CXXFLAGS = $(CXXFLAGS) $(WARN_CXXFLAGS) $(CONFIGURE_CXXFLAGS) $(extra_cflags) $(extra_cxxflags)
-FINAL_NOTMUCH_LDFLAGS = $(LDFLAGS) -Lutil -lutil -Llib -lnotmuch $(AS_NEEDED_LDFLAGS) $(GMIME_LDFLAGS) $(TALLOC_LDFLAGS)
+FINAL_NOTMUCH_LDFLAGS = $(LDFLAGS) -Lutil -lutil -Llib -lnotmuch $(AS_NEEDED_LDFLAGS) $(GMIME_LDFLAGS) $(TALLOC_LDFLAGS) $(URIPARSER_LDFLAGS)
 FINAL_NOTMUCH_LINKER = CC
 ifneq ($(LINKER_RESOLVES_LIBRARY_DEPENDENCIES),1)
 FINAL_NOTMUCH_LDFLAGS += $(CONFIGURE_LDFLAGS)
diff --git a/compat/have_uriparser.c b/compat/have_uriparser.c
new file mode 100644
index 0000000..d79e51d
--- /dev/null
+++ b/compat/have_uriparser.c
@@ -0,0 +1,17 @@
+#include <uriparser/Uri.h>
+
+int
+main (int argc, char *argv[])
+{
+    UriParserStateA state;
+    UriUriA uri;
+    char *uriS = NULL;
+
+    state.uri = &uri;
+    if (uriParseUriA (&state, uriS) != URI_SUCCESS) {
+        /* Failure */
+        uriFreeUriMembersA (&uri);
+    }
+
+    return 0;
+}
diff --git a/configure b/configure
index 3fad424..80aa13c 100755
--- a/configure
+++ b/configure
@@ -313,6 +313,19 @@ else
     errors=$((errors + 1))
 fi
 
+printf "Checking for uriparser... "
+if ${CC} -o compat/have_uriparser "$srcdir"/compat/have_uriparser.c -luriparser > /dev/null 2>&1
+then
+    printf "Yes.\n"
+    uriparser_ldflags="-luriparser"
+    have_uriparser=1
+else
+    printf "No.\n"
+    have_uriparser=0
+    errors=$((errors + 1))
+fi
+rm -f compat/have_uriparser
+
 printf "Checking for valgrind development files... "
 if pkg-config --exists valgrind; then
     printf "Yes.\n"
@@ -431,11 +444,11 @@ case a simple command will install everything you need. For example:
 
 On Debian and similar systems:
 
-	sudo apt-get install libxapian-dev libgmime-2.6-dev libtalloc-dev
+	sudo apt-get install libxapian-dev libgmime-2.6-dev libtalloc-dev liburiparser-dev
 
 Or on Fedora and similar systems:
 
-	sudo yum install xapian-core-devel gmime-devel libtalloc-devel
+	sudo yum install xapian-core-devel gmime-devel libtalloc-devel uriparser-devel
 
 On other systems, similar commands can be used, but the details of the
 package names may be different.
@@ -669,6 +682,9 @@ GMIME_LDFLAGS = ${gmime_ldflags}
 TALLOC_CFLAGS = ${talloc_cflags}
 TALLOC_LDFLAGS = ${talloc_ldflags}
 
+# Flags needed to link against uriparser
+URIPARSER_LDFLAGS = ${uriparser_ldflags}
+
 # Flags needed to have linker set rpath attribute
 RPATH_LDFLAGS = ${rpath_ldflags}
 
@@ -698,5 +714,6 @@ CONFIGURE_CXXFLAGS = -DHAVE_GETLINE=\$(HAVE_GETLINE) \$(GMIME_CFLAGS)    \\
 		     \$(TALLOC_CFLAGS) -DHAVE_VALGRIND=\$(HAVE_VALGRIND) \\
 		     \$(VALGRIND_CFLAGS) \$(XAPIAN_CXXFLAGS)             \\
                      -DHAVE_STRCASESTR=\$(HAVE_STRCASESTR)
-CONFIGURE_LDFLAGS =  \$(GMIME_LDFLAGS) \$(TALLOC_LDFLAGS) \$(XAPIAN_LDFLAGS)
+CONFIGURE_LDFLAGS =  \$(GMIME_LDFLAGS) \$(TALLOC_LDFLAGS) \$(XAPIAN_LDFLAGS) \\
+                     \$(URIPARSER_LDFLAGS)
 EOF
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC PATCH 03/14] mailstore can read from maildir: URLs
  2012-06-25 20:41 [RFC PATCH 00/14] modular mail stores based on URIs Ethan Glasser-Camp
  2012-06-25 20:41 ` [RFC PATCH 01/14] All access to mail files goes through the mailstore module Ethan Glasser-Camp
  2012-06-25 20:41 ` [RFC PATCH 02/14] Introduce uriparser Ethan Glasser-Camp
@ 2012-06-25 20:41 ` Ethan Glasser-Camp
  2012-06-25 20:41 ` [RFC PATCH 04/14] Not all filenames need to be converted to absolute paths Ethan Glasser-Camp
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 31+ messages in thread
From: Ethan Glasser-Camp @ 2012-06-25 20:41 UTC (permalink / raw)
  To: notmuch

No code uses this yet.

Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
---
 lib/mailstore.c |   37 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/lib/mailstore.c b/lib/mailstore.c
index 48acd47..ae02c12 100644
--- a/lib/mailstore.c
+++ b/lib/mailstore.c
@@ -17,14 +17,49 @@
  *
  * Author: Carl Worth <cworth@cworth.org>
  */
+#include <uriparser/Uri.h>
 #include <stdio.h>
 
 #include "notmuch-private.h"
 
+static FILE *
+notmuch_mailstore_basic_open (const char *filename)
+{
+    return fopen (filename, "r");
+}
+
 FILE *
 notmuch_mailstore_open (const char *filename)
 {
-    return fopen (filename, "r");
+    FILE *ret = NULL;
+    UriUriA parsed;
+    UriParserStateA state;
+    state.uri = &parsed;
+    if (uriParseUriA (&state, filename) != URI_SUCCESS) {
+        /* Failure. Fall back to fopen and hope for the best. */
+        ret = notmuch_mailstore_basic_open (filename);
+        goto DONE;
+    }
+
+    if (parsed.scheme.first == NULL) {
+        /* No scheme. Probably not really a URL but just an ordinary filename.
+         * Fall back to fopen for backwards compatibility. */
+        ret = notmuch_mailstore_basic_open (filename);
+        goto DONE;
+    }
+
+    if (0 == strncmp (parsed.scheme.first, "maildir",
+                      parsed.scheme.afterLast-parsed.scheme.first)) {
+        /* Maildir URI of the form maildir:///path/to/file.
+         * We want to fopen("/path/to/file").
+         * pathHead starts at "path/to/file". */
+        ret = notmuch_mailstore_basic_open (parsed.pathHead->text.first - 1);
+        goto DONE;
+    }
+
+DONE:
+    uriFreeUriMembersA (&parsed);
+    return ret;
 }
 
 int
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC PATCH 04/14] Not all filenames need to be converted to absolute paths
  2012-06-25 20:41 [RFC PATCH 00/14] modular mail stores based on URIs Ethan Glasser-Camp
                   ` (2 preceding siblings ...)
  2012-06-25 20:41 ` [RFC PATCH 03/14] mailstore can read from maildir: URLs Ethan Glasser-Camp
@ 2012-06-25 20:41 ` Ethan Glasser-Camp
  2012-06-27  9:17 ` [RFC PATCH 00/14] modular mail stores based on URIs Mark Walters
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 31+ messages in thread
From: Ethan Glasser-Camp @ 2012-06-25 20:41 UTC (permalink / raw)
  To: notmuch

_notmuch_message_ensure_filename_list converts "relative" paths, such
as those stored in Xapian until now, to "absolute" paths. However,
URLs are already absolute, and prepending the database path will just
confuse matters.

Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
---
 lib/message.cc |   14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/lib/message.cc b/lib/message.cc
index 978de06..c9857f5 100644
--- a/lib/message.cc
+++ b/lib/message.cc
@@ -700,9 +700,17 @@ _notmuch_message_ensure_filename_list (notmuch_message_t *message)
 							  message->notmuch,
 							  directory_id);
 
-	if (strlen (directory))
-	    filename = talloc_asprintf (message, "%s/%s/%s",
-					db_path, directory, basename);
+	if (strlen (directory)) {
+	    /* If directory is a URI, we don't need to append the db_path;
+	     * it is already an absolute path. */
+	    /* This is just a quick hack instead of actually parsing the URL. */
+	    if (strstr (directory, "://") == NULL)
+		filename = talloc_asprintf (message, "%s/%s/%s",
+					    db_path, directory, basename);
+	    else
+		filename = talloc_asprintf (message, "%s/%s",
+					    directory, basename);
+	}
 	else
 	    filename = talloc_asprintf (message, "%s/%s",
 					db_path, basename);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 00/14] modular mail stores based on URIs
  2012-06-25 20:41 [RFC PATCH 00/14] modular mail stores based on URIs Ethan Glasser-Camp
                   ` (3 preceding siblings ...)
  2012-06-25 20:41 ` [RFC PATCH 04/14] Not all filenames need to be converted to absolute paths Ethan Glasser-Camp
@ 2012-06-27  9:17 ` Mark Walters
  2012-06-28  7:39   ` Ethan
  2012-06-28 17:36 ` Jameson Graef Rollins
  2012-06-28 22:00 ` Mark Walters
  6 siblings, 1 reply; 31+ messages in thread
From: Mark Walters @ 2012-06-27  9:17 UTC (permalink / raw)
  To: Ethan Glasser-Camp, notmuch


Hi

On Mon, 25 Jun 2012, Ethan Glasser-Camp <ethan.glasser.camp@gmail.com> wrote:

> Hi guys,
>
> Sorry for dropping off the mailing list after I sent my last patch
> series (http://notmuchmail.org/pipermail/notmuch/2012/009470.html). I
> haven't had the time or a stable enough email address to really follow
> notmuch development :)
>
> I signed onto #notmuch a week or two ago and asked what I would need
> to do to get a feature like this one into mainline. j4ni told me that
> he agreed with the feedback to my original patch series, and suggested
> that I follow mjw1009's advice of having filenames encode all
> information about mail storage transparently, and that this would
> solve the problem with the original patch series of sprinkling mail
> storage parameters all over the place. bremner suggested that he had
> been thinking about how to support mbox or other multiple-message
> archives, and also commented that he wasn't crazy about so much of the
> API being in strings.
>
> Based on this advice, I decided to revise my approach to this
> patchset, one that is based around the stated desire to work with mbox
> formats. This approach, in contrast to the mailstore approach that
> Michal Sojka proposed and I revised, encodes all mail access
> information as URIs. These URIs are stored in Xapian the way that
> relative paths are right now. Examples might be:
>
>     maildir:///home/ethan/Mail/folder/cur/filename:2,S
>     mbox:///home/ethan/Mail/folder/file.mbox#byte-offset+lenght
>     couchdb://ethan:password@localhost:8080/some-doc-id

First, thank you for resubmitting this: it is definitely a feature I
would like to see in notmuch. And at a first glance I like the series
(and I will try and review it over the next few days).

> Personally, this isn't my favorite approach, for the following reasons:
>
> 1. Notmuch, at some point in its history, chose to store file paths
> relative to a "mail database", with the intent that if this mail
> database was moved, filenames would not change and everything would
> Just Work (tm). The above scheme completely reverses this design
> decision, and in general completely breaks this relocatability. I
> don't see any easy way to handle this problem. This isn't just a
> wishlist feature; at least two things in the test suite (caching of
> corpus.mail, and the atomicity tests) rely on this behavior.

Why can't the URI just store a relative path, at least for maildir://
and mbox:// ? It is purely internal to notmuch so it doesn't need to be
very standard.

> 2. Mail access information, i.e. open connections, etc. can only be
> stored in variables global to the mailstore code, and cannot be stored
> as private members of a mailstore object. This is more an aesthetic
> concern than a functional one.
>
> Anyhow, the following (enormous) patch series implement this design. I
> used uriparser as an external library to parse URIs. The API for this
> library is a little idiosyncratic. uriparser supports parsing Unicode
> URIs (strings of wchar_t), but I just used ASCII filenames because I
> think that's what comes out of Xapian.

Why use a library? Isn't it just a question of does the string contain
// and, if so, splitting it? I guess that // is a nice separator as I
think we can assume that a true path does not contain it (since a
filename cannot contain /).

> Patch 11 is borrowed directly from the last patch series.
>
> The last four or five patches add mbox support, including a few
> tests. That part of the series is still very first-draft: I added a
> new config option to specify URIs to scan, and ">From " lines still
> need to be unescaped. However, we support scanning mbox files whether
> messages have content-length or not.

I have an idea that mbox byte-locations change when messages are marked
as read (amongst other things). It might be worth saying that this
initial implementation only works for unchanging mboxs (rather than the
append only condition that you currently say). But I have not got as far
as applying/testing the series yet.

> I will try to receive feedback on this series more gratefully than the
> last one. :)

Just to say all of the above are genuine questions (not requests for you
to rewrite stuff!) to try and understand the series.

Best wishes

Mark

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 00/14] modular mail stores based on URIs
  2012-06-27  9:17 ` [RFC PATCH 00/14] modular mail stores based on URIs Mark Walters
@ 2012-06-28  7:39   ` Ethan
  2012-06-28 15:13     ` David Bremner
  2012-06-28 20:45     ` Mark Walters
  0 siblings, 2 replies; 31+ messages in thread
From: Ethan @ 2012-06-28  7:39 UTC (permalink / raw)
  To: Mark Walters; +Cc: notmuch

[-- Attachment #1: Type: text/plain, Size: 3992 bytes --]

I sent this at first as a reply-only-to-sender. Oops! Sorry Mark for the
double send.

On Wed, Jun 27, 2012 at 5:17 AM, Mark Walters <markwalters1009@gmail.com>wrote:

> > Personally, this isn't my favorite approach, for the following reasons:
> >
> > 1. Notmuch, at some point in its history, chose to store file paths
> > relative to a "mail database", with the intent that if this mail
> > database was moved, filenames would not change and everything would
> > Just Work (tm). The above scheme completely reverses this design
> > decision, and in general completely breaks this relocatability. I
> > don't see any easy way to handle this problem. This isn't just a
> > wishlist feature; at least two things in the test suite (caching of
> > corpus.mail, and the atomicity tests) rely on this behavior.
>
> Why can't the URI just store a relative path, at least for maildir://
> and mbox:// ? It is purely internal to notmuch so it doesn't need to be
> very standard.
>

Well, relative to where? This is especially relevant now that we can have
multiple mail stores. It sounds like you are suggesting that all mbox://
URIs are relative to an "mbox root", but the fundamental question is how to
pass that information from the configuration into the library.

Even using configuration itself may be problematic, because only the CLI
uses the configuration, and language bindings like Python and Ruby might
get out of sync! (But note also that the Python bindings currently use
.notmuch-config to find the database path, so maybe it's not a big deal.)

If I could do whatever I wanted, every mailstore would get registered
somehow and the URIs could use those registered names to specify what
they're relative to: maybe using hostname, such as
maildir://university-mail/some-mail-file, mbox://old-unix-system/some.mbox.
Then changing these names in .notmuch-config would be fine. I just don't
know how to pass that configuration information without an approach like in
the past patch series.

 > 2. Mail access information, i.e. open connections, etc. can only be
> > stored in variables global to the mailstore code, and cannot be stored
> > as private members of a mailstore object. This is more an aesthetic
> > concern than a functional one.
> >
> > Anyhow, the following (enormous) patch series implement this design. I
> > used uriparser as an external library to parse URIs. The API for this
> > library is a little idiosyncratic. uriparser supports parsing Unicode
> > URIs (strings of wchar_t), but I just used ASCII filenames because I
> > think that's what comes out of Xapian.
>
> Why use a library? Isn't it just a question of does the string contain
> // and, if so, splitting it? I guess that // is a nice separator as I
> think we can assume that a true path does not contain it (since a
> filename cannot contain /).
>

The URIs are true URIs. Filenames are provided by the "path" segment of the
uri -- everything from the first slash after the hostname up to a ? for
query arguments. My concern was that filenames could (in theory) contain #
or ?, and in practice they contain : (maildir flags). I figured it was
better to do it right.

 > Patch 11 is borrowed directly from the last patch series.
> >
> > The last four or five patches add mbox support, including a few
> > tests. That part of the series is still very first-draft: I added a
> > new config option to specify URIs to scan, and ">From " lines still
> > need to be unescaped. However, we support scanning mbox files whether
> > messages have content-length or not.
>
> I have an idea that mbox byte-locations change when messages are marked
> as read (amongst other things). It might be worth saying that this
> initial implementation only works for unchanging mboxs (rather than the
> append only condition that you currently say). But I have not got as far
> as applying/testing the series yet.
>

Yeah, I don't even know how an mbox message gets flagged read and I don't
know how I would support it.

Ethan

[-- Attachment #2: Type: text/html, Size: 5020 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 00/14] modular mail stores based on URIs
  2012-06-28  7:39   ` Ethan
@ 2012-06-28 15:13     ` David Bremner
  2012-06-28 20:41       ` Robert Horn
  2012-06-28 20:45     ` Mark Walters
  1 sibling, 1 reply; 31+ messages in thread
From: David Bremner @ 2012-06-28 15:13 UTC (permalink / raw)
  To: Ethan, Mark Walters; +Cc: notmuch

Ethan <ethan.glasser.camp@gmail.com> writes:
>
> Yeah, I don't even know how an mbox message gets flagged read and I don't
> know how I would support it.
>

I think read only access to mboxes is fine. Yes, somebody will be
unhappy, but the only convincing argument I have heard for mboxes or
similar formats is archival use.

d

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 00/14] modular mail stores based on URIs
  2012-06-25 20:41 [RFC PATCH 00/14] modular mail stores based on URIs Ethan Glasser-Camp
                   ` (4 preceding siblings ...)
  2012-06-27  9:17 ` [RFC PATCH 00/14] modular mail stores based on URIs Mark Walters
@ 2012-06-28 17:36 ` Jameson Graef Rollins
  2012-06-28 18:39   ` Ethan
  2012-06-28 22:00 ` Mark Walters
  6 siblings, 1 reply; 31+ messages in thread
From: Jameson Graef Rollins @ 2012-06-28 17:36 UTC (permalink / raw)
  To: Ethan Glasser-Camp, notmuch

[-- Attachment #1: Type: text/plain, Size: 285 bytes --]

Hi, Ethan.  I haven't really looked at this patch set at all yet, but I
did notice that it's huge, and includes multiple big changes.  Is there
any way you can break these up into separate smaller and more digestable
series?  It would certainly help relieve the review burden.

jamie.

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 00/14] modular mail stores based on URIs
  2012-06-28 17:36 ` Jameson Graef Rollins
@ 2012-06-28 18:39   ` Ethan
  0 siblings, 0 replies; 31+ messages in thread
From: Ethan @ 2012-06-28 18:39 UTC (permalink / raw)
  To: Jameson Graef Rollins; +Cc: notmuch

[-- Attachment #1: Type: text/plain, Size: 807 bytes --]

It is pretty big and there are a couple places where the series could be
simplified, the first patch in particular. I will break it out and resubmit
piecewise but I'd like to know how to address these particular issues:

1. Are URIs the way to specify individual messages, despite bremner's
concerns about too much of the API being strings? Is adding another library
is the easiest way to parse URIs?

2. Is it OK to break maildir relocatability, or is it worth it to pass more
config to the library (perhaps by adding it to the notmuch_database_t
object)?

3. Is a global variable in the library acceptable? (I don't see any
others.) If not, how to store mailstore state?

The patch series is really more like a very rough draft to try to give
these concerns a context (specifically, mbox support).

Ethan

[-- Attachment #2: Type: text/html, Size: 871 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 00/14] modular mail stores based on URIs
  2012-06-28 15:13     ` David Bremner
@ 2012-06-28 20:41       ` Robert Horn
  0 siblings, 0 replies; 31+ messages in thread
From: Robert Horn @ 2012-06-28 20:41 UTC (permalink / raw)
  To: David Bremner, Ethan, Mark Walters; +Cc: notmuch

David Bremner <david@tethera.net> writes:

> Ethan <ethan.glasser.camp@gmail.com> writes:
>>
>> Yeah, I don't even know how an mbox message gets flagged read and I don't
>> know how I would support it.
>>
>
> I think read only access to mboxes is fine. Yes, somebody will be
> unhappy, but the only convincing argument I have heard for mboxes or
> similar formats is archival use.
>

That would also deal with the potential that edits shuffle the mbox
structure, breaking all references except those that use the message-ID
as a search string.  Alternatively, the URI form will need to use
message-ID to avoid breakage when an mbox change includes garbage
collection or relocation of email messages within the mbox file.

R Horn
rjhorn@alum.mit.edu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 00/14] modular mail stores based on URIs
  2012-06-28  7:39   ` Ethan
  2012-06-28 15:13     ` David Bremner
@ 2012-06-28 20:45     ` Mark Walters
  2012-07-01 16:02       ` Ethan
  1 sibling, 1 reply; 31+ messages in thread
From: Mark Walters @ 2012-06-28 20:45 UTC (permalink / raw)
  To: Ethan; +Cc: notmuch

On Thu, 28 Jun 2012, Ethan <ethan.glasser.camp@gmail.com> wrote:
> I sent this at first as a reply-only-to-sender. Oops! Sorry Mark for the
> double send.
>
> On Wed, Jun 27, 2012 at 5:17 AM, Mark Walters <markwalters1009@gmail.com>wrote:
>
>> > Personally, this isn't my favorite approach, for the following reasons:
>> >
>> > 1. Notmuch, at some point in its history, chose to store file paths
>> > relative to a "mail database", with the intent that if this mail
>> > database was moved, filenames would not change and everything would
>> > Just Work (tm). The above scheme completely reverses this design
>> > decision, and in general completely breaks this relocatability. I
>> > don't see any easy way to handle this problem. This isn't just a
>> > wishlist feature; at least two things in the test suite (caching of
>> > corpus.mail, and the atomicity tests) rely on this behavior.
>>
>> Why can't the URI just store a relative path, at least for maildir://
>> and mbox:// ? It is purely internal to notmuch so it doesn't need to be
>> very standard.
>>
>
> Well, relative to where? This is especially relevant now that we can have
> multiple mail stores. It sounds like you are suggesting that all mbox://
> URIs are relative to an "mbox root", but the fundamental question is how to
> pass that information from the configuration into the library.

I was thinking of just having one mail root and inside that there could
be maildirs and mboxes. Everything would still be relative to the root.

> Even using configuration itself may be problematic, because only the CLI
> uses the configuration, and language bindings like Python and Ruby might
> get out of sync! (But note also that the Python bindings currently use
> .notmuch-config to find the database path, so maybe it's not a big deal.)
>
> If I could do whatever I wanted, every mailstore would get registered
> somehow and the URIs could use those registered names to specify what
> they're relative to: maybe using hostname, such as
> maildir://university-mail/some-mail-file, mbox://old-unix-system/some.mbox.
> Then changing these names in .notmuch-config would be fine. I just don't
> know how to pass that configuration information without an approach like in
> the past patch series.
>
>  > 2. Mail access information, i.e. open connections, etc. can only be
>> > stored in variables global to the mailstore code, and cannot be stored
>> > as private members of a mailstore object. This is more an aesthetic
>> > concern than a functional one.
>> >
>> > Anyhow, the following (enormous) patch series implement this design. I
>> > used uriparser as an external library to parse URIs. The API for this
>> > library is a little idiosyncratic. uriparser supports parsing Unicode
>> > URIs (strings of wchar_t), but I just used ASCII filenames because I
>> > think that's what comes out of Xapian.
>>
>> Why use a library? Isn't it just a question of does the string contain
>> // and, if so, splitting it? I guess that // is a nice separator as I
>> think we can assume that a true path does not contain it (since a
>> filename cannot contain /).
>>
>
> The URIs are true URIs. Filenames are provided by the "path" segment of the
> uri -- everything from the first slash after the hostname up to a ? for
> query arguments. My concern was that filenames could (in theory) contain #
> or ?, and in practice they contain : (maildir flags). I figured it was
> better to do it right.

This is similar to your question to Jamie:

>  1. Are URIs the way to specify individual messages, despite bremner's
>  concerns about too much of the API being strings? Is adding another library
>  is the easiest way to parse URIs?

In my opinion  the nice thing about using strings is that it does not require
any changes to the Xapian database to store them. I think using URIs may
not be best though as they seem to be annoying to parse (as filenames
can contain the same characters) and you seem to need to work around the
parser in some cases.

I wonder if the following would be practical: use // as the field
separator:

e.g. mbox://filename//start_of_message+length

I think 2 consecutive slashes // is about the only thing we can assume
is not in the path or filename. Since it is not in the filename I think
parsing should be trivial (thus avoiding the extra library).
 
Secondly, I would prefer to keep maildirs as just the bare file name: so
the existence of // can be the signal that there is some other
scheme. This is asymmetric, but is rather more backwardly compatible. 

I have read most of the patches and will send a couple of specific
comments but I completely agree with you that the first thing is to
decide the above.

Finally do note these are just my views and others may have very
different ideas!

Best wishes

Mark

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 01/14] All access to mail files goes through the mailstore module
  2012-06-25 20:41 ` [RFC PATCH 01/14] All access to mail files goes through the mailstore module Ethan Glasser-Camp
@ 2012-06-28 20:48   ` Mark Walters
  0 siblings, 0 replies; 31+ messages in thread
From: Mark Walters @ 2012-06-28 20:48 UTC (permalink / raw)
  To: Ethan Glasser-Camp, notmuch


On Mon, 25 Jun 2012, Ethan Glasser-Camp <ethan.glasser.camp@gmail.com> wrote:
> This commit introduces the mailstore module which provides two
> functions, notmuch_mailstore_open and notmuch_mailstore_close. These
> functions are currently just stub calls to fopen and fclose, but later
> can be made more complex in order to support mail storage systems
> where one message might not be one file.

I would prefer this patch split into two: one doing the stuff mentioned
in the commit message and one the notmuch_sha1_of_message/
notmuch_sha1_of_file change (or at least mention this latter change in
the commit message). 

Best wishes

Mark 

> Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
> ---
>  lib/Makefile.local    |    1 +
>  lib/database.cc       |    2 +-
>  lib/index.cc          |    2 +-
>  lib/mailstore.c       |   34 ++++++++++++++++++++++++
>  lib/message-file.c    |    6 ++---
>  lib/notmuch-private.h |    3 +++
>  lib/notmuch.h         |   16 +++++++++++
>  lib/sha1.c            |   70 +++++++++++++++++++++++++++++++++++++------------
>  mime-node.c           |    4 +--
>  notmuch-show.c        |   12 ++++-----
>  10 files changed, 120 insertions(+), 30 deletions(-)
>  create mode 100644 lib/mailstore.c
>
> diff --git a/lib/Makefile.local b/lib/Makefile.local
> index 8a9aa28..cfc77bb 100644
> --- a/lib/Makefile.local
> +++ b/lib/Makefile.local
> @@ -51,6 +51,7 @@ libnotmuch_c_srcs =		\
>  	$(dir)/filenames.c	\
>  	$(dir)/string-list.c	\
>  	$(dir)/libsha1.c	\
> +	$(dir)/mailstore.c	\
>  	$(dir)/message-file.c	\
>  	$(dir)/messages.c	\
>  	$(dir)/sha1.c		\
> diff --git a/lib/database.cc b/lib/database.cc
> index 761dc1a..c035edc 100644
> --- a/lib/database.cc
> +++ b/lib/database.cc
> @@ -1773,7 +1773,7 @@ notmuch_database_add_message (notmuch_database_t *notmuch,
>  	if (message_id == NULL ) {
>  	    /* No message-id at all, let's generate one by taking a
>  	     * hash over the file's contents. */
> -	    char *sha1 = notmuch_sha1_of_file (filename);
> +	    char *sha1 = notmuch_sha1_of_message (filename);
>  
>  	    /* If that failed too, something is really wrong. Give up. */
>  	    if (sha1 == NULL) {
> diff --git a/lib/index.cc b/lib/index.cc
> index e377732..b607e82 100644
> --- a/lib/index.cc
> +++ b/lib/index.cc
> @@ -441,7 +441,7 @@ _notmuch_message_index_file (notmuch_message_t *message,
>  	initialized = 1;
>      }
>  
> -    file = fopen (filename, "r");
> +    file = notmuch_mailstore_open (filename);
>      if (! file) {
>  	fprintf (stderr, "Error opening %s: %s\n", filename, strerror (errno));
>  	ret = NOTMUCH_STATUS_FILE_ERROR;
> diff --git a/lib/mailstore.c b/lib/mailstore.c
> new file mode 100644
> index 0000000..48acd47
> --- /dev/null
> +++ b/lib/mailstore.c
> @@ -0,0 +1,34 @@
> +/* mailstore.c - code to access individual messages
> + *
> + * Copyright © 2009 Carl Worth
> + *
> + * This program is free software: you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 3 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see http://www.gnu.org/licenses/ .
> + *
> + * Author: Carl Worth <cworth@cworth.org>
> + */
> +#include <stdio.h>
> +
> +#include "notmuch-private.h"
> +
> +FILE *
> +notmuch_mailstore_open (const char *filename)
> +{
> +    return fopen (filename, "r");
> +}
> +
> +int
> +notmuch_mailstore_close (FILE *file)
> +{
> +    return fclose (file);
> +}
> diff --git a/lib/message-file.c b/lib/message-file.c
> index 915aba8..271389c 100644
> --- a/lib/message-file.c
> +++ b/lib/message-file.c
> @@ -86,7 +86,7 @@ _notmuch_message_file_destructor (notmuch_message_file_t *message)
>  	g_hash_table_destroy (message->headers);
>  
>      if (message->file)
> -	fclose (message->file);
> +	notmuch_mailstore_close (message->file);
>  
>      return 0;
>  }
> @@ -104,7 +104,7 @@ _notmuch_message_file_open_ctx (void *ctx, const char *filename)
>  
>      talloc_set_destructor (message, _notmuch_message_file_destructor);
>  
> -    message->file = fopen (filename, "r");
> +    message->file = notmuch_mailstore_open (filename);
>      if (message->file == NULL)
>  	goto FAIL;
>  
> @@ -361,7 +361,7 @@ notmuch_message_file_get_header (notmuch_message_file_t *message,
>      }
>  
>      if (message->parsing_finished) {
> -        fclose (message->file);
> +        notmuch_mailstore_close (message->file);
>          message->file = NULL;
>      }
>  
> diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h
> index bfb4111..5dbe821 100644
> --- a/lib/notmuch-private.h
> +++ b/lib/notmuch-private.h
> @@ -468,6 +468,9 @@ notmuch_sha1_of_string (const char *str);
>  char *
>  notmuch_sha1_of_file (const char *filename);
>  
> +char *
> +notmuch_sha1_of_message (const char *filename);
> +
>  /* string-list.c */
>  
>  typedef struct _notmuch_string_node {
> diff --git a/lib/notmuch.h b/lib/notmuch.h
> index 3633bed..0ca367b 100644
> --- a/lib/notmuch.h
> +++ b/lib/notmuch.h
> @@ -1233,6 +1233,22 @@ notmuch_message_thaw (notmuch_message_t *message);
>  void
>  notmuch_message_destroy (notmuch_message_t *message);
>  
> +/* Get access to the underlying data of a message.
> + *
> + * Right now, all messages are simple files in maildirs, but this is
> + * hopefully subject to change.
> + *
> + * If the returned FILE* is not NULL, be sure to call
> + * notmuch_mailstore_close when you're done with it.
> + */
> +FILE *
> +notmuch_mailstore_open (const char *filename);
> +
> +/* Release any resources associated with this file.
> + */
> +int
> +notmuch_mailstore_close (FILE *file);
> +
>  /* Is the given 'tags' iterator pointing at a valid tag.
>   *
>   * When this function returns TRUE, notmuch_tags_get will return a
> diff --git a/lib/sha1.c b/lib/sha1.c
> index cc48108..462c07e 100644
> --- a/lib/sha1.c
> +++ b/lib/sha1.c
> @@ -64,19 +64,11 @@ notmuch_sha1_of_string (const char *str)
>      return _hex_of_sha1_digest (digest);
>  }
>  
> -/* Create a hexadecimal string version of the SHA-1 digest of the
> - * contents of the named file.
> - *
> - * This function returns a newly allocated string which the caller
> - * should free() when finished.
> - *
> - * If any error occurs while reading the file, (permission denied,
> - * file not found, etc.), this function returns NULL.
> +/* Internal function to feed the contents of a FILE * to the sha1 functions.
>   */
> -char *
> -notmuch_sha1_of_file (const char *filename)
> -{
> -    FILE *file;
> +
> +static char *
> +_notmuch_sha1_of_filep (FILE *file){
>  #define BLOCK_SIZE 4096
>      unsigned char block[BLOCK_SIZE];
>      size_t bytes_read;
> @@ -84,14 +76,10 @@ notmuch_sha1_of_file (const char *filename)
>      unsigned char digest[SHA1_DIGEST_SIZE];
>      char *result;
>  
> -    file = fopen (filename, "r");
> -    if (file == NULL)
> -	return NULL;
> -
>      sha1_begin (&sha1);
>  
>      while (1) {
> -	bytes_read = fread (block, 1, 4096, file);
> +	bytes_read = fread (block, 1, BLOCK_SIZE, file);
>  	if (bytes_read == 0) {
>  	    if (feof (file)) {
>  		break;
> @@ -108,8 +96,56 @@ notmuch_sha1_of_file (const char *filename)
>  
>      result = _hex_of_sha1_digest (digest);
>  
> +    return result;
> +}
> +
> +/* Create a hexadecimal string version of the SHA-1 digest of the
> + * contents of the named file.
> + *
> + * This function returns a newly allocated string which the caller
> + * should free() when finished.
> + *
> + * If any error occurs while reading the file, (permission denied,
> + * file not found, etc.), this function returns NULL.
> + */
> +char *
> +notmuch_sha1_of_file (const char *filename)
> +{
> +    FILE *file;
> +    char *result;
> +    file = fopen (filename, "r");
> +    if (file == NULL)
> +	return NULL;
> +
> +    result = _notmuch_sha1_of_filep (file);
> +
>      fclose (file);
>  
>      return result;
>  }
>  
> +/* Create a hexadecimal string version of the SHA-1 digest of the
> + * contents of the named message, which is accessed via a mailstore.
> + *
> + * This function returns a newly allocated string which the caller
> + * should free() when finished.
> + *
> + * If any error occurs while reading the file, (permission denied,
> + * file not found, etc.), this function returns NULL.
> + */
> +char *
> +notmuch_sha1_of_message (const char *filename)
> +{
> +    FILE *file;
> +    char *result;
> +
> +    file = notmuch_mailstore_open (filename);
> +    if (file == NULL)
> +	return NULL;
> +
> +    result = _notmuch_sha1_of_filep (file);
> +
> +    notmuch_mailstore_close (file);
> +
> +    return result;
> +}
> diff --git a/mime-node.c b/mime-node.c
> index 97e8b48..a5c60d0 100644
> --- a/mime-node.c
> +++ b/mime-node.c
> @@ -49,7 +49,7 @@ _mime_node_context_free (mime_node_context_t *res)
>  	g_object_unref (res->stream);
>  
>      if (res->file)
> -	fclose (res->file);
> +	notmuch_mailstore_close (res->file);
>  
>      return 0;
>  }
> @@ -79,7 +79,7 @@ mime_node_open (const void *ctx, notmuch_message_t *message,
>      }
>      talloc_set_destructor (mctx, _mime_node_context_free);
>  
> -    mctx->file = fopen (filename, "r");
> +    mctx->file = notmuch_mailstore_open (filename);
>      if (! mctx->file) {
>  	fprintf (stderr, "Error opening %s: %s\n", filename, strerror (errno));
>  	status = NOTMUCH_STATUS_FILE_ERROR;
> diff --git a/notmuch-show.c b/notmuch-show.c
> index 8247f1d..2847d25 100644
> --- a/notmuch-show.c
> +++ b/notmuch-show.c
> @@ -692,7 +692,7 @@ format_part_mbox (const void *ctx, mime_node_t *node, unused (int indent),
>  	INTERNAL_ERROR ("format_part_mbox requires a root part");
>  
>      filename = notmuch_message_get_filename (message);
> -    file = fopen (filename, "r");
> +    file = notmuch_mailstore_open (filename);
>      if (file == NULL) {
>  	fprintf (stderr, "Failed to open %s: %s\n",
>  		 filename, strerror (errno));
> @@ -716,7 +716,7 @@ format_part_mbox (const void *ctx, mime_node_t *node, unused (int indent),
>  
>      printf ("\n");
>  
> -    fclose (file);
> +    notmuch_mailstore_close (file);
>  
>      return NOTMUCH_STATUS_SUCCESS;
>  }
> @@ -739,7 +739,7 @@ format_part_raw (unused (const void *ctx), mime_node_t *node,
>  	    return NOTMUCH_STATUS_FILE_ERROR;
>  	}
>  
> -	file = fopen (filename, "r");
> +	file = notmuch_mailstore_open (filename);
>  	if (file == NULL) {
>  	    fprintf (stderr, "Error: Cannot open file %s: %s\n", filename, strerror (errno));
>  	    return NOTMUCH_STATUS_FILE_ERROR;
> @@ -749,18 +749,18 @@ format_part_raw (unused (const void *ctx), mime_node_t *node,
>  	    size = fread (buf, 1, sizeof (buf), file);
>  	    if (ferror (file)) {
>  		fprintf (stderr, "Error: Read failed from %s\n", filename);
> -		fclose (file);
> +		notmuch_mailstore_close (file);
>  		return NOTMUCH_STATUS_FILE_ERROR;
>  	    }
>  
>  	    if (fwrite (buf, size, 1, stdout) != 1) {
>  		fprintf (stderr, "Error: Write failed\n");
> -		fclose (file);
> +		notmuch_mailstore_close (file);
>  		return NOTMUCH_STATUS_FILE_ERROR;
>  	    }
>  	}
>  
> -	fclose (file);
> +	notmuch_mailstore_close (file);
>  	return NOTMUCH_STATUS_SUCCESS;
>      }
>  
> -- 
> 1.7.9.5
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 00/14] modular mail stores based on URIs
  2012-06-25 20:41 [RFC PATCH 00/14] modular mail stores based on URIs Ethan Glasser-Camp
                   ` (5 preceding siblings ...)
  2012-06-28 17:36 ` Jameson Graef Rollins
@ 2012-06-28 22:00 ` Mark Walters
  2012-06-29  6:43   ` Ethan
  6 siblings, 1 reply; 31+ messages in thread
From: Mark Walters @ 2012-06-28 22:00 UTC (permalink / raw)
  To: Ethan Glasser-Camp, notmuch


Hi

Just a quick question: does this update the database with
maildir://files URIs instead of just filenames? In other words is it
safe to try out on actual mailstores?

(I used it on a trial system but when I reverted to master some things
seemed to stop working)

Best wishes

Mark

On Mon, 25 Jun 2012, Ethan Glasser-Camp <ethan.glasser.camp@gmail.com> wrote:
> Hi guys,
>
> Sorry for dropping off the mailing list after I sent my last patch series (http://notmuchmail.org/pipermail/notmuch/2012/009470.html). I haven't had the time or a stable enough email address to really follow notmuch development :)
>
> I signed onto #notmuch a week or two ago and asked what I would need to do to get a feature like this one into mainline. j4ni told me that he agreed with the feedback to my original patch series, and suggested that I follow mjw1009's advice of having filenames encode all information about mail storage transparently, and that this would solve the problem with the original patch series of sprinkling mail storage parameters all over the place. bremner suggested that he had been thinking about how to support mbox or other multiple-message archives, and also commented that he wasn't crazy about so much of the API being in strings.
>
> Based on this advice, I decided to revise my approach to this patchset, one that is based around the stated desire to work with mbox formats. This approach, in contrast to the mailstore approach that Michal Sojka proposed and I revised, encodes all mail access information as URIs. These URIs are stored in Xapian the way that relative paths are right now. Examples might be:
>
>     maildir:///home/ethan/Mail/folder/cur/filename:2,S
>     mbox:///home/ethan/Mail/folder/file.mbox#byte-offset+lenght
>     couchdb://ethan:password@localhost:8080/some-doc-id
>
> Personally, this isn't my favorite approach, for the following reasons:
>
> 1. Notmuch, at some point in its history, chose to store file paths relative to a "mail database", with the intent that if this mail database was moved, filenames would not change and everything would Just Work (tm). The above scheme completely reverses this design decision, and in general completely breaks this relocatability. I don't see any easy way to handle this problem. This isn't just a wishlist feature; at least two things in the test suite (caching of corpus.mail, and the atomicity tests) rely on this behavior.
>
> 2. Mail access information, i.e. open connections, etc. can only be stored in variables global to the mailstore code, and cannot be stored as private members of a mailstore object. This is more an aesthetic concern than a functional one.
>
> Anyhow, the following (enormous) patch series implement this design. I used uriparser as an external library to parse URIs. The API for this library is a little idiosyncratic. uriparser supports parsing Unicode URIs (strings of wchar_t), but I just used ASCII filenames because I think that's what comes out of Xapian.
>
> Patch 11 is borrowed directly from the last patch series.
>
> The last four or five patches add mbox support, including a few tests. That part of the series is still very first-draft: I added a new config option to specify URIs to scan, and ">From " lines still need to be unescaped. However, we support scanning mbox files whether messages have content-length or not.
>
> I will try to receive feedback on this series more gratefully than the last one. :)
>
> Thanks again for your time,
>
> Ethan
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 00/14] modular mail stores based on URIs
  2012-06-28 22:00 ` Mark Walters
@ 2012-06-29  6:43   ` Ethan
  2012-06-29  7:00     ` Mark Walters
  2012-06-29  7:43     ` Jani Nikula
  0 siblings, 2 replies; 31+ messages in thread
From: Ethan @ 2012-06-29  6:43 UTC (permalink / raw)
  To: Mark Walters; +Cc: notmuch

[-- Attachment #1: Type: text/plain, Size: 707 bytes --]

On Thu, Jun 28, 2012 at 6:00 PM, Mark Walters <markwalters1009@gmail.com>wrote:

>
> Just a quick question: does this update the database with
> maildir://files URIs instead of just filenames? In other words is it
> safe to try out on actual mailstores?
>

It doesn't change any of the existing filenames or do anything like a
database "upgrade". If you run notmuch new, it should add a URI-based
filename to each message, but I don't think it should remove old filenames,
so I would expect it to be fine. But I didn't expect anyone to try it on
their actual mail database.

(I used it on a trial system but when I reverted to master some things
> seemed to stop working)
>

Sorry for breaking it :)

Ethan

[-- Attachment #2: Type: text/html, Size: 1174 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 00/14] modular mail stores based on URIs
  2012-06-29  6:43   ` Ethan
@ 2012-06-29  7:00     ` Mark Walters
  2012-06-29  7:43     ` Jani Nikula
  1 sibling, 0 replies; 31+ messages in thread
From: Mark Walters @ 2012-06-29  7:00 UTC (permalink / raw)
  To: Ethan; +Cc: notmuch


Hi

On Fri, 29 Jun 2012, Ethan <ethan.glasser.camp@gmail.com> wrote:
> On Thu, Jun 28, 2012 at 6:00 PM, Mark Walters <markwalters1009@gmail.com>wrote:
>
>>
>> Just a quick question: does this update the database with
>> maildir://files URIs instead of just filenames? In other words is it
>> safe to try out on actual mailstores?
>>
>
> It doesn't change any of the existing filenames or do anything like a
> database "upgrade". If you run notmuch new, it should add a URI-based
> filename to each message, but I don't think it should remove old filenames,
> so I would expect it to be fine. But I didn't expect anyone to try it on
> their actual mail database.

The breakage I observed was when trying to reply to a message after
reverting to master: on the command line I got

Error opening /database_path/maildir:///full_path_of_file: No such file or directory

When doing notmuch search output=files I see both possibilities.

Surprisingly notmuch new does not seem to remove these entries.

>> (I used it on a trial system but when I reverted to master some things
>> seemed to stop working)
>
> Sorry for breaking it :)

It was no problem for me: it was just a test corpus of 10,000 message
that I use for testing.

In case anyone else has problems the following solved it for me:

1) dump the tags
2) delete the .notmuch subdirectory of the mail store 
3) run notmuch new
4) restore the tags

Best wishes

Mark

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 00/14] modular mail stores based on URIs
  2012-06-29  6:43   ` Ethan
  2012-06-29  7:00     ` Mark Walters
@ 2012-06-29  7:43     ` Jani Nikula
  1 sibling, 0 replies; 31+ messages in thread
From: Jani Nikula @ 2012-06-29  7:43 UTC (permalink / raw)
  To: Ethan; +Cc: Notmuch Mail

[-- Attachment #1: Type: text/plain, Size: 1218 bytes --]

On Jun 29, 2012 9:43 AM, "Ethan" <ethan.glasser.camp@gmail.com> wrote:
>
> On Thu, Jun 28, 2012 at 6:00 PM, Mark Walters <markwalters1009@gmail.com>
wrote:
>>
>>
>> Just a quick question: does this update the database with
>> maildir://files URIs instead of just filenames? In other words is it
>> safe to try out on actual mailstores?
>
>
> It doesn't change any of the existing filenames or do anything like a
database "upgrade". If you run notmuch new, it should add a URI-based
filename to each message, but I don't think it should remove old filenames,
so I would expect it to be fine. But I didn't expect anyone to try it on
their actual mail database.

I don't have the time to look into the details, but I'd like to point out
that one of the powerful features of notmuch is the cli and ability to pipe
the commands. I'd hate to lose the ability to pipe 'notmuch search
--output=files' to tools that don't understand uris. I suppose that means
I'd like the most common case - files in maildir, the status quo - to use
plain filenames. Either that, or notmuch search needs a new --output switch
to output uris. A user that only has files in maildir should not experience
any breakage because of this.

BR,
Jani.

[-- Attachment #2: Type: text/html, Size: 1525 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 00/14] modular mail stores based on URIs
  2012-06-28 20:45     ` Mark Walters
@ 2012-07-01 16:02       ` Ethan
  2012-07-01 16:39         ` [PATCH v2 0/8] URI-based modular mail stores Ethan Glasser-Camp
                           ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Ethan @ 2012-07-01 16:02 UTC (permalink / raw)
  To: Mark Walters; +Cc: notmuch

[-- Attachment #1: Type: text/plain, Size: 2442 bytes --]

Thanks for going through it, I know there's a lot to go through..

On Thu, Jun 28, 2012 at 4:45 PM, Mark Walters <markwalters1009@gmail.com>wrote:

> I was thinking of just having one mail root and inside that there could
> be maildirs and mboxes. Everything would still be relative to the root.
>

I'm hesitant to have directories that contain maildirs and mboxes. It
should be possible to unambiguously distinguish between a maildir file and
an mbox file (mboxes always start with "From ", no colon) but it sounds
kind of fragile.

>  1. Are URIs the way to specify individual messages, despite bremner's
> >  concerns about too much of the API being strings? Is adding another
> library
> >  is the easiest way to parse URIs?
>
> In my opinion  the nice thing about using strings is that it does not
> require
> any changes to the Xapian database to store them. I think using URIs may
> not be best though as they seem to be annoying to parse (as filenames
> can contain the same characters) and you seem to need to work around the
> parser in some cases.
>

I think that's more the fault of the parser than of the URIs. If glib came
with a parser, that would be great. There aren't a lot of options for
pure-C URI parsing. Besides uriparser, there's also some code in the W3C
sample code library, but it looked like integrating it would be a pain so I
let it go.

I wonder if the following would be practical: use // as the field
> separator:
>
> e.g. mbox://filename//start_of_message+length
>
> I think 2 consecutive slashes // is about the only thing we can assume
> is not in the path or filename. Since it is not in the filename I think
> parsing should be trivial (thus avoiding the extra library).
>

Can you explain what you mean when you say that two consecutive slashes
can't appear in a URL? Ordinary filesystem paths can contain them, and so
can file: URLs. (I just looked up file:///home/ethan///////tmp and Firefox
handled that OK.) I've sometimes seen machine-generated filenames with
double slashes because that way you don't have to make sure the incoming
filename was correctly terminated before adding another level.


> Secondly, I would prefer to keep maildirs as just the bare file name: so
> the existence of // can be the signal that there is some other
> scheme. This is asymmetric, but is rather more backwardly compatible.
>

Based on your and Jani's reasoning, I did this. Revised patch series
follows.

Ethan

[-- Attachment #2: Type: text/html, Size: 3341 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 0/8] URI-based modular mail stores
  2012-07-01 16:02       ` Ethan
@ 2012-07-01 16:39         ` Ethan Glasser-Camp
  2012-07-01 16:39           ` [PATCH v2 1/8] All access to mail files goes through the mailstore module Ethan Glasser-Camp
                             ` (7 more replies)
  2012-07-01 16:48         ` [RFC PATCH 00/14] modular mail stores based on URIs Mark Walters
  2012-07-03  8:40         ` Jameson Graef Rollins
  2 siblings, 8 replies; 31+ messages in thread
From: Ethan Glasser-Camp @ 2012-07-01 16:39 UTC (permalink / raw)
  To: notmuch

Hi guys,

Based on the feedback from the last patch series (thanks a lot to Mark
Walters and Jani Nikula for taking the time to look at it!) I revised
this patch series to not change the handling of maildir filenames at
all. This meant not having to give up relocatability, and also
simplified the patch series a great deal (down to 8 patches instead of
14). I also noticed that there is only one use of notmuch_sha1_of_file
and it is only used on messages, so instead of splitting the function
as in the last patch series, I just renamed it.

As a reminder, this patch series proposes a mailstore.c module, which
is responsible for accessing "filenames" which, in addition to
ordinary filenames, can also be URIs.

- Patches 1-5 add the mailstore module and complementary interfaces,
  such as a new.scan option that tell notmuch-new where to look for
  other mail.

- Patches 6-8 prove the validity of this interface by using it to add
  mbox support.

I know this is still pretty big and does two different things, but
mbox support relies on URI support, and URI support is meaningless
without some new kind of mail store.

There are a few missing features in this patch series, but I was
hesitant to add them because I didn't want to make the series any more
complex. To wit, future series could include:

- notmuch-new.c: some code is shared between add_files_recursive and
  add_files_mbox{,_file}, specifically the code that scans directories
  for subdirectories and then recurses, and the code that scans
  directories for regular files and then does something else. Some
  guidance on how best to refactor that, or whether to even bother,
  would be appreciated.

- notmuch-config.c: I think instead of one option for new.scan, it
  would be good to have a section called [urls], for which each entry
  is name=url. This would also allow relocatability for mail not
  located in the central maildir (store the name in Xapian, and
  translate it to the URL on the way in/out).

- notmuch-new.c: code could be added to notice when we "walked past" a
  message in an mbox file and mark it as deleted, as is currently done
  with maildir.

- notmuch-new.c: no support for count_files kinds of operations
  exists.

Unlike the previous patch series, I would appreciate it if you
reviewed this one :)

Thanks for your time!

Ethan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 1/8] All access to mail files goes through the mailstore module
  2012-07-01 16:39         ` [PATCH v2 0/8] URI-based modular mail stores Ethan Glasser-Camp
@ 2012-07-01 16:39           ` Ethan Glasser-Camp
  2012-07-01 19:48             ` Mark Walters
  2012-07-01 16:39           ` [PATCH v2 2/8] Introduce uriparser Ethan Glasser-Camp
                             ` (6 subsequent siblings)
  7 siblings, 1 reply; 31+ messages in thread
From: Ethan Glasser-Camp @ 2012-07-01 16:39 UTC (permalink / raw)
  To: notmuch; +Cc: Ethan Glasser-Camp

This commit introduces the mailstore module which provides two
functions, notmuch_mailstore_open and notmuch_mailstore_close. These
functions are currently just stub calls to fopen and fclose, but later
can be made more complex in order to support mail storage systems
where one message might not be one file.

No functional changes are made, but calls to fopen and fclose are
replaced with notmuch_mailstore_open and notmuch_mailstore_close
wherever they are being used to access messages, but not when talking
about real files like in notmuch-dump or notmuch-restore.

Since the sha1 function notmuch_sha1_of_file is only used on messages,
we convert it to using notmuch_mailstore_open and
notmuch_mailstore_close, and rename it notmuch_sha1_of_message. While
we are there, we also replace a numeric constant with its symbolic
name BLOCK_SIZE.

Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
---
This patch does not split notmuch_sha1_of_file the way the previous
patch series did, because there are no other callers of
notmuch_sha1_of_file. Instead it renames the function to
notmuch_sha1_of_message.

 lib/Makefile.local    |    1 +
 lib/database.cc       |    2 +-
 lib/index.cc          |    2 +-
 lib/mailstore.c       |   34 ++++++++++++++++++++++++++++++++++
 lib/message-file.c    |    6 +++---
 lib/notmuch-private.h |    2 +-
 lib/notmuch.h         |   16 ++++++++++++++++
 lib/sha1.c            |   11 +++++------
 mime-node.c           |    4 ++--
 notmuch-show.c        |   12 ++++++------
 10 files changed, 70 insertions(+), 20 deletions(-)
 create mode 100644 lib/mailstore.c

diff --git a/lib/Makefile.local b/lib/Makefile.local
index 8a9aa28..cfc77bb 100644
--- a/lib/Makefile.local
+++ b/lib/Makefile.local
@@ -51,6 +51,7 @@ libnotmuch_c_srcs =		\
 	$(dir)/filenames.c	\
 	$(dir)/string-list.c	\
 	$(dir)/libsha1.c	\
+	$(dir)/mailstore.c	\
 	$(dir)/message-file.c	\
 	$(dir)/messages.c	\
 	$(dir)/sha1.c		\
diff --git a/lib/database.cc b/lib/database.cc
index 761dc1a..c035edc 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -1773,7 +1773,7 @@ notmuch_database_add_message (notmuch_database_t *notmuch,
 	if (message_id == NULL ) {
 	    /* No message-id at all, let's generate one by taking a
 	     * hash over the file's contents. */
-	    char *sha1 = notmuch_sha1_of_file (filename);
+	    char *sha1 = notmuch_sha1_of_message (filename);
 
 	    /* If that failed too, something is really wrong. Give up. */
 	    if (sha1 == NULL) {
diff --git a/lib/index.cc b/lib/index.cc
index e377732..b607e82 100644
--- a/lib/index.cc
+++ b/lib/index.cc
@@ -441,7 +441,7 @@ _notmuch_message_index_file (notmuch_message_t *message,
 	initialized = 1;
     }
 
-    file = fopen (filename, "r");
+    file = notmuch_mailstore_open (filename);
     if (! file) {
 	fprintf (stderr, "Error opening %s: %s\n", filename, strerror (errno));
 	ret = NOTMUCH_STATUS_FILE_ERROR;
diff --git a/lib/mailstore.c b/lib/mailstore.c
new file mode 100644
index 0000000..48acd47
--- /dev/null
+++ b/lib/mailstore.c
@@ -0,0 +1,34 @@
+/* mailstore.c - code to access individual messages
+ *
+ * Copyright © 2009 Carl Worth
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see http://www.gnu.org/licenses/ .
+ *
+ * Author: Carl Worth <cworth@cworth.org>
+ */
+#include <stdio.h>
+
+#include "notmuch-private.h"
+
+FILE *
+notmuch_mailstore_open (const char *filename)
+{
+    return fopen (filename, "r");
+}
+
+int
+notmuch_mailstore_close (FILE *file)
+{
+    return fclose (file);
+}
diff --git a/lib/message-file.c b/lib/message-file.c
index 915aba8..271389c 100644
--- a/lib/message-file.c
+++ b/lib/message-file.c
@@ -86,7 +86,7 @@ _notmuch_message_file_destructor (notmuch_message_file_t *message)
 	g_hash_table_destroy (message->headers);
 
     if (message->file)
-	fclose (message->file);
+	notmuch_mailstore_close (message->file);
 
     return 0;
 }
@@ -104,7 +104,7 @@ _notmuch_message_file_open_ctx (void *ctx, const char *filename)
 
     talloc_set_destructor (message, _notmuch_message_file_destructor);
 
-    message->file = fopen (filename, "r");
+    message->file = notmuch_mailstore_open (filename);
     if (message->file == NULL)
 	goto FAIL;
 
@@ -361,7 +361,7 @@ notmuch_message_file_get_header (notmuch_message_file_t *message,
     }
 
     if (message->parsing_finished) {
-        fclose (message->file);
+        notmuch_mailstore_close (message->file);
         message->file = NULL;
     }
 
diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h
index bfb4111..ef1f994 100644
--- a/lib/notmuch-private.h
+++ b/lib/notmuch-private.h
@@ -466,7 +466,7 @@ char *
 notmuch_sha1_of_string (const char *str);
 
 char *
-notmuch_sha1_of_file (const char *filename);
+notmuch_sha1_of_message (const char *filename);
 
 /* string-list.c */
 
diff --git a/lib/notmuch.h b/lib/notmuch.h
index 3633bed..0ca367b 100644
--- a/lib/notmuch.h
+++ b/lib/notmuch.h
@@ -1233,6 +1233,22 @@ notmuch_message_thaw (notmuch_message_t *message);
 void
 notmuch_message_destroy (notmuch_message_t *message);
 
+/* Get access to the underlying data of a message.
+ *
+ * Right now, all messages are simple files in maildirs, but this is
+ * hopefully subject to change.
+ *
+ * If the returned FILE* is not NULL, be sure to call
+ * notmuch_mailstore_close when you're done with it.
+ */
+FILE *
+notmuch_mailstore_open (const char *filename);
+
+/* Release any resources associated with this file.
+ */
+int
+notmuch_mailstore_close (FILE *file);
+
 /* Is the given 'tags' iterator pointing at a valid tag.
  *
  * When this function returns TRUE, notmuch_tags_get will return a
diff --git a/lib/sha1.c b/lib/sha1.c
index cc48108..f4d213a 100644
--- a/lib/sha1.c
+++ b/lib/sha1.c
@@ -65,7 +65,7 @@ notmuch_sha1_of_string (const char *str)
 }
 
 /* Create a hexadecimal string version of the SHA-1 digest of the
- * contents of the named file.
+ * contents of the named message.
  *
  * This function returns a newly allocated string which the caller
  * should free() when finished.
@@ -74,7 +74,7 @@ notmuch_sha1_of_string (const char *str)
  * file not found, etc.), this function returns NULL.
  */
 char *
-notmuch_sha1_of_file (const char *filename)
+notmuch_sha1_of_message (const char *filename)
 {
     FILE *file;
 #define BLOCK_SIZE 4096
@@ -84,14 +84,14 @@ notmuch_sha1_of_file (const char *filename)
     unsigned char digest[SHA1_DIGEST_SIZE];
     char *result;
 
-    file = fopen (filename, "r");
+    file = notmuch_mailstore_open (filename);
     if (file == NULL)
 	return NULL;
 
     sha1_begin (&sha1);
 
     while (1) {
-	bytes_read = fread (block, 1, 4096, file);
+	bytes_read = fread (block, 1, BLOCK_SIZE, file);
 	if (bytes_read == 0) {
 	    if (feof (file)) {
 		break;
@@ -108,8 +108,7 @@ notmuch_sha1_of_file (const char *filename)
 
     result = _hex_of_sha1_digest (digest);
 
-    fclose (file);
+    notmuch_mailstore_close (file);
 
     return result;
 }
-
diff --git a/mime-node.c b/mime-node.c
index 97e8b48..a5c60d0 100644
--- a/mime-node.c
+++ b/mime-node.c
@@ -49,7 +49,7 @@ _mime_node_context_free (mime_node_context_t *res)
 	g_object_unref (res->stream);
 
     if (res->file)
-	fclose (res->file);
+	notmuch_mailstore_close (res->file);
 
     return 0;
 }
@@ -79,7 +79,7 @@ mime_node_open (const void *ctx, notmuch_message_t *message,
     }
     talloc_set_destructor (mctx, _mime_node_context_free);
 
-    mctx->file = fopen (filename, "r");
+    mctx->file = notmuch_mailstore_open (filename);
     if (! mctx->file) {
 	fprintf (stderr, "Error opening %s: %s\n", filename, strerror (errno));
 	status = NOTMUCH_STATUS_FILE_ERROR;
diff --git a/notmuch-show.c b/notmuch-show.c
index 8f3c60e..daffb59 100644
--- a/notmuch-show.c
+++ b/notmuch-show.c
@@ -693,7 +693,7 @@ format_part_mbox (const void *ctx, mime_node_t *node, unused (int indent),
 	INTERNAL_ERROR ("format_part_mbox requires a root part");
 
     filename = notmuch_message_get_filename (message);
-    file = fopen (filename, "r");
+    file = notmuch_mailstore_open (filename);
     if (file == NULL) {
 	fprintf (stderr, "Failed to open %s: %s\n",
 		 filename, strerror (errno));
@@ -717,7 +717,7 @@ format_part_mbox (const void *ctx, mime_node_t *node, unused (int indent),
 
     printf ("\n");
 
-    fclose (file);
+    notmuch_mailstore_close (file);
 
     return NOTMUCH_STATUS_SUCCESS;
 }
@@ -740,7 +740,7 @@ format_part_raw (unused (const void *ctx), mime_node_t *node,
 	    return NOTMUCH_STATUS_FILE_ERROR;
 	}
 
-	file = fopen (filename, "r");
+	file = notmuch_mailstore_open (filename);
 	if (file == NULL) {
 	    fprintf (stderr, "Error: Cannot open file %s: %s\n", filename, strerror (errno));
 	    return NOTMUCH_STATUS_FILE_ERROR;
@@ -750,18 +750,18 @@ format_part_raw (unused (const void *ctx), mime_node_t *node,
 	    size = fread (buf, 1, sizeof (buf), file);
 	    if (ferror (file)) {
 		fprintf (stderr, "Error: Read failed from %s\n", filename);
-		fclose (file);
+		notmuch_mailstore_close (file);
 		return NOTMUCH_STATUS_FILE_ERROR;
 	    }
 
 	    if (fwrite (buf, size, 1, stdout) != 1) {
 		fprintf (stderr, "Error: Write failed\n");
-		fclose (file);
+		notmuch_mailstore_close (file);
 		return NOTMUCH_STATUS_FILE_ERROR;
 	    }
 	}
 
-	fclose (file);
+	notmuch_mailstore_close (file);
 	return NOTMUCH_STATUS_SUCCESS;
     }
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 2/8] Introduce uriparser
  2012-07-01 16:39         ` [PATCH v2 0/8] URI-based modular mail stores Ethan Glasser-Camp
  2012-07-01 16:39           ` [PATCH v2 1/8] All access to mail files goes through the mailstore module Ethan Glasser-Camp
@ 2012-07-01 16:39           ` Ethan Glasser-Camp
  2012-07-01 16:39           ` [PATCH v2 3/8] Not all filenames need to be converted to absolute paths Ethan Glasser-Camp
                             ` (5 subsequent siblings)
  7 siblings, 0 replies; 31+ messages in thread
From: Ethan Glasser-Camp @ 2012-07-01 16:39 UTC (permalink / raw)
  To: notmuch; +Cc: Ethan Glasser-Camp

Seeing as there is no glib-standard way to parse URIs, an external
library is needed. This commit introduces another program in compat/
and a stanza in ./configure to test if uriparser is there.

Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
---
 Makefile.local          |    2 +-
 compat/have_uriparser.c |   17 +++++++++++++++++
 configure               |   23 ++++++++++++++++++++---
 3 files changed, 38 insertions(+), 4 deletions(-)
 create mode 100644 compat/have_uriparser.c

diff --git a/Makefile.local b/Makefile.local
index a890df2..084f44e 100644
--- a/Makefile.local
+++ b/Makefile.local
@@ -41,7 +41,7 @@ PV_FILE=bindings/python/notmuch/version.py
 # Smash together user's values with our extra values
 FINAL_CFLAGS = -DNOTMUCH_VERSION=$(VERSION) $(CFLAGS) $(WARN_CFLAGS) $(CONFIGURE_CFLAGS) $(extra_cflags)
 FINAL_CXXFLAGS = $(CXXFLAGS) $(WARN_CXXFLAGS) $(CONFIGURE_CXXFLAGS) $(extra_cflags) $(extra_cxxflags)
-FINAL_NOTMUCH_LDFLAGS = $(LDFLAGS) -Lutil -lutil -Llib -lnotmuch $(AS_NEEDED_LDFLAGS) $(GMIME_LDFLAGS) $(TALLOC_LDFLAGS)
+FINAL_NOTMUCH_LDFLAGS = $(LDFLAGS) -Lutil -lutil -Llib -lnotmuch $(AS_NEEDED_LDFLAGS) $(GMIME_LDFLAGS) $(TALLOC_LDFLAGS) $(URIPARSER_LDFLAGS)
 FINAL_NOTMUCH_LINKER = CC
 ifneq ($(LINKER_RESOLVES_LIBRARY_DEPENDENCIES),1)
 FINAL_NOTMUCH_LDFLAGS += $(CONFIGURE_LDFLAGS)
diff --git a/compat/have_uriparser.c b/compat/have_uriparser.c
new file mode 100644
index 0000000..d79e51d
--- /dev/null
+++ b/compat/have_uriparser.c
@@ -0,0 +1,17 @@
+#include <uriparser/Uri.h>
+
+int
+main (int argc, char *argv[])
+{
+    UriParserStateA state;
+    UriUriA uri;
+    char *uriS = NULL;
+
+    state.uri = &uri;
+    if (uriParseUriA (&state, uriS) != URI_SUCCESS) {
+        /* Failure */
+        uriFreeUriMembersA (&uri);
+    }
+
+    return 0;
+}
diff --git a/configure b/configure
index 3fad424..80aa13c 100755
--- a/configure
+++ b/configure
@@ -313,6 +313,19 @@ else
     errors=$((errors + 1))
 fi
 
+printf "Checking for uriparser... "
+if ${CC} -o compat/have_uriparser "$srcdir"/compat/have_uriparser.c -luriparser > /dev/null 2>&1
+then
+    printf "Yes.\n"
+    uriparser_ldflags="-luriparser"
+    have_uriparser=1
+else
+    printf "No.\n"
+    have_uriparser=0
+    errors=$((errors + 1))
+fi
+rm -f compat/have_uriparser
+
 printf "Checking for valgrind development files... "
 if pkg-config --exists valgrind; then
     printf "Yes.\n"
@@ -431,11 +444,11 @@ case a simple command will install everything you need. For example:
 
 On Debian and similar systems:
 
-	sudo apt-get install libxapian-dev libgmime-2.6-dev libtalloc-dev
+	sudo apt-get install libxapian-dev libgmime-2.6-dev libtalloc-dev liburiparser-dev
 
 Or on Fedora and similar systems:
 
-	sudo yum install xapian-core-devel gmime-devel libtalloc-devel
+	sudo yum install xapian-core-devel gmime-devel libtalloc-devel uriparser-devel
 
 On other systems, similar commands can be used, but the details of the
 package names may be different.
@@ -669,6 +682,9 @@ GMIME_LDFLAGS = ${gmime_ldflags}
 TALLOC_CFLAGS = ${talloc_cflags}
 TALLOC_LDFLAGS = ${talloc_ldflags}
 
+# Flags needed to link against uriparser
+URIPARSER_LDFLAGS = ${uriparser_ldflags}
+
 # Flags needed to have linker set rpath attribute
 RPATH_LDFLAGS = ${rpath_ldflags}
 
@@ -698,5 +714,6 @@ CONFIGURE_CXXFLAGS = -DHAVE_GETLINE=\$(HAVE_GETLINE) \$(GMIME_CFLAGS)    \\
 		     \$(TALLOC_CFLAGS) -DHAVE_VALGRIND=\$(HAVE_VALGRIND) \\
 		     \$(VALGRIND_CFLAGS) \$(XAPIAN_CXXFLAGS)             \\
                      -DHAVE_STRCASESTR=\$(HAVE_STRCASESTR)
-CONFIGURE_LDFLAGS =  \$(GMIME_LDFLAGS) \$(TALLOC_LDFLAGS) \$(XAPIAN_LDFLAGS)
+CONFIGURE_LDFLAGS =  \$(GMIME_LDFLAGS) \$(TALLOC_LDFLAGS) \$(XAPIAN_LDFLAGS) \\
+                     \$(URIPARSER_LDFLAGS)
 EOF
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 3/8] Not all filenames need to be converted to absolute paths
  2012-07-01 16:39         ` [PATCH v2 0/8] URI-based modular mail stores Ethan Glasser-Camp
  2012-07-01 16:39           ` [PATCH v2 1/8] All access to mail files goes through the mailstore module Ethan Glasser-Camp
  2012-07-01 16:39           ` [PATCH v2 2/8] Introduce uriparser Ethan Glasser-Camp
@ 2012-07-01 16:39           ` Ethan Glasser-Camp
  2012-07-01 16:39           ` [PATCH v2 4/8] new: add "scan" option Ethan Glasser-Camp
                             ` (4 subsequent siblings)
  7 siblings, 0 replies; 31+ messages in thread
From: Ethan Glasser-Camp @ 2012-07-01 16:39 UTC (permalink / raw)
  To: notmuch; +Cc: Ethan Glasser-Camp

_notmuch_message_ensure_filename_list converts "relative" paths, such
as those stored in Xapian until now, to "absolute" paths. However,
URLs are already absolute, and prepending the database path will just
confuse matters.

Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
---
 lib/message.cc |   20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/lib/message.cc b/lib/message.cc
index 978de06..235185c 100644
--- a/lib/message.cc
+++ b/lib/message.cc
@@ -21,6 +21,7 @@
 #include "notmuch-private.h"
 #include "database-private.h"
 
+#include <uriparser/Uri.h>
 #include <stdint.h>
 
 #include <gmime/gmime.h>
@@ -682,6 +683,8 @@ _notmuch_message_ensure_filename_list (notmuch_message_t *message)
 	const char *db_path, *directory, *basename, *filename;
 	char *colon, *direntry = NULL;
 	unsigned int directory_id;
+	UriUriA parsed;
+	UriParserStateA parser;
 
 	direntry = node->string;
 
@@ -700,9 +703,20 @@ _notmuch_message_ensure_filename_list (notmuch_message_t *message)
 							  message->notmuch,
 							  directory_id);
 
-	if (strlen (directory))
-	    filename = talloc_asprintf (message, "%s/%s/%s",
-					db_path, directory, basename);
+	parser.uri = &parsed;
+
+	if (strlen (directory)) {
+	    /* If directory is a URI, we don't need to append the db_path;
+	     * it is already an absolute path. */
+	    if (uriParseUriA (&parser, directory) != URI_SUCCESS ||
+		parsed.scheme.first == NULL)
+		filename = talloc_asprintf (message, "%s/%s/%s",
+					    db_path, directory, basename);
+	    else
+		filename = talloc_asprintf (message, "%s/%s",
+					    directory, basename);
+	    uriFreeUriMembersA (&parsed);
+	}
 	else
 	    filename = talloc_asprintf (message, "%s/%s",
 					db_path, basename);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 4/8] new: add "scan" option
  2012-07-01 16:39         ` [PATCH v2 0/8] URI-based modular mail stores Ethan Glasser-Camp
                             ` (2 preceding siblings ...)
  2012-07-01 16:39           ` [PATCH v2 3/8] Not all filenames need to be converted to absolute paths Ethan Glasser-Camp
@ 2012-07-01 16:39           ` Ethan Glasser-Camp
  2012-07-01 16:39           ` [PATCH v2 5/8] notmuch-new: pull out useful bits of add_files_recursive Ethan Glasser-Camp
                             ` (3 subsequent siblings)
  7 siblings, 0 replies; 31+ messages in thread
From: Ethan Glasser-Camp @ 2012-07-01 16:39 UTC (permalink / raw)
  To: notmuch; +Cc: Ethan Glasser-Camp

The new.scan option is a list of URLs that can be "scanned".

The fact that the database maildir is scanned "automagically" is a
little weird, but it doesn't do any harm unless you decide to put mail
there that you really don't want indexed.

Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
---
I added text here to respond to Mark Walters's confusion regarding how
I implemented scanning.

 notmuch-client.h |    9 +++++++++
 notmuch-config.c |   33 ++++++++++++++++++++++++++++++++-
 notmuch-new.c    |   18 ++++++++++++++++++
 test/config      |    1 +
 4 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/notmuch-client.h b/notmuch-client.h
index 0c17b79..20349b2 100644
--- a/notmuch-client.h
+++ b/notmuch-client.h
@@ -257,6 +257,15 @@ notmuch_config_set_new_ignore (notmuch_config_t *config,
 			       const char *new_ignore[],
 			       size_t length);
 
+const char **
+notmuch_config_get_new_scan (notmuch_config_t *config,
+			       size_t *length);
+
+void
+notmuch_config_set_new_scan (notmuch_config_t *config,
+			       const char *new_scan[],
+			       size_t length);
+
 notmuch_bool_t
 notmuch_config_get_maildir_synchronize_flags (notmuch_config_t *config);
 
diff --git a/notmuch-config.c b/notmuch-config.c
index 3e37a2d..387f855 100644
--- a/notmuch-config.c
+++ b/notmuch-config.c
@@ -50,7 +50,13 @@ static const char new_config_comment[] =
     "\t	that will not be searched for messages by \"notmuch new\".\n"
     "\n"
     "\t	NOTE: *Every* file/directory that goes by one of those names will\n"
-    "\t	be ignored, independent of its depth/location in the mail store.\n";
+    "\t	be ignored, independent of its depth/location in the mail store.\n"
+    "\n"
+    "\tscan	A list (separated by ';') of mail URLs to scan.\n"
+    "\t	Each URL denotes a \"root\" which will be searched for mail files.\n"
+    "\t	How this search is performed depends on the scheme of the URL (the\n"
+    "\t	part before the first colon).\n"
+    "\t	The maildir located at database.path, above, will automatically be added.\n";
 
 static const char user_config_comment[] =
     " User configuration\n"
@@ -113,6 +119,8 @@ struct _notmuch_config {
     size_t new_tags_length;
     const char **new_ignore;
     size_t new_ignore_length;
+    const char **new_scan;
+    size_t new_scan_length;
     notmuch_bool_t maildir_synchronize_flags;
     const char **search_exclude_tags;
     size_t search_exclude_tags_length;
@@ -274,6 +282,8 @@ notmuch_config_open (void *ctx,
     config->new_tags_length = 0;
     config->new_ignore = NULL;
     config->new_ignore_length = 0;
+    config->new_scan = NULL;
+    config->new_scan_length = 0;
     config->maildir_synchronize_flags = TRUE;
     config->search_exclude_tags = NULL;
     config->search_exclude_tags_length = 0;
@@ -375,6 +385,10 @@ notmuch_config_open (void *ctx,
 	notmuch_config_set_new_ignore (config, NULL, 0);
     }
 
+    if (notmuch_config_get_new_scan (config, &tmp) == NULL) {
+	notmuch_config_set_new_scan (config, NULL, 0);
+    }
+
     if (notmuch_config_get_search_exclude_tags (config, &tmp) == NULL) {
 	if (is_new) {
 	    const char *tags[] = { "deleted", "spam" };
@@ -631,6 +645,14 @@ notmuch_config_get_new_ignore (notmuch_config_t *config, size_t *length)
 			     &(config->new_ignore_length), length);
 }
 
+const char **
+notmuch_config_get_new_scan (notmuch_config_t *config, size_t *length)
+{
+    return _config_get_list (config, "new", "scan",
+			     &(config->new_scan),
+			     &(config->new_scan_length), length);
+}
+
 void
 notmuch_config_set_user_other_email (notmuch_config_t *config,
 				     const char *list[],
@@ -658,6 +680,15 @@ notmuch_config_set_new_ignore (notmuch_config_t *config,
 		     &(config->new_ignore));
 }
 
+void
+notmuch_config_set_new_scan (notmuch_config_t *config,
+			     const char *list[],
+			     size_t length)
+{
+    _config_set_list (config, "new", "scan", list, length,
+		     &(config->new_scan));
+}
+
 const char **
 notmuch_config_get_search_exclude_tags (notmuch_config_t *config, size_t *length)
 {
diff --git a/notmuch-new.c b/notmuch-new.c
index 938ae29..8377750 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -607,6 +607,16 @@ add_files (notmuch_database_t *notmuch,
     return ret;
 }
 
+/* Call out to the appropriate add_files function, based on the URI. */
+static notmuch_status_t
+add_files_uri (unused(notmuch_database_t *notmuch),
+	       unused(const char *uri),
+	       unused(add_files_state_t *state))
+{
+    /* Stub for now */
+    return NOTMUCH_STATUS_SUCCESS;
+}
+
 static void
 setup_progress_printing_timer (void)
 {
@@ -832,6 +842,8 @@ notmuch_new_command (void *ctx, int argc, char *argv[])
     int ret = 0;
     struct stat st;
     const char *db_path;
+    const char **new_scan;
+    size_t new_scan_length, new_scan_i;
     char *dot_notmuch_path;
     struct sigaction action;
     _filename_node_t *f;
@@ -930,6 +942,12 @@ notmuch_new_command (void *ctx, int argc, char *argv[])
 	timer_is_active = TRUE;
     }
 
+    new_scan = notmuch_config_get_new_scan (config, &new_scan_length);
+
+    for (new_scan_i = 0; new_scan_i < new_scan_length; new_scan_i++) {
+	add_files_uri (notmuch, new_scan[new_scan_i], &add_files_state);
+    }
+
     ret = add_files (notmuch, db_path, &add_files_state);
     if (ret)
 	goto DONE;
diff --git a/test/config b/test/config
index 93ecb13..b0ad0c1 100755
--- a/test/config
+++ b/test/config
@@ -52,6 +52,7 @@ user.primary_email=test_suite@notmuchmail.org
 user.other_email=test_suite_other@notmuchmail.org;test_suite@otherdomain.org
 new.tags=unread;inbox;
 new.ignore=
+new.scan=
 search.exclude_tags=
 maildir.synchronize_flags=true
 foo.string=this is another string value
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 5/8] notmuch-new: pull out useful bits of add_files_recursive
  2012-07-01 16:39         ` [PATCH v2 0/8] URI-based modular mail stores Ethan Glasser-Camp
                             ` (3 preceding siblings ...)
  2012-07-01 16:39           ` [PATCH v2 4/8] new: add "scan" option Ethan Glasser-Camp
@ 2012-07-01 16:39           ` Ethan Glasser-Camp
  2012-07-01 19:55             ` Mark Walters
  2012-07-01 16:39           ` [PATCH v2 6/8] mailstore: support for mbox:// URIs Ethan Glasser-Camp
                             ` (2 subsequent siblings)
  7 siblings, 1 reply; 31+ messages in thread
From: Ethan Glasser-Camp @ 2012-07-01 16:39 UTC (permalink / raw)
  To: notmuch; +Cc: Ethan Glasser-Camp

This patch pulls some bits out of add_files_recursive which will be
useful for other backends: two reporting functions
_report_before_adding_file and _report_added_file, as well as
_add_message, which actually does the message adding.

No functional changes.

Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
---
 notmuch-new.c |  192 +++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 119 insertions(+), 73 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index 8377750..5250562 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -239,6 +239,122 @@ _entry_in_ignore_list (const char *entry, add_files_state_t *state)
     return FALSE;
 }
 
+/* Progress-reporting function.
+ *
+ * Can be used by any mailstore-crawling function that wants to alert
+ * users what message it's about to add. Subsequent errors will be due
+ * to this message ;)
+ */
+static void
+_report_before_adding_file (add_files_state_t *state, const char *filename)
+{
+    state->processed_files++;
+
+    if (state->verbose) {
+	if (state->output_is_a_tty)
+	    printf("\r\033[K");
+
+	printf ("%i/%i: %s",
+		state->processed_files,
+		state->total_files,
+		filename);
+
+	putchar((state->output_is_a_tty) ? '\r' : '\n');
+	fflush (stdout);
+    }
+}
+
+/* Progress-reporting function.
+ *
+ * Call this to respond to the signal handler for SIGALRM.
+ */
+static void
+_report_added_file (add_files_state_t *state)
+{
+    if (do_print_progress) {
+	do_print_progress = 0;
+	generic_print_progress ("Processed", "files", state->tv_start,
+				state->processed_files, state->total_files);
+    }
+}
+
+
+/* Atomically handles adding a message to the database.
+ *
+ * Should be used by any mailstore-crawling function that finds a new
+ * message to add.
+ */
+static notmuch_status_t
+_add_message (add_files_state_t *state, notmuch_database_t *notmuch,
+	      const char *filename)
+{
+    notmuch_status_t status, ret = NOTMUCH_STATUS_SUCCESS;
+    notmuch_message_t *message;
+    const char **tag;
+
+    status = notmuch_database_begin_atomic (notmuch);
+    if (status) {
+	ret = status;
+	goto DONE;
+    }
+
+    status = notmuch_database_add_message (notmuch, filename, &message);
+
+    switch (status) {
+    /* success */
+    case NOTMUCH_STATUS_SUCCESS:
+	state->added_messages++;
+	notmuch_message_freeze (message);
+	for (tag=state->new_tags; *tag != NULL; tag++)
+	    notmuch_message_add_tag (message, *tag);
+	if (state->synchronize_flags == TRUE)
+	    notmuch_message_maildir_flags_to_tags (message);
+	notmuch_message_thaw (message);
+	break;
+    /* Non-fatal issues (go on to next file) */
+    case NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID:
+	if (state->synchronize_flags == TRUE)
+	    notmuch_message_maildir_flags_to_tags (message);
+	break;
+    case NOTMUCH_STATUS_FILE_NOT_EMAIL:
+	fprintf (stderr, "Note: Ignoring non-mail file: %s\n",
+		 filename);
+	break;
+    /* Fatal issues. Don't process anymore. */
+    case NOTMUCH_STATUS_READ_ONLY_DATABASE:
+    case NOTMUCH_STATUS_XAPIAN_EXCEPTION:
+    case NOTMUCH_STATUS_OUT_OF_MEMORY:
+	fprintf (stderr, "Error: %s. Halting processing.\n",
+		 notmuch_status_to_string (status));
+	ret = status;
+	goto DONE;
+    default:
+    case NOTMUCH_STATUS_FILE_ERROR:
+    case NOTMUCH_STATUS_NULL_POINTER:
+    case NOTMUCH_STATUS_TAG_TOO_LONG:
+    case NOTMUCH_STATUS_UNBALANCED_FREEZE_THAW:
+    case NOTMUCH_STATUS_UNBALANCED_ATOMIC:
+    case NOTMUCH_STATUS_LAST_STATUS:
+	INTERNAL_ERROR ("add_message returned unexpected value: %d",  status);
+	ret = status;
+	goto DONE;
+    }
+
+    status = notmuch_database_end_atomic (notmuch);
+    if (status) {
+	ret = status;
+	goto DONE;
+    }
+
+  DONE:
+    if (message) {
+	notmuch_message_destroy (message);
+	message = NULL;
+    }
+
+    return ret;
+}
+
 /* Examine 'path' recursively as follows:
  *
  *   o Ask the filesystem for the mtime of 'path' (fs_mtime)
@@ -290,7 +406,6 @@ add_files (notmuch_database_t *notmuch,
     char *next = NULL;
     time_t fs_mtime, db_mtime;
     notmuch_status_t status, ret = NOTMUCH_STATUS_SUCCESS;
-    notmuch_message_t *message = NULL;
     struct dirent **fs_entries = NULL;
     int i, num_fs_entries = 0, entry_type;
     notmuch_directory_t *directory;
@@ -299,7 +414,6 @@ add_files (notmuch_database_t *notmuch,
     time_t stat_time;
     struct stat st;
     notmuch_bool_t is_maildir;
-    const char **tag;
 
     if (stat (path, &st)) {
 	fprintf (stderr, "Error reading directory %s: %s\n",
@@ -469,83 +583,15 @@ add_files (notmuch_database_t *notmuch,
 	 * in the database, so add it. */
 	next = talloc_asprintf (notmuch, "%s/%s", path, entry->d_name);
 
-	state->processed_files++;
-
-	if (state->verbose) {
-	    if (state->output_is_a_tty)
-		printf("\r\033[K");
-
-	    printf ("%i/%i: %s",
-		    state->processed_files,
-		    state->total_files,
-		    next);
-
-	    putchar((state->output_is_a_tty) ? '\r' : '\n');
-	    fflush (stdout);
-	}
+	_report_before_adding_file (state, next);
 
-	status = notmuch_database_begin_atomic (notmuch);
+	status = _add_message (state, notmuch, next);
 	if (status) {
 	    ret = status;
 	    goto DONE;
 	}
 
-	status = notmuch_database_add_message (notmuch, next, &message);
-	switch (status) {
-	/* success */
-	case NOTMUCH_STATUS_SUCCESS:
-	    state->added_messages++;
-	    notmuch_message_freeze (message);
-	    for (tag=state->new_tags; *tag != NULL; tag++)
-	        notmuch_message_add_tag (message, *tag);
-	    if (state->synchronize_flags == TRUE)
-		notmuch_message_maildir_flags_to_tags (message);
-	    notmuch_message_thaw (message);
-	    break;
-	/* Non-fatal issues (go on to next file) */
-	case NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID:
-	    if (state->synchronize_flags == TRUE)
-		notmuch_message_maildir_flags_to_tags (message);
-	    break;
-	case NOTMUCH_STATUS_FILE_NOT_EMAIL:
-	    fprintf (stderr, "Note: Ignoring non-mail file: %s\n",
-		     next);
-	    break;
-	/* Fatal issues. Don't process anymore. */
-	case NOTMUCH_STATUS_READ_ONLY_DATABASE:
-	case NOTMUCH_STATUS_XAPIAN_EXCEPTION:
-	case NOTMUCH_STATUS_OUT_OF_MEMORY:
-	    fprintf (stderr, "Error: %s. Halting processing.\n",
-		     notmuch_status_to_string (status));
-	    ret = status;
-	    goto DONE;
-	default:
-	case NOTMUCH_STATUS_FILE_ERROR:
-	case NOTMUCH_STATUS_NULL_POINTER:
-	case NOTMUCH_STATUS_TAG_TOO_LONG:
-	case NOTMUCH_STATUS_UNBALANCED_FREEZE_THAW:
-	case NOTMUCH_STATUS_UNBALANCED_ATOMIC:
-	case NOTMUCH_STATUS_LAST_STATUS:
-	    INTERNAL_ERROR ("add_message returned unexpected value: %d",  status);
-	    goto DONE;
-	}
-
-	status = notmuch_database_end_atomic (notmuch);
-	if (status) {
-	    ret = status;
-	    goto DONE;
-	}
-
-	if (message) {
-	    notmuch_message_destroy (message);
-	    message = NULL;
-	}
-
-	if (do_print_progress) {
-	    do_print_progress = 0;
-	    generic_print_progress ("Processed", "files", state->tv_start,
-				    state->processed_files, state->total_files);
-	}
+	_report_added_file (state);
 
 	talloc_free (next);
 	next = NULL;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 6/8] mailstore: support for mbox:// URIs
  2012-07-01 16:39         ` [PATCH v2 0/8] URI-based modular mail stores Ethan Glasser-Camp
                             ` (4 preceding siblings ...)
  2012-07-01 16:39           ` [PATCH v2 5/8] notmuch-new: pull out useful bits of add_files_recursive Ethan Glasser-Camp
@ 2012-07-01 16:39           ` Ethan Glasser-Camp
  2012-07-01 16:39           ` [PATCH v2 7/8] Tests for mbox support Ethan Glasser-Camp
  2012-07-01 16:39           ` [PATCH v2 8/8] new: Add scan support for mbox:// URIs Ethan Glasser-Camp
  7 siblings, 0 replies; 31+ messages in thread
From: Ethan Glasser-Camp @ 2012-07-01 16:39 UTC (permalink / raw)
  To: notmuch; +Cc: Ethan Glasser-Camp

No code uses this yet.

Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
---
 lib/mailstore.c |  117 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 116 insertions(+), 1 deletion(-)

diff --git a/lib/mailstore.c b/lib/mailstore.c
index 48acd47..a29d734 100644
--- a/lib/mailstore.c
+++ b/lib/mailstore.c
@@ -17,18 +17,133 @@
  *
  * Author: Carl Worth <cworth@cworth.org>
  */
+#include <uriparser/Uri.h>
 #include <stdio.h>
+#include <glib.h>
 
 #include "notmuch-private.h"
 
+static FILE *
+notmuch_mailstore_basic_open (const char *filename)
+{
+    return fopen (filename, "r");
+}
+
+/* Since we have to return a FILE*, we use fmemopen to turn buffers
+ * into FILE* streams. But when we close these streams, we have to
+ * free() the buffers. Use a hash to associate the two.
+ */
+static GHashTable *_mbox_files_to_strings = NULL;
+
+static void
+_ensure_mbox_files_to_strings () {
+    if (_mbox_files_to_strings == NULL)
+        _mbox_files_to_strings = g_hash_table_new (NULL, NULL);
+}
+
+static FILE *
+notmuch_mailstore_mbox_open (UriUriA *uri)
+{
+    FILE *ret = NULL, *mbox = NULL;
+    char *filename, *message, *length_s;
+    const char *error;
+    long int offset, length, this_read;
+    _ensure_mbox_files_to_strings ();
+
+    offset = strtol (uri->fragment.first, &length_s, 10);
+    length = strtol (length_s+1, NULL, 10);
+
+    filename = talloc_strndup (NULL, uri->pathHead->text.first-1,
+                               uri->pathTail->text.afterLast-uri->pathHead->text.first+1);
+
+    if (filename == NULL)
+        goto DONE;
+
+    mbox = fopen (filename, "r");
+    if (mbox == NULL) {
+        fprintf (stderr, "Couldn't open message %s: %s.\n", uri->scheme.first,
+                 strerror (errno));
+        goto DONE;
+    }
+
+    message = talloc_array (NULL, char, length);
+    fseek (mbox, offset, SEEK_SET);
+
+    this_read = fread (message, sizeof(char), length, mbox);
+    if (this_read != length) {
+        if (feof (mbox))
+            error = "end of file reached";
+        if (ferror (mbox))
+            error = strerror (ferror (mbox));
+
+        fprintf (stderr, "Couldn't read message %s: %s.\n", uri->scheme.first, error);
+        goto DONE;
+    }
+
+    ret = fmemopen (message, length, "r");
+    if (ret == NULL) {
+        /* No fclose will ever be called, so let's free message now */
+        talloc_free (message);
+        goto DONE;
+    }
+
+    g_hash_table_insert (_mbox_files_to_strings, ret, message);
+DONE:
+    if (filename)
+        talloc_free (filename);
+    if (mbox)
+        fclose (mbox);
+
+    return ret;
+}
+
 FILE *
 notmuch_mailstore_open (const char *filename)
 {
-    return fopen (filename, "r");
+    FILE *ret = NULL;
+    UriUriA parsed;
+    UriParserStateA state;
+    state.uri = &parsed;
+    if (uriParseUriA (&state, filename) != URI_SUCCESS) {
+        /* Failure. Fall back to fopen and hope for the best. */
+        ret = notmuch_mailstore_basic_open (filename);
+        goto DONE;
+    }
+
+    if (parsed.scheme.first == NULL) {
+        /* No scheme. Probably not really a URL but just an ordinary
+         * filename, such as a Maildir message. */
+        ret = notmuch_mailstore_basic_open (filename);
+        goto DONE;
+    }
+
+    if (0 == strncmp (parsed.scheme.first, "mbox",
+                      parsed.scheme.afterLast-parsed.scheme.first)) {
+        /* mbox URI of the form mbox:///path/to/file#offset+length.
+         * Just pass the parsed URI. */
+        ret = notmuch_mailstore_mbox_open (&parsed);
+        goto DONE;
+    }
+
+    /* Default case */
+    fprintf (stderr, "Error: Could not open URI %s: unknown scheme.\n",
+             filename);
+
+DONE:
+    uriFreeUriMembersA (&parsed);
+    return ret;
 }
 
 int
 notmuch_mailstore_close (FILE *file)
 {
+    char *file_buffer;
+    if (_mbox_files_to_strings != NULL) {
+        file_buffer = g_hash_table_lookup (_mbox_files_to_strings, file);
+        if (file_buffer != NULL)
+            talloc_free (file_buffer);
+
+        g_hash_table_remove (_mbox_files_to_strings, file);
+    }
     return fclose (file);
 }
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 7/8] Tests for mbox support
  2012-07-01 16:39         ` [PATCH v2 0/8] URI-based modular mail stores Ethan Glasser-Camp
                             ` (5 preceding siblings ...)
  2012-07-01 16:39           ` [PATCH v2 6/8] mailstore: support for mbox:// URIs Ethan Glasser-Camp
@ 2012-07-01 16:39           ` Ethan Glasser-Camp
  2012-07-01 16:39           ` [PATCH v2 8/8] new: Add scan support for mbox:// URIs Ethan Glasser-Camp
  7 siblings, 0 replies; 31+ messages in thread
From: Ethan Glasser-Camp @ 2012-07-01 16:39 UTC (permalink / raw)
  To: notmuch; +Cc: Ethan Glasser-Camp

These are simple tests of one single mbox.

Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
---
The test uses bash arrays, which have a slightly odd syntax for
appending.

 test/mbox         |   63 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 test/notmuch-test |    1 +
 2 files changed, 64 insertions(+)
 create mode 100755 test/mbox

diff --git a/test/mbox b/test/mbox
new file mode 100755
index 0000000..a6dab99
--- /dev/null
+++ b/test/mbox
@@ -0,0 +1,63 @@
+#!/usr/bin/env bash
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+
+test_description='basic mbox support'
+. ./test-lib.sh
+
+mkdir -p $MAIL_DIR/some-mboxes/subdir $MAIL_DIR/database $MAIL_DIR/corpus
+
+# The Content-Length headers here include the final newline (added later).
+generate_message '[body]="Mbox message 1."' '[header]="Content-Length: 16"' "[dir]=corpus"
+generate_message '[body]="Mbox message 2. Longer."' '[header]="Content-Length: 24"' "[dir]=corpus"
+generate_message '[body]="Mbox message 3."' "[dir]=corpus"
+generate_message '[body]="Mbox message 4."' "[dir]=corpus"
+generate_message '[body]="Mbox message 5. Last message."' '[header]="Content-Length: 30"' "[dir]=corpus"
+
+MBOX1=$MAIL_DIR/some-mboxes/first.mbox
+declare -a starts lengths
+
+for x in $MAIL_DIR/corpus/*; do
+    echo "From MAILER-DAEMON Sat Jan  3 01:05:34 1996" >> $MBOX1
+    starts+=(`wc -c $MBOX1 | cut -f 1 -d ' '`)
+    cat $x >> $MBOX1
+    lengths+=(`wc -c $x | cut -f 1 -d ' '`)
+    # Final newline
+    echo >> $MBOX1
+done
+
+notmuch config set database.path $MAIL_DIR/database
+notmuch config set new.scan mbox://$MAIL_DIR/some-mboxes
+
+test_begin_subtest "read a small mbox (5 messages)"
+output=$(NOTMUCH_NEW)
+test_expect_equal "$output" "Added 5 new messages to the database."
+
+test_begin_subtest "search"
+output=$(notmuch search '*' | notmuch_search_sanitize)
+test_expect_equal "$output" "thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; Test message #1 (inbox unread)
+thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; Test message #2 (inbox unread)
+thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; Test message #3 (inbox unread)
+thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; Test message #4 (inbox unread)
+thread:XXX   2001-01-05 [1/1] Notmuch Test Suite; Test message #5 (inbox unread)"
+
+test_begin_subtest "show (mboxcl)"
+output=$(notmuch show "Test message #1" | grep -o "filename:[^ ]*")
+test_expect_equal "$output" "filename:mbox://$MAIL_DIR/some-mboxes/first.mbox#${starts[0]}+${lengths[0]}"
+
+test_begin_subtest "show doesn't append an extra space at the end (mboxcl)"
+output=$(notmuch show --format=raw "Test message #1" )
+original=$(cat $MAIL_DIR/corpus/msg-001)
+test_expect_equal "$output" "$original"
+
+test_begin_subtest "show (non-cl)"
+output=$(notmuch show "Test message #3" | grep -o "filename:[^ ]*")
+test_expect_equal "$output" "filename:mbox://$MAIL_DIR/some-mboxes/first.mbox#${starts[2]}+${lengths[2]}"
+
+test_begin_subtest "show doesn't append an extra space at the end (non-cl)"
+output=$(notmuch show --format=raw "Test message #3" )
+original=$(cat $MAIL_DIR/corpus/msg-003)
+test_expect_equal "$output" "$original"
+
+test_done
diff --git a/test/notmuch-test b/test/notmuch-test
index bfad5d3..8cbb2cd 100755
--- a/test/notmuch-test
+++ b/test/notmuch-test
@@ -47,6 +47,7 @@ TESTS="
   emacs-large-search-buffer
   emacs-subject-to-filename
   maildir-sync
+  mbox
   crypto
   symbol-hiding
   search-folder-coherence
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 8/8] new: Add scan support for mbox:// URIs
  2012-07-01 16:39         ` [PATCH v2 0/8] URI-based modular mail stores Ethan Glasser-Camp
                             ` (6 preceding siblings ...)
  2012-07-01 16:39           ` [PATCH v2 7/8] Tests for mbox support Ethan Glasser-Camp
@ 2012-07-01 16:39           ` Ethan Glasser-Camp
  7 siblings, 0 replies; 31+ messages in thread
From: Ethan Glasser-Camp @ 2012-07-01 16:39 UTC (permalink / raw)
  To: notmuch; +Cc: Ethan Glasser-Camp

This fixes the broken tests introduced by the last commit.

Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
---
More text was added to clarify how mbox scanning works.

 notmuch-config.c |    4 +
 notmuch-new.c    |  304 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 303 insertions(+), 5 deletions(-)

diff --git a/notmuch-config.c b/notmuch-config.c
index 387f855..e02b6a9 100644
--- a/notmuch-config.c
+++ b/notmuch-config.c
@@ -56,6 +56,10 @@ static const char new_config_comment[] =
     "\t	Each URL denotes a \"root\" which will be searched for mail files.\n"
     "\t	How this search is performed depends on the scheme of the URL (the\n"
     "\t	part before the first colon).\n"
+    "\n"
+    "\t\tmbox:///path	scans all subdirectories starting at path for mbox\n"
+    "\t\t		files, and scans all mbox files for all messages.\n"
+    "\n"
     "\t	The maildir located at database.path, above, will automatically be added.\n";
 
 static const char user_config_comment[] =
diff --git a/notmuch-new.c b/notmuch-new.c
index 5250562..061a1a8 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -19,6 +19,7 @@
  */
 
 #include "notmuch-client.h"
+#include <uriparser/Uri.h>
 
 #include <unistd.h>
 
@@ -653,14 +654,307 @@ add_files (notmuch_database_t *notmuch,
     return ret;
 }
 
+/* Scan an mbox file for messages.
+ *
+ * We assume that mboxes are append only -- this function does not
+ * check to see if messages have gone missing.
+ *
+ * The mtime of the mbox file is stored in a "directory" document in
+ * Xapian.
+ */
+/* FIXME: a certain amount of this code appears in add_files_recursive,
+ * and could be refactored
+ */
+static notmuch_status_t
+add_messages_mbox_file (notmuch_database_t *notmuch,
+			const char *path,
+			add_files_state_t *state)
+{
+    notmuch_status_t ret = NOTMUCH_STATUS_SUCCESS, status;
+    struct stat st;
+    time_t fs_mtime, db_mtime, stat_time;
+    FILE *mbox;
+    char *line, *path_uri = NULL, *message_uri = NULL;
+    int line_len;
+    size_t offset, end_offset, line_size = 0;
+    notmuch_directory_t *directory;
+    int content_length = -1, is_headers;
+
+    if (stat (path, &st)) {
+	fprintf (stderr, "Error reading mbox file %s: %s\n",
+		 path, strerror (errno));
+	return NOTMUCH_STATUS_FILE_ERROR;
+    }
+
+    stat_time = time (NULL);
+    if (! S_ISREG (st.st_mode)) {
+	fprintf (stderr, "Error: %s is not a file.\n", path);
+	return NOTMUCH_STATUS_FILE_ERROR;
+    }
+
+    fs_mtime = st.st_mtime;
+
+    path_uri = talloc_asprintf (notmuch, "mbox://%s", path);
+    status = notmuch_database_get_directory (notmuch, path_uri, &directory);
+    if (status) {
+	ret = status;
+	goto DONE;
+    }
+    db_mtime = directory ? notmuch_directory_get_mtime (directory) : 0;
+
+    if (directory && db_mtime == fs_mtime) {
+	goto DONE;
+    }
+
+    mbox = fopen (path, "r");
+    if (mbox == NULL) {
+	fprintf (stderr, "Error: couldn't open %s for reading.\n",
+		 path);
+	ret = NOTMUCH_STATUS_FILE_ERROR;
+	goto DONE;
+    }
+
+    line_len = getline (&line, &line_size, mbox);
+
+    if (line_len == -1) {
+	fprintf (stderr, "Error: reading from %s failed: %s\n",
+		 path, strerror (errno));
+	ret = NOTMUCH_STATUS_FILE_ERROR;
+	goto DONE;
+    }
+
+    if (strncmp (line, "From ", 5) != 0) {
+	fprintf (stderr, "Note: Ignoring non-mbox file: %s\n",
+		 path);
+	ret = NOTMUCH_STATUS_FILE_ERROR;
+	goto DONE;
+    }
+
+    /* Loop invariant: At the beginning of the loop, we have just read
+     * a From_ line, but haven't yet read any of the headers.
+     */
+    while (! feof (mbox)) {
+	is_headers = 1;
+	offset = ftell (mbox);
+	content_length = -1;
+
+	/* Read lines until we either get to the next From_ header, or
+	 * we find a Content-Length header (mboxcl) and we run out of headers.
+	 */
+	do {
+	    /* Get the offset before we read, in case we got another From_ header. */
+	    end_offset = ftell (mbox);
+
+	    line_len = getline (&line, &line_size, mbox);
+
+	    /* Check to see if this line is a content-length header,
+	     * or the end of the headers. */
+	    if (is_headers && strncasecmp (line, "Content-Length: ",
+					   strlen ("Content-Length: ")) == 0)
+		content_length = strtol (line + strlen ("Content-Length: "),
+					 NULL, 10);
+
+	    if (is_headers && strlen (line) == 1 && *line == '\n') {
+		is_headers = 0;
+		/* If we got a content_length, skip the message body. */
+		if (content_length != -1) {
+		    fseek (mbox, content_length, SEEK_CUR);
+		    line_len = getline (&line, &line_size, mbox);
+
+		    /* We should be at the end of the message.  Sanity
+		     * check: there should be a blank line, and then
+		     * another From_ header. */
+		    if (strlen (line) != 1 || *line != '\n') {
+			fprintf (stderr, "Warning: message with Content-Length not "
+				 "immediately followed by blank line (%d)\n", offset);
+		    }
+
+		    end_offset = ftell (mbox);
+		    line_len = getline (&line, &line_size, mbox);
+
+		    if (line_len != -1 && strncmp (line, "From ", 5) != 0) {
+			fprintf (stderr, "Warning: message with Content-Length not "
+				 "immediately followed by another message (%d)\n", offset);
+			fprintf (stderr, "Line was: %s", line);
+		    }
+		}
+	    }
+
+	} while (! feof (mbox) && strncmp (line, "From ", 5) != 0);
+
+	/* end_offset is now after the \n but before the From_. */
+	message_uri = talloc_asprintf (notmuch, "mbox://%s#%d+%d",
+				       path, offset, (end_offset - 1) - offset);
+
+	_report_before_adding_file (state, message_uri);
+
+	status = _add_message (state, notmuch, message_uri);
+	if (status) {
+	    ret = status;
+	    goto DONE;
+	}
+
+	_report_added_file (state);
+
+	talloc_free (message_uri);
+	message_uri = NULL;
+    }
+
+    /* This is the same precaution we take in maildir. */
+    if (fs_mtime != stat_time)
+	_filename_list_add (state->directory_mtimes, path_uri)->mtime = fs_mtime;
+
+DONE:
+    if (line)
+	free (line);
+    if (path_uri)
+	talloc_free (path_uri);
+    if (message_uri)
+	talloc_free (message_uri);
+    if (directory)
+	notmuch_directory_destroy (directory);
+
+    return ret;
+}
+
+/*
+ * Examine path recursively as follows:
+ *
+ * - Recurse on each subdirectory, as in add_files.
+ *
+ * - Call add_messages_mbox_file on every non-directory.
+ */
+/* FIXME: this is almost entirely bits-and-pieces from
+ * add_files_recursive and could do with a refactor */
+static notmuch_status_t
+add_files_mbox (notmuch_database_t *notmuch,
+		const char *path,
+		add_files_state_t *state)
+{
+    struct dirent **fs_entries = NULL;
+    struct dirent *entry = NULL;
+    char *next = NULL;
+    int num_fs_entries = 0, i, entry_type;
+    struct stat st;
+    notmuch_status_t ret = NOTMUCH_STATUS_SUCCESS, status;
+
+    if (stat (path, &st)) {
+	fprintf (stderr, "Error reading directory %s: %s\n",
+		 path, strerror (errno));
+	return NOTMUCH_STATUS_FILE_ERROR;
+    }
+
+    num_fs_entries = scandir (path, &fs_entries, 0,
+			      dirent_sort_inode);
+
+    if (num_fs_entries == -1) {
+	fprintf (stderr, "Error opening directory %s: %s\n",
+		 path, strerror (errno));
+	ret = NOTMUCH_STATUS_FILE_ERROR;
+	goto DONE;
+    }
+
+    for (i = 0; i < num_fs_entries; i++) {
+	if (interrupted)
+	    break;
+
+	entry = fs_entries[i];
+
+	entry_type = dirent_type (path, entry);
+	if (entry_type == -1) {
+	    /* Be pessimistic, e.g. so we don't lose lots of mail just
+	     * because a user broke a symlink. */
+	    fprintf (stderr, "Error reading file %s/%s: %s\n",
+		     path, entry->d_name, strerror (errno));
+	    return NOTMUCH_STATUS_FILE_ERROR;
+	} else if (entry_type != S_IFDIR) {
+	    continue;
+	}
+
+	/* Ignore special directories to avoid infinite recursion.
+	 * Also ignore the .notmuch directory, any "tmp" directory
+	 * that appears within a maildir and files/directories
+	 * the user has configured to be ignored.
+	 */
+	if (strcmp (entry->d_name, ".") == 0 ||
+	    strcmp (entry->d_name, "..") == 0 ||
+	    strcmp (entry->d_name, ".notmuch") == 0 ||
+	    _entry_in_ignore_list (entry->d_name, state))
+	{
+	    continue;
+	}
+
+	next = talloc_asprintf (notmuch, "%s/%s", path, entry->d_name);
+	status = add_files_mbox (notmuch, next, state);
+	if (status) {
+	    ret = status;
+	    goto DONE;
+	}
+	talloc_free (next);
+	next = NULL;
+    }
+
+    /* Pass 2: Scan for new files, removed files, and removed directories. */
+    for (i = 0; i < num_fs_entries; i++)
+    {
+	if (interrupted)
+	    break;
+
+        entry = fs_entries[i];
+
+	/* Ignore files & directories user has configured to be ignored */
+	if (_entry_in_ignore_list (entry->d_name, state))
+	    continue;
+
+	/* Only add regular files (and symlinks to regular files). */
+	entry_type = dirent_type (path, entry);
+	if (entry_type == -1) {
+	    fprintf (stderr, "Error reading file %s/%s: %s\n",
+		     path, entry->d_name, strerror (errno));
+	    return NOTMUCH_STATUS_FILE_ERROR;
+	} else if (entry_type != S_IFREG) {
+	    continue;
+	}
+
+	next = talloc_asprintf (notmuch, "%s/%s", path, entry->d_name);
+	status = add_messages_mbox_file (notmuch, next, state);
+	talloc_free (next);
+	next = NULL;
+
+	if (status) {
+	    ret = status;
+	    goto DONE;
+	}
+    }
+
+DONE:
+    if (next)
+	talloc_free (next);
+    return ret;
+}
+
 /* Call out to the appropriate add_files function, based on the URI. */
 static notmuch_status_t
-add_files_uri (unused(notmuch_database_t *notmuch),
-	       unused(const char *uri),
-	       unused(add_files_state_t *state))
+add_files_uri (notmuch_database_t *notmuch,
+	       const char *uri,
+	       add_files_state_t *state)
 {
-    /* Stub for now */
-    return NOTMUCH_STATUS_SUCCESS;
+    UriUriA parsed;
+    UriParserStateA parser;
+    notmuch_status_t ret = NOTMUCH_STATUS_SUCCESS;
+    parser.uri = &parsed;
+    if (uriParseUriA (&parser, uri) != URI_SUCCESS)
+	goto DONE;
+
+    if (strncmp (parsed.scheme.first, "mbox",
+		 parsed.scheme.afterLast - parsed.scheme.first) == 0) {
+	ret = add_files_mbox (notmuch, parsed.pathHead->text.first - 1, state);
+	goto DONE;
+    }
+
+DONE:
+    uriFreeUriMembersA (&parsed);
+    return ret;
 }
 
 static void
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 00/14] modular mail stores based on URIs
  2012-07-01 16:02       ` Ethan
  2012-07-01 16:39         ` [PATCH v2 0/8] URI-based modular mail stores Ethan Glasser-Camp
@ 2012-07-01 16:48         ` Mark Walters
  2012-07-03  8:40         ` Jameson Graef Rollins
  2 siblings, 0 replies; 31+ messages in thread
From: Mark Walters @ 2012-07-01 16:48 UTC (permalink / raw)
  To: Ethan; +Cc: notmuch

On Sun, 01 Jul 2012, Ethan <ethan.glasser.camp@gmail.com> wrote:
> Thanks for going through it, I know there's a lot to go through..
>
> On Thu, Jun 28, 2012 at 4:45 PM, Mark Walters <markwalters1009@gmail.com>wrote:
>
>> I was thinking of just having one mail root and inside that there could
>> be maildirs and mboxes. Everything would still be relative to the root.
>>
>
> I'm hesitant to have directories that contain maildirs and mboxes. It
> should be possible to unambiguously distinguish between a maildir file and
> an mbox file (mboxes always start with "From ", no colon) but it sounds
> kind of fragile.

Well I was thinking you would still need to add specific sub-directories
of db_path that might contain mboxes. 

>>  1. Are URIs the way to specify individual messages, despite bremner's
>> >  concerns about too much of the API being strings? Is adding another
>> library
>> >  is the easiest way to parse URIs?
>>
>> In my opinion  the nice thing about using strings is that it does not
>> require
>> any changes to the Xapian database to store them. I think using URIs may
>> not be best though as they seem to be annoying to parse (as filenames
>> can contain the same characters) and you seem to need to work around the
>> parser in some cases.
>>
>
> I think that's more the fault of the parser than of the URIs. If glib came
> with a parser, that would be great. There aren't a lot of options for
> pure-C URI parsing. Besides uriparser, there's also some code in the W3C
> sample code library, but it looked like integrating it would be a pain so I
> let it go.
>
> I wonder if the following would be practical: use // as the field
>> separator:
>>
>> e.g. mbox://filename//start_of_message+length
>>
>> I think 2 consecutive slashes // is about the only thing we can assume
>> is not in the path or filename. Since it is not in the filename I think
>> parsing should be trivial (thus avoiding the extra library).
>>
>
> Can you explain what you mean when you say that two consecutive slashes
> can't appear in a URL? Ordinary filesystem paths can contain them, and so
> can file: URLs. (I just looked up file:///home/ethan///////tmp and Firefox
> handled that OK.) I've sometimes seen machine-generated filenames with
> double slashes because that way you don't have to make sure the incoming
> filename was correctly terminated before adding another level.

Nothing outside notmuch (i.e. other applications creating arbitrary
filenames etc) can make notmuch store a // as part of a path so if we
ever do store them in the database it's our own fault. In particular
notmuch can avoid them easily in that they cannot occur in a filename.


>> Secondly, I would prefer to keep maildirs as just the bare file name: so
>> the existence of // can be the signal that there is some other
>> scheme. This is asymmetric, but is rather more backwardly compatible.
>>
>
> Based on your and Jani's reasoning, I did this. Revised patch series
> follows.

I will try and look at that now.

Best wishes

Mark

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 1/8] All access to mail files goes through the mailstore module
  2012-07-01 16:39           ` [PATCH v2 1/8] All access to mail files goes through the mailstore module Ethan Glasser-Camp
@ 2012-07-01 19:48             ` Mark Walters
  0 siblings, 0 replies; 31+ messages in thread
From: Mark Walters @ 2012-07-01 19:48 UTC (permalink / raw)
  To: Ethan Glasser-Camp, notmuch; +Cc: Ethan Glasser-Camp

On Sun, 01 Jul 2012, Ethan Glasser-Camp <ethan.glasser.camp@gmail.com> wrote:
> This commit introduces the mailstore module which provides two
> functions, notmuch_mailstore_open and notmuch_mailstore_close. These
> functions are currently just stub calls to fopen and fclose, but later
> can be made more complex in order to support mail storage systems
> where one message might not be one file.
>
> No functional changes are made, but calls to fopen and fclose are
> replaced with notmuch_mailstore_open and notmuch_mailstore_close
> wherever they are being used to access messages, but not when talking
> about real files like in notmuch-dump or notmuch-restore.
>
> Since the sha1 function notmuch_sha1_of_file is only used on messages,
> we convert it to using notmuch_mailstore_open and
> notmuch_mailstore_close, and rename it notmuch_sha1_of_message. While
> we are there, we also replace a numeric constant with its symbolic
> name BLOCK_SIZE.
>
> Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
> ---
> This patch does not split notmuch_sha1_of_file the way the previous
> patch series did, because there are no other callers of
> notmuch_sha1_of_file. Instead it renames the function to
> notmuch_sha1_of_message.
>
>  lib/Makefile.local    |    1 +
>  lib/database.cc       |    2 +-
>  lib/index.cc          |    2 +-
>  lib/mailstore.c       |   34 ++++++++++++++++++++++++++++++++++
>  lib/message-file.c    |    6 +++---
>  lib/notmuch-private.h |    2 +-
>  lib/notmuch.h         |   16 ++++++++++++++++
>  lib/sha1.c            |   11 +++++------
>  mime-node.c           |    4 ++--
>  notmuch-show.c        |   12 ++++++------
>  10 files changed, 70 insertions(+), 20 deletions(-)
>  create mode 100644 lib/mailstore.c

This patch, though large, is simple and looks obviously correct. Indeed,
given that it is really just abstracting out fopen/fclose into notmuch
specific versions it would be a good first step for any more general
mail store so it might be reasonable to accept it independently of the
rest of the series.

Best wishes

Mark

>
> diff --git a/lib/Makefile.local b/lib/Makefile.local
> index 8a9aa28..cfc77bb 100644
> --- a/lib/Makefile.local
> +++ b/lib/Makefile.local
> @@ -51,6 +51,7 @@ libnotmuch_c_srcs =		\
>  	$(dir)/filenames.c	\
>  	$(dir)/string-list.c	\
>  	$(dir)/libsha1.c	\
> +	$(dir)/mailstore.c	\
>  	$(dir)/message-file.c	\
>  	$(dir)/messages.c	\
>  	$(dir)/sha1.c		\
> diff --git a/lib/database.cc b/lib/database.cc
> index 761dc1a..c035edc 100644
> --- a/lib/database.cc
> +++ b/lib/database.cc
> @@ -1773,7 +1773,7 @@ notmuch_database_add_message (notmuch_database_t *notmuch,
>  	if (message_id == NULL ) {
>  	    /* No message-id at all, let's generate one by taking a
>  	     * hash over the file's contents. */
> -	    char *sha1 = notmuch_sha1_of_file (filename);
> +	    char *sha1 = notmuch_sha1_of_message (filename);
>  
>  	    /* If that failed too, something is really wrong. Give up. */
>  	    if (sha1 == NULL) {
> diff --git a/lib/index.cc b/lib/index.cc
> index e377732..b607e82 100644
> --- a/lib/index.cc
> +++ b/lib/index.cc
> @@ -441,7 +441,7 @@ _notmuch_message_index_file (notmuch_message_t *message,
>  	initialized = 1;
>      }
>  
> -    file = fopen (filename, "r");
> +    file = notmuch_mailstore_open (filename);
>      if (! file) {
>  	fprintf (stderr, "Error opening %s: %s\n", filename, strerror (errno));
>  	ret = NOTMUCH_STATUS_FILE_ERROR;
> diff --git a/lib/mailstore.c b/lib/mailstore.c
> new file mode 100644
> index 0000000..48acd47
> --- /dev/null
> +++ b/lib/mailstore.c
> @@ -0,0 +1,34 @@
> +/* mailstore.c - code to access individual messages
> + *
> + * Copyright © 2009 Carl Worth
> + *
> + * This program is free software: you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 3 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see http://www.gnu.org/licenses/ .
> + *
> + * Author: Carl Worth <cworth@cworth.org>
> + */
> +#include <stdio.h>
> +
> +#include "notmuch-private.h"
> +
> +FILE *
> +notmuch_mailstore_open (const char *filename)
> +{
> +    return fopen (filename, "r");
> +}
> +
> +int
> +notmuch_mailstore_close (FILE *file)
> +{
> +    return fclose (file);
> +}
> diff --git a/lib/message-file.c b/lib/message-file.c
> index 915aba8..271389c 100644
> --- a/lib/message-file.c
> +++ b/lib/message-file.c
> @@ -86,7 +86,7 @@ _notmuch_message_file_destructor (notmuch_message_file_t *message)
>  	g_hash_table_destroy (message->headers);
>  
>      if (message->file)
> -	fclose (message->file);
> +	notmuch_mailstore_close (message->file);
>  
>      return 0;
>  }
> @@ -104,7 +104,7 @@ _notmuch_message_file_open_ctx (void *ctx, const char *filename)
>  
>      talloc_set_destructor (message, _notmuch_message_file_destructor);
>  
> -    message->file = fopen (filename, "r");
> +    message->file = notmuch_mailstore_open (filename);
>      if (message->file == NULL)
>  	goto FAIL;
>  
> @@ -361,7 +361,7 @@ notmuch_message_file_get_header (notmuch_message_file_t *message,
>      }
>  
>      if (message->parsing_finished) {
> -        fclose (message->file);
> +        notmuch_mailstore_close (message->file);
>          message->file = NULL;
>      }
>  
> diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h
> index bfb4111..ef1f994 100644
> --- a/lib/notmuch-private.h
> +++ b/lib/notmuch-private.h
> @@ -466,7 +466,7 @@ char *
>  notmuch_sha1_of_string (const char *str);
>  
>  char *
> -notmuch_sha1_of_file (const char *filename);
> +notmuch_sha1_of_message (const char *filename);
>  
>  /* string-list.c */
>  
> diff --git a/lib/notmuch.h b/lib/notmuch.h
> index 3633bed..0ca367b 100644
> --- a/lib/notmuch.h
> +++ b/lib/notmuch.h
> @@ -1233,6 +1233,22 @@ notmuch_message_thaw (notmuch_message_t *message);
>  void
>  notmuch_message_destroy (notmuch_message_t *message);
>  
> +/* Get access to the underlying data of a message.
> + *
> + * Right now, all messages are simple files in maildirs, but this is
> + * hopefully subject to change.
> + *
> + * If the returned FILE* is not NULL, be sure to call
> + * notmuch_mailstore_close when you're done with it.
> + */
> +FILE *
> +notmuch_mailstore_open (const char *filename);
> +
> +/* Release any resources associated with this file.
> + */
> +int
> +notmuch_mailstore_close (FILE *file);
> +
>  /* Is the given 'tags' iterator pointing at a valid tag.
>   *
>   * When this function returns TRUE, notmuch_tags_get will return a
> diff --git a/lib/sha1.c b/lib/sha1.c
> index cc48108..f4d213a 100644
> --- a/lib/sha1.c
> +++ b/lib/sha1.c
> @@ -65,7 +65,7 @@ notmuch_sha1_of_string (const char *str)
>  }
>  
>  /* Create a hexadecimal string version of the SHA-1 digest of the
> - * contents of the named file.
> + * contents of the named message.
>   *
>   * This function returns a newly allocated string which the caller
>   * should free() when finished.
> @@ -74,7 +74,7 @@ notmuch_sha1_of_string (const char *str)
>   * file not found, etc.), this function returns NULL.
>   */
>  char *
> -notmuch_sha1_of_file (const char *filename)
> +notmuch_sha1_of_message (const char *filename)
>  {
>      FILE *file;
>  #define BLOCK_SIZE 4096
> @@ -84,14 +84,14 @@ notmuch_sha1_of_file (const char *filename)
>      unsigned char digest[SHA1_DIGEST_SIZE];
>      char *result;
>  
> -    file = fopen (filename, "r");
> +    file = notmuch_mailstore_open (filename);
>      if (file == NULL)
>  	return NULL;
>  
>      sha1_begin (&sha1);
>  
>      while (1) {
> -	bytes_read = fread (block, 1, 4096, file);
> +	bytes_read = fread (block, 1, BLOCK_SIZE, file);
>  	if (bytes_read == 0) {
>  	    if (feof (file)) {
>  		break;
> @@ -108,8 +108,7 @@ notmuch_sha1_of_file (const char *filename)
>  
>      result = _hex_of_sha1_digest (digest);
>  
> -    fclose (file);
> +    notmuch_mailstore_close (file);
>  
>      return result;
>  }
> -
> diff --git a/mime-node.c b/mime-node.c
> index 97e8b48..a5c60d0 100644
> --- a/mime-node.c
> +++ b/mime-node.c
> @@ -49,7 +49,7 @@ _mime_node_context_free (mime_node_context_t *res)
>  	g_object_unref (res->stream);
>  
>      if (res->file)
> -	fclose (res->file);
> +	notmuch_mailstore_close (res->file);
>  
>      return 0;
>  }
> @@ -79,7 +79,7 @@ mime_node_open (const void *ctx, notmuch_message_t *message,
>      }
>      talloc_set_destructor (mctx, _mime_node_context_free);
>  
> -    mctx->file = fopen (filename, "r");
> +    mctx->file = notmuch_mailstore_open (filename);
>      if (! mctx->file) {
>  	fprintf (stderr, "Error opening %s: %s\n", filename, strerror (errno));
>  	status = NOTMUCH_STATUS_FILE_ERROR;
> diff --git a/notmuch-show.c b/notmuch-show.c
> index 8f3c60e..daffb59 100644
> --- a/notmuch-show.c
> +++ b/notmuch-show.c
> @@ -693,7 +693,7 @@ format_part_mbox (const void *ctx, mime_node_t *node, unused (int indent),
>  	INTERNAL_ERROR ("format_part_mbox requires a root part");
>  
>      filename = notmuch_message_get_filename (message);
> -    file = fopen (filename, "r");
> +    file = notmuch_mailstore_open (filename);
>      if (file == NULL) {
>  	fprintf (stderr, "Failed to open %s: %s\n",
>  		 filename, strerror (errno));
> @@ -717,7 +717,7 @@ format_part_mbox (const void *ctx, mime_node_t *node, unused (int indent),
>  
>      printf ("\n");
>  
> -    fclose (file);
> +    notmuch_mailstore_close (file);
>  
>      return NOTMUCH_STATUS_SUCCESS;
>  }
> @@ -740,7 +740,7 @@ format_part_raw (unused (const void *ctx), mime_node_t *node,
>  	    return NOTMUCH_STATUS_FILE_ERROR;
>  	}
>  
> -	file = fopen (filename, "r");
> +	file = notmuch_mailstore_open (filename);
>  	if (file == NULL) {
>  	    fprintf (stderr, "Error: Cannot open file %s: %s\n", filename, strerror (errno));
>  	    return NOTMUCH_STATUS_FILE_ERROR;
> @@ -750,18 +750,18 @@ format_part_raw (unused (const void *ctx), mime_node_t *node,
>  	    size = fread (buf, 1, sizeof (buf), file);
>  	    if (ferror (file)) {
>  		fprintf (stderr, "Error: Read failed from %s\n", filename);
> -		fclose (file);
> +		notmuch_mailstore_close (file);
>  		return NOTMUCH_STATUS_FILE_ERROR;
>  	    }
>  
>  	    if (fwrite (buf, size, 1, stdout) != 1) {
>  		fprintf (stderr, "Error: Write failed\n");
> -		fclose (file);
> +		notmuch_mailstore_close (file);
>  		return NOTMUCH_STATUS_FILE_ERROR;
>  	    }
>  	}
>  
> -	fclose (file);
> +	notmuch_mailstore_close (file);
>  	return NOTMUCH_STATUS_SUCCESS;
>      }
>  
> -- 
> 1.7.9.5
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 5/8] notmuch-new: pull out useful bits of add_files_recursive
  2012-07-01 16:39           ` [PATCH v2 5/8] notmuch-new: pull out useful bits of add_files_recursive Ethan Glasser-Camp
@ 2012-07-01 19:55             ` Mark Walters
  0 siblings, 0 replies; 31+ messages in thread
From: Mark Walters @ 2012-07-01 19:55 UTC (permalink / raw)
  To: Ethan Glasser-Camp, notmuch; +Cc: Ethan Glasser-Camp

On Sun, 01 Jul 2012, Ethan Glasser-Camp <ethan.glasser.camp@gmail.com> wrote:
> This patch pulls some bits out of add_files_recursive which will be
> useful for other backends: two reporting functions
> _report_before_adding_file and _report_added_file, as well as
> _add_message, which actually does the message adding.
>
> No functional changes.

This looks fine. Personally, I would split it into two or three bits so
it is easier to see that it is just code movement.

Best wishes

Mark

>
> Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
> ---
>  notmuch-new.c |  192 +++++++++++++++++++++++++++++++++++----------------------
>  1 file changed, 119 insertions(+), 73 deletions(-)
>
> diff --git a/notmuch-new.c b/notmuch-new.c
> index 8377750..5250562 100644
> --- a/notmuch-new.c
> +++ b/notmuch-new.c
> @@ -239,6 +239,122 @@ _entry_in_ignore_list (const char *entry, add_files_state_t *state)
>      return FALSE;
>  }
>  
> +/* Progress-reporting function.
> + *
> + * Can be used by any mailstore-crawling function that wants to alert
> + * users what message it's about to add. Subsequent errors will be due
> + * to this message ;)
> + */
> +static void
> +_report_before_adding_file (add_files_state_t *state, const char *filename)
> +{
> +    state->processed_files++;
> +
> +    if (state->verbose) {
> +	if (state->output_is_a_tty)
> +	    printf("\r\033[K");
> +
> +	printf ("%i/%i: %s",
> +		state->processed_files,
> +		state->total_files,
> +		filename);
> +
> +	putchar((state->output_is_a_tty) ? '\r' : '\n');
> +	fflush (stdout);
> +    }
> +}
> +
> +/* Progress-reporting function.
> + *
> + * Call this to respond to the signal handler for SIGALRM.
> + */
> +static void
> +_report_added_file (add_files_state_t *state)
> +{
> +    if (do_print_progress) {
> +	do_print_progress = 0;
> +	generic_print_progress ("Processed", "files", state->tv_start,
> +				state->processed_files, state->total_files);
> +    }
> +}
> +
> +
> +/* Atomically handles adding a message to the database.
> + *
> + * Should be used by any mailstore-crawling function that finds a new
> + * message to add.
> + */
> +static notmuch_status_t
> +_add_message (add_files_state_t *state, notmuch_database_t *notmuch,
> +	      const char *filename)
> +{
> +    notmuch_status_t status, ret = NOTMUCH_STATUS_SUCCESS;
> +    notmuch_message_t *message;
> +    const char **tag;
> +
> +    status = notmuch_database_begin_atomic (notmuch);
> +    if (status) {
> +	ret = status;
> +	goto DONE;
> +    }
> +
> +    status = notmuch_database_add_message (notmuch, filename, &message);
> +
> +    switch (status) {
> +    /* success */
> +    case NOTMUCH_STATUS_SUCCESS:
> +	state->added_messages++;
> +	notmuch_message_freeze (message);
> +	for (tag=state->new_tags; *tag != NULL; tag++)
> +	    notmuch_message_add_tag (message, *tag);
> +	if (state->synchronize_flags == TRUE)
> +	    notmuch_message_maildir_flags_to_tags (message);
> +	notmuch_message_thaw (message);
> +	break;
> +    /* Non-fatal issues (go on to next file) */
> +    case NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID:
> +	if (state->synchronize_flags == TRUE)
> +	    notmuch_message_maildir_flags_to_tags (message);
> +	break;
> +    case NOTMUCH_STATUS_FILE_NOT_EMAIL:
> +	fprintf (stderr, "Note: Ignoring non-mail file: %s\n",
> +		 filename);
> +	break;
> +    /* Fatal issues. Don't process anymore. */
> +    case NOTMUCH_STATUS_READ_ONLY_DATABASE:
> +    case NOTMUCH_STATUS_XAPIAN_EXCEPTION:
> +    case NOTMUCH_STATUS_OUT_OF_MEMORY:
> +	fprintf (stderr, "Error: %s. Halting processing.\n",
> +		 notmuch_status_to_string (status));
> +	ret = status;
> +	goto DONE;
> +    default:
> +    case NOTMUCH_STATUS_FILE_ERROR:
> +    case NOTMUCH_STATUS_NULL_POINTER:
> +    case NOTMUCH_STATUS_TAG_TOO_LONG:
> +    case NOTMUCH_STATUS_UNBALANCED_FREEZE_THAW:
> +    case NOTMUCH_STATUS_UNBALANCED_ATOMIC:
> +    case NOTMUCH_STATUS_LAST_STATUS:
> +	INTERNAL_ERROR ("add_message returned unexpected value: %d",  status);
> +	ret = status;
> +	goto DONE;
> +    }
> +
> +    status = notmuch_database_end_atomic (notmuch);
> +    if (status) {
> +	ret = status;
> +	goto DONE;
> +    }
> +
> +  DONE:
> +    if (message) {
> +	notmuch_message_destroy (message);
> +	message = NULL;
> +    }
> +
> +    return ret;
> +}
> +
>  /* Examine 'path' recursively as follows:
>   *
>   *   o Ask the filesystem for the mtime of 'path' (fs_mtime)
> @@ -290,7 +406,6 @@ add_files (notmuch_database_t *notmuch,
>      char *next = NULL;
>      time_t fs_mtime, db_mtime;
>      notmuch_status_t status, ret = NOTMUCH_STATUS_SUCCESS;
> -    notmuch_message_t *message = NULL;
>      struct dirent **fs_entries = NULL;
>      int i, num_fs_entries = 0, entry_type;
>      notmuch_directory_t *directory;
> @@ -299,7 +414,6 @@ add_files (notmuch_database_t *notmuch,
>      time_t stat_time;
>      struct stat st;
>      notmuch_bool_t is_maildir;
> -    const char **tag;
>  
>      if (stat (path, &st)) {
>  	fprintf (stderr, "Error reading directory %s: %s\n",
> @@ -469,83 +583,15 @@ add_files (notmuch_database_t *notmuch,
>  	 * in the database, so add it. */
>  	next = talloc_asprintf (notmuch, "%s/%s", path, entry->d_name);
>  
> -	state->processed_files++;
> -
> -	if (state->verbose) {
> -	    if (state->output_is_a_tty)
> -		printf("\r\033[K");
> -
> -	    printf ("%i/%i: %s",
> -		    state->processed_files,
> -		    state->total_files,
> -		    next);
> -
> -	    putchar((state->output_is_a_tty) ? '\r' : '\n');
> -	    fflush (stdout);
> -	}
> +	_report_before_adding_file (state, next);
>  
> -	status = notmuch_database_begin_atomic (notmuch);
> +	status = _add_message (state, notmuch, next);
>  	if (status) {
>  	    ret = status;
>  	    goto DONE;
>  	}
>  
> -	status = notmuch_database_add_message (notmuch, next, &message);
> -	switch (status) {
> -	/* success */
> -	case NOTMUCH_STATUS_SUCCESS:
> -	    state->added_messages++;
> -	    notmuch_message_freeze (message);
> -	    for (tag=state->new_tags; *tag != NULL; tag++)
> -	        notmuch_message_add_tag (message, *tag);
> -	    if (state->synchronize_flags == TRUE)
> -		notmuch_message_maildir_flags_to_tags (message);
> -	    notmuch_message_thaw (message);
> -	    break;
> -	/* Non-fatal issues (go on to next file) */
> -	case NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID:
> -	    if (state->synchronize_flags == TRUE)
> -		notmuch_message_maildir_flags_to_tags (message);
> -	    break;
> -	case NOTMUCH_STATUS_FILE_NOT_EMAIL:
> -	    fprintf (stderr, "Note: Ignoring non-mail file: %s\n",
> -		     next);
> -	    break;
> -	/* Fatal issues. Don't process anymore. */
> -	case NOTMUCH_STATUS_READ_ONLY_DATABASE:
> -	case NOTMUCH_STATUS_XAPIAN_EXCEPTION:
> -	case NOTMUCH_STATUS_OUT_OF_MEMORY:
> -	    fprintf (stderr, "Error: %s. Halting processing.\n",
> -		     notmuch_status_to_string (status));
> -	    ret = status;
> -	    goto DONE;
> -	default:
> -	case NOTMUCH_STATUS_FILE_ERROR:
> -	case NOTMUCH_STATUS_NULL_POINTER:
> -	case NOTMUCH_STATUS_TAG_TOO_LONG:
> -	case NOTMUCH_STATUS_UNBALANCED_FREEZE_THAW:
> -	case NOTMUCH_STATUS_UNBALANCED_ATOMIC:
> -	case NOTMUCH_STATUS_LAST_STATUS:
> -	    INTERNAL_ERROR ("add_message returned unexpected value: %d",  status);
> -	    goto DONE;
> -	}
> -
> -	status = notmuch_database_end_atomic (notmuch);
> -	if (status) {
> -	    ret = status;
> -	    goto DONE;
> -	}
> -
> -	if (message) {
> -	    notmuch_message_destroy (message);
> -	    message = NULL;
> -	}
> -
> -	if (do_print_progress) {
> -	    do_print_progress = 0;
> -	    generic_print_progress ("Processed", "files", state->tv_start,
> -				    state->processed_files, state->total_files);
> -	}
> +	_report_added_file (state);
>  
>  	talloc_free (next);
>  	next = NULL;
> -- 
> 1.7.9.5
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 00/14] modular mail stores based on URIs
  2012-07-01 16:02       ` Ethan
  2012-07-01 16:39         ` [PATCH v2 0/8] URI-based modular mail stores Ethan Glasser-Camp
  2012-07-01 16:48         ` [RFC PATCH 00/14] modular mail stores based on URIs Mark Walters
@ 2012-07-03  8:40         ` Jameson Graef Rollins
  2 siblings, 0 replies; 31+ messages in thread
From: Jameson Graef Rollins @ 2012-07-03  8:40 UTC (permalink / raw)
  To: Ethan, Mark Walters; +Cc: notmuch

[-- Attachment #1: Type: text/plain, Size: 868 bytes --]

On Sun, Jul 01 2012, Ethan <ethan.glasser.camp@gmail.com> wrote:
>> I wonder if the following would be practical: use // as the field
>> separator:
>>
>> e.g. mbox://filename//start_of_message+length
>>
>> I think 2 consecutive slashes // is about the only thing we can assume
>> is not in the path or filename. Since it is not in the filename I think
>> parsing should be trivial (thus avoiding the extra library).
>>
>
> Can you explain what you mean when you say that two consecutive slashes
> can't appear in a URL? Ordinary filesystem paths can contain them, and so
> can file: URLs. (I just looked up file:///home/ethan///////tmp and Firefox
> handled that OK.)

Does firefox resolve this as /home/ethan/tmp, or /home/ethan///////tmp?
For me iceweasel resolves it as /home/jrollins/tmp.  I think this is
might be the behavior to which Mark is referring.

jamie.

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2012-07-03  8:40 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-25 20:41 [RFC PATCH 00/14] modular mail stores based on URIs Ethan Glasser-Camp
2012-06-25 20:41 ` [RFC PATCH 01/14] All access to mail files goes through the mailstore module Ethan Glasser-Camp
2012-06-28 20:48   ` Mark Walters
2012-06-25 20:41 ` [RFC PATCH 02/14] Introduce uriparser Ethan Glasser-Camp
2012-06-25 20:41 ` [RFC PATCH 03/14] mailstore can read from maildir: URLs Ethan Glasser-Camp
2012-06-25 20:41 ` [RFC PATCH 04/14] Not all filenames need to be converted to absolute paths Ethan Glasser-Camp
2012-06-27  9:17 ` [RFC PATCH 00/14] modular mail stores based on URIs Mark Walters
2012-06-28  7:39   ` Ethan
2012-06-28 15:13     ` David Bremner
2012-06-28 20:41       ` Robert Horn
2012-06-28 20:45     ` Mark Walters
2012-07-01 16:02       ` Ethan
2012-07-01 16:39         ` [PATCH v2 0/8] URI-based modular mail stores Ethan Glasser-Camp
2012-07-01 16:39           ` [PATCH v2 1/8] All access to mail files goes through the mailstore module Ethan Glasser-Camp
2012-07-01 19:48             ` Mark Walters
2012-07-01 16:39           ` [PATCH v2 2/8] Introduce uriparser Ethan Glasser-Camp
2012-07-01 16:39           ` [PATCH v2 3/8] Not all filenames need to be converted to absolute paths Ethan Glasser-Camp
2012-07-01 16:39           ` [PATCH v2 4/8] new: add "scan" option Ethan Glasser-Camp
2012-07-01 16:39           ` [PATCH v2 5/8] notmuch-new: pull out useful bits of add_files_recursive Ethan Glasser-Camp
2012-07-01 19:55             ` Mark Walters
2012-07-01 16:39           ` [PATCH v2 6/8] mailstore: support for mbox:// URIs Ethan Glasser-Camp
2012-07-01 16:39           ` [PATCH v2 7/8] Tests for mbox support Ethan Glasser-Camp
2012-07-01 16:39           ` [PATCH v2 8/8] new: Add scan support for mbox:// URIs Ethan Glasser-Camp
2012-07-01 16:48         ` [RFC PATCH 00/14] modular mail stores based on URIs Mark Walters
2012-07-03  8:40         ` Jameson Graef Rollins
2012-06-28 17:36 ` Jameson Graef Rollins
2012-06-28 18:39   ` Ethan
2012-06-28 22:00 ` Mark Walters
2012-06-29  6:43   ` Ethan
2012-06-29  7:00     ` Mark Walters
2012-06-29  7:43     ` Jani Nikula

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).