From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id F405E429E25 for ; Tue, 20 Dec 2011 17:00:56 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.1 X-Spam-Level: X-Spam-Status: No, score=-0.1 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pQGlbAVjCLTH for ; Tue, 20 Dec 2011 17:00:55 -0800 (PST) Received: from ks3536.kimsufi.com (schnouki.net [87.98.217.222]) by olra.theworths.org (Postfix) with ESMTP id B3647431FB6 for ; Tue, 20 Dec 2011 17:00:55 -0800 (PST) Received: from odin.local (nancy.schnouki.net [78.238.0.45]) by ks3536.kimsufi.com (Postfix) with ESMTPSA id 650566A0026; Wed, 21 Dec 2011 02:00:54 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=schnouki.net; s=key-schnouki; t=1324429254; bh=ehG+Gitxjap0IVnzIv/mX02SMfPgOZYy+gPVqA3o9IU=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=eUHEzu+kZbqwwtZqrgQfP7XyaUf8Tg+oEkW9Awij7E7x3DWw+UNUPKruayxb5u+7g hfwjNBtbzFv0BhgGnJG5N73PDCPClUfURYQHErfq8lrDtFCQz949KEz50TH7rFuL78 RN8WRpYMXt0XW7x3vfxodiu09+7BSwzzxMKNz9PQ= From: Thomas Jost To: Austin Clements Subject: Re: [PATCH 2/5] lib: Add a MTIME value to every mail document In-Reply-To: <20111215004507.GF2760@mit.edu> References: <1323796305-28789-1-git-send-email-schnouki@schnouki.net> <1323796305-28789-3-git-send-email-schnouki@schnouki.net> <20111215004507.GF2760@mit.edu> User-Agent: Notmuch/0.10.2+122~g1d81c5e (http://notmuchmail.org) Emacs/24.0.92.1 (x86_64-unknown-linux-gnu) Date: Wed, 21 Dec 2011 02:00:53 +0100 Message-ID: <878vm6agca.fsf@schnouki.net> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha1; protocol="application/pgp-signature" Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Dec 2011 01:00:57 -0000 --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Wed, 14 Dec 2011 19:45:07 -0500, Austin Clements wrot= e: > A few minor comments below. >=20 > At a higher level, I'm curious what the tag synchronization protocol > you're building on top of this is. I can't think of one that doesn't > have race conditions, but maybe I'm not thinking about it right. The approach I've used is quite different from what you described in id:"20111219194821.GA10376@mit.edu". I don't directly sync host A to host B but I use a server in the middle. (A is my laptop --> not always on, B is my work PC --> turned off when I'm out of office, so a direct sync would be harder to do). My nm-sync script is written in Python 2 (2.7, may work with 2.6) and is present on both my PCs and on my server. It can operate in two modes : client (when run from one of my PCs) or server (called *from the client* through ssh, running on my server). When running in server mode, the script manipulates a small DB stored as a Python dictionary (and stored on disk with the pickle module). It does not even need notmuch to be installed on the server. Here is what this DB looks like: { "lastseen": { "pc_A": 1324428029, "pc_B": 1323952028 }, "messages": { "msgid_001": (mtime, tag1, tag2, ..., tagN), "msgid_002": (mtime, tag1, tag2, ..., tagM), ... } } So when running the client, here is what happens: 1. client starts a subprocess: "ssh myserver ~/nm-sync server" 2. client and server check that their sha1sum match (to avoid version mismatch) 3. client identifies itself with its hostname ("pc_A" in the example above), server replies with its "lastseen" value and updates its in the DB 4. server sends to client messages with mtime > lastseen (msgid + mtime + tags), client updates the notmuch DB with these values 5. client queries the notmuch DB for messages with mtime > lastseen and sends them (msgid + mtime + tags) to the server, which stores them in the DB 6. cleanup: server removes messages with mtime < min(lastseen) from its DB So basically this approach assumes that all clocks are synchronized (everyone uses ntp, right?...) and does not even try to detect conflicts: if a message has been modified both locally and remotely, then the local version will be overwritten by the remote one, period. It should also work with more than 2 hosts (but not tested yet). No sync data is kept in the notmuch DB. Right now all of this fits in about 250 lines of Python (could be made shorter) and works quite well for me. I'll put it online after doing some cleanup. > Quoth Thomas Jost on Dec 13 at 6:11 pm: > > This is a time_t value, similar to the message date (TIMESTAMP). It is = first set > > when the message is added to the database, and is then updated every ti= me a tag > > is added or removed. It can thus be used for doing incremental dumps of= the > > database or for synchronizing it between several computers. > >=20 > > This value can be read freely (with notmuch_message_get_mtime()) but fo= r now it > > can't be set to an arbitrary value: it can only be set to "now" when up= dated. > > There's no specific reason for this except that I don't really see a re= al use > > case for setting it to an arbitrary value. > > --- > > lib/database.cc | 7 ++++++- > > lib/message.cc | 32 ++++++++++++++++++++++++++++++++ > > lib/notmuch-private.h | 6 +++++- > > lib/notmuch.h | 4 ++++ > > 4 files changed, 47 insertions(+), 2 deletions(-) > >=20 > > diff --git a/lib/database.cc b/lib/database.cc > > index 2025189..6dc6f73 100644 > > --- a/lib/database.cc > > +++ b/lib/database.cc > > @@ -81,7 +81,7 @@ typedef struct { > > * STRING is the name of a file within that > > * directory for this mail message. > > * > > - * A mail document also has four values: > > + * A mail document also has five values: > > * > > * TIMESTAMP: The time_t value corresponding to the message's > > * Date header. > > @@ -92,6 +92,9 @@ typedef struct { > > * > > * SUBJECT: The value of the "Subject" header > > * > > + * MTIME: The time_t value corresponding to the last time > > + * a tag was added or removed on the message. > > + * > > * In addition, terms from the content of the message are added with > > * "from", "to", "attachment", and "subject" prefixes for use by the > > * user in searching. Similarly, terms from the path of the mail > > @@ -1735,6 +1738,8 @@ notmuch_database_add_message (notmuch_database_t = *notmuch, > > date =3D notmuch_message_file_get_header (message_file, "date"); > > _notmuch_message_set_header_values (message, date, from, subject); > >=20=20 > > + _notmuch_message_update_mtime (message); >=20 > Indentation. Fixed, thanks. >=20 > > + > > _notmuch_message_index_file (message, filename); > > } else { > > ret =3D NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID; > > diff --git a/lib/message.cc b/lib/message.cc > > index 0075425..0c98589 100644 > > --- a/lib/message.cc > > +++ b/lib/message.cc > > @@ -830,6 +830,34 @@ _notmuch_message_set_header_values (notmuch_messag= e_t *message, > > message->doc.add_value (NOTMUCH_VALUE_SUBJECT, subject); > > } > >=20=20 > > +/* Get the message mtime, i.e. when it was added or the last time a ta= g was > > + * added/removed. */ > > +time_t > > +notmuch_message_get_mtime (notmuch_message_t *message) > > +{ > > + std::string value; > > + > > + try { > > + value =3D message->doc.get_value (NOTMUCH_VALUE_MTIME); > > + } catch (Xapian::Error &error) { > > + INTERNAL_ERROR ("Failed to read mtime value from document."); > > + return 0; > > + } >=20 > For compatibility, this should handle the case when > NOTMUCH_VALUE_MTIME is missing, probably by just returning 0. As it > is, value will be an empty string and sortable_unserialise is > undefined on strings that weren't produced by sortable_serialise. Right. I think I rebuilt my DB just after implementing this, which explains why I did not notice that myself. Thanks! > > + > > + return Xapian::sortable_unserialise (value); > > +} > > + > > +/* Set the message mtime to "now". */ > > +void > > +_notmuch_message_update_mtime (notmuch_message_t *message) > > +{ > > + time_t time_value; > > + > > + time_value =3D time (NULL); > > + message->doc.add_value (NOTMUCH_VALUE_MTIME, > > + Xapian::sortable_serialise (time_value)); >=20 > Indentation. Noted too. It's really time I start using dtrt-indent. >=20 > > +} > > + > > /* Synchronize changes made to message->doc out into the database. */ > > void > > _notmuch_message_sync (notmuch_message_t *message) > > @@ -994,6 +1022,8 @@ notmuch_message_add_tag (notmuch_message_t *messag= e, const char *tag) > > private_status); > > } > >=20=20 > > + _notmuch_message_update_mtime (message); > > + > > if (! message->frozen) > > _notmuch_message_sync (message); > >=20=20 > > @@ -1022,6 +1052,8 @@ notmuch_message_remove_tag (notmuch_message_t *me= ssage, const char *tag) > > private_status); > > } > >=20=20 > > + _notmuch_message_update_mtime (message); > > + > > if (! message->frozen) > > _notmuch_message_sync (message); > >=20=20 > > diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h > > index 60a932f..9859872 100644 > > --- a/lib/notmuch-private.h > > +++ b/lib/notmuch-private.h > > @@ -95,7 +95,8 @@ typedef enum { > > NOTMUCH_VALUE_TIMESTAMP =3D 0, > > NOTMUCH_VALUE_MESSAGE_ID, > > NOTMUCH_VALUE_FROM, > > - NOTMUCH_VALUE_SUBJECT > > + NOTMUCH_VALUE_SUBJECT, > > + NOTMUCH_VALUE_MTIME > > } notmuch_value_t; > >=20=20 > > /* Xapian (with flint backend) complains if we provide a term longer > > @@ -276,6 +277,9 @@ _notmuch_message_set_header_values (notmuch_message= _t *message, > > const char *from, > > const char *subject); > > void > > +_notmuch_message_update_mtime (notmuch_message_t *message); > > + > > +void > > _notmuch_message_sync (notmuch_message_t *message); > >=20=20 > > notmuch_status_t > > diff --git a/lib/notmuch.h b/lib/notmuch.h > > index 9f23a10..643ebce 100644 > > --- a/lib/notmuch.h > > +++ b/lib/notmuch.h > > @@ -910,6 +910,10 @@ notmuch_message_set_flag (notmuch_message_t *messa= ge, > > time_t > > notmuch_message_get_date (notmuch_message_t *message); > >=20=20 > > +/* Get the mtime of 'message' as a time_t value. */ > > +time_t > > +notmuch_message_get_mtime (notmuch_message_t *message); > > + > > /* Get the value of the specified header from 'message'. > > * > > * The value will be read from the actual message file, not from the =2D-=20 Thomas/Schnouki --=-=-= Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQEcBAEBAgAGBQJO8S/FAAoJEMPdciX+bh5ILyoH/0H/94+LglcGxWTx8YN7uRfw 6IeK8fPHm+ykKVSTEYrvsREi+FvkjOeLAuoMp1kOH2g1Z7O7dCTLrsYgjJcMrlMW b1lbZam6Vo5bd5455p9DY+SzSq2J29hRpyxp0yqM3u7Awbqi/eGWsfQXW1FUNSmg fQ2ra3wsP6d1o3oG09sAft9nkBpajcWimfG5bMaXy32srlSQXZU1uUjxAh7NyI+a tGiKYKE5Q9tCNmhv0EuToh7PoBNLH18RWYxtTicfp6Kcz2dtE+U3xfrSiieMu6PD +YSyTyEqu755yqZuZPVXmajsbvsgvGWHus5Bm1KMS2882MFJ2q6PjRFcgllI1YI= =w5vf -----END PGP SIGNATURE----- --=-=-=--