unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* More ideas about logging.
@ 2011-12-16  2:09 David Bremner
  2011-12-16  4:07 ` Austin Clements
  2011-12-16  7:16 ` Michael Hudson-Doyle
  0 siblings, 2 replies; 10+ messages in thread
From: David Bremner @ 2011-12-16  2:09 UTC (permalink / raw)
  To: Notmuch Mail; +Cc: Olly Betts

[-- Attachment #1: Type: text/plain, Size: 3064 bytes --]


Various discussions (mostly on IRC) from my jlog proposal, and a from
Thomas's mtime
(id:"1323796305-28789-1-git-send-email-schnouki@schnouki.net") proposal
got me thinking.  So let me know what you think about the following.

The goal here is to log tag adds and deletes (including those implicit
in message deletion) to facilitate tag synchonization.

If we use Xapian to store transaction numbers (much as the
last_thread_id is stored now), then we don't need an external logging
library. We can rely on the xapian to keep other clients from writing

Assume we have routines read_metadata and write_metadata that read and
write to the xapian database metadata (in real life, I think we might
need to decide in advance exactly what will be written there).

when we create a database

write_metadata('log_write',0)
write_metadata('log_read',0) // more about this later

To carry out database operation X with logging, we do the following

begin_atomic

    txn=read_metadata('last_written')

    X

    // begin dangerzone
    fprintf(logfile,"%d %s",num+1,stuff) // or whatever.

    write_metadata('last_written', num+1)

end_atomic
//end dangerzone

If I understand correctly, then the only way the database and the log
can get out of sync is if this is interrupted in the "dangerzone"
between the start of the log write and the end of the xapian atomic
transaction. But then since we can consider the database authoritative
(since our goal is synchonization rather than recovery), we can discard
those portions of the log. We have to be a bit careful to discard
incomplete log items at the end of the log (maybe a checksum?).

So how do we discard? Two places. At the opening of the database for
writing, we truncate the log file (if we are very lazy, we can use seek
offsets as transaction indicies to facilitate this). 

In order to guarantee that log item is output exactly once, it seems
like we need another counter (or maybe I'm overthinking this)

     read_ptr = read_metadata('last_read')

     write_ptr = read_metadata('last_write')
     
     while (read_ptr < write_ptr) {
         begin_atomic
            s = read(read_ptr)
            do_stuff(s)
            read_ptr++
            write_metadata('log_read', read_ptr);
         end_atomic
     }

     write_metadata('log_write',0) // The log file will be truncated on
                                   // on db open
     write_metadata('log_read',0) 

I think we can double check if write_ptr <= read_ptr on next db open,
and truncate then if needed.

I think we need to assume that do_stuff is atomic here; I'm not sure how
reasonable or unreasonable that is in practice.

I also don't know about the performance implications of reading and
writing like maniac from the xapian metadata. Of course if this whole
scheme is fatally flawed, no need to worry about performance.

I don't think the actual amount of code involved would be too bad. Of
course, I thought was going to be a short message too.

d


[-- Attachment #2: Type: application/pgp-signature, Size: 315 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-10-12 16:28 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-16  2:09 More ideas about logging David Bremner
2011-12-16  4:07 ` Austin Clements
2011-12-16 11:56   ` David Bremner
2011-12-18 18:34   ` David Bremner
2011-12-18 20:22     ` Tom Prince
2011-12-20 20:25       ` David Bremner
2012-10-12 16:28   ` Ethan Glasser-Camp
2011-12-16  7:16 ` Michael Hudson-Doyle
2011-12-16 12:02   ` David Bremner
2011-12-18 21:53     ` Michael Hudson-Doyle

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).