unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: David Mazieres <dm-list-email-notmuch@scs.stanford.edu>
To: "Amadeusz Żołnowski" <aidecoe@aidecoe.name>, notmuch@notmuchmail.org
Subject: Re: muchsync files renames
Date: Sat, 22 Aug 2015 22:41:59 -0700	[thread overview]
Message-ID: <876146o920.fsf@ta.scs.stanford.edu> (raw)
In-Reply-To: <878u93ujdo.fsf@freja.aidecoe.name>

Amadeusz Żołnowski <aidecoe@aidecoe.name> writes:

> Hi,
>
> I am testing muchsync-2 and it looks to me that files names across
> machines are different.  Moreover when syncing again after
> initialization it seems muchsync is working on something.  I have
> canceled this and rerun muchsync.  notmuch reported lots of files
> renames on server.  What and why it happens?

What muchsync specifically synchronizes for messages in the mapping:

    (directory, SHA-1-hash, link-count)

So if a directory contains two copies of a file on one machine, it will
end up with two copies on the other machine.  However, the file names
themselves are not the same, but rather are created in accordance with
the maildir spec.  (Note SHA-1 wouldn't be my first choice of hash
function, but notmuch already uses this for messages with long message
IDs, so I figured I'd just be consistent with existing practice.)

In terms of what muchsync is working on, you can run it with "-vvvv" on
both sides to get an idea, as in "muchsync -vvvv server -vvvv".  Better
yet, you can just run it on one side with "muchsync -vvvv".  You'll get
a lot of output, so maybe run it inside the script command to save the
output.maybe run it inside the script command to save the output.  If
you have enabled maildir.synchronize_flags, it could be that notmuch is
initially renaming all of your files, in which case muchsync needs to
re-hash them to make sure they haven't changed.

How did you cancel muchsync?  If you send it a single SIGINT or SIGTERM,
it attempts to clean up after itself.  However, upon multiple signals or
other signals, it immediately exits.  Muchsync is conservative about
updating the database, to avoid missing tags or files that have been
changed.  It always updates the notmuch database first, then its own
sqlite database with a version number.  That means if you kill muchsync,
some number of files may get picked up as changed again even though
really they were just copied from a peer.

To mitigate this problem, the muchsync client syncs the database every
10 seconds, so that in theory you should only get 10 seconds of extra
work from killing the client.  However, the server does not sync
periodically, on the assumption that it is more likely to read an EOF
than get killed, although currently it doesn't appear to commit any
pending transactions to the sqlite database upon EOF, which may be an
oversight.

So to summarize:

  * File names are not the same across machine, only file contents and
    directory structure.

  * Give muchsync lots of "-v" options to see what it is doing.

  * Try to avoid killing muchsync.  Doing so is safe, but likely to
    generate extra work in the form of phantom renames or tag changes
    that get synchronized even though they don't need to be.

  * Possibly the server should handle EOF more gracefully and commit any
    pending transactions, or the client should periodically send a
    commit command to the server.

If you think something is wrong, I can help you figure it out, but I
need to know what maildir.synchronize_flags is set to on each replica,
what you mean by "canceled", and roughly what was happening when you
canceled (uploading or downloading).

David

  reply	other threads:[~2015-08-23  5:42 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-22 21:02 muchsync files renames Amadeusz Żołnowski
2015-08-23  5:41 ` David Mazieres [this message]
2015-08-23  8:44   ` Amadeusz Żołnowski
2015-08-23 20:06     ` David Mazieres
2015-08-23 20:43       ` David Mazieres
2015-08-24 22:14         ` Amadeusz Żołnowski
2015-08-26  6:31       ` Amadeusz Żołnowski
     [not found]       ` <87vbbwnbb4.fsf@freja.aidecoe.name>
     [not found]         ` <87io7wr50y.fsf@ta.scs.stanford.edu>
2015-08-31 11:59           ` Amadeusz Żołnowski
2015-08-31 17:27             ` dm-list-email-notmuch
2015-08-31 22:11               ` Amadeusz Żołnowski
2015-08-31 23:43                 ` David Mazieres
2015-09-01 22:52                   ` Amadeusz Żołnowski
2015-09-01 23:20                     ` synchronize_flags leaving files in new (was muchsync files renames) dm-list-email-notmuch
2015-09-02 21:01                       ` Amadeusz Żołnowski
2015-09-02  0:37                     ` muchsync files renames David Bremner
2015-09-02  0:46                       ` dm-list-email-notmuch
2015-09-02 21:21                       ` Amadeusz Żołnowski
2015-09-02 23:05                         ` David Bremner
2015-09-09 17:49                           ` Amadeusz Żołnowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=876146o920.fsf@ta.scs.stanford.edu \
    --to=dm-list-email-notmuch@scs.stanford.edu \
    --cc=aidecoe@aidecoe.name \
    --cc=notmuch@notmuchmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).