From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 7045D431FAF for ; Sun, 2 Dec 2012 06:42:23 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CIJgjHjkayEV for ; Sun, 2 Dec 2012 06:42:19 -0800 (PST) Received: from mail-lb0-f181.google.com (mail-lb0-f181.google.com [209.85.217.181]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id B2ADA431FAE for ; Sun, 2 Dec 2012 06:42:18 -0800 (PST) Received: by mail-lb0-f181.google.com with SMTP id ge1so1790567lbb.26 for ; Sun, 02 Dec 2012 06:42:17 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:to:cc:subject:in-reply-to:references:user-agent:date :message-id:mime-version:content-type:x-gm-message-state; bh=XxZDRzfk1zO5LuVqfSjhThU6FJaR8zl08ZgYXq3yTVk=; b=gdbKe4ojBA4kpYhNG84JmRyfoqhZgIjwPbxpGxQ4ftrS0GwXL8s5wsoyqk3h/z6B/F md5GDJVQ+TdeY2z/jU0PB/4oUuNwM0FI6rb5MG2P9zMYE663+bqCmclImGYJweAe3m4G YtaAZjLvo1wFIRakXaldFIFfNX6LgJsdiUgXrnmxbjXgtuE1RI3WjPvCg+mbjquPhBAv SFi5+F6PsdOoeE7+9J8cSgk8JO40WieF75MJZyNUOnqS+NueFojnNjPk1yjDglbPThaw PW5gh9m6uGLCKqYo/V3gdcFbAKVPIW/xIqA+rT8NUYyconiFOePnBYCfV9b0rGlH3U6g NADg== Received: by 10.112.49.167 with SMTP id v7mr3165890lbn.122.1354459335873; Sun, 02 Dec 2012 06:42:15 -0800 (PST) Received: from localhost (dsl-hkibrasgw4-fe51df00-27.dhcp.inet.fi. [80.223.81.27]) by mx.google.com with ESMTPS id v6sm4197127lbf.11.2012.12.02.06.42.13 (version=SSLv3 cipher=OTHER); Sun, 02 Dec 2012 06:42:14 -0800 (PST) From: Jani Nikula To: david@tethera.net, notmuch@notmuchmail.org Subject: Re: [Patch v2 17/17] tag-util: optimization of tag application In-Reply-To: <1353792017-31459-18-git-send-email-david@tethera.net> References: <1353792017-31459-1-git-send-email-david@tethera.net> <1353792017-31459-18-git-send-email-david@tethera.net> User-Agent: Notmuch/0.14+124~g3b17402 (http://notmuchmail.org) Emacs/23.4.1 (i686-pc-linux-gnu) Date: Sun, 02 Dec 2012 16:42:12 +0200 Message-ID: <87txs4cy7v.fsf@nikula.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Gm-Message-State: ALoCoQkWGwgv9AMGeWAoWMdfxZHI42CEc1twFZ3so6RaVfwuYvj/X2nIFeLg94O1+7DB9VI+X16k Cc: David Bremner X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 Dec 2012 14:42:23 -0000 On Sat, 24 Nov 2012, david@tethera.net wrote: > From: David Bremner > > The idea is not to bother with restore operations if they don't change > the set of tags. This is actually a relatively common case. > > In order to avoid fancy datastructures, this method is quadratic in > the number of tags; at least on my mail database this doesn't seem to > be a big problem. It's bound to be slower than the original optimization using strcmp, but that was based on a number of assumptions we can no longer make. In any case, not doing message sync is the key optimization. One side effect of any optimization (also the existing one) is not doing maildir sync from tags to flags when there are no tag changes. This just emphasizes that one should never do dump or restore with the mail store and database out of sync. I don't think we emphasize the importance of running notmuch new before dump and restore enough. A few more comments below. BR, Jani. > --- > notmuch-tag.c | 2 +- > tag-util.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 60 insertions(+), 1 deletion(-) > > diff --git a/notmuch-tag.c b/notmuch-tag.c > index 8a8af0b..e4fca67 100644 > --- a/notmuch-tag.c > +++ b/notmuch-tag.c > @@ -140,7 +140,7 @@ tag_query (void *ctx, notmuch_database_t *notmuch, const char *query_string, > notmuch_messages_valid (messages) && ! interrupted; > notmuch_messages_move_to_next (messages)) { > message = notmuch_messages_get (messages); > - tag_op_list_apply (message, tag_ops, flags); > + tag_op_list_apply (message, tag_ops, flags | TAG_FLAG_PRE_OPTIMIZED); > notmuch_message_destroy (message); > } > > diff --git a/tag-util.c b/tag-util.c > index 287cc67..2bb8355 100644 > --- a/tag-util.c > +++ b/tag-util.c > @@ -111,6 +111,62 @@ message_error (notmuch_message_t *message, > fprintf (stderr, "Status: %s\n", notmuch_status_to_string (status)); > } > > +static int > +makes_changes (notmuch_message_t *message, > + tag_op_list_t *list, > + tag_op_flag_t flags) > +{ > + > + int i; > + > + notmuch_tags_t *tags; > + notmuch_bool_t changes = FALSE; > + > + /* First, do we delete an existing tag? */ > + changes = FALSE; > + for (tags = notmuch_message_get_tags (message); > + ! changes && notmuch_tags_valid (tags); > + notmuch_tags_move_to_next (tags)) { > + const char *cur_tag = notmuch_tags_get (tags); > + int last_op = (flags & TAG_FLAG_REMOVE_ALL) ? -1 : 0; > + > + for (i = 0; i < list->count; i++) { > + if (strcmp (cur_tag, list->ops[i].tag) == 0) { > + last_op = list->ops[i].remove ? -1 : 1; > + } > + } If you looped from end to beginning, you could break on first match. > + > + changes = (last_op == -1); > + } > + notmuch_tags_destroy (tags); > + > + if (changes) > + return TRUE; > + > + /* Now check for adding new tags */ > + for (i = 0; i < list->count; i++) { > + notmuch_bool_t exists = FALSE; > + I think you can do: if (list->ops[i].remove) continue; here and remove the check on .remove below. > + for (tags = notmuch_message_get_tags (message); > + notmuch_tags_valid (tags); > + notmuch_tags_move_to_next (tags)) { > + const char *cur_tag = notmuch_tags_get (tags); > + if (strcmp (cur_tag, list->ops[i].tag) == 0) { > + exists = TRUE; > + break; > + } > + } > + notmuch_tags_destroy (tags); > + > + /* the following test is conservative, it's ok to think we > + * make changes when we don't */ I think it's "too" conservative only when the input has e.g. "+foo -foo", but I wouldn't worry about optimizing that. > + if ( ! exists && ! list->ops[i].remove ) > + return TRUE; > + } > + return FALSE; > + > +} > + > notmuch_status_t > tag_op_list_apply (notmuch_message_t *message, > tag_op_list_t *list, > @@ -121,6 +177,9 @@ tag_op_list_apply (notmuch_message_t *message, > notmuch_status_t status = 0; > tag_operation_t *tag_ops = list->ops; > > + if (! (flags & TAG_FLAG_PRE_OPTIMIZED) && ! makes_changes (message, list, flags)) > + return NOTMUCH_STATUS_SUCCESS; > + > status = notmuch_message_freeze (message); > if (status) { > message_error (message, status, "freezing message"); > -- > 1.7.10.4 > > _______________________________________________ > notmuch mailing list > notmuch@notmuchmail.org > http://notmuchmail.org/mailman/listinfo/notmuch