From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id C8E636DE1075 for ; Wed, 3 Apr 2019 13:05:47 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: 0.451 X-Spam-Level: X-Spam-Status: No, score=0.451 tagged_above=-999 required=5 tests=[AWL=-0.201, SPF_NEUTRAL=0.652] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rodl-EtKfCjV for ; Wed, 3 Apr 2019 13:05:46 -0700 (PDT) Received: from guru.guru-group.fi (guru.guru-group.fi [46.183.73.34]) by arlo.cworth.org (Postfix) with ESMTP id 8A9DE6DE0ED6 for ; Wed, 3 Apr 2019 13:05:46 -0700 (PDT) Received: from guru.guru-group.fi (localhost [IPv6:::1]) by guru.guru-group.fi (Postfix) with ESMTP id 2A24910019F; Wed, 3 Apr 2019 23:05:42 +0300 (EEST) From: Tomi Ollila To: David Bremner , Michael J Gruber , notmuch@notmuchmail.org Subject: Re: [PATCH] performance-tests: tests for renamed/copied files in notmuch new In-Reply-To: <20190402124011.16642-1-david@tethera.net> References: <587fa8b9dbaa8b8583e83eaa3825e74a24b5ba20.1537284357.git.git@grubix.eu> <20190402124011.16642-1-david@tethera.net> User-Agent: Notmuch/0.28.3+42~g7b16377 (https://notmuchmail.org) Emacs/25.2.1 (x86_64-unknown-linux-gnu) X-Face: HhBM'cA~ MIME-Version: 1.0 Content-Type: text/plain X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Apr 2019 20:05:47 -0000 On Tue, Apr 02 2019, David Bremner wrote: > Several people have observed that this is surprisingly slow, and we > have a proposal to add tagging into this code path, so we want to make > sure it doesn't imply too much of a performance hit. > --- > performance-test/T00-new.sh | 30 ++++++++++++++++++++++++++++++ > 1 file changed, 30 insertions(+) > > I added these tests to help evaluate Michael's propesed patch. I'll send the results in a seperate email. > > diff --git a/performance-test/T00-new.sh b/performance-test/T00-new.sh > index 68750129..cec28d58 100755 > --- a/performance-test/T00-new.sh > +++ b/performance-test/T00-new.sh > @@ -12,4 +12,34 @@ for i in $(seq 2 6); do > time_run "notmuch new #$i" 'notmuch new' > done > > +manifest=$(mktemp manifestXXXXXX) > + > +count=0 > +total=0 > +while read -r name ; do > + if [ $((total % 4 )) -eq 0 ]; then > + echo $name >> $manifest > + count=$((count + 1)) > + fi > + total=$((total + 1)) > +done < <(find mail -type f ! -path 'mail/.notmuch/*' ) // this comment was written last in this email, just for fun >;) // find mail -type f ! -path 'mail/.notmuch/*' | sed -n '1~4 p' > $manifest count=`wc $manifest` (I'd be interested which one of the above were faster -- my suggestion does quite a many more forks and execve's but abowe read loop 200 000 read(2)'s and [lf]seek(2)s (and then 50 000 opens). well, probably no-one would notice difference...) > + > +while read -r name ; do > + mv $name ${name}.renamed > +done < $manifest --------'12' -- 2 spaces above (and below...) luckily bash read builtin does not read input byte at a time (IIRC it read 128 bytes, then scanned for newline and then seeked -- in this case it can, since file was redirected -- fd is seekable) 50 000 mv(1) executions definitely take time. perl -nle 'rename $_, "$_.renamed"' $manifest would be significantly faster > + > +time_run "new ($count mv)" 'notmuch new' > + > +while read -r name ; do > + mv ${name}.renamed $name > +done < $manifest > + > +time_run "new ($count mv back)" 'notmuch new' > + > +while read -r name ; do > + cp ${name} $name.copy > +done < $manifest perl -nle 'link $_, "$_.copy"' $manifest ? > + > +time_run "new ($count cp)" 'notmuch new' > + > time_done > -- > 2.20.1 > > _______________________________________________ > notmuch mailing list > notmuch@notmuchmail.org > https://notmuchmail.org/mailman/listinfo/notmuch