From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id ABD736DE0176 for ; Sat, 28 Dec 2019 16:21:14 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: -0.065 X-Spam-Level: X-Spam-Status: No, score=-0.065 tagged_above=-999 required=5 tests=[AWL=-0.064, SPF_PASS=-0.001] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id B1nc5pYRMhP4 for ; Sat, 28 Dec 2019 16:21:13 -0800 (PST) Received: from fethera.tethera.net (fethera.tethera.net [198.245.60.197]) by arlo.cworth.org (Postfix) with ESMTPS id 33C426DE0175 for ; Sat, 28 Dec 2019 16:21:13 -0800 (PST) Received: from remotemail by fethera.tethera.net with local (Exim 4.89) (envelope-from ) id 1ilMKS-00062y-Or; Sat, 28 Dec 2019 19:21:08 -0500 Received: (nullmailer pid 9566 invoked by uid 1000); Sun, 29 Dec 2019 00:21:05 -0000 From: David Bremner To: Matthew Schauer , notmuch@notmuchmail.org Subject: Re: Xapian commits unexpectedly slow In-Reply-To: <4b3b642b-8f5b-4e8c-9f29-76d393d45fd6@e10x.net> References: <4b3b642b-8f5b-4e8c-9f29-76d393d45fd6@e10x.net> Date: Sun, 29 Dec 2019 09:21:05 +0900 Message-ID: <87tv5kvv66.fsf@tethera.net> MIME-Version: 1.0 Content-Type: text/plain X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Dec 2019 00:21:14 -0000 Matthew Schauer writes: > Greetings, > > I've been trying to migrate about 25K e-mails to Notmuch, and I'm seeing > some frustrating performance characteristics that don't seem to match > with the experience others report. 25,000 messages should really not be a strain, spinning rust or no. > I'm dumping messages from > Thunderbird in batches and then running `notmuch new` to add each batch > to the database. The indexing performance remains okay, at more than > 200 per second, but after Notmuch has reported it's finished indexing, > it hangs for as much as several minutes before exiting. A stack trace > confirms that this is Xapian committing the database, with most of the > time seemingly spent in `fdatasync`. The time spent grows with the size > of the database, not the number of e-mails being imported, which means > this will remain a problem during day-to-day usage. It would be interesting if you could report the results of running the notmuch performance test suite (under performance-test/ in the source). The other thing I'm curious about is the actual size of the database. This varies a lot, but in the past pathological performance has sometimes been linked to indexing things that should not be, bloating the database. d