From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 63B54431FB6 for ; Tue, 22 Nov 2011 18:46:02 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id C+d3kibxYolD for ; Tue, 22 Nov 2011 18:46:02 -0800 (PST) Received: from dmz-mailsec-scanner-6.mit.edu (DMZ-MAILSEC-SCANNER-6.MIT.EDU [18.7.68.35]) by olra.theworths.org (Postfix) with ESMTP id D7788429E30 for ; Tue, 22 Nov 2011 18:46:01 -0800 (PST) X-AuditID: 12074423-b7f266d0000008b8-ae-4ecc5e68bc1c Received: from mailhub-auth-4.mit.edu ( [18.7.62.39]) by dmz-mailsec-scanner-6.mit.edu (Symantec Messaging Gateway) with SMTP id DD.2D.02232.86E5CCE4; Tue, 22 Nov 2011 21:46:00 -0500 (EST) Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by mailhub-auth-4.mit.edu (8.13.8/8.9.2) with ESMTP id pAN2jx67029817; Tue, 22 Nov 2011 21:46:00 -0500 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) (authenticated bits=0) (User authenticated as amdragon@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id pAN2jvgk002876 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Tue, 22 Nov 2011 21:45:59 -0500 (EST) Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.77) (envelope-from ) id 1RT2sk-0006s3-MG; Tue, 22 Nov 2011 21:48:18 -0500 Date: Tue, 22 Nov 2011 21:48:18 -0500 From: Austin Clements To: Petter Reinholdtsen Subject: Re: 'notmuch new' leaking memory and getting slower over time? Message-ID: <20111123024818.GI9351@mit.edu> References: <2flfwhht87d.fsf@diskless.uio.no> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2flfwhht87d.fsf@diskless.uio.no> User-Agent: Mutt/1.5.21 (2010-09-15) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprAKsWRmVeSWpSXmKPExsUixG6nrpsRd8bPYNNxOYvrN2cyW+y8OoHR gcmjc+leVo9nq24xBzBFcdmkpOZklqUW6dslcGV8OtnGXrCSq+LDtO/MDYwLOLoYOTkkBEwk OhasY4ewxSQu3FvP1sXIxSEksI9RovXPA3YIZwOjxKpJd6Cck0wSn9uXQzlLGCUavp8C62cR UJV4+GsTmM0moCGxbf9yRhBbREBT4umPXcwgNrOAtMS3381MILawgLvE+0mrwGp4BbQlJnfc YQOxhQR0JdYf/8MKEReUODnzCQtEr5bEjX8vgXo5wOYs/wf2AqeAnsTZiXvA1ooKqEhMObmN bQKj0Cwk3bOQdM9C6F7AyLyKUTYlt0o3NzEzpzg1Wbc4OTEvL7VI10wvN7NELzWldBMjOLBd lHcw/jmodIhRgINRiYc38uRpPyHWxLLiytxDjJIcTEqivBqxZ/yE+JLyUyozEosz4otKc1KL DzFKcDArifBecwfK8aYkVlalFuXDpKQ5WJTEeWV2OvgJCaQnlqRmp6YWpBbBZGU4OJQkeBtB hgoWpaanVqRl5pQgpJk4OEGG8wANXwBSw1tckJhbnJkOkT/FqMux4Mr104xCLHn5ealS4rzd IEUCIEUZpXlwc2AJ6RWjONBbwrwlIFU8wGQGN+kV0BImoCXT1p4AWVKSiJCSamDk1lkWejv+ pegJ3Qqp/Iw7rhuOWvmpRYrf5L34XPSwXl1N/S3jwyqSS/LirY6xZVcv/mV2vUjFa8vjTEft 41u839Y/K3Tf8ufqhMuHMj3e3TJZwFq25pZW+b/6Zxp//yqFc7IsYtuhH8pdv+HopCmW58Vm vzbw9vwo1rKzJFD87HG3IMuMx0eVWIozEg21mIuKEwE2yfBDIwMAAA== Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Nov 2011 02:46:02 -0000 Quoth Petter Reinholdtsen on Nov 21 at 11:35 pm: > The indexing took 36 hours. At the start it claimed it would take 10 > hours, and it continued to underestimate the amount of time left until > the very end. It claimed to have 1 hour left when I checked before I > went to bed, and claimed to have 15 minutes left when I woke up 6-7 > hours later. notmuch new does a simple linear extrapolation based on how many files it's examined and how many there are total. This is doomed to undershoot at least because indexing becomes slower as the database grows (B-tree insertion is O(log N), fragmentation will increase over time, posting lists will get longer...). I'm not sure much can be done about the estimate at the beginning, short of throwing in some fudge factor, but the estimates later in the process would be much more accurate if it used a sliding window, rather than measuring from the beginning. > Shortly before the indexing finished, the notmuch process was using 1.2 > GiB of resident memory according to top. Is the process leaking memory? It's possible this is just memory fragmentation, but it definitely sounds like a leak. talloc has some tools for tracking down leaks and it would be good to heap profile notmuch new, but to my knowledge nobody's applied these tools.