From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 4EA7D429E41 for ; Wed, 23 Nov 2011 09:20:36 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.099 X-Spam-Level: X-Spam-Status: No, score=-0.099 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HgltgiMAGnQr for ; Wed, 23 Nov 2011 09:20:35 -0800 (PST) Received: from nm14.bullet.mail.sp2.yahoo.com (nm14.bullet.mail.sp2.yahoo.com [98.139.91.84]) by olra.theworths.org (Postfix) with SMTP id 8EFEC431FB6 for ; Wed, 23 Nov 2011 09:20:35 -0800 (PST) Received: from [98.139.91.67] by nm14.bullet.mail.sp2.yahoo.com with NNFMP; 23 Nov 2011 17:20:34 -0000 Received: from [98.139.91.43] by tm7.bullet.mail.sp2.yahoo.com with NNFMP; 23 Nov 2011 17:20:34 -0000 Received: from [127.0.0.1] by omp1043.mail.sp2.yahoo.com with NNFMP; 23 Nov 2011 17:20:34 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 255811.61503.bm@omp1043.mail.sp2.yahoo.com Received: (qmail 20946 invoked by uid 60001); 23 Nov 2011 17:20:33 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1322068833; bh=LqdQrdOLM6CVIwAh0xi8NV4mYf47Yl+tfsIXbhLt2XE=; h=X-YMail-OSG:Received:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=zdpOBl5HY3JqgAi5r4tmbKMg57uqYxAHmylzdz+j4iK+L+rRcjymPHIaYnPE79Tuk2XEv16Gw5lFB9/5Yigc7e4/r0HQvzU/GPrqocsdrunvWbYPjIkJsuJe/ZyL7gFMum42DD/4Klr0k0H20XCn+As+ngp8oaMf+2ewOwR3vK4= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=MEojAI/vl5NuBVAoB+5TD2gTNni+HStoIfGzUACU48Kfpwvjnb1d/aFK7kzfZkl22Re9uM9tm5/P008fObQ9KSo8Q++ComnAs1z6MQijSvh5OlNF7+VJLhFo1qFgT8LjwzcdVhh2rzOf90rBz5L9J+QPhTcNs4fIBHnzSa/UbAY=; X-YMail-OSG: xmstlj8VM1k2ZwvwlKTN9hDfGrOmJkP8qmHbp3MbjZEMfyq wl8fbyLmBNGNMNUnnY36yD_Vny2G5R056XN3UU6C0HbYh.3BNQnyvD0aPTci 3hKTJgwqac948mpdsZu_0aILnIyAPSfEHNcXWbCb0RpsdRlL.CsjL1owkf_T yop_sLIpqFoMFLL7Kzq4q3dvPIelw52A3mdQ381UDrRrV8qv4ZaXaXtPSZ88 ItEsz_Zlywn2acURywfSeMgYs0mBKFRBEZlBeCwf4F0bN7kIYSluC0JDm_sj jOUbm78..JrFXrk.jhsPA9ohHun7Cq7Bd3_CUoVGr20Qzxmpmy1KzceYJwED eM2niPlpAoEuW51zhE8IXpG7yE.UW3yUeJX_8QZK3EhqneFSsUoyvBY0CaEq G0aYByLNNaL90Sfds5uRv25uyVIxQkxffLahoCbAlCbxB5GbsyOfLOJRD_T8 HmSRFgOwyy7gy5C1mrLYK Received: from [74.43.153.1] by web36504.mail.mud.yahoo.com via HTTP; Wed, 23 Nov 2011 09:20:33 PST X-Mailer: YahooMailWebService/0.8.115.325013 References: <1321930927.73603.YahooMailNeo@web36506.mail.mud.yahoo.com> Message-ID: <1322068833.15983.YahooMailNeo@web36504.mail.mud.yahoo.com> Date: Wed, 23 Nov 2011 09:20:33 -0800 (PST) From: Tom Bulli Subject: Re: Notmuch indexing 21 million emails To: Felipe Contreras In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Cc: "notmuch@notmuchmail.org" X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list Reply-To: Tom Bulli List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Nov 2011 17:20:36 -0000 I have been able to speed that up with the code below - basically increase = "XAPIAN_FLUSH_THRESHOLD" based on the total virtual memory divided by the a= vg. size of an email times 2 (just to be safe).=A0 It seems to be faster si= nce it does less xapian updates.=A0 However, I have a nagging feeling that = ""XAPIAN_FLUSH_THRESHOLD" could even be higher since I don't see any increa= se in used memory (via "top -d 1").=A0 The server in question has eight CPU= cores and 8GB RAM, running Debian squeeze on a 32bit architecture (I know = - but it is what it is :) ).=0A=0A=0A# Assume an average size of 120KB per = email=0A#=A0 and use at most half the virtual memory=0AXFT=3D$(($(free -otk= =A0 | awk '/^Total/ {print $2}') / 240))=0A# Keep more index info in memory= before flushing to disk=0A[ $XFT -lt 10000 ] && XFT=3D10000=0Asu - archive= -c "export XAPIAN_FLUSH_THRESHOLD=3D$XFT; notmuch new --verbose"=0A=0A=0A= =0A=0A----- Original Message -----=0A> From: Felipe Contreras =0A> To: Tom Bulli =0A> Cc: "notmuch@notm= uchmail.org" =0A> Sent: Wednesday, November 23, 20= 11 10:40 AM=0A> Subject: Re: Notmuch indexing 21 million emails=0A> =0A> On= Tue, Nov 22, 2011 at 5:02 AM, Tom Bulli wrote:=0A>> I= have a project where I need to search about 21 emails - and decided to =0A= > use "notmuch" for it.=A0 The system is a Debian Squeeze, the notmuch =0A>= version is "0.8-1~bpo60+1" from "kyria's" private =0A> repository.=0A>> = =0A>> I am running the "notmuch new" for approx. 4 days now - and =0A> acc= ording to "not,uch count" it has indexed about 4.5 million emails.=0A>> =0A= >> Is this expected performance?=A0 Is there any way to speed that up?=0A>= =0A> It would be nice to run something like this with OProfile (or perf)= =0A> and see if there's some obvious fixes.=0A> =0A> -- =0A> Felipe Contrer= as=0A>