From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Lennart Borgman Newsgroups: gmane.emacs.devel Subject: Re: Indexed search with grep-like output Date: Mon, 3 Jan 2011 04:38:41 +0100 Message-ID: References: <831v4wpcue.fsf@gnu.org> <83wrmone2h.fsf@gnu.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1294025956 9248 80.91.229.12 (3 Jan 2011 03:39:16 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 3 Jan 2011 03:39:16 +0000 (UTC) Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Jan 03 04:39:11 2011 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1PZbGJ-0000l6-19 for ged-emacs-devel@m.gmane.org; Mon, 03 Jan 2011 04:39:11 +0100 Original-Received: from localhost ([127.0.0.1]:34086 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PZbGI-0004cb-3N for ged-emacs-devel@m.gmane.org; Sun, 02 Jan 2011 22:39:10 -0500 Original-Received: from [140.186.70.92] (port=60828 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PZbGE-0004cW-62 for emacs-devel@gnu.org; Sun, 02 Jan 2011 22:39:07 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PZbGC-0006Nk-NG for emacs-devel@gnu.org; Sun, 02 Jan 2011 22:39:06 -0500 Original-Received: from mail-ew0-f41.google.com ([209.85.215.41]:47957) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PZbGC-0006NY-8k; Sun, 02 Jan 2011 22:39:04 -0500 Original-Received: by ewy27 with SMTP id 27so6777061ewy.0 for ; Sun, 02 Jan 2011 19:39:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=krf++asOl99yGCs2Z1K0Qk7eFMd60ubq3ck3Swznb8E=; b=u8fH/sGKzdNIc2AZIpmgqMzd/N+2t1v6gj6v6C1He97PYFG8WRc5kK/XLjwLXYh321 mwWzs60bosuq9tiqMqtUWVAZHLUmIW7aMAnJ2Wx4ZKG0sNHsXGQIMXhZ0faxMEw7540c xmBhUU6BwdVhq3CI/6sjwVU9oQqztdPdJN5kU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; b=ot3q+p+hymNAgFpBSEFsF30TTXi+aepGzZ22giF4vjne0ZnkbR0CkD1ScVEy2kzq/4 f8mYFL/VWkDEh2Nv9cbnGFTF0rCwCeyrUnxz6T6S2+xUFIAD7Y02LBvqwJdIVY6vxhqo 9Gf5K7D13I9X0sxvGPXCOy8gaKVxuMPYGVo4g= Original-Received: by 10.213.10.75 with SMTP id o11mr92194ebo.71.1294025941641; Sun, 02 Jan 2011 19:39:01 -0800 (PST) Original-Received: by 10.213.20.148 with HTTP; Sun, 2 Jan 2011 19:38:41 -0800 (PST) In-Reply-To: X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:134185 Archived-At: On Sun, Jan 2, 2011 at 4:51 PM, Eli Zaretskii wrote: >> From: Lennart Borgman >> Date: Sun, 2 Jan 2011 15:15:39 +0100 >> Cc: emacs-devel@gnu.org >> >> On Sun, Jan 2, 2011 at 2:53 PM, Eli Zaretskii wrote: >> >> From: Lennart Borgman >> >> Date: Sun, 2 Jan 2011 14:46:45 +0100 >> >> Cc: emacs-devel@gnu.org >> >> >> >> > Yes, I use one of the tools that builds on Lucene (see >> >> > http://www.methods.co.nz/docindexer/) to index MS Office documents >> >> > (some 17,000 of them) I have on my office machine. =C2=A0It is also= very >> >> > fast: just a few seconds to return a query. >> >> >> >> Is that from within Emacs or? >> > >> > It doesn't matter. =C2=A0Most of the time is to output a long list of >> > documents, i.e. I/O. >> >> Does that mean you actually see the same time inside of Emacs as >> outside of Emacs? > > Yes, the same (2.5 seconds for a query that returns 951 documents). I think my estimation of the time required were a bit too high (though it depends on whether it is "cold" or warm test). I have now added support for Google Desktop Search too (the grepping is not added there yet though). It feels snappier, but the timing tells me it should be about the same as Windows Desktop Search. Wonder if I did something wrong... ;-) The Google version is a bit more cumbersome to setup, but on the other hand you do not need to add ruby for it. And it is cross platform. Both the Google and the Windows version seems fast enough at least for the data set I have. On my old pc it takes no more than 1-2 seconds in most cases to do the search. It would be nice to have support for other search engines too, but I am not going to add that myself. However if they are SQL-based then it should be a simple matter of restructuring the ruby file a little bit, generalizing the code a bit more, to add them. Note that my part of the coding is not quite ready yet (some parts are missing and there are some bugs), but the structure is there so it should be ok to add new search engines to what there is now.