From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai =?iso-8859-1?q?Gro=DFjohann?=) Newsgroups: gmane.emacs.devel Subject: Re: Apropos commands and regexps Date: Fri, 17 May 2002 14:01:49 +0200 Sender: emacs-devel-admin@gnu.org Message-ID: References: <5xbsbj9834.fsf@kfs2.cua.dk> <200205150700.g4F70rr16163@aztec.santafe.edu> <87ptzxmz7s.fsf@tc-1-100.kawasaki.gol.ne.jp> <5xoffhoywn.fsf@kfs2.cua.dk> <5xg00r4tlo.fsf@kfs2.cua.dk> <87sn4rdabb.fsf@tc-1-100.kawasaki.gol.ne.jp> NNTP-Posting-Host: localhost.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: main.gmane.org 1021636982 2971 127.0.0.1 (17 May 2002 12:03:02 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Fri, 17 May 2002 12:03:02 +0000 (UTC) Cc: storm@cua.dk (Kim F. Storm), rms@gnu.org, eliz@is.elta.co.il, emacs-devel@gnu.org Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.33 #1 (Debian)) id 178gRp-0000ln-00 for ; Fri, 17 May 2002 14:03:01 +0200 Original-Received: from fencepost.gnu.org ([199.232.76.164]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 178geJ-0000Am-00 for ; Fri, 17 May 2002 14:15:55 +0200 Original-Received: from localhost ([127.0.0.1] helo=fencepost.gnu.org) by fencepost.gnu.org with esmtp (Exim 3.34 #1 (Debian)) id 178gRz-0005Wk-00; Fri, 17 May 2002 08:03:11 -0400 Original-Received: from waldorf.cs.uni-dortmund.de ([129.217.4.42]) by fencepost.gnu.org with esmtp (Exim 3.34 #1 (Debian)) id 178gQv-0005Uh-00; Fri, 17 May 2002 08:02:05 -0400 Original-Received: from lothlorien.cs.uni-dortmund.de (lothlorien [129.217.19.67]) by waldorf.cs.uni-dortmund.de with ESMTP id g4HC1wb05208; Fri, 17 May 2002 14:01:58 +0200 (MES) Original-Received: from lucy.cs.uni-dortmund.de (lucy [129.217.19.80]) by lothlorien.cs.uni-dortmund.de id OAA09574; Fri, 17 May 2002 14:01:49 +0200 (MET DST) Original-Received: by lucy.cs.uni-dortmund.de (Postfix, from userid 6104) id 293DE3B41D; Fri, 17 May 2002 14:01:49 +0200 (CEST) Original-To: Miles Bader In-Reply-To: <87sn4rdabb.fsf@tc-1-100.kawasaki.gol.ne.jp> (Miles Bader's message of "17 May 2002 06:58:32 +0900") Original-Lines: 24 User-Agent: Gnus/5.090007 (Oort Gnus v0.07) Emacs/21.2.50 (i686-pc-linux-gnu) Errors-To: emacs-devel-admin@gnu.org X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.0.9 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Emacs development discussions. List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.emacs.devel:4057 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:4057 Miles Bader writes: > `or' is clearly wrong; even in emacs' `limited' universe, it generates > way too many hits. > > E.g., (apropos "\\(find.*file\\|file.*find\\)") gets about 50 hits, > whereas (apropos "\\(find\\|file\\)") gets over 700! > > Maybe your idea of `at least N matches' is a good compromise. Information Retrieval research has shown that weighting and ranking is what's needed. Just list the "good" matches first. With Boolean searches, people need to issue a lot of queries to select an appropriate answer set. If you have ranking, fewer queries will be sufficient. But it might be useful to somehow indicate to the user the nature of each match so that the user can decide what they want. For example: this group of matches contains all words, the following group of matches misses the word foo, ... kai -- Silence is foo!