From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Stephen J. Turnbull" Newsgroups: gmane.emacs.devel Subject: Re: Adding a few more finder keywords Date: Tue, 09 Jun 2015 13:39:51 +0900 Message-ID: <873821xzon.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87sia2l04r.fsf@gmail.com> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 X-Trace: ger.gmane.org 1433824832 13820 80.91.229.3 (9 Jun 2015 04:40:32 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 9 Jun 2015 04:40:32 +0000 (UTC) Cc: Oleh Krehel , Artur Malabarba , emacs-devel To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Jun 09 06:40:23 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Z2BKo-0005SC-NT for ged-emacs-devel@m.gmane.org; Tue, 09 Jun 2015 06:40:22 +0200 Original-Received: from localhost ([::1]:33051 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z2BKn-0004kV-SD for ged-emacs-devel@m.gmane.org; Tue, 09 Jun 2015 00:40:21 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:49092) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z2BKX-0004iP-4w for emacs-devel@gnu.org; Tue, 09 Jun 2015 00:40:06 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z2BKR-0005tP-JA for emacs-devel@gnu.org; Tue, 09 Jun 2015 00:40:05 -0400 Original-Received: from shako.sk.tsukuba.ac.jp ([130.158.97.161]:33436) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z2BKR-0005t1-9e for emacs-devel@gnu.org; Tue, 09 Jun 2015 00:39:59 -0400 Original-Received: from uwakimon.sk.tsukuba.ac.jp (uwakimon.sk.tsukuba.ac.jp [130.158.99.156]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by shako.sk.tsukuba.ac.jp (Postfix) with ESMTPS id C574D1C386E; Tue, 9 Jun 2015 13:39:53 +0900 (JST) Original-Received: by uwakimon.sk.tsukuba.ac.jp (Postfix, from userid 1000) id D4F301A28D3; Tue, 9 Jun 2015 13:39:52 +0900 (JST) In-Reply-To: X-Mailer: VM undefined under 21.5 (beta34) "kale" 83e5c3cd6be6 XEmacs Lucid (x86_64-unknown-linux) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 130.158.97.161 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:187116 Archived-At: Stefan Monnier writes: > We could decide that the specific keywords are unwanted, tho. An "unwanted" keyword doesn't exist though. Somebody wanted it or it wasn't in Keywords: in the first place. And although every human is unique, very few humans are so unique that they'll choose a keyword that nobody else would use to look up packages. So I think what you mean by "unwanted" is mostly "redundant (because a synonym)". It seems to me that 1. There *should* be a list of "recommended keywords" which package maintainers can easily access for reference when choosing keywords to specify for their packages and users can refer to get an idea of the keywords maintainers are likely to use. 2. There *should* be a database of synonyms of recommended keywords for use by maintainers to discover recommended keywords, and for *finder* to use in user searches for keywords. Finder should probably divide its report into exact matches for the user's keyword and matches discovered via synonyms. The schema for this database is unclear to me. Should there be a "similarity" measure to indicate how synonymous two keywords are? (Probably a YAGNI.) Should the primary key of the database be restricted to recommended keywords, or perhaps just be the most frequently used of a synonym group? (See point 3 below.) 3. There should be a tool to walk the libraries producing a Pareto distribution of keywords. Those at the top of the distribution would be excellent candidates for the "recommended" list (but beware, it's quite possible that two popular keywords could be synonyms!). Those at the bottom would be candidates for addition to the database of synonyms and replacement with a recommended keyword. Probably this tool only needs to be run at release time, and the distribution database could be included in etc. There's no need to be fascist about keyword maintenance and pruning low-frequency keywords that have synonyms, either. There is quite some incentive for maintainers to use user-discoverable (ie, recommended) keywords, if you provide the tools so that they can find them easily.