From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Drew Adams Newsgroups: gmane.emacs.devel Subject: RE: char equivalence classes in search - why not symmetric? Date: Tue, 1 Sep 2015 12:09:54 -0700 (PDT) Message-ID: References: <2a7b9134-af2a-462d-af6c-d02bad60bbe8@default> <55E5C9AC.3010007@lanl.gov> <55E5F112.3090908@lanl.gov> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1441134622 32412 80.91.229.3 (1 Sep 2015 19:10:22 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 1 Sep 2015 19:10:22 +0000 (UTC) Cc: emacs-devel@gnu.org To: Davis Herring Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Sep 01 21:10:09 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1ZWqwY-0002cR-Un for ged-emacs-devel@m.gmane.org; Tue, 01 Sep 2015 21:10:07 +0200 Original-Received: from localhost ([::1]:57166 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZWqwY-0005c7-Nd for ged-emacs-devel@m.gmane.org; Tue, 01 Sep 2015 15:10:06 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:40984) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZWqwV-0005bW-2v for emacs-devel@gnu.org; Tue, 01 Sep 2015 15:10:04 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZWqwP-0001QI-Vk for emacs-devel@gnu.org; Tue, 01 Sep 2015 15:10:03 -0400 Original-Received: from userp1040.oracle.com ([156.151.31.81]:23272) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZWqwP-0001Q7-Oy for emacs-devel@gnu.org; Tue, 01 Sep 2015 15:09:57 -0400 Original-Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id t81J9uZ4025672 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 1 Sep 2015 19:09:56 GMT Original-Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserv0022.oracle.com (8.13.8/8.13.8) with ESMTP id t81J9tlj000732 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Tue, 1 Sep 2015 19:09:55 GMT Original-Received: from abhmp0012.oracle.com (abhmp0012.oracle.com [141.146.116.18]) by aserv0121.oracle.com (8.13.8/8.13.8) with ESMTP id t81J9tYg006804; Tue, 1 Sep 2015 19:09:55 GMT In-Reply-To: <55E5F112.3090908@lanl.gov> X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9 (901082) [OL 12.0.6691.5000 (x86)] X-Source-IP: aserv0022.oracle.com [141.146.126.234] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 156.151.31.81 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:189417 Archived-At: > >> Because having both input characters mean the same thing > >> uselessly deprives the user of expressive power. > > > > Examples/arguments/reasons, please. IOW, prove it. >=20 > I'm sorry: I thought it was obvious. For case folding, there are three > sets of characters that might be considered a match: [a], [A], and [aA]. > The default Emacs behavior is to make "a" mean [aA] and "A" mean [A]. > For the (relatively rare) case in which [a] is desired, one can turn > case-fold-search off (e.g., with M-c). Then you gain [a] and lose [aA] > as a choice (you can't have all three from just two characters!). You are just echoing what the implementation does, not giving any supporting reasons for it. "You can't have all three from just two characters" sounds important - except that it doesn't mean anything. It is quite possible for the behavior to be any of these: a matches a only a matches a and A A matches A only A matches a and A The current implementation does not provide for the last possibility. In that, it can be argued that it "deprives the user of expressive power". But I won't bother making that argument for case folding. I am not arguing for a change now in the longstanding case-fold behavior. I am arguing that we get this right for char folding. > With your suggestion (which addresses only case-fold-search, of course), > we would have only [aA] available whether you typed "a" or "A". That is > the less expressive power: the semantically distinct options available > have been reduced. That's your suggestion perhaps. It's certainly not mine. I suggest letting the user match a to a, a to [aA], A to A, and A to [aA]. That is more expressive power, not less. With it, the "semantically distinct options available" have been increased. > Of course, with more than one character there are yet other > possibilities: for two characters there are 9, of which "ab" gives you > [aA][bB] and each of the other three permutations give one > (case-sensitive) match each. 4/9 isn't great, but it's better than 1/9! See above. You are reducing possibilities, not expanding them. > > IMO, more users have been tripped up than helped by the rule > > that "An upper-case letter anywhere in the incremental search > > string makes the search case-sensitive." (emacs) Search Case. >=20 > How did that upper-case letter get there? Commands like C-w are careful > not to add uppercase letters if there aren't already some. So the user > must have typed it explicitly, and so they were paying attention to case > and have no need for a case-insensitive search. The only harm is if > they are inconsistent in their typing -- during something as brief as > isearch. A char in a search string can "get there" because a user typed it, and that can be because for that user it is easy to type. Or it can get there from a previous search (same Isearch invocation or not). Or it can "get there" by yanking copied text. Try typing or pasting "r=E9duction" to Google, and see if it ignores hits such as "reduction". Good luck with that. Silly Google, missing the "obvious". It should be obvious that it can be useful to match the pattern "r=E9duction" against "reduction", just as it can be useful to match the pattern "reduction" against "r=E9duction" (and "r=E9duction" against "r=E9duction" and "reduction" against "reduction"). To remove this possibility, thus reducing user expressiveness, you really should come up with a reason.