From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Drew Adams Newsgroups: gmane.emacs.devel Subject: char equivalence classes in search - why not symmetric? Date: Tue, 1 Sep 2015 08:46:26 -0700 (PDT) Message-ID: <2a7b9134-af2a-462d-af6c-d02bad60bbe8@default> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1441122417 15701 80.91.229.3 (1 Sep 2015 15:46:57 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 1 Sep 2015 15:46:57 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Sep 01 17:46:45 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1ZWnlk-00076w-00 for ged-emacs-devel@m.gmane.org; Tue, 01 Sep 2015 17:46:44 +0200 Original-Received: from localhost ([::1]:55232 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZWnlj-0007Ep-EX for ged-emacs-devel@m.gmane.org; Tue, 01 Sep 2015 11:46:43 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:42866) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZWnlc-0007E2-Ps for emacs-devel@gnu.org; Tue, 01 Sep 2015 11:46:40 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZWnlY-00030J-GF for emacs-devel@gnu.org; Tue, 01 Sep 2015 11:46:36 -0400 Original-Received: from aserp1040.oracle.com ([141.146.126.69]:35358) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZWnlY-0002zQ-2L for emacs-devel@gnu.org; Tue, 01 Sep 2015 11:46:32 -0400 Original-Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id t81FkUf7001904 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 1 Sep 2015 15:46:30 GMT Original-Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userv0022.oracle.com (8.13.8/8.13.8) with ESMTP id t81FkSis029051 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Tue, 1 Sep 2015 15:46:29 GMT Original-Received: from abhmp0012.oracle.com (abhmp0012.oracle.com [141.146.116.18]) by userv0121.oracle.com (8.13.8/8.13.8) with ESMTP id t81FkRbB026468 for ; Tue, 1 Sep 2015 15:46:27 GMT X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9 (901082) [OL 12.0.6691.5000 (x86)] X-Source-IP: userv0022.oracle.com [156.151.31.74] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 141.146.126.69 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:189389 Archived-At: When character folding is turned on, shouldn't you be able to search for =E1 and find (match) a, =E0, =E3, =AA, =E2, =E5, and =E4? I think so. Currently you cannot - you can only do the reverse: search for a and find any of the above. a is treated specially. Why? I suppose that the logic behind the current implementation is to mirror what we do with case-fold searching. But is that the right thing in this case? For case-fold searching, it was thought that if you bother to hold the Shift key and thus use an uppercase letter then you want to match case, and otherwise you do not (case-insensitive). This was essentially, I think, a shortcut for programmers, and it was introduced at a time when much of the code being searched was case-ambivalent. (UNIX was still pretty much an exception at that point, in distinguishing lowercase letters.) Whether or not this behavior for case-fold is still a good thing is questionable now, I think. I don't think it is necessary now or particularly useful. And I think it can be confusing to newbies. Why should searching for A be different from searching for a, wrt case matching? But I'm not really questioning the behavior of case-fold searching now. I am questioning applying this same behavior to char folding. To me, folding a group of chars together for search purposes should be symmetric - go both ways. It should, in effect, treat the given group of chars as equivalent - as an equivalence class wrt searching. Why not? Why, when char folding, treat plain a specially for searching? Why not treat =E1, a, =E0, =E3, =AA, =E2, =E5, and =E4 the same= ? Isn't that the point here? We are telling Isearch that they are equivalent. Why pick one of them as the canonical search-pattern to use for finding any of them? Why privilege a over =E1, a, =E0, =E3, =AA, =E2, =E5, and =E4? Now most of the time I, like most people, will by typing a instead of =E1 into a search string. But that's not really the point. I think users should be able to use any members of an equivalence class of chars indifferently. And when it comes to chars other than letters, it might well be that some users, with some keyboards, will find some chars in an equivalence class easier to type than others. Let them use/type whichever they like, no? This feature, welcome as it is, seems only half-baked, so far. How about equality for char-folding equivalence?