From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Drew Adams <drew.adams@oracle.com>
Newsgroups: gmane.emacs.devel
Subject: char equivalence classes in search - why not symmetric?
Date: Tue, 1 Sep 2015 08:46:26 -0700 (PDT)
Message-ID: <2a7b9134-af2a-462d-af6c-d02bad60bbe8@default>
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: ger.gmane.org 1441122417 15701 80.91.229.3 (1 Sep 2015 15:46:57 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Tue, 1 Sep 2015 15:46:57 +0000 (UTC)
To: emacs-devel@gnu.org
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Sep 01 17:46:45 2015
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1ZWnlk-00076w-00
	for ged-emacs-devel@m.gmane.org; Tue, 01 Sep 2015 17:46:44 +0200
Original-Received: from localhost ([::1]:55232 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1ZWnlj-0007Ep-EX
	for ged-emacs-devel@m.gmane.org; Tue, 01 Sep 2015 11:46:43 -0400
Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:42866)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <drew.adams@oracle.com>) id 1ZWnlc-0007E2-Ps
	for emacs-devel@gnu.org; Tue, 01 Sep 2015 11:46:40 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <drew.adams@oracle.com>) id 1ZWnlY-00030J-GF
	for emacs-devel@gnu.org; Tue, 01 Sep 2015 11:46:36 -0400
Original-Received: from aserp1040.oracle.com ([141.146.126.69]:35358)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <drew.adams@oracle.com>) id 1ZWnlY-0002zQ-2L
	for emacs-devel@gnu.org; Tue, 01 Sep 2015 11:46:32 -0400
Original-Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74])
	by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with
	ESMTP id t81FkUf7001904
	(version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK)
	for <emacs-devel@gnu.org>; Tue, 1 Sep 2015 15:46:30 GMT
Original-Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72])
	by userv0022.oracle.com (8.13.8/8.13.8) with ESMTP id t81FkSis029051
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL)
	for <emacs-devel@gnu.org>; Tue, 1 Sep 2015 15:46:29 GMT
Original-Received: from abhmp0012.oracle.com (abhmp0012.oracle.com [141.146.116.18])
	by userv0121.oracle.com (8.13.8/8.13.8) with ESMTP id t81FkRbB026468
	for <emacs-devel@gnu.org>; Tue, 1 Sep 2015 15:46:27 GMT
X-Priority: 3
X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9  (901082) [OL
	12.0.6691.5000 (x86)]
X-Source-IP: userv0022.oracle.com [156.151.31.74]
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic]
X-Received-From: 141.146.126.69
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:189389
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/189389>

When character folding is turned on, shouldn't you be able to
search for =E1 and find (match) a, =E0, =E3, =AA, =E2, =E5, and =E4?

I think so.  Currently you cannot - you can only do the reverse:
search for a and find any of the above.  a is treated specially.
Why?

I suppose that the logic behind the current implementation is
to mirror what we do with case-fold searching.  But is that the
right thing in this case?

For case-fold searching, it was thought that if you bother to
hold the Shift key and thus use an uppercase letter then you
want to match case, and otherwise you do not (case-insensitive).

This was essentially, I think, a shortcut for programmers, and
it was introduced at a time when much of the code being searched
was case-ambivalent.  (UNIX was still pretty much an exception
at that point, in distinguishing lowercase letters.)

Whether or not this behavior for case-fold is still a good thing
is questionable now, I think.  I don't think it is necessary now
or particularly useful.  And I think it can be confusing to
newbies.  Why should searching for A be different from searching
for a, wrt case matching?

But I'm not really questioning the behavior of case-fold
searching now.  I am questioning applying this same behavior
to char folding.

To me, folding a group of chars together for search purposes
should be symmetric - go both ways.  It should, in effect,
treat the given group of chars as equivalent - as an
equivalence class wrt searching.

Why not?  Why, when char folding, treat plain a specially for
searching?  Why not treat =E1, a, =E0, =E3, =AA, =E2, =E5, and =E4 the same=
?
Isn't that the point here?  We are telling Isearch that they
are equivalent.  Why pick one of them as the canonical
search-pattern to use for finding any of them?  Why privilege
a over =E1, a, =E0, =E3, =AA, =E2, =E5, and =E4?

Now most of the time I, like most people, will by typing a
instead of =E1 into a search string.  But that's not really the
point.  I think users should be able to use any members of an
equivalence class of chars indifferently.

And when it comes to chars other than letters, it might well
be that some users, with some keyboards, will find some chars
in an equivalence class easier to type than others.  Let them
use/type whichever they like, no?

This feature, welcome as it is, seems only half-baked, so far.
How about equality for char-folding equivalence?