From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Drew Adams Newsgroups: gmane.emacs.devel Subject: RE: char equivalence classes in search - why not symmetric? Date: Thu, 10 Sep 2015 08:02:43 -0700 (PDT) Message-ID: <921a4633-63f0-490c-b9da-5bd3641baad3@default> References: <2a7b9134-af2a-462d-af6c-d02bad60bbe8@default> <834mjecdy7.fsf@gnu.org> <38061f42-eaf1-47c6-b74d-f676ac952b18@default> <83r3miatvl.fsf@gnu.org> <21998.29683.916211.867479@a1i15.kph.uni-mainz.de> <83pp1s7w1m.fsf@gnu.org>> <87lhcg38x7.fsf@mail.linkov.net> <83y4gg5n4q.fsf@gnu.org>> > > <5c860cd6-6453-4d7e-971b-bb047f6c9b1e@default> <87zj0u4v96.fsf@fencepost.gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1441897560 15366 80.91.229.3 (10 Sep 2015 15:06:00 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 10 Sep 2015 15:06:00 +0000 (UTC) Cc: rms@gnu.org, ulm@gentoo.org, bruce.connor.am@gmail.com, juri@linkov.net, eliz@gnu.org, emacs-devel@gnu.org To: David Kastrup Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Sep 10 17:05:47 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Za3NS-00016j-Uf for ged-emacs-devel@m.gmane.org; Thu, 10 Sep 2015 17:03:07 +0200 Original-Received: from localhost ([::1]:49710 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Za3NN-0002E7-0t for ged-emacs-devel@m.gmane.org; Thu, 10 Sep 2015 11:03:01 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:39584) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Za3NK-0002Cp-DD for emacs-devel@gnu.org; Thu, 10 Sep 2015 11:02:59 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Za3NJ-0000QL-8y for emacs-devel@gnu.org; Thu, 10 Sep 2015 11:02:58 -0400 Original-Received: from aserp1040.oracle.com ([141.146.126.69]:49205) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Za3ND-0000Of-Jf; Thu, 10 Sep 2015 11:02:51 -0400 Original-Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id t8AF2jol015560 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 10 Sep 2015 15:02:46 GMT Original-Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userv0022.oracle.com (8.13.8/8.13.8) with ESMTP id t8AF2jr6004643 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Thu, 10 Sep 2015 15:02:45 GMT Original-Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19]) by aserv0121.oracle.com (8.13.8/8.13.8) with ESMTP id t8AF2iYK008658; Thu, 10 Sep 2015 15:02:44 GMT In-Reply-To: <87zj0u4v96.fsf@fencepost.gnu.org> X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9 (901082) [OL 12.0.6691.5000 (x86)] X-Source-IP: userv0022.oracle.com [156.151.31.74] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 141.146.126.69 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:189800 Archived-At: > >>> They are equivalence classes. The chars are equivalent when searched > >>> for (with char folding turned on). > >> > >> No, they aren't. For instance, A and =C1 are not equivalent in search= . > >> Searching for A will match =C1, but searching for =C1 will not match A= . > > > > Please read what I said: "The chars are equivalent when searched for." > > ^^^^^^^^^^^^^^^^^ (with char-fold search, i.e., ignoring diacritics - that's the context) > They aren't. Searching with the search string "=C1" will find "=C1" but > not "A". For anyone who really still does not understand, and anyone who might be pretending not to understand ;-): When search is case-insensitive, occurrences of a and A in the searched text are found equivalently. As search targets, a and A are equivalent for case-insensitive search. If you ask to find an occurrence of the first letter of the English alphabet, and you say that you don't care about case, you find, as you expect, either a or A, indifferently. a and A in the searched text are treated the same by case folding. They form an equivalence class in this context. But in Emacs, if you put A in the search string then you inhibit, turn OFF, blow away case-insensitive search - case is no longer folded. So of course any statement about the behavior of case-fold search is irrelevant then. Likewise, for char folding. When char folding is on, A and =C1 in the searched text are found equivalently. As search targets, A and =C1 are equivalent for char-fold search. If you don't care about diacritics, you can expect to find either A or =C1, indifferently, and you do, when char folding is in effect. A and =C1 in the searched text are treated the same by char folding. They form an equivalence class in this context. But in Emacs, currently, if you put =C1 in the search string then you inhibit, turn OFF, blow away char-fold search. So of course any statement about the behavior of char-fold search is irrelevant then. a and A for case folding, and A and =C1 for char folding, form equivalence classes wrt being found in searched text. Case folding does NOT apply if you put A in the search string. Char folding does NOT apply if you put =C1 in the search string. Ulrich M=FCller CANNOT search for his last name using M=FCller in the search string and have search ignore diacritics, so that it matches indifferently M=FCller and Muller. That is, char folding simply DOES NOT WORK here - verboten. (He can of course use regexp search to work around the limitation.) > > I did *not* say, as you say, that they are "equivalent in search." > > I=A0tried to carefully distinguish the two uses of the chars: when used > > as search targets (they are currently equivalent) vs when used in the > > search string (they are not equivalent, currently). >=20 > Yes, there is a distinction between search targets and search spec. But > they are different in either category. Indeed, sigh. The point of the proposal of this thread is to _allow_ users to search _using char folding_ regardless of whether there are diacritics in the search string. They would still be able to use search without char folding, e.g., to search for =C1 and find only =C1, not also A.