From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Drew Adams Newsgroups: gmane.emacs.devel Subject: RE: char equivalence classes in search - why not symmetric? Date: Tue, 8 Sep 2015 07:24:09 -0700 (PDT) Message-ID: <8cf269bc-69d8-4752-8506-de8d992512e1@default> References: <2a7b9134-af2a-462d-af6c-d02bad60bbe8@default> <834mjecdy7.fsf@gnu.org> <38061f42-eaf1-47c6-b74d-f676ac952b18@default> <83r3miatvl.fsf@gnu.org> <21998.29683.916211.867479@a1i15.kph.uni-mainz.de> <9A972800-D8F0-4DA8-877E-07D5BDC2E1F9@gmail.com> <87oahd11i9.fsf@uwakimon.sk.tsukuba.ac.jp> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1441722278 19689 80.91.229.3 (8 Sep 2015 14:24:38 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 8 Sep 2015 14:24:38 +0000 (UTC) Cc: emacs-devel@gnu.org To: "Stephen J. Turnbull" , Jean-Christophe Helary Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Sep 08 16:24:26 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1ZZJot-0001wH-0A for ged-emacs-devel@m.gmane.org; Tue, 08 Sep 2015 16:24:23 +0200 Original-Received: from localhost ([::1]:34771 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZZJos-0004mF-Cu for ged-emacs-devel@m.gmane.org; Tue, 08 Sep 2015 10:24:22 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:53400) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZZJon-0004ib-1X for emacs-devel@gnu.org; Tue, 08 Sep 2015 10:24:18 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZZJoj-0007OI-PV for emacs-devel@gnu.org; Tue, 08 Sep 2015 10:24:17 -0400 Original-Received: from aserp1040.oracle.com ([141.146.126.69]:19209) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZZJoj-0007Ny-Jk for emacs-devel@gnu.org; Tue, 08 Sep 2015 10:24:13 -0400 Original-Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id t88EOB5O029773 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 8 Sep 2015 14:24:11 GMT Original-Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserv0022.oracle.com (8.13.8/8.13.8) with ESMTP id t88EOAs1005763 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Tue, 8 Sep 2015 14:24:10 GMT Original-Received: from abhmp0017.oracle.com (abhmp0017.oracle.com [141.146.116.23]) by aserv0122.oracle.com (8.13.8/8.13.8) with ESMTP id t88EOAoC019129; Tue, 8 Sep 2015 14:24:10 GMT In-Reply-To: <87oahd11i9.fsf@uwakimon.sk.tsukuba.ac.jp> X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9 (901082) [OL 12.0.6691.5000 (x86)] X-Source-IP: aserv0022.oracle.com [141.146.126.234] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 141.146.126.69 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:189713 Archived-At: > The discussion here is entirely about the DWIM > UI of isearch that allows requesting strict matching by having at > least one uppercase or accented character, even though lax mode is > enabled. The proposal is explicitly *not* for the former, now. The weird exception of an uppercase letter making the current search be case-sensitive, even though you have toggled case sensitivity OFF, is not under attack now. Personally, yes, I would get rid of that anomaly too at some point, but I'm not proposing that now. Likewise, for the anomaly that whitespace folding is switched off by SPC SPC. That too, I would like to see removed eventually, but I'm not proposing that now either. The point now is to DTRT wrt char folding - the new feature. > Drew prefers a UI that enables/disables strict mode using a > special isearch command bound to a key. We already have that. What I'm proposing in this thread is that when char folding is on, it work symmetrically: Folding should let you use `=C3=A9' in the search string to match any of the accented or unaccented variants, just as it does for `e' in the search string. Nothing more. What's good for `e' should be good for `=C3=A9' and all the rest. It's about equivalence classes. There is no reason to limit search strings to one privileged member of an equivalence class when trying to match any members of the class. That's all. > That would be plausible, if the DWIM > UI for case fold search in isearch weren't 3 decades old. See above. I am *not* now proposing a change to case-fold behavior. I've made that clear from the beginning, and repeated it several times now. But it seems that it is easier, for those not favorable to what I (and Juri, apparently) propose, to harp on the age-old anomaly of uppercase case-fold annulment as, somehow (?), an argument against clean, symmetric char folding. Please argue about the topic at hand (see Subject line), not whether the 1980s decision to make an exception for an uppercase letter in the search string was or is a good idea. > ut the DWIM UI *is* 3 decades old, and successful. Drew > disputes that,=20 No, Drew does not. You cannot show one place where anything Drew has written written suggests that he disputes that. > but in the 25 years I've followed Emacs development this is > the first time I've seen anybody complain about the DWIM-ish > case folding feature. Live and learn. ;-) That is not the topic of this thread, in any case. > Note that incremental case-folded search (usually with no escape for > strict matching!) has been widely adopted in web and file browsers. Uh, no. Case folding, yes. But not case folding that switches off (becoming case-sensitive) just because you include an uppercase letter in the search string. Not in any browser I have, at least. Nor in Notepad or TextPad or other simple editors that newbies or non-programmers might be used to. But again, *not* the subject of this topic. > I'm +1 on generalizing this UI to "diacritic folding" in isearch. By "this UI", I guess you mean that if there is a char with a diacritic in the search string then that should turn off char folding, preventing you from matching text ignoring diacritics. That would be unfortunate - a strict loss (inability to match `=C3=A9' against `e'; only ability to match `e' against `=C3=A9'), and with no gain. > The other question is that of Ulrich M=C3=BCller, who points out that it'= s > natural for him to type his name correctly, but he'd like to laxly > match Mueller and Muller, too.[1] Same as my resum=C3=A9 example, yes. And the use case includes various quotation marks (e.g. curly) in the search string and wanting to match various others in the text. E.g., you copy some text from a web page, which includes some curly quote marks, and you want to match text in your buffer but ignoring the difference in quote-mark type. Likewise, for any of the other equivalence classes. No reason to privilege any particular member of a class, making it so that only that member can be used in a search string to match the other members. We've seen no argument supporting such asymmetry. (I can imagine an argument in terms of implementation, but we have not heard that yet. And *no* argument has been given in user terms - UI. Why should users be limited wrt which class member they can use to match a class?) > It's a valid use case, obviously, > but based on an analogy to experience with DWIMish case-folding in > Emacs, I believe most users will quickly adjust to typing "muller" > when they want a poor man's version of full "orthographic > equivalence". Individuals may not, but I believe the great majority > will, since I'm sure it's anatomically easier to type "muller" than > "M=C3=BCller", even on a German keyboard. It's not only about typing. That seems to be the main point that those who repeat this mantra forget. Text can be pasted into an Isearch string, including text copied from outside Emacs. Text using any Unicode chars, from any languages. > Footnotes: > [1] Drew also argues this point, but from an abstract insistence on > "symmetry", which doesn't really exist here for representational, > anatomical, psychological reasons, and let's not forget personal > historical reasons like "M=C3=BCller is my name". Nonsense. I gave concrete examples. It's not an academic argument. It's about really having character folding, not just a one-way character folding that requires you to type (or edit a pasted string) _only_ the "canonical" chars that are folded. It's a practical argument, not an abstract insistence on symmetry. Being _able_ to fold `=C3=A9' to `e' or `=C3=A8', and to fold one kind of quote mark to others, is, yes, a normal use case. Nothing odd, abstract, or academic about it. Herr M=C3=BCller confirms this with his own example. This should be a no-brainer, IMO.