From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Drew Adams Newsgroups: gmane.emacs.devel Subject: RE: char equivalence classes in search - why not symmetric? Date: Thu, 3 Sep 2015 10:15:01 -0700 (PDT) Message-ID: <66ae6614-75bf-4b1b-9c16-5e7755a824f8@default> References: <2a7b9134-af2a-462d-af6c-d02bad60bbe8@default> <834mjecdy7.fsf@gnu.org> <834mjcbydi.fsf@gnu.org> <83vbbs9qd1.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1441300528 27778 80.91.229.3 (3 Sep 2015 17:15:28 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 3 Sep 2015 17:15:28 +0000 (UTC) Cc: emacs-devel To: bruce.connor.am@gmail.com, Jean-Christophe Helary Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Sep 03 19:15:15 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1ZXY6T-0008AY-Ky for ged-emacs-devel@m.gmane.org; Thu, 03 Sep 2015 19:15:13 +0200 Original-Received: from localhost ([::1]:50502 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZXY6T-0002xB-Id for ged-emacs-devel@m.gmane.org; Thu, 03 Sep 2015 13:15:13 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:42493) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZXY6O-0002v8-BH for emacs-devel@gnu.org; Thu, 03 Sep 2015 13:15:09 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZXY6L-0002gg-4Z for emacs-devel@gnu.org; Thu, 03 Sep 2015 13:15:08 -0400 Original-Received: from aserp1040.oracle.com ([141.146.126.69]:51968) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZXY6K-0002f3-V3 for emacs-devel@gnu.org; Thu, 03 Sep 2015 13:15:05 -0400 Original-Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id t83HF28X015113 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 3 Sep 2015 17:15:02 GMT Original-Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserv0021.oracle.com (8.13.8/8.13.8) with ESMTP id t83HF2QP021732 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Thu, 3 Sep 2015 17:15:02 GMT Original-Received: from abhmp0014.oracle.com (abhmp0014.oracle.com [141.146.116.20]) by aserv0122.oracle.com (8.13.8/8.13.8) with ESMTP id t83HF2t7030467; Thu, 3 Sep 2015 17:15:02 GMT In-Reply-To: X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9 (901082) [OL 12.0.6691.5000 (x86)] X-Source-IP: aserv0021.oracle.com [141.146.126.233] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 141.146.126.69 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:189524 Archived-At: > Yes, and you can count me among those objections.=20 > When I first started with emacs, case folding by default was something > I liked a lot, before I ever knew how to configure this stuff. > I also only learned about lax whitespace when it became the default (IIRC= ). > It was a feature that already existed and yet I had no idea because > it wasn't default. Emacs _should_ work on improving discoverability, IMO, but that is a separate discussion. IMO and FWIW, it is misguided to provide confusing, dwim behavior by default. Hard for a newbie to guess what the behavior really is, because it is too complex, conditional, contextual, whatever. The argument that we have this nifty feature and newbies won't discover it on their own easily, so let's foist it upon them from the outset, as the default behavior, is quite misguided. What should be done is to have simple, obvious default behavior, easy to fathom. AND to have easy ways to discover alternate, optional, fancy behavior that some of us might be convinced is handier, more powerful, more elegant, or more clever. Discoverability is not an argument for choosing any default behavior. Poor discoverability is an argument for improving discoverability. Nothing more. That should be a no-brainer, IMO, but we hear this over and over again. Developers like to show off the clever things they come up with. That's human and normal. Add such things, sure, but don't make them the default behavior. Especially when they are brand new. That a somewhat dwimish default was chosen for case folding 40 years ago, back when I was programming FORTRAN and most editing=20 and programming involved case-insensitive contexts, should not be an argument for using it today - and certainly not for doubling down on it for new developments (e.g. char folding). It should instead be a reason to revisit whether we, in 2015, should continue to have search be case-insensitive by default. There is only one reasonable argument I can see in favor of keeping case insensitivity the default, and it does not at all apply to the other kinds of folding we are talking about now (char folding, whitespace folding). This is why I said: But I won't bother making that argument for case folding. I am not arguing for a change now in the longstanding case-fold behavior. I am arguing that we get this right for char folding. What is that somewhat reasonable argument for turning on case insensitivity by default? Habit. I see no other good argument for it "nowadays". Forty years ago, yes; today, no. Today, most contexts involve both uppercase and lowercase letters, and they are distinguished semantically (case-sensitive). It's perhaps a bit odd that some of those who are so quick to argue for "modernizing" Emacs might also argue to keep their case insensitivity by default. Old-fartness is relative? The rule about least surprise for newbies I expressed above applies even more to the dwim rule that an uppercase letter in the search string magically flips search to case sensitivity. Handy as you might find that dwim, it is hardly immediately clear to a newbie what is going on. Other editors that are case-insensitive by default do not throw such a gotcha at new users. (Emacs is not your average editor, and it is great that Emacs does fancier things than most do, but we're talking about default behavior here.) I mention this to try to put a stop to the application of an old rationale for case folding to char folding etc., not to argue that we should (now) consider changing the default behavior for case folding. To be clear, and to try to forestall the usual whining from some: I don't care much what the _default_ behavior is for char folding. That's not what this thread is about. I, like Jean-Christophe apparently, think that it helps newbies more to have Isearch, by default, search for just what you type (imagine!). But I don't feel strongly about that. What is more important is to be able to (a) customize the default behavior and (b) toggle it anytime during Isearch. Also important, to me, is to be able, as I proposed and as Juri apparently seconded, to have `=C3=A1' match any of the `a' variants, just like `a' can do. That is, be able to toggle whether `=C3=A1' (or `a') matches only itself or all `a' variants - e.g., as Juri proposed, using `M-''. And that, BTW, is the topic of this thread (see Subject line). What goes for `a' should also go for `=C3=A1': either of them should be able to match, au choix, either itself alone or any of its char-folding variants (and yes, they _are_ equivalences). I also support Juri's mention of doing the same for whitespace folding: letting `M-s SPC' toggle whitespace dwimming (option `search-whitespace-regexp'). But we can also separate out that discussion from the current topic, which is about char folding. The general argument about the default behavior is that what a user puts in the search string is what should be looked for. If s?he inserts a SPC char then only a SPC char should be sought. If s?he inserts two consecutive SPC chars then only a two consecutive SPC chars should be sought. You want cleverer, handier behavior? Customize the option. Attempts to finesse the confusion and the possible useful dwim behaviors tend to end with even more complex dwim behavior: rules upon rules. See recent discussions about whitespace, where we hear things like SPC should (by default) match any amount of any whitespace, but SPC SPC should match only SPC SPC. Unless the moon is full or it is Tuesday before noon... Epicycles upon epicycles. Far better to keep the default behavior simple and immediately understandable - no need to look up the doc and study a dwim flowchart. On top of that we can add any fancy alternative behaviors we think are handier or more clever. But let's not impose those on newbies as default behavior, no matter how helpful and ingenious we are convinced they might be. And certainly not with the excuse that it makes the fancy feature more discoverable. [The last (so far) of the folding things is what `M-s i' does: it toggles search behavior for invisible text. I'm OK with the default value in this case, but it too could be open for discussion in the general context of folding. That too is best left for a separate discussion.]