From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Drew Adams" Newsgroups: gmane.emacs.help,gmane.emacs.devel Subject: RE: emacs 22 - regular-expression isearch on spaces extremely lenient Date: Sat, 29 Apr 2006 07:41:08 -0700 Message-ID: References: <2cd46e7f0604281356i582388e2kef07922b6b6a9a3a@mail.gmail.com> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Trace: sea.gmane.org 1146321701 21310 80.91.229.2 (29 Apr 2006 14:41:41 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Sat, 29 Apr 2006 14:41:41 +0000 (UTC) Cc: Emacs-Devel Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Sat Apr 29 16:41:39 2006 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1FZqdX-0003ZH-MW for geh-help-gnu-emacs@m.gmane.org; Sat, 29 Apr 2006 16:41:32 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1FZqdX-00043z-5w for geh-help-gnu-emacs@m.gmane.org; Sat, 29 Apr 2006 10:41:31 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1FZqdM-00043W-99 for help-gnu-emacs@gnu.org; Sat, 29 Apr 2006 10:41:20 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1FZqdL-00043K-Iu for help-gnu-emacs@gnu.org; Sat, 29 Apr 2006 10:41:19 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1FZqdL-00043H-Cl; Sat, 29 Apr 2006 10:41:19 -0400 Original-Received: from [141.146.126.228] (helo=agminet01.oracle.com) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_3DES_EDE_CBC_SHA:24) (Exim 4.52) id 1FZqgi-00070i-Qd; Sat, 29 Apr 2006 10:44:49 -0400 Original-Received: from rgmsgw301.us.oracle.com (rgmsgw301.us.oracle.com [138.1.186.50]) by agminet01.oracle.com (Switch-3.1.7/Switch-3.1.7) with ESMTP id k3TEfGoO010282; Sat, 29 Apr 2006 09:41:16 -0500 Original-Received: from dradamslap (dhcp-amer-csvpn-gw2-141-144-73-17.vpn.oracle.com [141.144.73.17]) by rgmsgw301.us.oracle.com (Switch-3.1.7/Switch-3.1.7) with SMTP id k3TEfEFt020169 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO); Sat, 29 Apr 2006 08:41:15 -0600 Original-To: X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook IMO, Build 9.0.6604 (9.0.2911.0) In-reply-to: <2cd46e7f0604281356i582388e2kef07922b6b6a9a3a@mail.gmail.com> Importance: Normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1807 X-Brightmail-Tracker: AAAAAQAAAAI= X-Brightmail-Tracker: AAAAAQAAAAI= X-Whitelist: TRUE X-Whitelist: TRUE X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:34715 gmane.emacs.devel:53591 Archived-At: i recently started noticing that emacs 22 regular expression isearches do not treat spaces exactly - any number of spaces in your search will map to any number, same or different, of spaces in the target. can anyone tell me whether or not it's deliberate, and what the rationale is? (setq search-whitespace-regexp nil) will turn this off. When this is nil, each space you type matches literally, against one space. `search-whitespace-regexp' is, by default, "\\s-+", which searches for any amount of whitespace when you type a space. This was introduced for regexp search in Emacs 21, I believe. There is no such "magic-space" searching in Emacs 20. Doc: If non-nil, regular expression to match a sequence of whitespace chars. This applies to regular expression incremental search. When you put a space or spaces in the incremental regexp, it stands for this, unless it is inside of a regexp construct such as [...] or *, + or ?. You might want to use something like "[ \t\r\n]+" instead. In the Customization buffer, that is `[' followed by a space, a tab, a carriage return (control-M), a newline, and `]+'. The rationale was, I believe, that some users might want that: type space to find any amount of whitespace, in particular, to find two words that are separated by a newline. There was talk of using this "magic-space" searching also for plain incremental search in Emacs 22, but I don't think that was done. FWIW, I agree with Miles on this - this is a misfeature, if turned on by default. It should be off by default, and you should be able to turn it on via a simple toggle during incremental search (regexp or plain). Here is what I wrote 2005/02/06 to emacs-devel on this: > > sometimes the actual whitespace matters. > Right: in *regexp* search. while people generally expect regexp searches to be a bit fuzzy, they might expect a non-regexp search to be exact. Since the fuzzy whitespace matching often "looks" like normal matching (because the majority of whitespace is in fact a single space), it might take some time to see what's going on, resulting in some subtle errors. This is particularly true if one embeds a search inside a keyboard macro [which I often do]. Plain (incremental) search should be a literal search. Regexp search should rigorously respect the regexp. People don't expect either to be fuzzy. The question is "Under what circumstances should typing a space be interpreted as wanting to search for any amount of whitespace?" This is unrelated to both plain search and regexp search. You might or might not want this _input effect_ with either plain or regexp search. This is akin to word search (as I think someone mentioned). Ultimately, a word search or a space-means-whitespace search is implemented with a regexp search - but the point in both cases is to provide a user-friendly way to do it, instead of requiring users to know about regexps. By default, neither `C-M-s' nor `C-s' should respect the user-friendly space-input feature. Or, rather, the default behavior of each should be determined by a user option - a la case-fold-search. And, regardless of the value of this option, you should be able to toggle space-means-whitespace searching from both `C-M-s' and `C-s', via a key sequence. The question then becomes how to toggle this space-means-whitespace searching?