From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Whitespace search and regex.c Date: Thu, 25 Nov 2004 10:20:50 -0500 Message-ID: <87y8gq9da4.fsf-monnier+emacs@gnu.org> References: NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1101396136 12434 80.91.229.6 (25 Nov 2004 15:22:16 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Thu, 25 Nov 2004 15:22:16 +0000 (UTC) Cc: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Nov 25 16:22:04 2004 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1CXLRa-00041r-00 for ; Thu, 25 Nov 2004 16:22:03 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1CXLan-0003PC-Gr for ged-emacs-devel@m.gmane.org; Thu, 25 Nov 2004 10:31:33 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.33) id 1CXLa6-0003ML-1Q for emacs-devel@gnu.org; Thu, 25 Nov 2004 10:30:50 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.33) id 1CXLa3-0003Lf-Ov for emacs-devel@gnu.org; Thu, 25 Nov 2004 10:30:48 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1CXLa3-0003LV-8x for emacs-devel@gnu.org; Thu, 25 Nov 2004 10:30:47 -0500 Original-Received: from [209.226.175.74] (helo=tomts20-srv.bellnexxia.net) by monty-python.gnu.org with esmtp (Exim 4.34) id 1CXLQR-0007C9-V7; Thu, 25 Nov 2004 10:20:52 -0500 Original-Received: from alfajor ([65.92.240.220]) by tomts20-srv.bellnexxia.net (InterMail vM.5.01.06.10 201-253-122-130-110-20040306) with ESMTP id <20041125152050.TDXK2034.tomts20-srv.bellnexxia.net@alfajor>; Thu, 25 Nov 2004 10:20:50 -0500 Original-Received: by alfajor (Postfix, from userid 1000) id 88C7ED73CB; Thu, 25 Nov 2004 10:20:50 -0500 (EST) Original-To: rms@gnu.org In-Reply-To: (Richard Stallman's message of "Wed, 24 Nov 2004 21:21:39 -0500") User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/21.3.50 (gnu/linux) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:30344 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:30344 > What is this recent change to regex.c w.r.t whitespace search all about? > This is really ugly. > As best as I can tell, this is to avoid the problem where > (replace-regexp-in-string " " "\\(?:\s-+\\)" ...) does not give the right > result because the " " could be inside brackets. > It has nothing to do with replace-regexp-in-string, which doesn't use > this feature. I didn't mean to say that it was used by replace-regexp-in-string but that it was used for those cases where you want a regex generated by systematically replacing " " with something else (such as "\\(?:\s-+\\)"). In those cases, the obvious way to do it (with replace-regexp-in-string or some piece of code that ends up doing something similar) suffers from the fact that it will replace " " even if it appears inside brackets delimiting a char-range. > It is for the sake of user-level search features that > want a series of SPCs to stand for some broader kind of whitespace. I'm afraid that doesn't tell me really what it is for. I.e. why is it implemented this way rather than some other way? What was the precise goal? E.g. if you told me to implement a "user-level search features that want a series of SPCs to stand for some broader kind of whitespace" it'd never occur to me to fiddle with regex.[ch]. Instead, I'd add a piece of elisp code which replaces every SPC (or sequence of SPC) in a regex with some other regex. E.g. I'd use something like (replace-regexp-in-string " " "\\(?:\s-+\\)" ...). Now maybe in order to correctly do the replacement in the presence of brackets, I'd probably add a function like (parse-partial-regex REGEXP POS), potentially (tho probably not at first) implemented in regex.[ch]. > After all this problem manifests itself at a few other places (such as > regexp-opt-depth) as well. > I don't follow how this relates to regexp-opt-depth. > Would you please spell that out? Regexp-opt-depth has to count the number of occurences of "\\(" in a regexp, but it should be careful not to count those occurences that appear within brackets. > E.g. a function (parse-partial-regex REGEXP POS) > which would return a value indicating whether POS is within brackets or not. > That would be helpful for making C-q SPC in I-search DTRT both > inside and outside brackets. Yes, and it would also help the previous code (before your changes) make SPC DTRT both inside and outside brackets. I thought your change was trying to solve exactly this problem, and that it ends up just pushing it from SPC to C-q SPC. Stefan