From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Jean Louis Newsgroups: gmane.emacs.help Subject: Re: How to grep for a string spanning multiple lines? Date: Sat, 26 Nov 2022 11:43:47 +0300 Message-ID: References: <8735a6cj2k.fsf@mbork.pl> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="39261"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mutt/2.2.7+37 (a90f69b) (2022-09-02) Cc: help-gnu-emacs@gnu.org To: tomas@tuxteam.de Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Sat Nov 26 09:44:29 2022 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oyqnU-000A3w-RK for geh-help-gnu-emacs@m.gmane-mx.org; Sat, 26 Nov 2022 09:44:28 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oyqn9-0003h6-95; Sat, 26 Nov 2022 03:44:07 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oyqn7-0003gr-7W for help-gnu-emacs@gnu.org; Sat, 26 Nov 2022 03:44:05 -0500 Original-Received: from stw1.rcdrun.com ([217.170.207.13]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oyqn5-0003Fh-Bm for help-gnu-emacs@gnu.org; Sat, 26 Nov 2022 03:44:04 -0500 Original-Received: from localhost ([::ffff:154.229.164.172]) (AUTH: PLAIN admin, TLS: TLS1.3,256bits,ECDHE_RSA_AES_256_GCM_SHA384) by stw1.rcdrun.com with ESMTPSA id 00000000000C5D64.000000006381D1D2.00003E8A; Sat, 26 Nov 2022 01:44:01 -0700 Mail-Followup-To: tomas@tuxteam.de, help-gnu-emacs@gnu.org Content-Disposition: inline In-Reply-To: Received-SPF: pass client-ip=217.170.207.13; envelope-from=bugs@gnu.support; helo=stw1.rcdrun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.help:141164 Archived-At: * tomas@tuxteam.de [2022-11-26 11:37]: > Note that, at least, in Emacs, the POSIX character class [:space:] > also matches line breaks. So if you always use [[:space:]]+ to > separate your words, you might find what you are looking for. Just that for some reason it does not work as expected: (string-match "[[:space:]]+" "Hello\nthere") ➜ nil (string-match "[[:space:]]+" "Hello there") ➜ nil (xr "[[:space:]]+") ➜ (one-or-more space) (rx (one-or-more space)) ➜ "[[:space:]]+" That is why I had to make this: (defun rcd-string-clean-whitespace (s) "Return trimmed string S after cleaning whitespaces." (replace-regexp-in-string (rx (one-or-more (or "\n" (any whitespace)))) " " (string-trim s))) as then it works as expected: (rcd-string-clean-whitespace "Hello\nthere") ➜ "Hello there" Because "[[:space:]]+" does not include "\n" that I know: (replace-regexp-in-string "[[:space:]]+" " " "Hello\nthere") ➜ "Hello there" -- Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/