From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: "Drew Adams" <drew.adams@oracle.com>
Newsgroups: gmane.emacs.help
Subject: RE: search across linebreaks
Date: Sun, 17 Feb 2013 07:52:35 -0800
Message-ID: <D2FA74E3555F429E9E79990ED7891A16@us.oracle.com>
References: <878v6nbd1i.fsf@ericabrahamsen.net>
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Trace: ger.gmane.org 1361116376 28065 80.91.229.3 (17 Feb 2013 15:52:56 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Sun, 17 Feb 2013 15:52:56 +0000 (UTC)
To: "'Eric Abrahamsen'" <eric@ericabrahamsen.net>, <help-gnu-emacs@gnu.org>
Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Sun Feb 17 16:53:19 2013
Return-path: <help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org>
Envelope-to: geh-help-gnu-emacs@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org>)
	id 1U76YH-0003YJ-Fe
	for geh-help-gnu-emacs@m.gmane.org; Sun, 17 Feb 2013 16:53:17 +0100
Original-Received: from localhost ([::1]:42906 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org>)
	id 1U76Xx-0004oz-Ng
	for geh-help-gnu-emacs@m.gmane.org; Sun, 17 Feb 2013 10:52:57 -0500
Original-Received: from eggs.gnu.org ([208.118.235.92]:45077)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <drew.adams@oracle.com>) id 1U76Xr-0004oo-GX
	for help-gnu-emacs@gnu.org; Sun, 17 Feb 2013 10:52:52 -0500
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <drew.adams@oracle.com>) id 1U76Xq-000580-Cs
	for help-gnu-emacs@gnu.org; Sun, 17 Feb 2013 10:52:51 -0500
Original-Received: from aserp1040.oracle.com ([141.146.126.69]:48453)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <drew.adams@oracle.com>) id 1U76Xq-00057n-74
	for help-gnu-emacs@gnu.org; Sun, 17 Feb 2013 10:52:50 -0500
Original-Received: from ucsinet21.oracle.com (ucsinet21.oracle.com [156.151.31.93])
	by aserp1040.oracle.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.1) with
	ESMTP id r1HFqlsK002137
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK);
	Sun, 17 Feb 2013 15:52:47 GMT
Original-Received: from acsmt356.oracle.com (acsmt356.oracle.com [141.146.40.156])
	by ucsinet21.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id
	r1HFqkJ5011009
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sun, 17 Feb 2013 15:52:46 GMT
Original-Received: from abhmt105.oracle.com (abhmt105.oracle.com [141.146.116.57])
	by acsmt356.oracle.com (8.12.11.20060308/8.12.11) with ESMTP id
	r1HFqjSD027629; Sun, 17 Feb 2013 09:52:45 -0600
Original-Received: from dradamslap1 (/71.202.147.44)
	by default (Oracle Beehive Gateway v4.0)
	with ESMTP ; Sun, 17 Feb 2013 07:52:45 -0800
X-Mailer: Microsoft Office Outlook 11
In-Reply-To: <878v6nbd1i.fsf@ericabrahamsen.net>
Thread-Index: Ac4M+5eYsEFT5m51TIWz3ioFwFXLtgAKfRnw
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157
X-Source-IP: ucsinet21.oracle.com [156.151.31.93]
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic]
X-Received-From: 141.146.126.69
X-BeenThere: help-gnu-emacs@gnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Users list for the GNU Emacs text editor <help-gnu-emacs.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/help-gnu-emacs>,
	<mailto:help-gnu-emacs-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/help-gnu-emacs>
List-Post: <mailto:help-gnu-emacs@gnu.org>
List-Help: <mailto:help-gnu-emacs-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/help-gnu-emacs>,
	<mailto:help-gnu-emacs-request@gnu.org?subject=subscribe>
Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org
Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.help:89131
Archived-At: <http://permalink.gmane.org/gmane.emacs.help/89131>

> I'm going to need to do a large scale search-and-replace on a 
> series of text files, using a sort of dictionary or hash-table of 
> search terms and their replacement. The text files are filled
> to the usual fill column.  The search terms may be broken across
> linebreaks, and I'm not sure of the best way to handle this.
> If it was regular English words I could probably manage a
> programmatic version of `isearch-toggle-word', but in
> this case these are solid strings, and might be broken anywhere.
> 
> The two solutions I can think of are: 1) break up the characters
> in the search string and insert "\n?" between each one to create
> regexps to search on, and 2) unfill the whole file at the start
> of the procedure and then refill it afterwards. Neither of these
> seems like a great idea -- does anyone have any brighter ideas?

What's not clear is whether any of the newline chars are significant.  From what
you wrote I'm guessing no: they can all be ignored or just removed.  But in that
case, filling would mean filling one big paragraph.

Or perhaps consecutive newlines (\n\n) are significant, separating paragraphs?
In that case, you could remove all newlines except one for each consecutive
group (i.e., paragraph separation).

Assuming no newlines are significant (or only one of consecutive ones is), the
two solutions you propose sound reasonable to me.  Which of them to use might
depend on size etc. - relative time to remove newlines and later refill vs the
\n? regexp match time.