all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Drew Adams" <drew.adams@oracle.com>
To: "'Eric Abrahamsen'" <eric@ericabrahamsen.net>, <help-gnu-emacs@gnu.org>
Subject: RE: search across linebreaks
Date: Sun, 17 Feb 2013 07:52:35 -0800	[thread overview]
Message-ID: <D2FA74E3555F429E9E79990ED7891A16@us.oracle.com> (raw)
In-Reply-To: <878v6nbd1i.fsf@ericabrahamsen.net>

> I'm going to need to do a large scale search-and-replace on a 
> series of text files, using a sort of dictionary or hash-table of 
> search terms and their replacement. The text files are filled
> to the usual fill column.  The search terms may be broken across
> linebreaks, and I'm not sure of the best way to handle this.
> If it was regular English words I could probably manage a
> programmatic version of `isearch-toggle-word', but in
> this case these are solid strings, and might be broken anywhere.
> 
> The two solutions I can think of are: 1) break up the characters
> in the search string and insert "\n?" between each one to create
> regexps to search on, and 2) unfill the whole file at the start
> of the procedure and then refill it afterwards. Neither of these
> seems like a great idea -- does anyone have any brighter ideas?

What's not clear is whether any of the newline chars are significant.  From what
you wrote I'm guessing no: they can all be ignored or just removed.  But in that
case, filling would mean filling one big paragraph.

Or perhaps consecutive newlines (\n\n) are significant, separating paragraphs?
In that case, you could remove all newlines except one for each consecutive
group (i.e., paragraph separation).

Assuming no newlines are significant (or only one of consecutive ones is), the
two solutions you propose sound reasonable to me.  Which of them to use might
depend on size etc. - relative time to remove newlines and later refill vs the
\n? regexp match time.




  parent reply	other threads:[~2013-02-17 15:52 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-17  7:43 search across linebreaks Eric Abrahamsen
2013-02-17 13:13 ` Jude DaShiell
     [not found] ` <mailman.20189.1361106838.855.help-gnu-emacs@gnu.org>
2013-02-17 14:43   ` J G Miller
2013-02-17 15:52 ` Drew Adams [this message]
2013-02-18  3:52   ` Eric Abrahamsen
2013-02-18  4:01     ` Jambunathan K
2013-02-18  6:09       ` Eric Abrahamsen
2013-02-17 17:05 ` Andreas Röhler
2013-02-18 13:09 ` Nicolas Richard
2013-02-19  1:22   ` Eric Abrahamsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D2FA74E3555F429E9E79990ED7891A16@us.oracle.com \
    --to=drew.adams@oracle.com \
    --cc=eric@ericabrahamsen.net \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.