unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
From: Kevin Rodgers <kevin.d.rodgers@gmail.com>
To: help-gnu-emacs@gnu.org
Subject: Re: file filtering
Date: Wed, 31 Jan 2007 22:55:44 -0700	[thread overview]
Message-ID: <eprvd0$cs5$1@sea.gmane.org> (raw)
In-Reply-To: <utzy8plu9.fsf@gmail.com>

Peter Tury wrote:
> I would like to write an emacs lisp script what (filters +) modifies a
> file logically in the following way:
> 
> * processes the file content line by line
> 
> * if line corresponds to a given regexp, then replaces the line by
>   something built up from found regexp-parts (\1...)
> 
> * otherwise deletes the line
> 
> I would like to use this script similarly to grep: so emacs would run
> in the backgroup (using --script initial option at Emacs invocation).

(find-file FILENAME)
(shell-command-on-region (point-min) (point-max)
			 (format "sed -n s/%s/%s/p"
				 (shell-quote-argument REGULAR_EXPRESSION)
				 (shell-quote-argument REPLACEMENT))
			 nil t)
(save-buffer) ; or (write-file NEW_FILENAME)

> For this I am looking for some functionalities/functions what I don't
> know:
> 
> * how to read a file without loading the whole file into memory
>   (i.e. e.g. without loading it into a buffer)
> 
> E.g. I thought of a solution when I would read from the file only
> strings what correspond to a given regexp. Something like
> (insert-file-contents filename regexp). (In the "simpliest" case
> regexp would be "^.*$".) Is this possible?

(shell-command (format "grep %s %s"
			(shell-quote-argument FILENAME)
			(shell-quote-argument REGULAR_EXPRESSION))
		t)

> Then, the second step would be to replace the just inserted text, so
> something like the following would be even better
> (insert-file-contents filename regexp replace-match-first-arg): this
> would find the regexp in filename, replace the found string according
> to replace-match (in memory) and insert only the result into the buffer. 

(replace-regexp REGEXP TO-STRING nil (point-min) (point-max))

> Then (after a while loop what processes the whole file), the third
> step would be to write the result into a new file, so the best would
> be something like this :-) (append-to-file to-filename from-filename
> regexp-to-read replace-match-first-arg-to-append)

(write-file NEW_FILENAME)

> I think I could create these functions if I would know how to read a
> portion (not fixed number of chars!) of a file...

Use an external command like grep to select the desired lines.  But
since you need to do that, you may as well use an external command like
sed to do the whole replacement -- otherwise, you're matching the
regular expression twice, once outside emacs to select the lines to
insert into the buffer and once inside emacs to find the text to
replace.

> My problem is this: if I work on buffers (instead of files), I have to
> create two buffers: one that corresponds to the original file and one
> that corresponds to the result file -- or otherwise I have to delete
> those portions of the first buffer what didn't matched by the regexp
> searches -- and I don't know how to do it simply :-( Or using two
> buffers (strings??) (and storing the two files in them) for such a
> task isn't an ugly solution?
> 
> How to solve this task in the simpliest way?

(with-temp-file NEW_FILENAME
   (shell-command (format "sed -n s/%s/%s/p %s"
			 (shell-quote-argument REGULAR_EXPRESSION)
			 (shell-quote-argument REPLACEMENT)
			 (shell-quote-argument FILENAME))
		 t		       ; output-buffer: (current-buffer)
		 nil))

-- 
Kevin Rodgers
Denver, Colorado, USA

  parent reply	other threads:[~2007-02-01  5:55 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-01-30 15:34 file filtering Peter Tury
2007-01-30 16:58 ` HS
2007-01-31  8:05   ` Peter Tury
2007-01-31 12:50     ` HS
2007-01-31 13:34       ` Peter Tury
2007-01-31 14:51         ` HS
2007-02-01  7:47           ` Peter Tury
2007-02-01 14:26             ` Mathias Dahl
2007-02-04 17:18             ` Kevin Rodgers
     [not found]             ` <mailman.3999.1170609530.2155.help-gnu-emacs@gnu.org>
2007-02-14 12:19               ` Peter Tury
2007-02-01  5:55 ` Kevin Rodgers [this message]
     [not found] ` <mailman.3856.1170309361.2155.help-gnu-emacs@gnu.org>
2007-02-14 12:49   ` Peter Tury

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='eprvd0$cs5$1@sea.gmane.org' \
    --to=kevin.d.rodgers@gmail.com \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).