* Re: bulk replacement on region, buffer, file?
[not found] <mailman.1767.1449714096.31583.help-gnu-emacs@gnu.org>
@ 2015-12-10 3:13 ` Pascal J. Bourguignon
2015-12-15 4:16 ` Tom Roche
0 siblings, 1 reply; 5+ messages in thread
From: Pascal J. Bourguignon @ 2015-12-10 3:13 UTC (permalink / raw)
To: help-gnu-emacs
Tom Roche <Tom_Roche@pobox.com> writes:
> I would appreciate pointers to code that enables "bulk replacement" of
> numerous string tuples ({to-replace, replace-with}) in a single
> call. What I mean, why I ask:
>
> I frequently scrape blocks of text from PDFs into Emacs text
> buffers. After I do so, I usually want to replace lots of strings in
> the buffer. E.g. (using '|' to delimit the strings),
>
> |CO 2| -> |CO2|
> |- | -> ||
> |“| -> |"|
> |”| -> |"|
> |[weird unicodes used for bulleting]| -> |*|
>
> which I do manually by calling `M-x replace-string` or similar
> interactive or regexp function. I'd prefer instead to call something
> that
>
> 1. could be called on a region (if selected) or buffer (if not)
You can use functions that are not designed to work on a region,
restricting them to a narrowed region with narrow-to-region. (This is
why it is important to always use point-min and point-max, and not eg. 0
and buffer-size, because point-min and point-max take into account the
narrowing).
(save-excursion
(narrow-to-region start end)
...)
> 2. could read from a user-editable property file of replacement tuples
> (like those above), similar to `abbrev_defs` but without some
> constraints of the latter that annoy in this usecase. E.g. (unless I'm
> missing something), I cannot use `abbrev` to replace the
> space-delimited 'CO 2' with 'CO2'.
You can read lisp sexps from files with:
(with-file "~/.your-replacements.sexp"
(goto-char (point-min)) ; in case the file is already open.
(read (current-buffer)))
> 3. would, for every {to-replace, replace-with} tuple in the file,
>
> * if `to-replace` found, replace every instance with `replace-with`
> * if `to-replace` not found, goto next tuple
>
> Is there elisp to do this?
Yes.
I use:
(progn (goto-char (point-min))
(replace-multiple-strings
'(("CO 2" . "CO2")
("- " . "")
("“" . "\"")
("”" . "\"")
("[weird unicodes used for bulleting]" . "*"))))
So wrapping all together:
(save-excursion
(narrow-to-region start end)
(goto-char (point-min))
(replace-multiple-strings
(with-file "~/.your-replacements.sexp"
(goto-char (point-min)) ; in case the file is already open.
(read (current-buffer)))))
with-file and replace-multiple-strings are found in pjb-emacs.el
https://github.com/informatimago/emacs/blob/master/pjb-emacs.el
--
__Pascal Bourguignon__ http://www.informatimago.com/
“The factory of the future will have only two employees, a man and a
dog. The man will be there to feed the dog. The dog will be there to
keep the man from touching the equipment.” -- Carl Bass CEO Autodesk
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: bulk replacement on region, buffer, file?
2015-12-10 3:13 ` bulk replacement on region, buffer, file? Pascal J. Bourguignon
@ 2015-12-15 4:16 ` Tom Roche
0 siblings, 0 replies; 5+ messages in thread
From: Tom Roche @ 2015-12-15 4:16 UTC (permalink / raw)
To: help-gnu-emacs
summary: elisp newbie needs help fixing code @ https://bitbucket.org/tlroche/elisp_bulk_replacement
details:
Apologies for the delay in replying:
Tom Roche[1]
>>> I would appreciate pointers to code that enables "bulk replacement" of numerous string tuples ({to-replace, replace-with}) in a single call[, such that it]
>>> 1. could be called on a region (if selected) or buffer (if not)
>>> 2. could read from a user-editable property file of replacement tuples [...]
>>> 3. would, for every {to-replace, replace-with} tuple in the file,
>>> * if `to-replace` found, replace every instance with `replace-with`
>>> * if `to-replace` not found, goto next tuple
Pascal J. Bourguignon[2]
>> wrapping all together:
>> (save-excursion
>> (narrow-to-region start end)
>> (goto-char (point-min))
>> (replace-multiple-strings
>> (with-file "~/.your-replacements.sexp"
>> (goto-char (point-min)) ; in case the file is already open.
>> (read (current-buffer)))))
>> with-file and replace-multiple-strings are found in pjb-emacs.el[6]
I've got 3 buffers open (among many others :-), with
1. one buffer on file[3] containing some {to-replace, replace-with} tuples as sexp's, open locally @ filepath=`$HOME/.emacs.d/tlr_bulk_replacements.sexp`
2. another buffer containing text to be bulk-replaced (interspersed with other text). A sample from that buffer ("suitable for testing") is @ [4]
3. yet another buffer[5] containing
* the relevant bits of pjb-emacs.el[6]
* the path to the sexp's file as `BULK-REPLACE-TUPLES-FILEPATH`
* my attempt to transcribe the desired 'wrapping all together' function
However, when I run `M-x bulk-replace-current-buffer-with-tuples-from-file` (defined in the code) in the sample-text buffer[4], I get the error
*Messages*
> save-excursion: Symbol's value as variable is void: start
So how to make the code[5] set `start` and `end` appropriately, such that
* if function is called with a region set, `start`==region start && `end`==region end
* if function is called without a region set, `start`==buffer start && `end`==buffer end
? Your assistance is appreciated, Tom Roche <Tom_Roche@pobox.com>
[1]: http://lists.gnu.org/archive/html/help-gnu-emacs/2015-12/msg00077.html
[2]: http://lists.gnu.org/archive/html/help-gnu-emacs/2015-12/msg00079.html
[3]: https://bitbucket.org/tlroche/elisp_bulk_replacement/src/HEAD/sample_replacements.sexp
[4]: https://bitbucket.org/tlroche/elisp_bulk_replacement/src/HEAD/sample_input.txt
[5]: https://bitbucket.org/tlroche/elisp_bulk_replacement/src/HEAD/test_code.el
[6]: https://github.com/informatimago/emacs/blob/master/pjb-emacs.el
^ permalink raw reply [flat|nested] 5+ messages in thread
* bulk replacement on region, buffer, file?
@ 2015-12-10 2:21 Tom Roche
2015-12-10 3:21 ` Emanuel Berg
2015-12-10 16:40 ` Bob Proulx
0 siblings, 2 replies; 5+ messages in thread
From: Tom Roche @ 2015-12-10 2:21 UTC (permalink / raw)
To: help-gnu-emacs
I would appreciate pointers to code that enables "bulk replacement" of numerous string tuples ({to-replace, replace-with}) in a single call. What I mean, why I ask:
I frequently scrape blocks of text from PDFs into Emacs text buffers. After I do so, I usually want to replace lots of strings in the buffer. E.g. (using '|' to delimit the strings),
|CO 2| -> |CO2|
|- | -> ||
|“| -> |"|
|”| -> |"|
|[weird unicodes used for bulleting]| -> |*|
which I do manually by calling `M-x replace-string` or similar interactive or regexp function. I'd prefer instead to call something that
1. could be called on a region (if selected) or buffer (if not)
2. could read from a user-editable property file of replacement tuples (like those above), similar to `abbrev_defs` but without some constraints of the latter that annoy in this usecase. E.g. (unless I'm missing something), I cannot use `abbrev` to replace the space-delimited 'CO 2' with 'CO2'.
3. would, for every {to-replace, replace-with} tuple in the file,
* if `to-replace` found, replace every instance with `replace-with`
* if `to-replace` not found, goto next tuple
Is there elisp to do this? Alternatively, pointers to non-elisp (that I could invoke on a buffer's file and then `revert-buffer`) would also be appreciated. (And, yes, I know this sounds easy to write, but I have other priorities at present and no wish to reinvent any well-working wheels.)
Apologies if this is a FAQ, but a brief websearch found nothing that looked useful.
TIA, Tom Roche <Tom_Roche@pobox.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: bulk replacement on region, buffer, file?
2015-12-10 2:21 Tom Roche
@ 2015-12-10 3:21 ` Emanuel Berg
2015-12-10 16:40 ` Bob Proulx
1 sibling, 0 replies; 5+ messages in thread
From: Emanuel Berg @ 2015-12-10 3:21 UTC (permalink / raw)
To: help-gnu-emacs
Tom Roche <Tom_Roche@pobox.com> writes:
> I would appreciate pointers to code that enables
> "bulk replacement" of numerous string tuples
> ({to-replace, replace-with}) in a single call.
> What I mean, why I ask:
>
> I frequently scrape blocks of text from PDFs into
> Emacs text buffers. After I do so, I usually want to
> replace lots of strings in the buffer. E.g. (using
> '|' to delimit the strings),
>
> |CO 2| -> |CO2|
> |- | -> ||
> |“| -> |"|
> |”| -> |"|
> |[weird unicodes used for bulleting]| -> |*|
I hear you - everything is fair in the struggle against
those goofy chars! Down with unicode!
(Except: putting them as a quote when they aren't!)
Aaanyway...
Probably best way is to use set functions - another
good way tho is recursion. And I'm not just saying
that...
(defun replace-strings (tuple-list)
(when tuple-list
(let*((tuple (car tuple-list))
(rest (cdr tuple-list))
(replace-match (car tuple))
(replace-string (cadr tuple)) )
(goto-char (point-min))
(while (re-search-forward replace-match (point-max) t) ; NOERROR
(replace-match replace-string) )
(replace-strings rest) )))
;; Eval this to fix the below typos:
(replace-strings '(("Robb Hall" "Rob Hall")
("Scott Ficsher" "Scott Fischer") ))
;; Robb Hall
;;
;; Scott Ficsher
;;
;; Robb Hall
;;
;; Scott Ficsher
--
underground experts united
http://user.it.uu.se/~embe8573
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: bulk replacement on region, buffer, file?
2015-12-10 2:21 Tom Roche
2015-12-10 3:21 ` Emanuel Berg
@ 2015-12-10 16:40 ` Bob Proulx
1 sibling, 0 replies; 5+ messages in thread
From: Bob Proulx @ 2015-12-10 16:40 UTC (permalink / raw)
To: help-gnu-emacs
Tom Roche wrote:
> I would appreciate pointers to code that enables "bulk replacement"
> of numerous string tuples ({to-replace, replace-with}) in a single
> call. What I mean, why I ask:
To handle the UTF-8 translations I like 'iconv'. It handles many
different types of transliterations.
$ echo '“foo”' | iconv -f UTF-8 -t ASCII//TRANSLIT
"foo"
If it were me I would do a first pass using iconv to transliterate
characters in the first pass and then perform the other replacements
you want in a second pass.
Bob
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-12-15 4:16 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <mailman.1767.1449714096.31583.help-gnu-emacs@gnu.org>
2015-12-10 3:13 ` bulk replacement on region, buffer, file? Pascal J. Bourguignon
2015-12-15 4:16 ` Tom Roche
2015-12-10 2:21 Tom Roche
2015-12-10 3:21 ` Emanuel Berg
2015-12-10 16:40 ` Bob Proulx
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).