* How to circumvent warning in batch mode
@ 2009-10-08 23:44 Decebal
2009-10-09 13:43 ` Kevin Rodgers
[not found] ` <mailman.8407.1255095844.2239.help-gnu-emacs@gnu.org>
0 siblings, 2 replies; 5+ messages in thread
From: Decebal @ 2009-10-08 23:44 UTC (permalink / raw)
To: help-gnu-emacs
I have the following code:
emacs -batch -nw --eval='
(let (
(match-length)
(reg-exp "^ +")
(substitute-str "@")
)
(find-file "input")
(goto-char (point-min))
(while (re-search-forward "^ +" nil t)
(setq match-length (- (point) (match-beginning 0)))
(while (> match-length (length substitute-str))
(setq substitute-str (concat substitute-str substitute-str)))
(replace-match (substring substitute-str 0 match-length))
)
(write-file "outputEmacs")
)
'
I have severall questions about it.
The input file is quite big and I get:
File input is large (31MB), really open? (y or n)
Is there a way to circumvent this?
Is there a way to do this more efficient? This script needs about 20
seconds. When doing it with a Perl script, it takes about 6 seconds.
Instead of the '@' or chr$(64) I would like to use a nbsp or chr
$(160). But then the script needs almost 3 minutes. Also every space
is replaced by two characters chr$(194) + chr$(160).
What is going wrong here?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: How to circumvent warning in batch mode
2009-10-08 23:44 How to circumvent warning in batch mode Decebal
@ 2009-10-09 13:43 ` Kevin Rodgers
2009-10-09 14:42 ` Andreas Politz
[not found] ` <mailman.8415.1255099400.2239.help-gnu-emacs@gnu.org>
[not found] ` <mailman.8407.1255095844.2239.help-gnu-emacs@gnu.org>
1 sibling, 2 replies; 5+ messages in thread
From: Kevin Rodgers @ 2009-10-09 13:43 UTC (permalink / raw)
To: help-gnu-emacs
Decebal wrote:
> I have the following code:
> emacs -batch -nw --eval='
> (let (
> (match-length)
> (reg-exp "^ +")
> (substitute-str "@")
> )
> (find-file "input")
> (goto-char (point-min))
> (while (re-search-forward "^ +" nil t)
> (setq match-length (- (point) (match-beginning 0)))
> (while (> match-length (length substitute-str))
> (setq substitute-str (concat substitute-str substitute-str)))
> (replace-match (substring substitute-str 0 match-length))
> )
> (write-file "outputEmacs")
> )
> '
> I have severall questions about it.
> The input file is quite big and I get:
> File input is large (31MB), really open? (y or n)
> Is there a way to circumvent this?
let-bind large-file-warning-threshold to nil around the call to find-file.
> Is there a way to do this more efficient? This script needs about 20
> seconds. When doing it with a Perl script, it takes about 6 seconds.
1. Put the code in a file (FILE.el) and byte-compile it. Then instead of
--eval 'CODE' on the command line, use --load FILE.elc
2. It looks like you are doing a lot of unnecessary string allocation with
concat and substring:
For every character after the first character in the match, you double the
length of the replacement string until it is at least as long as the length
of the match string, then you only use the number of characters that were in
the match string anyway. Change the loop to:
(while (re-search-forward "^ +" nil t)
(setq match-length (- (point) (match-beginning 0)))
(if (> match-length 1)
(replace-match (make-string match-length ?@))
(replace-match "@")))
That could be improved further by caching each replacement string of length
> 1, so it is only allocated once... But now, I can see that my version
using make-string does the same amount of string allocation as yours using
substring, and that your use of concat is infrequent (only needed when the
match string jumps to a larger length than has been seen so far). So caching
the replacement string (in an array, indexed by its length) is the way to go.
> Instead of the '@' or chr$(64) I would like to use a nbsp or chr
> $(160). But then the script needs almost 3 minutes. Also every space
> is replaced by two characters chr$(194) + chr$(160).
> What is going wrong here?
In UTF-8, NBSP is 2 bytes: decimal 194 160 aka hex 00C2 00A0.
--
Kevin Rodgers
Denver, Colorado, USA
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: How to circumvent warning in batch mode
2009-10-09 13:43 ` Kevin Rodgers
@ 2009-10-09 14:42 ` Andreas Politz
[not found] ` <mailman.8415.1255099400.2239.help-gnu-emacs@gnu.org>
1 sibling, 0 replies; 5+ messages in thread
From: Andreas Politz @ 2009-10-09 14:42 UTC (permalink / raw)
To: help-gnu-emacs
Kevin Rodgers <kevin.d.rodgers@gmail.com> writes:
> Decebal wrote:
>> I have the following code:
>> emacs -batch -nw --eval='
>> (let (
>> (match-length)
>> (reg-exp "^ +")
>> (substitute-str "@")
>> )
>> (find-file "input")
>> (goto-char (point-min))
>> (while (re-search-forward "^ +" nil t)
>> (setq match-length (- (point) (match-beginning 0)))
>> (while (> match-length (length substitute-str))
>> (setq substitute-str (concat substitute-str substitute-str)))
>> (replace-match (substring substitute-str 0 match-length))
>> )
>> (write-file "outputEmacs")
>> )
>> '
>> I have severall questions about it.
>> The input file is quite big and I get:
>> File input is large (31MB), really open? (y or n)
>> Is there a way to circumvent this?
>
> let-bind large-file-warning-threshold to nil around the call to find-file.
>
>> Is there a way to do this more efficient? This script needs about 20
>> seconds. When doing it with a Perl script, it takes about 6 seconds.
>
> 1. Put the code in a file (FILE.el) and byte-compile it. Then instead of
> --eval 'CODE' on the command line, use --load FILE.elc
>
> 2. It looks like you are doing a lot of unnecessary string allocation with
> concat and substring:
>
I would suggest removing the body of the while-loop, in order to see if
there is actually a significant amount of time spend there.
Depending on the file, a great deal goes probably into the
initialization of the major-mode. Maybe you can use
`find-file-literally' or some other means, I don't know.
-ap
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: How to circumvent warning in batch mode
[not found] ` <mailman.8415.1255099400.2239.help-gnu-emacs@gnu.org>
@ 2009-10-10 8:23 ` Decebal
0 siblings, 0 replies; 5+ messages in thread
From: Decebal @ 2009-10-10 8:23 UTC (permalink / raw)
To: help-gnu-emacs
On Oct 9, 4:42 pm, Andreas Politz <poli...@fh-trier.de> wrote:
> I would suggest removing the body of the while-loop, in order to see if
> there is actually a significant amount of time spend there.
There the most time is spend. Without inner-loop it took 5 seconds.
Whithout the search for the regexp 3,5 seconds.
And without the write half a second.
The complete scripts takes 17,5 seconds.
When the inner loop only has the setq for match-length it takes 5,5
seconds.
When I also have the loop to increase substitute-str it takes 6,5
seconds.
The complete scripts takes 17,5 seconds.
When I change the code to:
(while (re-search-forward reg-exp nil t)
(replace-match substitute-str)
)
Then it takes 15 seconds.
So it looks like replace-match is very expensive. A candidate for
optimalisation?
> Depending on the file, a great deal goes probably into the
> initialization of the major-mode. Maybe you can use
> `find-file-literally' or some other means, I don't know.
I allready changed to:
(switch-to-buffer (find-file-noselect input-file t t))
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: How to circumvent warning in batch mode
[not found] ` <mailman.8407.1255095844.2239.help-gnu-emacs@gnu.org>
@ 2009-10-10 8:50 ` Decebal
0 siblings, 0 replies; 5+ messages in thread
From: Decebal @ 2009-10-10 8:50 UTC (permalink / raw)
To: help-gnu-emacs
On Oct 9, 3:43 pm, Kevin Rodgers <kevin.d.rodg...@gmail.com> wrote:
> > The input file is quite big and I get:
> > File input is large (31MB), really open? (y or n)
> > Is there a way to circumvent this?
>
> let-bind large-file-warning-threshold to nil around the call to find-file.
I allready use:
(switch-to-buffer (find-file-noselect input-file t t))
> > Is there a way to do this more efficient? This script needs about 20
> > seconds. When doing it with a Perl script, it takes about 6 seconds.
>
> 1. Put the code in a file (FILE.el) and byte-compile it. Then instead of
> --eval 'CODE' on the command line, use --load FILE.elc
It is part of a script. So I think the compilation would be faster as
a load from disc. Also: how can I give parameters to an .elc
file?
> 2. It looks like you are doing a lot of unnecessary string allocation with
> concat and substring:
>
> For every character after the first character in the match, you double the
> length of the replacement string until it is at least as long as the length
> of the match string, then you only use the number of characters that were in
> the match string anyway. Change the loop to:
>
> (while (re-search-forward "^ +" nil t)
> (setq match-length (- (point) (match-beginning 0)))
> (if (> match-length 1)
> (replace-match (make-string match-length ?@))
> (replace-match "@")))
Will not work in my case. In the example the replace string is only a
character long, but it could also be for example '1234567890'.
> That could be improved further by caching each replacement string of length
> > 1, so it is only allocated once... But now, I can see that my version
> using make-string does the same amount of string allocation as yours using
> substring, and that your use of concat is infrequent (only needed when the
> match string jumps to a larger length than has been seen so far). So caching
> the replacement string (in an array, indexed by its length) is the way to go.
Making the replacement string longer takes only about a second. The
real work is in the replace-match. Only the coders of Emacs can change
that.
> > Instead of the '@' or chr$(64) I would like to use a nbsp or chr
> > $(160). But then the script needs almost 3 minutes. Also every space
> > is replaced by two characters chr$(194) + chr$(160).
> > What is going wrong here?
>
> In UTF-8, NBSP is 2 bytes: decimal 194 160 aka hex 00C2 00A0.
That explains the two characters, but why does it akes so long?
Because I now use
(switch-to-buffer (find-file-noselect input-file t t))
I do not have this problem anymore.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-10-10 8:50 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-08 23:44 How to circumvent warning in batch mode Decebal
2009-10-09 13:43 ` Kevin Rodgers
2009-10-09 14:42 ` Andreas Politz
[not found] ` <mailman.8415.1255099400.2239.help-gnu-emacs@gnu.org>
2009-10-10 8:23 ` Decebal
[not found] ` <mailman.8407.1255095844.2239.help-gnu-emacs@gnu.org>
2009-10-10 8:50 ` Decebal
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).