unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
From: Kevin Rodgers <kevin.d.rodgers@gmail.com>
To: help-gnu-emacs@gnu.org
Subject: Re: How to circumvent warning in batch mode
Date: Fri, 09 Oct 2009 07:43:40 -0600	[thread overview]
Message-ID: <hanem4$on7$1@ger.gmane.org> (raw)
In-Reply-To: <5ebdc222-a8b5-4eed-9481-39b813da5f1c@j28g2000vbl.googlegroups.com>

Decebal wrote:
> I have the following code:
> emacs -batch -nw --eval='
>   (let (
>         (match-length)
>         (reg-exp "^ +")
>         (substitute-str "@")
>         )
>     (find-file "input")
>     (goto-char (point-min))
>     (while (re-search-forward "^ +" nil t)
>       (setq match-length (- (point) (match-beginning 0)))
>       (while (> match-length (length substitute-str))
>         (setq substitute-str (concat substitute-str substitute-str)))
>       (replace-match (substring substitute-str 0 match-length))
>     )
>     (write-file "outputEmacs")
>   )
> '
> I have severall questions about it.
> The input file is quite big and I get:
>     File input is large (31MB), really open? (y or n)
> Is there a way to circumvent this?

let-bind large-file-warning-threshold to nil around the call to find-file.

> Is there a way to do this more efficient? This script needs about 20
> seconds. When doing it with a Perl script, it takes about 6 seconds.

1. Put the code in a file (FILE.el) and byte-compile it.  Then instead of
    --eval 'CODE' on the command line, use --load FILE.elc

2. It looks like you are doing a lot of unnecessary string allocation with
    concat and substring:

    For every character after the first character in the match, you double the
    length of the replacement string until it is at least as long as the length
    of the match string, then you only use the number of characters that were in
    the match string anyway.  Change the loop to:

     (while (re-search-forward "^ +" nil t)
       (setq match-length (- (point) (match-beginning 0)))
       (if (> match-length 1)
	  (replace-match (make-string match-length ?@))
	(replace-match "@")))

    That could be improved further by caching each replacement string of length
    > 1, so it is only allocated once... But now, I can see that my version
    using make-string does the same amount of string allocation as yours using
    substring, and that your use of concat is infrequent (only needed when the
    match string jumps to a larger length than has been seen so far).  So caching
    the replacement string (in an array, indexed by its length) is the way to go.

> Instead of the '@' or chr$(64) I would like to use a nbsp or chr
> $(160). But then the script needs almost 3 minutes. Also every space
> is replaced by two characters chr$(194) + chr$(160).
> What is going wrong here?

In UTF-8, NBSP is 2 bytes: decimal 194 160 aka hex 00C2 00A0.

-- 
Kevin Rodgers
Denver, Colorado, USA





  reply	other threads:[~2009-10-09 13:43 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-08 23:44 How to circumvent warning in batch mode Decebal
2009-10-09 13:43 ` Kevin Rodgers [this message]
2009-10-09 14:42   ` Andreas Politz
     [not found]   ` <mailman.8415.1255099400.2239.help-gnu-emacs@gnu.org>
2009-10-10  8:23     ` Decebal
     [not found] ` <mailman.8407.1255095844.2239.help-gnu-emacs@gnu.org>
2009-10-10  8:50   ` Decebal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='hanem4$on7$1@ger.gmane.org' \
    --to=kevin.d.rodgers@gmail.com \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).