efficiency question on text manipulation using string vs buffer

unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed

* efficiency question on text manipulation using string vs buffer
@ 2009-03-24  1:41 Xah Lee
  2009-03-24 20:19 ` Nikolaj Schumacher
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Xah Lee @ 2009-03-24  1:41 UTC (permalink / raw)
  To: help-gnu-emacs

emacs lisp question.

it's said that for text manipulation, operation on buffer data type is
more efficient than operation on string data type.

today, i tried to test it, but the difference seems negligible ? My
tentative test seems to indicate, that after performing 120 thousand
string replacement, the string method is only 1 second slower.

Here's 2 implementation of the same command. The first act on buffer
using narrow-to-region. The second deal with string.

(defun replace-string-pairs-region1 (start end mylist)
  "Replace string pairs in region.
Example call:
 (replace-string-pairs-region START END '([\"alpha\" \"α\"] [\"beta\"
\"β\"]))
The search string and replace string are all literal and case
sensitive."
  (save-restriction
    (narrow-to-region start end)
    (mapc
      (lambda (arg)
        (goto-char (point-min))
        (while (search-forward (elt arg 0) nil t) (replace-match (elt
arg 1) t t) ))
      mylist)))

(defun replace-string-pairs-region2 (start end mylist)
  "Replace string pairs in region.
Same as `replace-string-pairs-region1' but different implementation."
  (let (mystr)
    (setq mystr (buffer-substring start end))
    (mapc
     (lambda (x) (setq mystr (replace-regexp-in-string (elt x 0) (elt
x 1) mystr t t)))
     mylist)
    (delete-region start end)
    (insert mystr)
    )
)

It appears to me, testing these commands on a text selection with
about 122k chars that needs to be replaced, the second version is only
1 second slower? (both finishes within 2 or 3 seconds, on a 2007
midrange PC)

Any comments?

Here are the 2 test functions i used:

(defun f1 (start end)
  ""
  (interactive "r")
  (let (starttime endtime)
    (setq starttime (current-time))
    (replace-string-pairs-region1 start end '(["&" "&amp;"]
                                              ["<" "&lt;"]
                                              [">" "&gt;"]))
    (setq endtime (current-time))
    (message "%f" (- (elt endtime 1)
                    (elt starttime 1)) )))

(defun f2 (start end)
  ""
  (interactive "r")
  (let (starttime endtime)
    (setq starttime (current-time))
    (replace-string-pairs-region2 start end '(["&" "&amp;"]
                                              ["<" "&lt;"]
                                              [">" "&gt;"]))
    (setq endtime (current-time))
    (message "%f" (- (elt endtime 1)
                    (elt starttime 1)) )))

the test region is a buffer with lines like this:
<>&<>&<>&<>&<>&<>&<>&<>&<>&<>&<>&<>&<>&<>&<>&<>&<>&<>&<>&<>&
the file size is 122k bytes. I select the whole buffer, than call f1
or f2, and compare their timing difference.

Thanks.

  Xah
∑ http://xahlee.org/

☄

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: efficiency question on text manipulation using string vs buffer
  2009-03-24  1:41 efficiency question on text manipulation using string vs buffer Xah Lee
@ 2009-03-24 20:19 ` Nikolaj Schumacher
       [not found] ` <mailman.3920.1237926004.31690.help-gnu-emacs@gnu.org>
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Nikolaj Schumacher @ 2009-03-24 20:19 UTC (permalink / raw)
  To: Xah Lee; +Cc: help-gnu-emacs

Xah Lee <xahlee@gmail.com> wrote:

> It appears to me, testing these commands on a text selection with
> about 122k chars that needs to be replaced, the second version is only
> 1 second slower? (both finishes within 2 or 3 seconds, on a 2007
> midrange PC)

You should note that the replace-string function is pretty smart.
It only does one big concat, so not that much string manipulation actually
happens in this case.  The majority of time is probably spent on the
regexp search anyway, making direct comparison moot.


regards,
Nikolaj Schumacher




^ permalink raw reply	[flat|nested] 7+ messages in thread

[parent not found: <mailman.3920.1237926004.31690.help-gnu-emacs@gnu.org>]

* Re: efficiency question on text manipulation using string vs buffer
       [not found] ` <mailman.3920.1237926004.31690.help-gnu-emacs@gnu.org>
@ 2009-03-24 22:26   ` Xah Lee
  0 siblings, 0 replies; 7+ messages in thread
From: Xah Lee @ 2009-03-24 22:26 UTC (permalink / raw)
  To: help-gnu-emacs

On Mar 24, 1:19 pm, Nikolaj Schumacher <m...@nschum.de> wrote:
> XahLee<xah...@gmail.com> wrote:
> > It appears to me, testing these commands on a text selection with
> > about 122k chars that needs to be replaced, the second version is only
> > 1 second slower? (both finishes within 2 or 3 seconds, on a 2007
> > midrange PC)
>
> You should note that the replace-string function is pretty smart.
> It only does one big concat, so not that much string manipulation actually
> happens in this case.  The majority of time is probably spent on the
> regexp search anyway, making direct comparison moot.

Thanks.

  Xah
∑ http://xahlee.org/

☄


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: efficiency question on text manipulation using string vs buffer
  2009-03-24  1:41 efficiency question on text manipulation using string vs buffer Xah Lee
  2009-03-24 20:19 ` Nikolaj Schumacher
       [not found] ` <mailman.3920.1237926004.31690.help-gnu-emacs@gnu.org>
@ 2009-03-25  2:34 ` Kevin Rodgers
       [not found] ` <mailman.3935.1237948501.31690.help-gnu-emacs@gnu.org>
  3 siblings, 0 replies; 7+ messages in thread
From: Kevin Rodgers @ 2009-03-25  2:34 UTC (permalink / raw)
  To: help-gnu-emacs

Xah Lee wrote:
> emacs lisp question.
> 
> it's said that for text manipulation, operation on buffer data type is
> more efficient than operation on string data type.
> 
> today, i tried to test it, but the difference seems negligible ? My
> tentative test seems to indicate, that after performing 120 thousand
> string replacement, the string method is only 1 second slower.
...
> It appears to me, testing these commands on a text selection with
> about 122k chars that needs to be replaced, the second version is only
> 1 second slower? (both finishes within 2 or 3 seconds, on a 2007
> midrange PC)
> 
> Any comments?

The version that takes 3 seconds is 50% slower than the version that
takes 2 seconds.

-- 
Kevin Rodgers
Denver, Colorado, USA





^ permalink raw reply	[flat|nested] 7+ messages in thread

[parent not found: <mailman.3935.1237948501.31690.help-gnu-emacs@gnu.org>]

* Re: efficiency question on text manipulation using string vs buffer
       [not found] ` <mailman.3935.1237948501.31690.help-gnu-emacs@gnu.org>
@ 2009-03-25  2:56   ` Xah Lee
  2009-03-26  3:07     ` Kevin Rodgers
  2009-03-26 16:56     ` Nikolaj Schumacher
  0 siblings, 2 replies; 7+ messages in thread
From: Xah Lee @ 2009-03-25  2:56 UTC (permalink / raw)
  To: help-gnu-emacs

On Mar 24, 7:34 pm, Kevin Rodgers <kevin.d.rodg...@gmail.com> wrote:
> Xah Lee wrote:
> > emacs lisp question.
>
> > it's said that for text manipulation, operation on buffer data type is
> > more efficient than operation on string data type.
>
> > today, i tried to test it, but the difference seems negligible ? My
> > tentative test seems to indicate, that after performing 120 thousand
> > string replacement, the string method is only 1 second slower.
> ...
> > It appears to me, testing these commands on a text selection with
> > about 122k chars that needs to be replaced, the second version is only
> > 1 second slower? (both finishes within 2 or 3 seconds, on a 2007
> > midrange PC)
>
> > Any comments?
>
> The version that takes 3 seconds is 50% slower than the version that
> takes 2 seconds.

LOL Kevin. I think with what little skills i have, i can still do
arithmetic fine.

The point is, the speed difference is few micro seconds for all
practical purposes, at least this particular code comparison.  (the
version with save-restriction and narrow-to-region might even be
slower if the function is called multiple times for its overhead.)

I like the version that repeated sets var and repeatedly call replace-
regexp-in-string. This is conceptually simple and the concept is
universal among langs. The version calling save-restriction and narrow-
to-region is emacs specific, requires some specific knowledge about
emacs/lisp environment to understand.

i somewhat fear, if repeated or recursive call to save-restriction or
narrow-to-region or similar might break something. (don't fully
understand their details... which involves buffers, marks, etc.)

  Xah
∑ http://xahlee.org/

☄

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: efficiency question on text manipulation using string vs buffer
  2009-03-25  2:56   ` Xah Lee
@ 2009-03-26  3:07     ` Kevin Rodgers
  2009-03-26 16:56     ` Nikolaj Schumacher
  1 sibling, 0 replies; 7+ messages in thread
From: Kevin Rodgers @ 2009-03-26  3:07 UTC (permalink / raw)
  To: help-gnu-emacs

Xah Lee wrote:
> On Mar 24, 7:34 pm, Kevin Rodgers <kevin.d.rodg...@gmail.com> wrote:
>> The version that takes 3 seconds is 50% slower than the version that
>> takes 2 seconds.
> 
> LOL Kevin. I think with what little skills i have, i can still do
> arithmetic fine.

See, I knew I didn't need to put a smiley in there.

-- 
Kevin Rodgers
Denver, Colorado, USA





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: efficiency question on text manipulation using string vs buffer
  2009-03-25  2:56   ` Xah Lee
  2009-03-26  3:07     ` Kevin Rodgers
@ 2009-03-26 16:56     ` Nikolaj Schumacher
  1 sibling, 0 replies; 7+ messages in thread
From: Nikolaj Schumacher @ 2009-03-26 16:56 UTC (permalink / raw)
  To: Xah Lee; +Cc: help-gnu-emacs

Xah Lee <xahlee@gmail.com> wrote:

> I like the version that repeated sets var and repeatedly call replace-
> regexp-in-string. This is conceptually simple and the concept is
> universal among langs. The version calling save-restriction and narrow-
> to-region is emacs specific, requires some specific knowledge about
> emacs/lisp environment to understand.

When working on data coming from a buffer, you should prefer working in
that buffer, though.  The text might contain overlays that can be lost
as well as point and scrolling positions.


regards,
Nikolaj Schumacher




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-03-26 16:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-24  1:41 efficiency question on text manipulation using string vs buffer Xah Lee
2009-03-24 20:19 ` Nikolaj Schumacher
     [not found] ` <mailman.3920.1237926004.31690.help-gnu-emacs@gnu.org>
2009-03-24 22:26   ` Xah Lee
2009-03-25  2:34 ` Kevin Rodgers
     [not found] ` <mailman.3935.1237948501.31690.help-gnu-emacs@gnu.org>
2009-03-25  2:56   ` Xah Lee
2009-03-26  3:07     ` Kevin Rodgers
2009-03-26 16:56     ` Nikolaj Schumacher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).