unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Lars Ingebrigtsen <larsi@gnus.org>
To: "Benninghofen\, Benjamin Dr." <benjamin.benninghofen@airbus.com>
Cc: Kevin Layer <layer@franz.com>,
	32729@debbugs.gnu.org, 32728@debbugs.gnu.org
Subject: bug#32729: Xemacs 23 times as fast as GNU Emacs
Date: Sat, 12 Oct 2019 05:57:09 +0200	[thread overview]
Message-ID: <871rviobu2.fsf@gnus.org> (raw)
In-Reply-To: <c3a9871dbc0a42c799d7368e2b6457b2@CD1-4DDAG02-P01.cdmail.common.airbusds.corp>

The example is a bit convoluted, but it's an interesting question -- is
Emacs' handling of process output fast or not?

As a baseline, let's read a GB of zeroes and output to /dev/null:

larsi@marnie:~/src/emacs/trunk$ time dd if=/dev/zero bs=1000 count=1000000 > /dev/null
1000000+0 records in
1000000+0 records out
1000000000 bytes (1.0 GB, 954 MiB) copied, 2.04672 s, 489 MB/s

real	0m2.064s
user	0m0.173s
sys	0m1.684s

(benchmark-run 1 (call-process "dd" nil nil nil "if=/dev/zero" "bs=1000" "count=1000000"))
=> (0.665899839 0 0.0)

So that's better in Emacs than in the shell, but we're just setting
stdout to /dev/zero here, so we're not actually seeing the data at all.

But let's insert the data somewhere:

(benchmark-run 1 (call-process "dd" nil (get-buffer-create " *zeroes*") nil "if=/dev/zero" "bs=1000" "count=1000000"))
=> (4.703641145 0 0.0)

(Note: Don't visit the " *zeroes*" buffer after this, because that will
hang Emacs totally.  I guess the long-line display problem hasn't been
fixed after all?)

4.7s isn't horrible, but it's not good, either.  But most of that time
is taken up in coding system conversion, so:

(let ((coding-system-for-read 'binary))
  (benchmark-run 1 (call-process "dd" nil (get-buffer-create " *zeroes*") nil "if=/dev/zero" "bs=1000" "count=1000000")))
=> (1.750549617 0 0.0)

Which is faster than writing to a file:

larsi@marnie:~/src/emacs/trunk$ time dd if=/dev/zero bs=1000 count=1000000 of=/tmp/foo
1000000+0 records in
1000000+0 records out
1000000000 bytes (1.0 GB, 954 MiB) copied, 2.21987 s, 450 MB/s

real	0m2.325s
user	0m0.168s
sys	0m1.957s

So that's OK.  But what happens when we add a process filter?

(let ((coding-system-for-read 'binary))
  (kill-buffer (get-buffer-create " *zeroes*"))
  (benchmark-run
      1
    (let ((proc (start-process "dd" (get-buffer-create " *zeroes*") "dd"
			       "if=/dev/zero" "bs=1000" "count=1000000")))
      (set-process-filter proc (lambda (proc string)
				 ))
      (while (and (process-live-p proc)
		  (accept-process-output proc 0.01))))))
=> (16.878995199 993 12.469541476)

That's slow, and we're just discarding the data.  If we output the data,
it's even slower, but not a surprising amount:

(let ((coding-system-for-read 'binary))
  (kill-buffer (get-buffer-create " *zeroes*"))
  (benchmark-run
      1
    (let ((proc (start-process "dd" (get-buffer-create " *zeroes*") "dd"
			       "if=/dev/zero" "bs=1000" "count=1000000")))
      (set-process-filter proc (lambda (proc string)
				 (with-current-buffer
				     (get-buffer-create " *zeroes*")
				   (goto-char (point-max))
				   (insert string))))
      (while (and (process-live-p proc)
		  (accept-process-output proc 0.01))))))
=> (19.801399562 1000 12.700370797000001)

Byte-compiling the function makes no difference.

So it would seem that the Emacs filter method is just unnecessarily
slow, which I've long suspected.  Creating the strings before calling
the filter is probably what's taking quite a bit of this time, but the
rest is taken up by garbage collecting as it spends 13 of these 20
seconds doing that.

And it's a real world problem: When reading data from any network
source, you have to use filters because the protocol is usually based on
parsing the output to find out when it's over, so you can't use
sentinels.

So for Emacs 28 I want to explore adding a new type of filter to
processes: One that doesn't take a string argument, but which just
inserts the data into the buffer, and then calls the filter with the
region positions of what was inserted, which is just as powerful, but
should allow streams to be 10x more efficient.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no






  reply	other threads:[~2019-10-12  3:57 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-13 13:32 bug#32729: Xemacs 23 times as fast as GNU Emacs Benninghofen, Benjamin Dr.
2019-10-12  3:57 ` Lars Ingebrigtsen [this message]
2019-10-12  7:39   ` bug#32728: " Eli Zaretskii
2019-10-12 17:55     ` Lars Ingebrigtsen
2019-10-13  8:13       ` bug#32728: " Eli Zaretskii
2019-10-13 17:36         ` Lars Ingebrigtsen
2019-10-14  8:18           ` Eli Zaretskii
2019-10-14  8:36             ` bug#32728: " Lars Ingebrigtsen
2019-10-14  9:15               ` Eli Zaretskii
2019-10-13 17:47         ` Lars Ingebrigtsen
2019-10-13 18:46           ` Eli Zaretskii
2019-10-14  8:54             ` Lars Ingebrigtsen
2019-10-14 10:18               ` bug#32728: " Eli Zaretskii
2019-10-25  6:38               ` Benninghofen, Benjamin Dr.
2019-10-25  7:00                 ` Eli Zaretskii
2019-10-13 10:49   ` Phil Sainty
2019-10-13 17:24     ` bug#32728: " Lars Ingebrigtsen
2019-10-13 18:44       ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871rviobu2.fsf@gnus.org \
    --to=larsi@gnus.org \
    --cc=32728@debbugs.gnu.org \
    --cc=32729@debbugs.gnu.org \
    --cc=benjamin.benninghofen@airbus.com \
    --cc=layer@franz.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).