From: Dmitry Gutov <dgutov@yandex.ru>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 20154@debbugs.gnu.org
Subject: bug#20154: 25.0.50; json-encode-string is too slow for large strings
Date: Sat, 21 Mar 2015 22:00:46 +0200 [thread overview]
Message-ID: <550DCDEE.4090900@yandex.ru> (raw)
In-Reply-To: <838ueqvl1o.fsf@gnu.org>
On 03/21/2015 09:58 AM, Eli Zaretskii wrote:
> It depends on your requirements. How fast would it need to run to
> satisfy your needs?
In this case, the buffer contents are encoded to JSON at most once per
keypress. So 50ms or below should be fast enough, especially since most
files are smaller than that.
Of course, I'm sure there are use cases for fast JSON encoding/decoding
of even bigger volumes of data, but they can probably wait until we have
FFI.
> You don't really need regexp replacement functions with all its
> features here, do you? What you need is a way to skip characters that
> are "okay", then replace the character that is "not okay" with its
> encoded form, then repeat.
It doesn't seem like regexp searching is the slow part: save for the GC
pauses, looking for the non-matching regexp in the same string -
(replace-regexp-in-string "x" "z" s1 t t)
- only takes ~3ms.
And likewise, after changing them to use `concat' instead of `format',
both alternative json-encode-string implementations that I have "encode"
a numbers-only (without newlines) string of the same length in a few
milliseconds. Again, save for the GC pauses, which can add 30-40ms.
> For starters, how fast
> can you iterate through the string with 'skip-chars-forward', stopping
> at characters that need encoding, without actually encoding them, but
> just consing the output string by appending the parts delimited by
> places where 'skip-chars-forward' stopped? That's the lower bound on
> performance using this method.
70-90ms if we simply skip 0-9, even without nreverse-ing and
concatenating. But the change in runtime after adding an (apply #'concat
(nreverse res)) step doesn't look statistically insignificant. Here's
the implementation I tried:
(defun foofoo (string)
(with-temp-buffer
(insert string)
(goto-char (point-min))
(let (res)
(while (not (eobp))
(let ((skipped (skip-chars-forward "0-9")))
(push (buffer-substring (- (point) skipped) (point))
res))
(forward-char 1))
res)))
But that actually goes down to 30ms if we don't accumulate the result.
> I think the latest tendency is the opposite: move to Lisp everything
> that doesn't need to be in C.
Yes, and often that's great, if we're dealing with some piece of UI
infrastructure that only gets called at most a few times per command,
with inputs of size we can anticipate in advance.
> If some specific application needs more
> speed than we can provide, the first thing I'd try is think of a new
> primitive by abstracting your use case enough to be more useful than
> just for JSON.
That's why I suggested to do that with `replace-regexp-in-string' first.
That's a very common feature, and in Python and Ruby it's written in C.
Ruby's calling convention is even pretty close (the replacement can be a
string, or it can take a block, which is a kind of a function).
> Of course, implementing the precise use case in C first is probably a
> prerequisite, since it could turn out that the problem is somewhere
> else, or that even in C you won't get the speed you want.
A fast `replace-regexp-in-string' may not get us where I want, but it
should get us close. It will still be generally useful, and it'll save
us from having two `json-encode-string' implementations - for long and
short strings.
>> Replacing "z" with #'identity (so now we include a function call
>> overhead) increases the averages to 0.15s and 0.10s respectively.
>
> Sounds like the overhead of the Lisp interpreter is a significant
> factor here, no?
Yes and no. Given the 50ms budget, I think we can live with it for now,
when it's the only problem.
next prev parent reply other threads:[~2015-03-21 20:00 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-20 14:26 bug#20154: 25.0.50; json-encode-string is too slow for large strings Dmitry Gutov
2015-03-20 14:34 ` Eli Zaretskii
2015-03-20 14:43 ` Dmitry Gutov
2015-03-20 15:03 ` Eli Zaretskii
2015-03-20 15:20 ` Dmitry Gutov
2015-03-20 16:02 ` Eli Zaretskii
2015-03-20 16:21 ` Dmitry Gutov
2015-03-20 16:44 ` Eli Zaretskii
2015-03-20 16:52 ` Dmitry Gutov
2015-03-20 17:44 ` Eli Zaretskii
2015-03-20 18:42 ` Dmitry Gutov
2015-03-20 21:14 ` Eli Zaretskii
2015-03-20 22:02 ` Dmitry Gutov
2015-03-21 7:58 ` Eli Zaretskii
2015-03-21 8:12 ` Eli Zaretskii
2015-03-21 20:00 ` Dmitry Gutov [this message]
2015-03-21 20:25 ` Eli Zaretskii
2015-03-21 21:26 ` Dmitry Gutov
2015-03-22 17:31 ` Eli Zaretskii
2015-03-22 18:13 ` Dmitry Gutov
2015-03-22 18:26 ` Dmitry Gutov
2015-03-22 18:32 ` Eli Zaretskii
2015-03-22 19:03 ` Dmitry Gutov
2015-03-21 21:05 ` Drew Adams
2015-03-21 21:32 ` Dmitry Gutov
2015-04-20 22:20 ` Ted Zlatanov
2015-04-20 22:41 ` Dmitry Gutov
2015-04-20 23:11 ` Ted Zlatanov
2015-03-20 22:26 ` Dmitry Gutov
2015-03-21 8:07 ` Eli Zaretskii
2015-03-21 21:09 ` Dmitry Gutov
2015-03-21 22:20 ` Ivan Shmakov
2015-03-21 23:36 ` Dmitry Gutov
2015-03-22 14:52 ` Dmitry Gutov
2015-03-22 16:15 ` Ivan Shmakov
2015-03-22 16:47 ` Dmitry Gutov
2015-03-22 17:43 ` Eli Zaretskii
2015-03-22 19:15 ` Ivan Shmakov
2015-03-22 16:51 ` Eli Zaretskii
2015-03-22 20:07 ` mailing lists and Cc: Ivan Shmakov
2015-03-22 18:22 ` bug#20154: 25.0.50; json-encode-string is too slow for large strings Glenn Morris
2015-03-22 19:45 ` mailing lists and Cc: Ivan Shmakov
2015-03-22 16:50 ` bug#20154: 25.0.50; json-encode-string is too slow for large strings Eli Zaretskii
2015-03-22 17:10 ` Dmitry Gutov
2015-03-22 22:57 ` Dmitry Gutov
2015-03-23 15:37 ` Eli Zaretskii
2015-04-07 13:31 ` Dmitry Gutov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=550DCDEE.4090900@yandex.ru \
--to=dgutov@yandex.ru \
--cc=20154@debbugs.gnu.org \
--cc=eliz@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.