From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: =?UTF-8?Q?S=C3=A9bastien?= Chapuis Newsgroups: gmane.emacs.bugs Subject: bug#31138: Native json slower than json.el Date: Sat, 23 Mar 2019 09:59:23 +0800 Message-ID: References: <87sh806xwa.fsf@chapu.is> <834lkf7ely.fsf@gnu.org> <878t9own1p.fsf@chapu.is> <838t9o4hvl.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="6545"; mail-complaints-to="usenet@blaine.gmane.org" Cc: yyoncho@gmail.com, 31138@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat Mar 23 03:00:15 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1h7VxG-0001Xo-63 for geb-bug-gnu-emacs@m.gmane.org; Sat, 23 Mar 2019 03:00:14 +0100 Original-Received: from localhost ([127.0.0.1]:37308 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h7VxF-0006TH-54 for geb-bug-gnu-emacs@m.gmane.org; Fri, 22 Mar 2019 22:00:13 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:53243) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h7Vx6-0006R7-UY for bug-gnu-emacs@gnu.org; Fri, 22 Mar 2019 22:00:06 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1h7Vx5-0005SE-Gc for bug-gnu-emacs@gnu.org; Fri, 22 Mar 2019 22:00:04 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:41308) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1h7Vx5-0005RJ-8P for bug-gnu-emacs@gnu.org; Fri, 22 Mar 2019 22:00:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1h7Vx4-0007tD-S2 for bug-gnu-emacs@gnu.org; Fri, 22 Mar 2019 22:00:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: =?UTF-8?Q?S=C3=A9bastien?= Chapuis Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 23 Mar 2019 02:00:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 31138 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: moreinfo Original-Received: via spool by 31138-submit@debbugs.gnu.org id=B31138.155330638330265 (code B ref 31138); Sat, 23 Mar 2019 02:00:02 +0000 Original-Received: (at 31138) by debbugs.gnu.org; 23 Mar 2019 01:59:43 +0000 Original-Received: from localhost ([127.0.0.1]:54848 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h7Vwk-0007s4-II for submit@debbugs.gnu.org; Fri, 22 Mar 2019 21:59:42 -0400 Original-Received: from mail-wr1-f44.google.com ([209.85.221.44]:44416) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h7Vwi-0007rs-7I for 31138@debbugs.gnu.org; Fri, 22 Mar 2019 21:59:40 -0400 Original-Received: by mail-wr1-f44.google.com with SMTP id y7so104766wrn.11 for <31138@debbugs.gnu.org>; Fri, 22 Mar 2019 18:59:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chapu-is.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=EWjsA/+gsgoEMwpibtGLBCvLZzxruQlDDP3H1jUUucM=; b=fB5gkSBhOX2XyrsRuS21ufFZOzm1DVbaY067mZXjec2seMErVkkdf3LxmZUDAmVcvG dNZzgivoJ54BhJ0LH5qlXe/5K6u/r5ovrX+dJDDaGmTlACK/aK/QSyZFcAyKQNymNeWB al6RirC0ovu/OSANxEpsv095OvnuqR5GbBkOkqZVLtSGye/eef1HVqZ6TKNgZLrb9BbD YQAt31AbOPMOXTig5LN0W4gcqtiQvB6UbdhNyjku4RuzI8Z8yMXQgYCt58gxgxw5N0bT aR5O7qXdxtoLKOber51+jJ9F4SpTBLEbfN0Wh4/2D87pDWu1zIzGkqE5wWBJCjYFiS/r cE2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=EWjsA/+gsgoEMwpibtGLBCvLZzxruQlDDP3H1jUUucM=; b=hUMNC77wNZeU3LBSXDH+svnso4nalY8uwW0CzSaaZ7l4FbWzlnImVrEVveplep2AyY jBFZoLHMXqI6R2b2pSKdliC3vXlW69v1wWNbY/9bdThbqevlSpv4KJLxNc8PyXNXR857 N074SPuOyLLdAa+Y/2/8LNkPurn/YjBCMWAtuU8dHabofNXyDL74FRXJTuyGYX/ynF4o ZXiDZ7qn+bX+1rRbdjRv/0WyEYa8Hv3q+ftIbSueFLvr1FwxAy2nVjo6yqJzxdkK1L92 qqTfEaFzIQcFwECB8CxLJPHlM8Xx4+I//Petam1J4V8MkzUkPeuue7wi8yOm+Lt3Z/u7 fdxg== X-Gm-Message-State: APjAAAUCnEn0nFdUXs3swtoMVGBGJV02AVv1A/p5zumUZsFBOWXVqGO5 g1tDdX6yD8BVxZ/KDkjcV+MYISdmdrRcPx4XoGK9rA== X-Google-Smtp-Source: APXvYqzuje2IttxJqxWwwFygObUBD3r5DjXTkkgABilSoNGm9s2FKfzU5AoIHZZ6kbwW5s7sBWVNjB/CQP+44fqgsVs= X-Received: by 2002:a5d:6889:: with SMTP id h9mr8333043wru.12.1553306374173; Fri, 22 Mar 2019 18:59:34 -0700 (PDT) In-Reply-To: <838t9o4hvl.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:156639 Archived-At: Hello, I tried to find the cause of this but still without any success. Here is a reproducible case: You can download the json file at: https://gist.githubusercontent.com/yyoncho/dec968b69185305ed02741e18b27a82d= /raw/334b0a51bc52cc3c98edb8ff4bccb5fc4531842b/large.json Open the file with `emacs -Q large.json`. Switch to the scratch buffer and run: ``` (with-current-buffer "large.json" (benchmark-run 10 (json-parse-string (buffer-string)))) ;;; (2.5371836119999998 10 0.111044641) (with-current-buffer "large.json" (let ((str (buffer-string))) (benchmark-run 10 (with-temp-buffer (json-parse-string str))))) ;;; (1.510604359 10 0.13192760000000003) (with-current-buffer "large.json" (let ((str (buffer-string))) (benchmark-run 10 (with-temp-buffer (json-read-from-string str))))) ;;; (1.970248228 114 1.058150570000001) ``` Thanks, Sebastien Chapuis Le dim. 15 avr. 2018 =C3=A0 23:19, Eli Zaretskii a =C3=A9cri= t : > > > From: Sebastien Chapuis > > Cc: 31138@debbugs.gnu.org > > Date: Sun, 15 Apr 2018 16:40:18 +0200 > > > > > > > I'm surprised that the slowdown due to the conversion is so large, > > > though. It doesn't feel right, even with a 4MB string. > > > > I've digged a bit to know why it is so slow, and I've found that if I'm > > wrapping `json-parse-string` with a `with-temp-buffer`, it is now way > > faster: > > > > results of benchmark-run with a string of 4043212 characters > > ``` > > (with-temp-buffer (json-parse-string str)): > > (0.814315554 1 0.11941178500000005) > > > > (json-parse-string str): > > (11.542233167 1 0.14954429599999997) > > > > (with-temp-buffer (json-read-from-string str)): > > (5.9781185610000005 29 4.967349412000001) > > > > (json-read-from-string str): > > (5.601267 24 4.723292248000001) > > ``` > > Interesting. > > > Any idea why ? > > Where did str come from? Did you insert it into the buffer or > something? Could that explain the difference in performance? > > More generally, can you post the string you are using for the > benchmarking, and the benchmark code as well? That would make the > discussion less abstract. > > > > Yes, it's necessary, because the input string may include raw bytes, > > > which will crash Emacs if not handled properly. > > > > The Jansson documentation guarantee that the strings returned > > from the library are always UTF-8 encoded [1]. > > You assume that the library has no bugs, yes? Because if it does, > then we might crash Emacs by trusting it so much. Letting invalid > bytes creep into Emacs buffers and strings is a sure recipe for an > eventual crash. > > > By knowing that guarantee, is it possible to reconsider the use of > > code_convert_string ? > > Since it's already much faster than a Lisp implementation, why would > we want to risk crashing an Emacs session by omitting the decoding? > > > Encoding a string to UTF-8 which is already UTF-8 encoded seems > > useless.. > > It's decoding, not encoding, and the process of decoding examines > every sequence in the byte stream and ensures they are valid UTF-8. > > Emacs never trusts any external data to be what the user or Lisp tell > it is; I see no reason why we should make an exception in this > particular case. > > Thanks.