From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#31138: Native json slower than json.el Date: Sun, 15 Apr 2018 18:19:26 +0300 Message-ID: <838t9o4hvl.fsf@gnu.org> References: <87sh806xwa.fsf@chapu.is> <834lkf7ely.fsf@gnu.org> <878t9own1p.fsf@chapu.is> Reply-To: Eli Zaretskii NNTP-Posting-Host: blaine.gmane.org X-Trace: blaine.gmane.org 1523805490 12172 195.159.176.226 (15 Apr 2018 15:18:10 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 15 Apr 2018 15:18:10 +0000 (UTC) Cc: 31138@debbugs.gnu.org To: Sebastien Chapuis Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sun Apr 15 17:18:05 2018 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1f7jPp-00033K-JF for geb-bug-gnu-emacs@m.gmane.org; Sun, 15 Apr 2018 17:18:05 +0200 Original-Received: from localhost ([::1]:49663 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f7jRv-0007Vg-UT for geb-bug-gnu-emacs@m.gmane.org; Sun, 15 Apr 2018 11:20:15 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:60523) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f7jRm-0007V7-IK for bug-gnu-emacs@gnu.org; Sun, 15 Apr 2018 11:20:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f7jRi-0003vs-Da for bug-gnu-emacs@gnu.org; Sun, 15 Apr 2018 11:20:06 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:46413) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1f7jRi-0003ut-AQ for bug-gnu-emacs@gnu.org; Sun, 15 Apr 2018 11:20:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1f7jRi-00007h-4O for bug-gnu-emacs@gnu.org; Sun, 15 Apr 2018 11:20:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 15 Apr 2018 15:20:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 31138 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 31138-submit@debbugs.gnu.org id=B31138.1523805576424 (code B ref 31138); Sun, 15 Apr 2018 15:20:02 +0000 Original-Received: (at 31138) by debbugs.gnu.org; 15 Apr 2018 15:19:36 +0000 Original-Received: from localhost ([127.0.0.1]:54310 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1f7jRH-00006m-NW for submit@debbugs.gnu.org; Sun, 15 Apr 2018 11:19:35 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:41049) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1f7jRG-00006U-7f for 31138@debbugs.gnu.org; Sun, 15 Apr 2018 11:19:34 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f7jR6-0002BG-RD for 31138@debbugs.gnu.org; Sun, 15 Apr 2018 11:19:28 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:46120) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f7jR6-0002BA-Ne; Sun, 15 Apr 2018 11:19:24 -0400 Original-Received: from [176.228.60.248] (port=3051 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1f7jR6-00083y-5H; Sun, 15 Apr 2018 11:19:24 -0400 In-reply-to: <878t9own1p.fsf@chapu.is> (message from Sebastien Chapuis on Sun, 15 Apr 2018 16:40:18 +0200) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:145389 Archived-At: > From: Sebastien Chapuis > Cc: 31138@debbugs.gnu.org > Date: Sun, 15 Apr 2018 16:40:18 +0200 > > > > I'm surprised that the slowdown due to the conversion is so large, > > though. It doesn't feel right, even with a 4MB string. > > I've digged a bit to know why it is so slow, and I've found that if I'm > wrapping `json-parse-string` with a `with-temp-buffer`, it is now way > faster: > > results of benchmark-run with a string of 4043212 characters > ``` > (with-temp-buffer (json-parse-string str)): > (0.814315554 1 0.11941178500000005) > > (json-parse-string str): > (11.542233167 1 0.14954429599999997) > > (with-temp-buffer (json-read-from-string str)): > (5.9781185610000005 29 4.967349412000001) > > (json-read-from-string str): > (5.601267 24 4.723292248000001) > ``` Interesting. > Any idea why ? Where did str come from? Did you insert it into the buffer or something? Could that explain the difference in performance? More generally, can you post the string you are using for the benchmarking, and the benchmark code as well? That would make the discussion less abstract. > > Yes, it's necessary, because the input string may include raw bytes, > > which will crash Emacs if not handled properly. > > The Jansson documentation guarantee that the strings returned > from the library are always UTF-8 encoded [1]. You assume that the library has no bugs, yes? Because if it does, then we might crash Emacs by trusting it so much. Letting invalid bytes creep into Emacs buffers and strings is a sure recipe for an eventual crash. > By knowing that guarantee, is it possible to reconsider the use of > code_convert_string ? Since it's already much faster than a Lisp implementation, why would we want to risk crashing an Emacs session by omitting the decoding? > Encoding a string to UTF-8 which is already UTF-8 encoded seems > useless.. It's decoding, not encoding, and the process of decoding examines every sequence in the byte stream and ensures they are valid UTF-8. Emacs never trusts any external data to be what the user or Lisp tell it is; I see no reason why we should make an exception in this particular case. Thanks.