From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.bugs Subject: bug#31138: Native json slower than json.el Date: Wed, 24 Apr 2019 18:55:45 +0300 Message-ID: References: <87sh806xwa.fsf@chapu.is> <83d0lmgez2.fsf@gnu.org> <7d503be9-4d85-3d0b-6829-631ad376ba3d@yandex.ru> <831s22gcci.fsf@gnu.org> <83y349gasn.fsf@gnu.org> <83d0lfag4x.fsf@gnu.org> <5cf45a21-65c3-67ee-f123-be83a6ee7c99@yandex.ru> <83a7gjaen6.fsf@gnu.org> <83ftqa8qsg.fsf@gnu.org> <83muki6y6r.fsf@gnu.org> <4b8c6799-e845-768b-749c-f2a883ab89f8@yandex.ru> <83h8aq6v6a.fsf@gnu.org> <834l6q6ozn.fsf@gnu.org> <83zhoi59ao.fsf@gnu.org> <83wojm57sl.fsf@gnu.org> <83ftq96azk.fsf@gnu.org> <83wojk534g.fsf@gnu.org> <6308ceff-479b-2ce7-2072-41e683978c7c@yandex.ru> <83h8ao4vl0.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="206325"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 Cc: sebastien@chapu.is, yyoncho@gmail.com, 31138@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Wed Apr 24 17:56:15 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1hJKFr-000rYo-9C for geb-bug-gnu-emacs@m.gmane.org; Wed, 24 Apr 2019 17:56:15 +0200 Original-Received: from localhost ([127.0.0.1]:43600 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hJKFp-00053c-9c for geb-bug-gnu-emacs@m.gmane.org; Wed, 24 Apr 2019 11:56:13 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:35208) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hJKFi-000527-J6 for bug-gnu-emacs@gnu.org; Wed, 24 Apr 2019 11:56:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hJKFg-00057F-C4 for bug-gnu-emacs@gnu.org; Wed, 24 Apr 2019 11:56:05 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:42724) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hJKFe-00056t-Iv for bug-gnu-emacs@gnu.org; Wed, 24 Apr 2019 11:56:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1hJKFe-0005ut-Fo for bug-gnu-emacs@gnu.org; Wed, 24 Apr 2019 11:56:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Dmitry Gutov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 24 Apr 2019 15:56:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 31138 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: moreinfo Original-Received: via spool by 31138-submit@debbugs.gnu.org id=B31138.155612135722732 (code B ref 31138); Wed, 24 Apr 2019 15:56:02 +0000 Original-Received: (at 31138) by debbugs.gnu.org; 24 Apr 2019 15:55:57 +0000 Original-Received: from localhost ([127.0.0.1]:56268 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hJKFZ-0005ua-HX for submit@debbugs.gnu.org; Wed, 24 Apr 2019 11:55:57 -0400 Original-Received: from mail-wm1-f42.google.com ([209.85.128.42]:40506) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hJKFX-0005uL-Ap for 31138@debbugs.gnu.org; Wed, 24 Apr 2019 11:55:56 -0400 Original-Received: by mail-wm1-f42.google.com with SMTP id z24so5805027wmi.5 for <31138@debbugs.gnu.org>; Wed, 24 Apr 2019 08:55:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:subject:to:cc:references:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=/iooIZJOxCSJ1XnkCnhRKsETGDdBbIwVPxXLGRJU9Eg=; b=DId9lwQtfCSR+J9dMchfom5T9HpElP1yofzhOUsTi/KNakB8k+WfwNnjikY4jb1gze d3lmGVyoG1xkklVnZ8nN0laWVzhZsUcVsZITsnGneK57/EPxJGpe4tKotf4AccyK6s9Y DvBK/K6aEv3wDHTW0JZa6noTAsYt1aIYGIQ/rSKZSFJYQWoI2EN/NhpeyUi1kAGs8qUh f3Dg70xm5f9HijiKHmwgaGe4pw0VuTzl7N45EDEQKoRnYoujPqkUPDRCDG2QSov6SpKW V6+P/rLaj7odfI170s4eY5OOnnZhej5q7LILUswLBqncm5fQp3s0ie4roSw/okMPsxUC Tvnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:subject:to:cc:references:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=/iooIZJOxCSJ1XnkCnhRKsETGDdBbIwVPxXLGRJU9Eg=; b=uCasy9cLcQxhhI8WjsxGBaA2MeBzDi5obzzTpdf+o8M1KQJ4uZc2ZLJusTUuCrlgRK x9YhBVqWs4qxH//zk92olEAXYZNMyibiRuE0VLCKZ/djG2f7zWC0/VkFQYGWDmtBDc1L M0AV/rwB1B5z+8EIR0RqyGoQCdkcel/zIOUVq07hjwU/c9DRVEdB0eGHwO+fK95dKhnH D922RSEks3+yR9M2v8UlZMj/RCNVTAnA93ShiPZlRmag/mJC+PPwCwu0Ewo1iSQQSudL TxRQZu1i2FjNnGn918IWWXyLSr2vxScHX4ljzUPD/NPHoeyW69y1IaGfZDWJqkzzxViC w+Qw== X-Gm-Message-State: APjAAAUQ/1+XyDm7YRruhzXtzrV8ih31BWWcbee5aKoDRDLsigCpxIfY 00WGvEB+dsKqk3FT3fHi5o9K7gCR X-Google-Smtp-Source: APXvYqydBMrWmzEI5Y26egIs48fsbrDICi/Ivr8JOWivNDyKVbnUT8r74n3NY7egRFM2qIEmFiBYVQ== X-Received: by 2002:a7b:cc91:: with SMTP id p17mr7155292wma.38.1556121348874; Wed, 24 Apr 2019 08:55:48 -0700 (PDT) Original-Received: from [192.168.0.195] ([109.110.245.170]) by smtp.googlemail.com with ESMTPSA id a22sm17567050wmj.44.2019.04.24.08.55.46 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 24 Apr 2019 08:55:47 -0700 (PDT) In-Reply-To: <83h8ao4vl0.fsf@gnu.org> Content-Language: en-US X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:158184 Archived-At: On 23.04.2019 17:58, Eli Zaretskii wrote: >> Doing that for buffer text as well might be helpful. > > In what use cases would this be helpful? Most cases of decoding text > in a buffer happen when we read text from files, where we already have > an internal optimization for plain ASCII files. HTTP request response buffers? They can be as large as several megabytes, as we have established in this discussion. > We could perhaps try > a similar optimization for UTF-8 instead of just ASCII. I think so. > Use cases where we read without decoding and then decode buffer > contents "by hand" are relatively rare, certainly when the stuff to > decode is so large that the performance gains will be tangible. Maybe be. We could try and benchmark, though. >> So that's why I mentioned decode-coding-string (though >> code_convert_string would be a better choice; or decode_coding_object?), >> as opposed to creating a new specialized function. > > code_convert_string also handles encoding, though. That's just one more comparison. We could also do that in decode_coding_object instead, but I'm not sure about the overhead of the intervening code. >> What I can understand from our testing, this kind of change improves >> performance for all kinds of strings when the source encoding is >> utf_8_unix. Even for large ones (despite you expecting otherwise). > > I tested 10K strings, and the advantage there already becomes > relatively small. 10K characters may be a lot for strings, but it > isn't for buffers. That's probably true. I have tried a similar shortcut, removing the code_convert_string call in json_encode (which is called once from json-parse-string), and that did not measurably affect its performance. But it did increase the performance of json-serialize, by more than 2x, on the same test data I've been using. Like 3.8s to 1.6s for 10 iterations. Can we do something like that? Removing conversion altogether is probably not an option, but even using Fstring_as_unibyte instead lead to a significant improvement (2,43s with this approach). diff --git a/src/json.c b/src/json.c index 7d6d531427..01682473ca 100644 --- a/src/json.c +++ b/src/json.c @@ -266,7 +266,7 @@ json_encode (Lisp_Object string) { /* FIXME: Raise an error if STRING is not a scalar value sequence. */ - return code_convert_string (string, Qutf_8_unix, Qt, true, true, true); + return string; } static AVOID