From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.bugs Subject: bug#31138: Native json slower than json.el Date: Mon, 22 Apr 2019 18:02:35 +0300 Message-ID: <19b19dec-a5a0-e08d-6026-0b9621d38143@yandex.ru> References: <87sh806xwa.fsf@chapu.is> <83h8bqn2ik.fsf@gnu.org> <83zhphliil.fsf@gnu.org> <181b93a3-3861-0481-1b95-8344410d1049@yandex.ru> <83r2a2hdxn.fsf@gnu.org> <21f68973-a684-2a65-82eb-c8f3df90127f@yandex.ru> <83d0lmgez2.fsf@gnu.org> <7d503be9-4d85-3d0b-6829-631ad376ba3d@yandex.ru> <831s22gcci.fsf@gnu.org> <83y349gasn.fsf@gnu.org> <83d0lfag4x.fsf@gnu.org> <5cf45a21-65c3-67ee-f123-be83a6ee7c99@yandex.ru> <83a7gjaen6.fsf@gnu.org> <10ca4e2f-b116-16bc-c81e-24036752c867@yandex.ru> <83lg026xxb.fsf@gnu.org> <0d42dab4-6c5c-be3a-d402-f17b39e7fc3c@yandex.ru> <83k1fm6vly.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="26432"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 Cc: p.stephani2@gmail.com, sebastien@chapu.is, yyoncho@gmail.com, 31138@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon Apr 22 17:08:38 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1hIaYg-0006kS-4b for geb-bug-gnu-emacs@m.gmane.org; Mon, 22 Apr 2019 17:08:38 +0200 Original-Received: from localhost ([127.0.0.1]:38599 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hIaYe-0000Vp-Nz for geb-bug-gnu-emacs@m.gmane.org; Mon, 22 Apr 2019 11:08:36 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:52245) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hIaTI-0004gd-Do for bug-gnu-emacs@gnu.org; Mon, 22 Apr 2019 11:03:05 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hIaTH-0005P4-5d for bug-gnu-emacs@gnu.org; Mon, 22 Apr 2019 11:03:04 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:37702) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hIaTH-0005OK-2Q for bug-gnu-emacs@gnu.org; Mon, 22 Apr 2019 11:03:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1hIaTG-0001E2-I6 for bug-gnu-emacs@gnu.org; Mon, 22 Apr 2019 11:03:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Dmitry Gutov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 22 Apr 2019 15:03:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 31138 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: moreinfo Original-Received: via spool by 31138-submit@debbugs.gnu.org id=B31138.15559453684689 (code B ref 31138); Mon, 22 Apr 2019 15:03:02 +0000 Original-Received: (at 31138) by debbugs.gnu.org; 22 Apr 2019 15:02:48 +0000 Original-Received: from localhost ([127.0.0.1]:51246 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hIaT2-0001DY-8X for submit@debbugs.gnu.org; Mon, 22 Apr 2019 11:02:48 -0400 Original-Received: from mail-wr1-f66.google.com ([209.85.221.66]:33419) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hIaSz-0001DI-Td for 31138@debbugs.gnu.org; Mon, 22 Apr 2019 11:02:47 -0400 Original-Received: by mail-wr1-f66.google.com with SMTP id a3so6499875wrx.0 for <31138@debbugs.gnu.org>; Mon, 22 Apr 2019 08:02:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=Ux1VD7S7y3PpihAdbhKGtW/Vi8uGcW04El3Y+oXJzg8=; b=svDLcDogA2/3Dl711bGix54zYFAjQxxT+1XWjZ50wUHCkSpkbJ4yIkikBbIKG4FmH4 q3j4L6pgXTAVakjyEjhVs+QOCNN9kcisd+9jxJpIbZoSbsRVuai0r/81LeDpd+8XFKw1 239WgGcP0zz6gUzKdApTwD2cvFiykkS+gBekuYj/4ClmJpI5VcN7Jg3DwYfqcyX7lf4P pneyNHNYzFh/DAL4z5UsaqPWOIToylZx6CRmVKFzFAac5SNwp0nExkjnr49g4CewDijr OdgpKG7E++XGOUNBGd4IZcdcfd4ToATTsXoY9pccGsxBWOasbU2PXx/KagQYOweuH8QN bwzA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:to:cc:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=Ux1VD7S7y3PpihAdbhKGtW/Vi8uGcW04El3Y+oXJzg8=; b=bSLOXjFtqt1gVCuy3lDR6cdPhqs9hRdn9IyR2JcOuhmn1hSS1DSkm/IQTzc5yao2SU E4G/9sHGaaqu/SuRffnGNGbA2TQ9ueXn9Kc4XsAhODJw5ubXWgu0pQM1c6CxtiqiB9TU W89vnfLEicBP/wu6SXNxb5ktlOBvVt455UyAWYhFqkMWEXDkwzsN2xhqvOMKNytDOZHO M3IiCPEeA3IWBaHwEqT8ACYWNTbmmhafTi2W87/H/opsjU9Mno0S8d3nxgLlS/6gEre9 MTwvfLzquCOhpNt6/cfSXzUWj/cm/HP0Zl6JKWxeDRQoUVpmoHENGlEewHdW7xHCJri+ K5PA== X-Gm-Message-State: APjAAAV6z8+jCzBnwVGhMa/eewZWbimsPfXnEixWX9ugwyMwhyjzoy77 K9IoEwxfkPegz+PH8BRHL/9/d/s5 X-Google-Smtp-Source: APXvYqy63+f5Kcy4zf8eXIO3QJW8Fdhvm1brsa6Eu4DQBTpWYDEUghoMFm/AAcVuYNqTE3mQvvhi4A== X-Received: by 2002:adf:e547:: with SMTP id z7mr12802394wrm.295.1555945359433; Mon, 22 Apr 2019 08:02:39 -0700 (PDT) Original-Received: from [192.168.0.195] ([109.110.245.170]) by smtp.googlemail.com with ESMTPSA id 11sm9791962wmk.17.2019.04.22.08.02.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 22 Apr 2019 08:02:37 -0700 (PDT) In-Reply-To: <83k1fm6vly.fsf@gnu.org> Content-Language: en-US X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:158045 Archived-At: On 22.04.2019 16:02, Eli Zaretskii wrote: >> Thank you. Tried it, tests now pass, and the performance improvement is >> the same. I compared the same benchmark (100 iterations, GC disabled for >> the whole duration), and this patch takes >> >> src/emacs -Q --batch -l ~/examples/elisp/json-test.el >> Elapsed time: 51.153870s >> >> down to >> >> $ src/emacs -Q --batch -l ~/examples/elisp/json-test.el >> Elapsed time: 26.268435s >> >> Are you still against it? (Just checking). > > I'm still against using that patch as is, yes. I'm okay with using > make_specified_string if before calling it we make sure the string is > plain ASCII or a series of proper UTF-8 sequences. Not sure how much > of a performance hit would such tests cost us, but if you are > interested, let's time them. Sure. > (Let me know if you need help in writing the code for the above 2 > tests. I think parse_str_as_multibyte should help a lot.) I do. At the very least: am I supposed to use parse_str_as_multibyte similarly to how make_string does, or to write a function similar to parse_str_as_multibyte? I can more or less follow its logic, but I don't understand if any of its callees cannot cope with improper input. > I guess we should also have some test case with non-ASCII characters, > if we will introduce these optimizations. We already do in test/src/json-tests.el, like I previously mentioned. And the simple patch (which you're against) passes them. I've put the patch at the end of this email so we're on the same page. > Thanks. > > P.S. I'm still not sure these optimizations will make the OP happy, > since at some point I heard them saying that our present performance > is abysmally slow and inadequate. Well... IIUC Node.js's JSON parsing is ~10 times as fast. Ruby's parser speeds vary from ~9 times as fast to ~3 as fast, for comparison. For LSP usage, we are of course comparing with Node. But since we're still here, and lsp-mode has some users, reaching Node's performance level is likely not a life-or-death situation. > If that wasn't a wild exaggeration, > then halving the time will still be inadequate. So maybe we should > agree in advance whether 30% to 50% improvement will be "good enough", > before we embark on this adventure. If we're talking about big changes and increases in complexity, sure, we should weigh them. But if a simple change gives us even a 20-30% improvement, why not take it? The reporter is not the only one who parses JSON in Emacs. Speaking of bigger improvements... it seems that with the patch below, and the fact that it passes the existing tests, we have at least established that the contents of the C strings that libjansson returns and our "decoded" strings are very often exactly the same. So most of the time what code_convert_string does is not really conversion, but in effect verification. I'm betting it's a frequent situation in other use cases, too. So one optimization (more complex to implement, I'm sure) would be to defer creating coding->dst_object inside decode_coding_object until we're sure we need it (the source and destination bytes actually come out different), and if we don't, return src_object in the end (I'm only taking about the case when dst_object is Qt). That might improve performance across the board, including during the encoding step. Or might not, of course. What do you think? diff --git a/src/json.c b/src/json.c index 928825e034..2b0cc8a313 100644 --- a/src/json.c +++ b/src/json.c @@ -225,8 +225,7 @@ json_has_suffix (const char *string, const char *suffix) static Lisp_Object json_make_string (const char *data, ptrdiff_t size) { - return code_convert_string (make_specified_string (data, -1, size, false), - Qutf_8_unix, Qt, false, true, true); + return make_specified_string (data, -1, size, false); } /* Create a multibyte Lisp string from the NUL-terminated UTF-8