From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: =?utf-8?Q?Herman=2C_G=C3=A9za?= Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] Implement fast verisons of json-parse functions Date: Sat, 30 Mar 2024 11:50:19 +0100 Message-ID: <871q7snffr.fsf@gmail.com> References: <87h6h2rsgn.fsf@gmail.com> <867chy3vpm.fsf@gnu.org> <87cyrqrqnb.fsf@gmail.com> <865xxi3tsu.fsf@gnu.org> <874jd2rnwj.fsf@gmail.com> <864jd14lqs.fsf@gnu.org> <87edc1rzig.fsf@gmail.com> <865xx4dv0g.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="34443"; mail-complaints-to="usenet@ciao.gmane.io" Cc: =?utf-8?Q?G=C3=A9za?= Herman , emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Mar 30 12:11:14 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rqWcD-0008on-UF for ged-emacs-devel@m.gmane-mx.org; Sat, 30 Mar 2024 12:11:13 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rqWbT-0005Vq-Fk; Sat, 30 Mar 2024 07:10:27 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rqWbS-0005Vh-28 for emacs-devel@gnu.org; Sat, 30 Mar 2024 07:10:26 -0400 Original-Received: from mail-lj1-x234.google.com ([2a00:1450:4864:20::234]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rqWbP-0008RE-Aq; Sat, 30 Mar 2024 07:10:25 -0400 Original-Received: by mail-lj1-x234.google.com with SMTP id 38308e7fff4ca-2d6ff0422a2so38322461fa.2; Sat, 30 Mar 2024 04:10:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1711797018; x=1712401818; darn=gnu.org; h=content-transfer-encoding:mime-version:message-id:in-reply-to:date :subject:cc:to:from:references:from:to:cc:subject:date:message-id :reply-to; bh=QGgFtytHOFZIy/I9fVxScwT9fZVHlZqra9R5sG+JSU8=; b=hC4nmeQhMXZ047pozZV33uRKj1uGw6pPBMqdjcZ4LkdNsHkXdinRarTcNdCkhZVfNx Tt/Tqycfr8y0IiaWblzcgRJZAz7byYq85WCuM5/uwNGg7gpwo7UCdaMsw4omFGO52dZm k9TPrattgD4CUK7nzUT9LYUrKzu3168wA4+U4LrBUXJZprcmk49bzlb2sGmeRyLgHnWC CDislC8gwSLbu+sLCb11+CtnuuaSymgolqqXkh/na4xXGrZa4h2fUUVgsV88wzIBf+09 opulc3ebQBXgf4mmxCKecu5qz9aMladzTAxs9K61BUphMV2QvvdDGHl8OcwMRJaCqnTT FDCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711797018; x=1712401818; h=content-transfer-encoding:mime-version:message-id:in-reply-to:date :subject:cc:to:from:references:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=QGgFtytHOFZIy/I9fVxScwT9fZVHlZqra9R5sG+JSU8=; b=lqSKIuW0pvO4j5cx0BpX8pHz/XwGSBGJdDI4QDwX/NvNEEjO7c3oisw/AuNkmiSw6e isjmTRMJjTPImWmrvl+jB7Qx7zR6LcYVU53K5IgmcDmIRF3m/O0JGzvgUwg/PtRSPOp7 i5S92hAeGsqO8TzdiHErT+BFZI0R9j343oCtAOslCQRIxc0xelW7P4Iz9Qmhm6Mt3Ysh 5tEVhHzxTTfkyPS7smPvvwFQyqI3MlQTdxib9h9K9eDRM+9p6KWUj2ZYY0hOKwyADmy2 +p8tV1ah4MtnO2kIV8NPv7vZEJKu0+44njbfx2T1LagFFYB4im4sa20Vp7ddpq/yqCf1 kfBA== X-Forwarded-Encrypted: i=1; AJvYcCUDzY/l5L4QoBk8fIWk9GRWmFqNttQIIJB34Veq+m3txP4DhGH3SN8o+WjZefetVCfSpELHjlPvq7uQjwlnDBd5bmbj X-Gm-Message-State: AOJu0YyJKoyEEyMtL4cOy3NKv/gBy0RBObIjSDAPdVGXq82c6Guc29T+ Mv2j2y5/rCIb3PqGcQELxQn0EtuBg4SZdu+xxShIE9W67D9Qto5grU3CZe11 X-Google-Smtp-Source: AGHT+IGPIZ/7GchTiYyqQ6Z7sADN9kqboQ25gb5v2HXatM2qyk0FSYxJ7lqgwyNt4LRkhjd9KN/gfQ== X-Received: by 2002:a2e:8706:0:b0:2d8:70e:6267 with SMTP id m6-20020a2e8706000000b002d8070e6267mr628397lji.22.1711797018188; Sat, 30 Mar 2024 04:10:18 -0700 (PDT) Original-Received: from localhost (netacc-gpn-104-145-196.pool.yettel.hu. [91.104.145.196]) by smtp.gmail.com with ESMTPSA id t7-20020a05600c198700b004155afe0c11sm1031798wmq.15.2024.03.30.04.10.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Mar 2024 04:10:17 -0700 (PDT) In-reply-to: <865xx4dv0g.fsf@gnu.org> Received-SPF: pass client-ip=2a00:1450:4864:20::234; envelope-from=geza.herman@gmail.com; helo=mail-lj1-x234.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:317392 Archived-At: Eli Zaretskii writes: > Thanks. I installed this on the master branch, after adding the > required commit log messages and some cleanup of unused=20 > functions. Thanks! > However: > > . 3 tests in test/src/json-tests.el are now failing, where=20 > they > succeeded before; see the log at the end > . the times of the relevant tests don't seem to be faster than > the libjansson version, perhaps because this is an=20 > unoptimized > build 3 test failures: 1. Handling of utf-8 decode errors: the new parser emits=20 json-utf8-decode-error instead of json-parse-error (this is what=20 the test expects). I can fix this by modifying the test 2. Handling of a single \0 byte: the new parser emits=20 json-end-of-file. I think this is not the best error kind for=20 this case, so I'll modify the parser to emit json-parse-error=20 instead. This is still different what the test expects=20 (wrong-type-argument), but I think there is no reason to treat=20 zero bytes specially. Considering the JSON spec, it's the same=20 error as any other unexpected byte value. 3. Handling objects with duplicate keys. That's an interesting=20 one. With alist/plist objects, the old parser removed duplicate=20 members, but the new parser doesn't remove such members, it keeps=20 them all. The JSON spec doesn't really say anything about this=20 case, so I think we're free to do anything we like. Mattias=20 Engdeg=C3=A5rd had an interesting idea: what if we put alist/plist=20 members in reversed order? This way, if one uses assq/plist-get to=20 get values by keys, the behavior will be consistent with the hash=20 table representation (which keeps that last value of duplicate=20 keys). I like the idea of consistency, but I don't like that the=20 elements will become reversed after parsing. I had the idea that=20 if the hash table kept the first value of duplicate keys, then=20 we'd also have consistency. What do you think? Regarding performance: the new parser only becames significantly=20 faster on larger JSONs only. And yes, unoptimized build also has=20 an impact on performance.