From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: JSON/YAML/TOML/etc. parsing performance Date: Sun, 01 Oct 2017 21:06:13 +0300 Message-ID: <83r2um3fqi.fsf@gnu.org> References: <87poaqhc63.fsf@lifelogs.com> <8360ceh5f1.fsf@gnu.org> <83h8vl5lf9.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: blaine.gmane.org X-Trace: blaine.gmane.org 1506881200 26369 195.159.176.226 (1 Oct 2017 18:06:40 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 1 Oct 2017 18:06:40 +0000 (UTC) Cc: emacs-devel@gnu.org To: Philipp Stephani Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Oct 01 20:06:35 2017 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dyidN-0005j0-CB for ged-emacs-devel@m.gmane.org; Sun, 01 Oct 2017 20:06:33 +0200 Original-Received: from localhost ([::1]:49393 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dyidO-0004Yx-7z for ged-emacs-devel@m.gmane.org; Sun, 01 Oct 2017 14:06:34 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:56385) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dyidH-0004YN-KR for emacs-devel@gnu.org; Sun, 01 Oct 2017 14:06:28 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dyidD-0003Wv-9n for emacs-devel@gnu.org; Sun, 01 Oct 2017 14:06:27 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:42773) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dyidD-0003Wq-7O; Sun, 01 Oct 2017 14:06:23 -0400 Original-Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:1086 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1dyidB-0003Nj-Mh; Sun, 01 Oct 2017 14:06:23 -0400 In-reply-to: (message from Philipp Stephani on Sat, 30 Sep 2017 22:02:55 +0000) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:218990 Archived-At: > From: Philipp Stephani > Date: Sat, 30 Sep 2017 22:02:55 +0000 > Cc: emacs-devel@gnu.org > > Subject: [PATCH] Implement native JSON support using Jansson Thanks, a few more comments/questions. > +#if __has_attribute (warn_unused_result) > +# define ATTRIBUTE_WARN_UNUSED_RESULT __attribute__ ((__warn_unused_result__)) > +#else > +# define ATTRIBUTE_WARN_UNUSED_RESULT > +#endif Hmm... why do we need this attribute? You use it with 2 static functions, so this sounds like a left-over from the development stage? > +static Lisp_Object > +internal_catch_all_1 (Lisp_Object (*function) (void *), void *argument) Can you tell why you needed this (and the similar internal_catch_all)? Is that only because the callbacks could signal an error, or is there another reason? If the former, I'd prefer to simplify the code and its maintenance by treating the error condition in a less drastic manner, and avoiding the call to xsignal. > +static _GL_ARG_NONNULL ((2)) Lisp_Object > +lisp_to_json_toplevel_1 (Lisp_Object lisp, json_t **json) > +{ > + if (VECTORP (lisp)) > + { > + ptrdiff_t size = ASIZE (lisp); > + eassert (size >= 0); > + if (size > SIZE_MAX) > + xsignal1 (Qoverflow_error, build_string ("vector is too long")); I think this error text is too vague. Can we come up with something that describes the problem more accurately? And btw, how can size be greater than SIZE_MAX in this case? This is a valid Lisp object, isn't it? (There are more such tests in the patch, e.g. in lisp_to_json, and I think they, too, are redundant.) > + *json = json_check (json_array ()); > + ptrdiff_t count = SPECPDL_INDEX (); > + record_unwind_protect_ptr (json_release_object, json); > + for (ptrdiff_t i = 0; i < size; ++i) > + { > + int status > + = json_array_append_new (*json, lisp_to_json (AREF (lisp, i))); > + if (status == -1) > + json_out_of_memory (); > + eassert (status == 0); > + } > + eassert (json_array_size (*json) == size); > + clear_unwind_protect (count); > + return unbind_to (count, Qnil); This, too, sounds more complex than it should: you record unwind-protect just so lisp_to_json's subroutines could signal an error due to insufficient memory, right? Why can't we have the out-of-memory check only inside this loop, which you already do, and avoid the checks on lower levels (which undoubtedly cost us extra cycles)? What do those extra checks in json_check buy us? the errors they signal are no more informative than the one in the loop, AFAICT. > +static Lisp_Object > +json_insert (void *data) > +{ > + const struct json_buffer_and_size *buffer_and_size = data; > + if (buffer_and_size->size > PTRDIFF_MAX) > + xsignal1 (Qoverflow_error, build_string ("buffer too large")); > + insert (buffer_and_size->buffer, buffer_and_size->size); I don't think we need this test here, as 'insert' already has the equivalent test in one of its subroutines. > + case JSON_INTEGER: > + { > + json_int_t value = json_integer_value (json); > + if (FIXNUM_OVERFLOW_P (value)) > + xsignal1 (Qoverflow_error, > + build_string ("JSON integer is too large")); > + return make_number (value); This overflow test is also redundant, as make_number already does it. > + case JSON_STRING: > + { > + size_t size = json_string_length (json); > + if (FIXNUM_OVERFLOW_P (size)) > + xsignal1 (Qoverflow_error, build_string ("JSON string is too long")); > + return json_make_string (json_string_value (json), size); Once again, the overflow test is redundant, as make_specified_string (called by json_make_string) already includes an equivalent test. > + case JSON_ARRAY: > + { > + if (++lisp_eval_depth > max_lisp_eval_depth) > + xsignal0 (Qjson_object_too_deep); > + size_t size = json_array_size (json); > + if (FIXNUM_OVERFLOW_P (size)) > + xsignal1 (Qoverflow_error, build_string ("JSON array is too long")); > + Lisp_Object result = Fmake_vector (make_natnum (size), Qunbound); Likewise here: Fmake_vector makes sure the size is not larger than allowed. > + case JSON_OBJECT: > + { > + if (++lisp_eval_depth > max_lisp_eval_depth) > + xsignal0 (Qjson_object_too_deep); > + size_t size = json_object_size (json); > + if (FIXNUM_OVERFLOW_P (size)) > + xsignal1 (Qoverflow_error, > + build_string ("JSON object has too many elements")); > + Lisp_Object result = CALLN (Fmake_hash_table, QCtest, Qequal, > + QCsize, make_natnum (size)); Likewise here: make_natnum does the equivalent test. > + /* Adjust point by how much we just read. Do this here because > + tokener->char_offset becomes incorrect below. */ > + bool overflow = INT_ADD_WRAPV (point, error.position, &point); > + eassert (!overflow); > + eassert (point <= ZV_BYTE); > + SET_PT_BOTH (BYTE_TO_CHAR (point), point); It's better to use SET_PT here, I think. > + define_error (Qjson_out_of_memory, "no free memory for creating JSON object", I'd prefer "not enough memory for creating JSON object". Thanks again for working on this.