unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: "Mattias Engdegård" <mattias.engdegard@gmail.com>
To: "\"Herman, Géza\"" <geza.herman@gmail.com>
Cc: Eli Zaretskii <eliz@gnu.org>,
	Philip Kaludercic <philipk@posteo.net>,
	wellons@nullprogram.com, emacs-devel@gnu.org
Subject: Re: I created a faster JSON parser
Date: Tue, 12 Mar 2024 10:26:36 +0100	[thread overview]
Message-ID: <437D901F-CEC6-45E0-8ABE-B036A7B0AAF5@gmail.com> (raw)
In-Reply-To: <87r0ggdcki.fsf@gmail.com>

11 mars 2024 kl. 15.35 skrev Herman, Géza <geza.herman@gmail.com>:

> According to https://github.com/miloyip/nativejson-benchmark, RapidJSON is at least 10x faster than jansson.  I'm just saying this because Emacs doesn't have to stick with my parser, there are possible alternatives, which have JSON serializers as well.

Thanks for the benchmark page reference. Yes, if this turns out to matter more we may consider a faster library. Right now I think your efforts are good enough (at least if we finish the job with a JSON serialiser).

> Yep, the formatting of that table got destroyed when I reformatted the code into GNU style.  Now I formatted the table back, and added comments for each row/col.  Here's the latest version: https://github.com/geza-herman/emacs/commit/4b5895636c1ec06e630baf47881b246c198af056.patch

Much better, thank you.

>> * Do you really need to maintain line and column during the parse? If
>> you want them for error reporting, you can materialise them from the
>> offset that you already have.
> 
> Yeah, I thought of that, but it turned out that maintaining the line/column doesn't have an impact on performance.

That's just because your code isn't fast enough! We are very disappointed. Very.

>  I added that easily, tough admittedly it's a little bit awkward to maintain these variables.  If emacs has a way to tell from the byte-pointer the line/col position (both for strings and buffers), I am happy to use that instead.

Since error handling isn't performance-critical it doesn't matter if it's a bit slow. (I'd just count newlines.)

>> * Are you sure that GC can't run during parsing or that all your Lisp
>> objects are reachable directly from the stack? (It's the
>> `object_workspace` in particular that's worrying me a bit.)
> 
> That's a very good question.  I suppose that object_workspace is invisible to the Lisp VM, as it is just a malloc'd object.  But I've never seen a problem because of this.  What triggers the GC? Is it possible that for the duration of the whole parsing, GC is never get triggered?  Otherwise it should have GCd the objects in object_workspace, causing problems (I tried this parser in a loop, where GC is caused hundreds of times. In the loop, I compared the result to json-read, everything was fine).

You can't test that code is GC-safe, you have to show that it's correct by design.

Looking at the code it is quite possible that GC cannot take place. But it can signal errors, and getting into the debugger should open GC windows unless I'm mistaken.

There are some options. `record_unwind_protect_ptr_mark` would be one, and it was made for code like this, but Gerd has been grumbling about it lately. Perhaps it's easier just to disable GC in the dynamic scope (inhibit_garbage_collection).




  reply	other threads:[~2024-03-12  9:26 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-08 10:27 I created a faster JSON parser Herman, Géza
2024-03-08 11:41 ` Philip Kaludercic
2024-03-08 12:34   ` Herman, Géza
2024-03-08 12:03 ` Eli Zaretskii
2024-03-08 12:38   ` Herman, Géza
2024-03-08 12:59     ` Eli Zaretskii
2024-03-08 13:12       ` Herman, Géza
2024-03-08 14:10         ` Eli Zaretskii
2024-03-08 14:24           ` Collin Funk
2024-03-08 15:20           ` Herman, Géza
2024-03-08 16:22             ` Eli Zaretskii
2024-03-08 18:34               ` Herman, Géza
2024-03-08 19:57                 ` Eli Zaretskii
2024-03-08 20:22                   ` Herman, Géza
2024-03-09  6:52                     ` Eli Zaretskii
2024-03-09 11:08                       ` Herman, Géza
2024-03-09 12:23                         ` Lynn Winebarger
2024-03-09 12:58                         ` Po Lu
2024-03-09 13:13                         ` Eli Zaretskii
2024-03-09 14:00                           ` Herman, Géza
2024-03-09 14:21                             ` Eli Zaretskii
2024-03-08 13:28 ` Po Lu
2024-03-08 16:14   ` Herman, Géza
2024-03-09  1:55     ` Po Lu
2024-03-09 20:37 ` Christopher Wellons
2024-03-10  6:31   ` Eli Zaretskii
2024-03-10 21:39     ` Philip Kaludercic
2024-03-11 13:29       ` Eli Zaretskii
2024-03-11 14:05         ` Mattias Engdegård
2024-03-11 14:35           ` Herman, Géza
2024-03-12  9:26             ` Mattias Engdegård [this message]
2024-03-12 10:20               ` Gerd Möllmann
2024-03-12 11:14                 ` Mattias Engdegård
2024-03-12 11:33                   ` Gerd Möllmann
2024-03-15 13:35                 ` Herman, Géza
2024-03-15 14:56                   ` Gerd Möllmann
2024-03-19 18:49                   ` Mattias Engdegård
2024-03-19 19:05                     ` Herman, Géza
2024-03-19 19:18                       ` Gerd Möllmann
2024-03-19 19:13                     ` Gerd Möllmann
2024-03-12 10:58               ` Herman, Géza
2024-03-12 13:11                 ` Mattias Engdegård
2024-03-12 13:42                   ` Mattias Engdegård
2024-03-12 15:23                   ` Herman, Géza
2024-03-12 15:39                     ` Gerd Möllmann
2024-03-10  6:58   ` Herman, Géza
2024-03-10 16:54     ` Christopher Wellons
2024-03-10 20:41       ` Herman, Géza
2024-03-10 23:22         ` Christopher Wellons
2024-03-11  9:34           ` Herman, Géza
2024-03-11 13:47             ` Christopher Wellons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=437D901F-CEC6-45E0-8ABE-B036A7B0AAF5@gmail.com \
    --to=mattias.engdegard@gmail.com \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=geza.herman@gmail.com \
    --cc=philipk@posteo.net \
    --cc=wellons@nullprogram.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).