From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#70007: [PATCH] native JSON encoder Date: Wed, 27 Mar 2024 21:05:54 +0200 Message-ID: <8634sbijfx.fsf@gnu.org> References: <1BF559D1-DB9F-4FEB-90ED-72E0EFD76424@gmail.com> <86wmpphrg7.fsf@gnu.org> <4589243D-C11A-45C1-AF3E-6F4A5BADEB54@gmail.com> <864jcrindg.fsf@gnu.org> <291DD5F1-85B8-4647-A40A-EBBD4C51E253@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="31922"; mail-complaints-to="usenet@ciao.gmane.io" Cc: casouri@gmail.com, 70007@debbugs.gnu.org To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Wed Mar 27 20:07:30 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rpYcU-000883-CP for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 27 Mar 2024 20:07:30 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rpYcB-0006pd-Io; Wed, 27 Mar 2024 15:07:12 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rpYc3-0006pH-2c for bug-gnu-emacs@gnu.org; Wed, 27 Mar 2024 15:07:03 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rpYc2-00073F-7o for bug-gnu-emacs@gnu.org; Wed, 27 Mar 2024 15:07:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1rpYc1-0003bl-Q5 for bug-gnu-emacs@gnu.org; Wed, 27 Mar 2024 15:07:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 27 Mar 2024 19:07:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 70007 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 70007-submit@debbugs.gnu.org id=B70007.171156636913600 (code B ref 70007); Wed, 27 Mar 2024 19:07:01 +0000 Original-Received: (at 70007) by debbugs.gnu.org; 27 Mar 2024 19:06:09 +0000 Original-Received: from localhost ([127.0.0.1]:38365 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rpYbB-0003XH-8N for submit@debbugs.gnu.org; Wed, 27 Mar 2024 15:06:09 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:59828) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rpYb6-0003WD-Dh for 70007@debbugs.gnu.org; Wed, 27 Mar 2024 15:06:08 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rpYb0-0006pj-MF; Wed, 27 Mar 2024 15:05:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=JBZvqtiPMK2msP7EgwbC0yr3SlJzHU0GwNOc4LflgCI=; b=k3AqXy8Y2fypq/WdxWhT T21UVDXrKQFxlQoXxjd7vRYbmqYdn6LdAOdt+TNzOFk4QnGihTCoX5kIPSj6vcb7j81xWKCEZjgLL 98Vq7sYUEwS/TRGx4Q4aKUmqABe/uWDBeUERZVVfFbxY2C5GFx6NiSU3YrU0LcqrQj2IGDtcLEbN1 0rrTJJ38AHbpS4eYVUysDqzQJXZuQR4ebQ6On3UhZP4CBxx+WYBJCw/2JOj9zy/iOAD/0C1PZBiyO ibNESWF0NhVMVRb9/ZVbodyqYBP5/sAZF7acTs7n9nMeXf41ekG9nmKR6zUBzea9qF4bwP52GpgzT E8hu93odxJXqug==; In-Reply-To: <291DD5F1-85B8-4647-A40A-EBBD4C51E253@gmail.com> (message from Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= on Wed, 27 Mar 2024 19:57:24 +0100) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:282151 Archived-At: > From: Mattias EngdegÄrd > Date: Wed, 27 Mar 2024 19:57:24 +0100 > Cc: Yuan Fu , > 70007@debbugs.gnu.org > > Eli, thank you for your comments! Thanks for working on this in the first place. > > This rejects unibyte non-ASCII strings, AFAU, in which case I suggest > > to think whether we really want that. E.g., why is it wrong to encode > > a string to UTF-8, and then send it to JSON? > > The way I see it, that would break the JSON abstraction: it transports strings of Unicode characters, not strings of bytes. What's the difference? AFAIU, JSON expects UTF-8 encoded strings, and whether that is used as a sequence of bytes or a sequence of characters is in the eyes of the beholder: the bytestream is the same, only the interpretation changes. So I'm not sure I understand how this would break the assumption. > A user who for some reason has a string of bytes that encode Unicode characters can just decode it in order to prove it to us. It's not the JSON encoder's job to decode the user's strings. I didn't suggest to decode the input string, not at all. I suggested to allow unibyte strings, and process them just like you process pure-ASCII strings, leaving it to the caller to make sure the string has only valid UTF-8 sequences. Forcing callers to decode such strings is IMO too harsh and largely unjustified. > (It would also be a pain to deal with and risks slowing down the string serialiser even if it's a case that never happens.) I don't understand why. Once again, I'm just talking about passing the bytes through as you do with ASCII characters.