From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] Improve error reporting when serializing non-Unicode strings to JSON Date: Sat, 23 Dec 2017 16:52:16 +0200 Message-ID: <83shc1jy3j.fsf@gnu.org> References: <20171222210031.30811-1-phst@google.com> <83efnllufm.fsf@gnu.org> <83wp1dk18g.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: blaine.gmane.org X-Trace: blaine.gmane.org 1514040659 3941 195.159.176.226 (23 Dec 2017 14:50:59 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sat, 23 Dec 2017 14:50:59 +0000 (UTC) Cc: phst@google.com, emacs-devel@gnu.org To: Philipp Stephani Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Dec 23 15:50:55 2017 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eSl8W-0000WX-Kk for ged-emacs-devel@m.gmane.org; Sat, 23 Dec 2017 15:50:52 +0100 Original-Received: from localhost ([::1]:42225 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eSlAV-0005Wi-7h for ged-emacs-devel@m.gmane.org; Sat, 23 Dec 2017 09:52:55 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:48333) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eSl9x-0005Wb-5s for emacs-devel@gnu.org; Sat, 23 Dec 2017 09:52:22 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eSl9w-0008Vh-Ch for emacs-devel@gnu.org; Sat, 23 Dec 2017 09:52:21 -0500 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:43808) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eSl9r-0007qm-Eu; Sat, 23 Dec 2017 09:52:15 -0500 Original-Received: from [176.228.60.248] (port=4013 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1eSl9q-0005n9-SF; Sat, 23 Dec 2017 09:52:15 -0500 In-reply-to: (message from Philipp Stephani on Sat, 23 Dec 2017 14:29:56 +0000) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:221375 Archived-At: > From: Philipp Stephani > Date: Sat, 23 Dec 2017 14:29:56 +0000 > Cc: emacs-devel@gnu.org, phst@google.com > > OK, but why do we need external functions for doing that? What is > missing in our own code to detect such a situation? > > Not much I think, it's just easiest to use Gnulib functions because they are well-documented, have a clean > interface, and are probably bug-free. > coding.c has check_utf_8, which is quite similar, but has an incompatible interface (it takes struct > coding_system objects) and also checks for embedded newlines, which isn't necessary here. So let's use check_utf_8, as its downsides don't sound serious to me, and OTOH using unistring functions will bloat Emacs for the benefit of a single use case, not to mention create two different methods for doing the same job, which IMO is even more confusing to any newcomer to the Emacs internals. Btw, doesn't find_charsets_in_text do the same job cleaner and quicker? AFAIU, all you need is make sure there are no characters from the 2 eight-bit-* charsets in the text, or did I miss something?