From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Zhiwei Chen Newsgroups: gmane.emacs.bugs Subject: bug#50391: 28.0.50; json-read non-ascii data results in malformed string Date: Sun, 05 Sep 2021 12:19:56 +0800 Message-ID: <877dfvd577.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="25295"; mail-complaints-to="usenet@ciao.gmane.io" To: 50391@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun Sep 05 06:21:11 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mMjeZ-0006Qa-3Z for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 05 Sep 2021 06:21:11 +0200 Original-Received: from localhost ([::1]:45594 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mMjeX-0000B3-2c for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 05 Sep 2021 00:21:09 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:41016) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mMjeQ-0000At-QK for bug-gnu-emacs@gnu.org; Sun, 05 Sep 2021 00:21:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:37399) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mMjeQ-0003An-Iw for bug-gnu-emacs@gnu.org; Sun, 05 Sep 2021 00:21:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1mMjeQ-0005Z2-Co for bug-gnu-emacs@gnu.org; Sun, 05 Sep 2021 00:21:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Zhiwei Chen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 05 Sep 2021 04:21:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 50391 X-GNU-PR-Package: emacs X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.163081560721297 (code B ref -1); Sun, 05 Sep 2021 04:21:02 +0000 Original-Received: (at submit) by debbugs.gnu.org; 5 Sep 2021 04:20:07 +0000 Original-Received: from localhost ([127.0.0.1]:48945 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mMjdW-0005XR-VH for submit@debbugs.gnu.org; Sun, 05 Sep 2021 00:20:07 -0400 Original-Received: from lists.gnu.org ([209.51.188.17]:48988) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mMjdU-0005XF-Ch for submit@debbugs.gnu.org; Sun, 05 Sep 2021 00:20:04 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:40988) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mMjdT-0008VH-MC for bug-gnu-emacs@gnu.org; Sun, 05 Sep 2021 00:20:03 -0400 Original-Received: from mail-pf1-x436.google.com ([2607:f8b0:4864:20::436]:34588) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mMjdR-0002Hx-Vb for bug-gnu-emacs@gnu.org; Sun, 05 Sep 2021 00:20:03 -0400 Original-Received: by mail-pf1-x436.google.com with SMTP id g14so2863079pfm.1 for ; Sat, 04 Sep 2021 21:20:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=U+V9IJBN2MXkv05Kl1rypjHOfW1+y+ct93rKlevd4lE=; b=nZmZvMf/R/fUAIVB3zcR9cvy5IRQ0rUgZ+vibG7KgRAAdHX0iB5Zz5OfjnWx8tExeb u0ZdwNUeHO6xsDhoXOfdHzpOF9ppnfkLm8WyCg6gKXWVmuQfeyeWZ1InqJaWvJfyxnDF /1oPsDhYcxB0IERqBG+Q24W//9/tYa6co5dpw87Agvt8Pywe9X+K87vh37+FpbZVbVQt 0Sms9BVxaR6wDsESkQR9GzRt0ylSXpBO9oKrs1CuIgT0d2BL7St4AHLOS+qfRiqrRACV rIQSBNYh2J+6LvL6uBvBV7vdk2SFGWAUcwkiFzbyDKBvGEjYSpy4Of3n0xtBL85UyCvx ZsLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=U+V9IJBN2MXkv05Kl1rypjHOfW1+y+ct93rKlevd4lE=; b=EvwoBEy0UjdTgrnTFaQNouo8gcIdEq+xeBIccxWKxmKkCapTDN1lpZJETn5ZSEy1bF buZCuVC3/1tTF8KPWhDpyQdKh/hCLgUZgWtMADj3Bv/PcmbynrygwFRVFJDL7UBzvm9D tfoZmwqrcAuzDl0buOVJeNizFmahjoMsJAjQybV+Pq/pqjEuYiAaTRL7/5FHH8OTXHLV WeTwBsO8RuF960rznpIQ5PExxdayW5So3F9pUm9J2d4Qr1gePkr42gy2h8KIcxLywl4c CtDAEH9T1GLmy2l26ai8EkhxpvB1SRE1Eeqsc235uZWdUt3GY4DmAd92I680mVvcGG18 ng8Q== X-Gm-Message-State: AOAM532eOInih08BNHCOMbw/nX9n2Rzwx8iPhx2o3T+xA0mi1mLCg//1 JoC2YGahbvbFi/uH1bN1rZxpgW8NikQ= X-Google-Smtp-Source: ABdhPJzkBuxIPJ5Kp0pjyqFg6xID3QtnaRT4nXOFsmbnTOJIGp4XxksMUtMF4jxZd3482nyhDa7XwQ== X-Received: by 2002:a63:b1f:: with SMTP id 31mr6097171pgl.73.1630815599592; Sat, 04 Sep 2021 21:19:59 -0700 (PDT) Original-Received: from Youmu (192.69.92.236.16clouds.com. [192.69.92.236]) by smtp.gmail.com with ESMTPSA id t68sm4339140pgc.59.2021.09.04.21.19.58 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 04 Sep 2021 21:19:59 -0700 (PDT) Received-SPF: pass client-ip=2607:f8b0:4864:20::436; envelope-from=condy0919@gmail.com; helo=mail-pf1-x436.google.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:213450 Archived-At: When fetch json from youdao (a dict service in China). #+begin_src elisp (url-retrieve "https://dict.youdao.com/suggest?q=3Daccumulate&le=3Deng&num=3D80&doctype= =3Djson" (lambda (_status) (goto-char (1+ url-http-end-of-headers)) (write-region (point) (point-max) "/tmp/acc1.json"))) #+end_src Then C-x C-f "/tmp/acc1.json", the file is correctly encoded without=20 But If `json-read' then `json-insert', the file is malformed even if uchardet shows the encoding of the file is utf-8. #+begin_src elisp (url-retrieve "https://dict.youdao.com/suggest?q=3Daccumulate&le=3Deng&num=3D80&doctype= =3Djson" (lambda (_status) (goto-char (1+ url-http-end-of-headers)) (let ((j (json-read))) (with-temp-buffer (json-insert j) (write-region (point-min) (point-max) "/tmp/acc2.json"))))) #+end_src #+begin_src shell diff -u <(hexdump -C /tmp/acc1.json | head -n10) <(hexdump -C /tmp/acc2.jso= n | head -n10) | diff-so-fancy #+end_src Screenshot: https://pb.nichi.co/jazz-estate-brave Where diff shows the first word "=E7=B4=AF=E7=A7=AF" is encoded incorrectly= in "/tmp/acc2.json". (It uses `c3 a7 c2 b4 c2 af') Actually, #+begin_src shell echo -n "=E7=B4=AF=E7=A7=AF" | hexdump -C #+end_src should be `e7 b4 af e7 a7 af' in utf-8 where "=E7=B4=AF" is represented with `e7 b4 af' and "=E7=A7=AF" is represented with `e7 a7 af' The environment variable LANG is `en_US.UTF-8', all tested in `emacs -Q' --=20 Zhiwei Chen