From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Philipp Stephani Newsgroups: gmane.emacs.bugs Subject: bug#24784: 26.0.50; JSON strings with utf-16 escape codes Date: Mon, 24 Oct 2016 19:57:19 +0000 Message-ID: References: NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary=001a114e2a8e5983f6053fa1cd21 X-Trace: blaine.gmane.org 1477341280 20901 195.159.176.226 (24 Oct 2016 20:34:40 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 24 Oct 2016 20:34:40 +0000 (UTC) To: Helmut Eller , 24784@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon Oct 24 22:34:36 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bylwo-0002vl-J6 for geb-bug-gnu-emacs@m.gmane.org; Mon, 24 Oct 2016 22:34:18 +0200 Original-Received: from localhost ([::1]:49905 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bylwq-0007Fk-Vq for geb-bug-gnu-emacs@m.gmane.org; Mon, 24 Oct 2016 16:34:21 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:39509) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bylNl-0002cG-Oo for bug-gnu-emacs@gnu.org; Mon, 24 Oct 2016 15:58:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bylNi-0004ZD-Mu for bug-gnu-emacs@gnu.org; Mon, 24 Oct 2016 15:58:05 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:39489) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1bylNi-0004Z2-JB for bug-gnu-emacs@gnu.org; Mon, 24 Oct 2016 15:58:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1bylNi-0007uH-7A for bug-gnu-emacs@gnu.org; Mon, 24 Oct 2016 15:58:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Philipp Stephani Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 24 Oct 2016 19:58:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 24784 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 24784-submit@debbugs.gnu.org id=B24784.147733905930365 (code B ref 24784); Mon, 24 Oct 2016 19:58:02 +0000 Original-Received: (at 24784) by debbugs.gnu.org; 24 Oct 2016 19:57:39 +0000 Original-Received: from localhost ([127.0.0.1]:54888 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bylNK-0007th-Tu for submit@debbugs.gnu.org; Mon, 24 Oct 2016 15:57:39 -0400 Original-Received: from mail-wm0-f46.google.com ([74.125.82.46]:32782) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bylNJ-0007tT-5l for 24784@debbugs.gnu.org; Mon, 24 Oct 2016 15:57:37 -0400 Original-Received: by mail-wm0-f46.google.com with SMTP id c78so19382795wme.0 for <24784@debbugs.gnu.org>; Mon, 24 Oct 2016 12:57:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=tiUMmn7r/VDVciX6UtB2EFj1/kTDsArEALupvQjFS40=; b=fVvbBbfus/TvQizko2U/QmqqDAjdb3NX13Bh0cCDjnjaRxGT/iJ9ZXQpD7rCG7whTB Xg/Wb/VzZdaUF4oeqitfWwgGA4It9uuR8EMOLKShYw97tcZz+4lcrXF9Aesjg9vxTPXF PxICy61QlUT7lJgFelXz02FZ0UMqQjdgluo1/nDashpnOAZKlq6hQ6s7G4gapc1P4Mhd S+JNzoCycAZ69jcOqn0CJRL6wAVGJg9lIkIo8/BHr4u1d3cJQbWAefckADAqIUc39eri O4sHOK1QM3FYW1ZhPHsrvjbp2rGdCAPS7hBsfzuBt+vNg3HGIZ7YGYdb5QN4FjO4Ne3Y O4CQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=tiUMmn7r/VDVciX6UtB2EFj1/kTDsArEALupvQjFS40=; b=G7G2UEmJTu9MhW77x+MqI1SGTXP+XqwxbJnrRxuHLIIyvMFD/qvSspe7cOXWBsdjEd shpjQqX1AZX2ZODJr6Xa0enB0MZ1+JrnUipF8dizbettJuT3cEcQghbBxY/8SwZ1y+xv gUrCKt/3r0iz4DxXSSMgi+NDEQRstYSHeS4hxntq0kBIZcyni2ScSPFqQ5WP8BGIy2CP DGHUWUHYYgQp78Y+T9Kqt6StD7I0fmShrcOTlCv3DDGIZI4BsmAO08FT3wX/WDempQKr KzAjAL+BYgFp+Png1tIt6eNIaCYghc8rQqob6UsljWHB4j7dcFkvlMmH9hLeFuJHxcd1 Jnbw== X-Gm-Message-State: AA6/9RlUyAOSuiyvpzFMfxsa0C844y2RROyn6ptBAGc341pgUnTjgNw/oI/FdXq6QILdWZ6eO8Uv1e8LsQp9Xg== X-Received: by 10.28.191.3 with SMTP id p3mr25428167wmf.112.1477339050958; Mon, 24 Oct 2016 12:57:30 -0700 (PDT) In-Reply-To: X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:124976 Archived-At: --001a114e2a8e5983f6053fa1cd21 Content-Type: multipart/alternative; boundary=001a114e2a8e5983f2053fa1cd1f --001a114e2a8e5983f2053fa1cd1f Content-Type: text/plain; charset=UTF-8 Helmut Eller schrieb am Mo., 24. Okt. 2016 um 20:58 Uhr: > > json-read-from-string doesn't parse strings correctly if the the \u > syntax is used to write UTF-16 surrogates: > > (equal (json-read-from-string "\"\\uD834\\uDD1E\"") "\"\U0001D11E\"") > => nil > > The correct result t. To quote RFC 7159[*]: > > To escape an extended character that is not in the Basic Multilingual > Plane, the character is represented as a 12-character sequence, > encoding the UTF-16 surrogate pair. So, for example, a string > containing only the G clef character (U+1D11E) may be represented as > "\uD834\uDD1E". > > [*] https://tools.ietf.org/html/rfc7159#section-7 > > Thanks for reporting, I've attached a patch. --001a114e2a8e5983f2053fa1cd1f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


Helmut= Eller <eller.helmut@gmail.com= > schrieb am Mo., 24. Okt. 2016 um 20:58=C2=A0Uhr:

json-read-from-string doesn't parse strings correctly if the the \u
syntax is used to write UTF-16 surrogates:

=C2=A0(equal (json-read-from-string "\"\\uD834\\uDD1E\""= ;) "\"\U0001D11E\"")
=C2=A0=3D> nil

The correct result t.=C2=A0 To quote RFC 7159[*]:

=C2=A0 =C2=A0To escape an extended character that is not in the Basic Multi= lingual
=C2=A0 =C2=A0Plane, the character is represented as a 12-character sequence= ,
=C2=A0 =C2=A0encoding the UTF-16 surrogate pair.=C2=A0 So, for example, a s= tring
=C2=A0 =C2=A0containing only the G clef character (U+1D11E) may be represen= ted as
=C2=A0 =C2=A0"\uD834\uDD1E".

[*] https://tools.ietf.org/html/rfc= 7159#section-7

Thanks for reporting, I've attached a patch.=C2=A0
--001a114e2a8e5983f2053fa1cd1f-- --001a114e2a8e5983f6053fa1cd21 Content-Type: text/plain; charset=US-ASCII; name="0001-Fix-encoding-of-JSON-surrogate-pairs.txt" Content-Disposition: attachment; filename="0001-Fix-encoding-of-JSON-surrogate-pairs.txt" Content-Transfer-Encoding: base64 Content-ID: <157f844beafc1921ba11> X-Attachment-Id: 157f844beafc1921ba11 RnJvbSA2YzYzMGJkNWIwMDEyNDNkNmI3MTE1MzgwMDg4OTA5YTdhMTgwZGRiIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBQaGlsaXBwIFN0ZXBoYW5pIDxwaHN0QGdvb2dsZS5jb20+CkRh dGU6IE1vbiwgMjQgT2N0IDIwMTYgMjE6NTQ6NTEgKzAyMDAKU3ViamVjdDogW1BBVENIXSBGaXgg ZW5jb2Rpbmcgb2YgSlNPTiBzdXJyb2dhdGUgcGFpcnMKCkpTT04gcmVxdWlyZXMgdGhhdCBzdWNo IHBhaXJzIGJlIHRyZWF0ZWQgYXMgVVRGLTE2IHN1cnJvZ2F0ZSBwYWlycywgbm90CmluZGl2aWR1 YWwgY29kZSBwb2ludHM7IGNmLiBCdWcgIzI0Nzg0LgoKKiBsaXNwL2pzb24uZWwgKGpzb24tcmVh ZC1lc2NhcGVkLWNoYXIpOiBGaXggZGVjb2Rpbmcgb2Ygc3Vycm9nYXRlCnBhaXJzLgooanNvbi0t ZGVjb2RlLXV0Zi0xNi1zdXJyb2dhdGVzKTogTmV3IGRlZnN1YnN0LgoKKiB0ZXN0L2xpc3AvanNv bi10ZXN0cy5lbCAodGVzdC1qc29uLXJlYWQtc3RyaW5nKTogQWRkIHRlc3QgZm9yCnN1cnJvZ2F0 ZSBwYWlycy4KKHRlc3QtanNvbi1lbmNvZGUtc3RyaW5nKTogQWRkIHRlc3QgZm9yIG5vbi1CTVAg Y2hhcmFjdGVyIGVuY29kaW5nLgotLS0KIGxpc3AvanNvbi5lbCAgICAgICAgICAgIHwgMTMgKysr KysrKysrKysrKwogdGVzdC9saXNwL2pzb24tdGVzdHMuZWwgfCAgNyArKysrKy0tCiAyIGZpbGVz IGNoYW5nZWQsIDE4IGluc2VydGlvbnMoKyksIDIgZGVsZXRpb25zKC0pCgpkaWZmIC0tZ2l0IGEv bGlzcC9qc29uLmVsIGIvbGlzcC9qc29uLmVsCmluZGV4IGZkYWM4ZDkuLjViZmRmZDQgMTAwNjQ0 Ci0tLSBhL2xpc3AvanNvbi5lbAorKysgYi9saXNwL2pzb24uZWwKQEAgLTM2Myw2ICszNjMsMTAg QEAganNvbi1zcGVjaWFsLWNoYXJzCiAKIDs7IFN0cmluZyBwYXJzaW5nCiAKKyhkZWZzdWJzdCBq c29uLS1kZWNvZGUtdXRmLTE2LXN1cnJvZ2F0ZXMgKGhpZ2ggbG93KQorICAiUmV0dXJuIHRoZSBj b2RlIHBvaW50IHJlcHJlc2VudGVkIGJ5IHRoZSBVVEYtMTYgc3Vycm9nYXRlcyBISUdIIGFuZCBM T1cuIgorICAoKyAobHNoICgtIGhpZ2ggI3hEODAwKSAxMCkgKC0gbG93ICN4REMwMCkgI3gxMDAw MCkpCisKIChkZWZ1biBqc29uLXJlYWQtZXNjYXBlZC1jaGFyICgpCiAgICJSZWFkIHRoZSBKU09O IHN0cmluZyBlc2NhcGVkIGNoYXJhY3RlciBhdCBwb2ludC4iCiAgIDs7IFNraXAgb3ZlciB0aGUg J1wnCkBAIC0zNzIsNiArMzc2LDE1IEBAIGpzb24tcmVhZC1lc2NhcGVkLWNoYXIKICAgICAoY29u ZAogICAgICAoc3BlY2lhbCAoY2RyIHNwZWNpYWwpKQogICAgICAoKG5vdCAoZXEgY2hhciA/dSkp IGNoYXIpCisgICAgIDs7IFNwZWNpYWwtY2FzZSBVVEYtMTYgc3Vycm9nYXRlIHBhaXJzLAorICAg ICA7OyBjZi4gaHR0cHM6Ly90b29scy5pZXRmLm9yZy9odG1sL3JmYzcxNTkjc2VjdGlvbi03Cisg ICAgICgobG9va2luZy1hdAorICAgICAgIChyeCAoZ3JvdXAgKGFueSAiRGQiKSAoYW55ICI4OUFC YWIiKSAoPSAyIChhbnkgIjAtOUEtRmEtZiIpKSkKKyAgICAgICAgICAgIlxcdSIgKGdyb3VwIChh bnkgIkRkIikgKGFueSAiQy1GYy1mIikgKD0gMiAoYW55ICIwLTlBLUZhLWYiKSkpKSkKKyAgICAg IChqc29uLWFkdmFuY2UgMTApCisgICAgICAoanNvbi0tZGVjb2RlLXV0Zi0xNi1zdXJyb2dhdGVz CisgICAgICAgKHN0cmluZy10by1udW1iZXIgKG1hdGNoLXN0cmluZyAxKSAxNikKKyAgICAgICAo c3RyaW5nLXRvLW51bWJlciAobWF0Y2gtc3RyaW5nIDIpIDE2KSkpCiAgICAgICgobG9va2luZy1h dCAiWzAtOUEtRmEtZl1bMC05QS1GYS1mXVswLTlBLUZhLWZdWzAtOUEtRmEtZl0iKQogICAgICAg KGxldCAoKGhleCAobWF0Y2gtc3RyaW5nIDApKSkKICAgICAgICAgKGpzb24tYWR2YW5jZSA0KQpk aWZmIC0tZ2l0IGEvdGVzdC9saXNwL2pzb24tdGVzdHMuZWwgYi90ZXN0L2xpc3AvanNvbi10ZXN0 cy5lbAppbmRleCA3OGNlYmI0Li44OTU4MDAwIDEwMDY0NAotLS0gYS90ZXN0L2xpc3AvanNvbi10 ZXN0cy5lbAorKysgYi90ZXN0L2xpc3AvanNvbi10ZXN0cy5lbApAQCAtMTY3LDE0ICsxNjcsMTcg QEAganNvbi10ZXN0cy0td2l0aC10ZW1wLWJ1ZmZlcgogICAgIChzaG91bGQgKGVxdWFsIChqc29u LXJlYWQtc3RyaW5nKSAiYWJjzrHOss6zIikpKQogICAoanNvbi10ZXN0cy0td2l0aC10ZW1wLWJ1 ZmZlciAiXCJcXG5hc2RcXHUwNDQ0XFx1MDQ0YlxcdTA0MzJmZ2hcXHRcIiIKICAgICAoc2hvdWxk IChlcXVhbCAoanNvbi1yZWFkLXN0cmluZykgIlxuYXNk0YTRi9CyZmdoXHQiKSkpCisgIDs7IEJ1 ZyMyNDc4NAorICAoanNvbi10ZXN0cy0td2l0aC10ZW1wLWJ1ZmZlciAiXCJcXHVEODM0XFx1REQx RVwiIgorICAgIChzaG91bGQgKGVxdWFsIChqc29uLXJlYWQtc3RyaW5nKSAiXFUwMDAxRDExRSIp KSkKICAgKGpzb24tdGVzdHMtLXdpdGgtdGVtcC1idWZmZXIgImZvbyIKICAgICAoc2hvdWxkLWVy cm9yIChqc29uLXJlYWQtc3RyaW5nKSA6dHlwZSAnanNvbi1zdHJpbmctZm9ybWF0KSkpCiAKIChl cnQtZGVmdGVzdCB0ZXN0LWpzb24tZW5jb2RlLXN0cmluZyAoKQogICAoc2hvdWxkIChlcXVhbCAo anNvbi1lbmNvZGUtc3RyaW5nICJmb28iKSAiXCJmb29cIiIpKQogICAoc2hvdWxkIChlcXVhbCAo anNvbi1lbmNvZGUtc3RyaW5nICJhXG5cZmIiKSAiXCJhXFxuXFxmYlwiIikpCi0gIChzaG91bGQg KGVxdWFsIChqc29uLWVuY29kZS1zdHJpbmcgIlxuYXNk0YTRi9CyXHUwMDFmXHUwMDdmZmdoXHQi KQotICAgICAgICAgICAgICAgICAiXCJcXG5hc2TRhNGL0LJcXHUwMDFmXHUwMDdmZmdoXFx0XCIi KSkpCisgIChzaG91bGQgKGVxdWFsIChqc29uLWVuY29kZS1zdHJpbmcgIlxuYXNk0YTRi9Cy8J2E nlx1MDAxZlx1MDA3ZmZnaFx0IikKKyAgICAgICAgICAgICAgICAgIlwiXFxuYXNk0YTRi9Cy8J2E nlxcdTAwMWZcdTAwN2ZmZ2hcXHRcIiIpKSkKIAogKGVydC1kZWZ0ZXN0IHRlc3QtanNvbi1lbmNv ZGUta2V5ICgpCiAgIChzaG91bGQgKGVxdWFsIChqc29uLWVuY29kZS1rZXkgImZvbyIpICJcImZv b1wiIikpCi0tIAoyLjEwLjEKCg== --001a114e2a8e5983f6053fa1cd21--