Dmitry=
Gutov <
dgutov@yandex.ru> sch=
rieb am Di., 25. Okt. 2016 um 01:19=C2=A0Uhr:
Philipp,
Thanks. Some comments:
On 24.10.2016 22:57, Philipp Stephani wrote:
> +(defsubst json--decode-utf-16-surrogates (high low)
IIRC, there might be no actual benefit from making it a defsubst. If
someone could benchmark it, I'd like to see the result.
Agreed; converted to defun. I'=
ve only used defsubst because some other helper functions also used defsubs=
t.
=C2=A0
> +=C2=A0 =C2=A0 =C2=A0;; Special-case UTF-16 surrogate pairs,
> +=C2=A0 =C2=A0 =C2=A0;; cf. ht=
tps://tools.ietf.org/html/rfc7159#section-7
> +=C2=A0 =C2=A0 =C2=A0((looking-at
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0(rx (group (any "Dd") (any "=
;89ABab") (=3D 2 (any "0-9A-Fa-f")))
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"\\u" (group (any =
"Dd") (any "C-Fc-f") (=3D 2 (any "0-9A-Fa-f")=
))))
> +=C2=A0 =C2=A0 =C2=A0 (json-advance 10)
> +=C2=A0 =C2=A0 =C2=A0 (json--decode-utf-16-surrogates
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0(string-to-number (match-string 1) 16)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0(string-to-number (match-string 2) 16)))
Shouldn't this go below the UTF-8 case, as the less-frequent one?
No, the below case is mo=
re general and therefore has to come last.
=C2=A0
>=C2=A0 (ert-deftest test-json-encode-string ()
>=C2=A0 =C2=A0 (should (equal (json-encode-string "foo") "=
;\"foo\""))
>=C2=A0 =C2=A0 (should (equal (json-encode-string "a\n\fb") &q=
uot;\"a\\n\\fb\""))
> -=C2=A0 (should (equal (json-encode-string "\nasd=C3=91=E2=80=9E=
=C3=91=E2=80=B9=C3=90=C2=B2\u001f\u007ffgh\t")
> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"\=
"\\nasd=C3=91=E2=80=9E=C3=91=E2=80=B9=C3=90=C2=B2\\u001f\u007ffgh\\t\&=
quot;")))
> +=C2=A0 (should (equal (json-encode-string "\nasd=C3=91=E2=80=9E=
=C3=91=E2=80=B9=C3=90=C2=B2=C3=B0 =E2=80=9E=C5=BE\u001f\u007ffgh\t")
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"\=
"\\nasd=C3=91=E2=80=9E=C3=91=E2=80=B9=C3=90=C2=B2=C3=B0 =E2=80=9E=C5=
=BE\\u001f\u007ffgh\\t\"")))
Why are we testing string encoding here?
It's not 100% related to the patch, but I think i=
t can be included for symmetry reasons (testing encoding as well as decodin=
g).=C2=A0