From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Philipp Stephani Newsgroups: gmane.emacs.devel Subject: Re: bug#23750: 25.0.95; bug in url-retrieve or json.el Date: Wed, 28 Dec 2016 18:35:58 +0000 Message-ID: References: <6d0c8c2e-8428-2fdb-0d6e-899f7b9d7ffd@nifty.com> <8053af81-80e1-a24a-f649-8ffc86963ed5@nifty.com> <0cc7fab4-9a2c-6a8d-def7-36bd50317ca3@yandex.ru> <7f9a799f-de88-fd78-0cdc-dac0928f1503@nifty.com> <308bb78f-8be3-092d-d877-e129d340242b@nifty.com> <4dc615e7-ec73-60a5-426e-0d6986f15d76@yandex.ru> <0cb406fb-ffc4-a4ad-557a-2cacc99b8e75@nifty.com> <86ccb4af-5719-c017-26bb-fc06b4c904d2@yandex.ru> <83r35uxkr5.fsf@gnu.org> <4e12d4ad-cd6b-3087-5d7c-449d4c1886e2@yandex.ru> <83lgw1q9uu.fsf@gnu.org> <83eg1tq8is.fsf@gnu.org> <787e5206-53e0-752f-a339-4608d2f7ad39@yandex.ru> <8360n5q6j4.fsf@gnu.org> <83r35sq02k.fsf@gnu.org> <83mvffvrh7.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=001a114706941362e30544bc3e54 X-Trace: blaine.gmane.org 1482950186 29434 195.159.176.226 (28 Dec 2016 18:36:26 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 28 Dec 2016 18:36:26 +0000 (UTC) Cc: larsi@gnus.org, dgutov@yandex.ru, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Dec 28 19:36:22 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cMJ5H-0006lC-Cx for ged-emacs-devel@m.gmane.org; Wed, 28 Dec 2016 19:36:19 +0100 Original-Received: from localhost ([::1]:60457 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cMJ5M-0002v6-0Z for ged-emacs-devel@m.gmane.org; Wed, 28 Dec 2016 13:36:24 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:42647) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cMJ5F-0002uv-5r for emacs-devel@gnu.org; Wed, 28 Dec 2016 13:36:18 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cMJ5D-00082F-3O for emacs-devel@gnu.org; Wed, 28 Dec 2016 13:36:17 -0500 Original-Received: from mail-wm0-x235.google.com ([2a00:1450:400c:c09::235]:35625) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cMJ58-00080a-MO; Wed, 28 Dec 2016 13:36:10 -0500 Original-Received: by mail-wm0-x235.google.com with SMTP id a197so287815413wmd.0; Wed, 28 Dec 2016 10:36:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=2PvqLa7hgZi0vN5LnSgqgUSCKGnVdfeRNOXZBFj46Mc=; b=FY/m9W5xD2isnGAu4Ks3ACt86neSM0fwdCwjLV15Adh3DsWxT0Lk2861ZbXHfEky1W wuKobuibAD5syMlDS630DbG6ShakeAzsVi9yKiZfUKthSwqnS8eZ2iPYUFujWnKVtpgp VE7MhNtJgdtCglnBatJLisPhQNHnyuR502P2v/++L0p4oLDTXZDOvp1evUe32PgUnrox O6BvSk1PV7eUHQTxvJ0BJWO2HDs23+g5mV+EkV735qiEW7qN5/Y/Syx40KICQzgsfN+J 6lq7OjruTyIoHH20gdEVtHZ/MArDLRny9V5VMY64yES62qCuDF1MAJ7t2GeM9q4fgLcz hP2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=2PvqLa7hgZi0vN5LnSgqgUSCKGnVdfeRNOXZBFj46Mc=; b=iYnG2mp87YhqXA0Bi9+RGRRcNMmUYGMbFA0KrpQnow2dpQNcoYwEau/5S5gcC+Ol4o GMFaQ/FPlIsv6wAGOv8CWI5dDMzoUgh/NRDpocCTrJRz+2blukRIByCwQQT7EBuqm3vk +6+asuyoEWcEPhk+RnGIt+s8OVIcTaxmZdrCqUCZKC6vk6wDLa7e2dj1AXklkxEkgyt0 EOm0YC6LnoDOjoPdbUMgXw2AGqycNSQZzI6s3FSo1WZSQNZhAW5WWyOnw18ZV9yl6Fav OibHqoAU8WpoBM4Gfc6R2xxRx9j7YO4c9eTS9vNFYAe8SaSlmrWTsV7CuVTJ/vLUJxhS BiTA== X-Gm-Message-State: AIkVDXIs3jPBsVWLzI5NdJ7KIjJNm8XSfv6oM59bpgV83YsiFTTFnveJO2B836kokUTHCeMt4I73Lr+r4QqUDQ== X-Received: by 10.28.216.65 with SMTP id p62mr34093167wmg.92.1482950169550; Wed, 28 Dec 2016 10:36:09 -0800 (PST) In-Reply-To: <83mvffvrh7.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2a00:1450:400c:c09::235 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:210922 Archived-At: --001a114706941362e30544bc3e54 Content-Type: text/plain; charset=UTF-8 Eli Zaretskii schrieb am Mi., 28. Dez. 2016 um 19:28 Uhr: > > From: Philipp Stephani > > Date: Wed, 28 Dec 2016 18:09:52 +0000 > > Cc: larsi@gnus.org, dgutov@yandex.ru, kentaro.nakazawa@nifty.com, > > emacs-devel@gnu.org > > > > > > [1:text/plain Show] > > > > > > [2:text/html Hide Save:noname (9kB)] > > > > Eli Zaretskii schrieb am Mi., 30. Nov. 2016 um 19:45 Uhr: > > > > > From: Philipp Stephani > > > Date: Wed, 30 Nov 2016 18:23:14 +0000 > > > Cc: dgutov@yandex.ru, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org > > > > > > > Yes, this is not a json.el problem at all. It does the correct > thing, > > > > and shouldn't be changed. > > > > > > ??? Why should any code care whether a pure-ASCII string is marked as > > > unibyte or as multibyte? Both are "correct". > > > > > > I guess the problem is that process-send-string cares. If it didn't, > we wouldn't have the problem. > > > > I don't think I follow. The error we are talking about is signaled > > from url-http-create-request, not from process-send-string. > > > > Yes, but url-http-create-request only cares about unibyte strings > because the request it creates is passed to > > process-send-string, which special-cases unibyte strings. > > How do you see that process-send-string special-cases unibyte strings? > The send_process function has two branches, one for unibyte, one for multibyte. > > > > For URL, we'd need functions like > > > (byte-array-length s) = (length (string-to-unibyte s)) > > > > Why do you need this? string-to-unibyte is well-defined only for > > unibyte or ASCII strings (if we forget the raw bytes for a moment), so > > length will do. > > > > We need it because we have to send the byte length in a header. We can't > just use (length s) because it > > would silently give a wrong result. > > We are miscommunicating. string-to-unibyte can only meaningfully be > called on a pure-ASCII string, and for pure-ASCII strings 'length' > will count bytes. So I see no need for 'byte-array-length' if its > implementation is as you indicated. > That depends on how you want to represent byte arrays/octet streams in Emacs. If you want to represent them using unibyte strings, then you indeed only need `length'. But some earlier messages sounded like you wanted to represent byte arrays either using unibyte strings or byte-only multibyte strings. In that case `string-to-unibyte' is necessary. > > > > (process-send-bytes s) = (process-send-string (string-to-unibyte s)) > > > > Why is this needed? process-send-string already encodes its argument, > > which produces a unibyte string. > > > > We can't give a multibyte string to process-send-string, because we have > to pass the length in bytes in a > > header first. Therefore we have to encode any string before passing it > to process-send-string. > > Once you encoded the string, why do you need anything except calling > process-send-string? > > The byte size should be added as a Content-length HTTP header. If url-request-data is a unibyte string, that's not a problem (except for the newline conversion behavior in send_string), you can just use `length'. But if it's a multibyte string, you need to encode first to find the byte length. --001a114706941362e30544bc3e54 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


Eli Za= retskii <eliz@gnu.org> schrieb am= Mi., 28. Dez. 2016 um 19:28=C2=A0Uhr:
> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Wed, 28 Dec 2016 18:09:52 +0000
> Cc: larsi@gnus.org, dgutov@yandex.ru, kentaro.nakazawa@ni= fty.com,
>=C2=A0 =C2=A0 =C2=A0 =C2=A0emacs-devel@gnu.org
>
>
> [1:text/plain Show]
>
>
> [2:text/html Hide Save:noname (9kB)]
>
> Eli Zaretskii <eliz@gnu.org> schrieb am Mi., 30. Nov. 2016 um 19:= 45 Uhr:
>
>=C2=A0 > From: Philipp Stephani <p.stephani2@gmail.com>= ;
>=C2=A0 > Date: Wed, 30 Nov 2016 18:23:14 +0000
>=C2=A0 > Cc: dgutov@yandex.ru, kentaro.nakazawa@nifty.co= m, emacs-devel@gnu.org
>=C2=A0 >
>=C2=A0 > > Yes, this is not a json.el problem at all. It does the= correct thing,
>=C2=A0 > > and shouldn't be changed.
>=C2=A0 >
>=C2=A0 > ??? Why should any code care whether a pure-ASCII string is= marked as
>=C2=A0 > unibyte or as multibyte? Both are "correct".
>=C2=A0 >
>=C2=A0 > I guess the problem is that process-send-string cares. If i= t didn't, we wouldn't have the problem.
>
>=C2=A0 I don't think I follow. The error we are talking about is si= gnaled
>=C2=A0 from url-http-create-request, not from process-send-string.
>
> Yes, but url-http-create-request only cares about unibyte strings beca= use the request it creates is passed to
> process-send-string, which special-cases unibyte strings.

How do you see that process-send-string special-cases unibyte strings?

The send_process functi= on has two branches, one for unibyte, one for multibyte.
=C2=A0

>=C2=A0 > For URL, we'd need functions like
>=C2=A0 > (byte-array-length s) =3D (length (string-to-unibyte s)) >
>=C2=A0 Why do you need this? string-to-unibyte is well-defined only for=
>=C2=A0 unibyte or ASCII strings (if we forget the raw bytes for a momen= t), so
>=C2=A0 length will do.
>
> We need it because we have to send the byte length in a header. We can= 't just use (length s) because it
> would silently give a wrong result.

We are miscommunicating.=C2=A0 string-to-unibyte can only meaningfully be called on a pure-ASCII string, and for pure-ASCII strings 'length'<= br class=3D"gmail_msg"> will count bytes.=C2=A0 So I see no need for 'byte-array-length' if= its
implementation is as you indicated.

That depends on how you want to represent byte arrays/octe= t streams in Emacs. If you want to represent them using unibyte strings, th= en you indeed only need `length'. But some earlier messages sounded lik= e you wanted to represent byte arrays either using unibyte strings or byte-= only multibyte strings. In that case `string-to-unibyte' is necessary.<= /div>
=C2=A0

>=C2=A0 > (process-send-bytes s) =3D (process-send-string (string-to-= unibyte s))
>
>=C2=A0 Why is this needed? process-send-string already encodes its argu= ment,
>=C2=A0 which produces a unibyte string.
>
> We can't give a multibyte string to process-send-string, because w= e have to pass the length in bytes in a
> header first. Therefore we have to encode any string before passing it= to process-send-string.

Once you encoded the string, why do you need anything except calling
process-send-string?


The byte size shou= ld be added as a Content-length HTTP header. If url-request-data is a uniby= te string, that's not a problem (except for the newline conversion beha= vior in send_string), you can just use `length'. But if it's a mult= ibyte string, you need to encode first to find the byte length.=C2=A0
=
--001a114706941362e30544bc3e54--