From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Philipp Stephani Newsgroups: gmane.emacs.devel Subject: Re: bug#23750: 25.0.95; bug in url-retrieve or json.el Date: Wed, 28 Dec 2016 18:18:25 +0000 Message-ID: References: <6d0c8c2e-8428-2fdb-0d6e-899f7b9d7ffd@nifty.com> <8053af81-80e1-a24a-f649-8ffc86963ed5@nifty.com> <0cc7fab4-9a2c-6a8d-def7-36bd50317ca3@yandex.ru> <7f9a799f-de88-fd78-0cdc-dac0928f1503@nifty.com> <308bb78f-8be3-092d-d877-e129d340242b@nifty.com> <4dc615e7-ec73-60a5-426e-0d6986f15d76@yandex.ru> <0cb406fb-ffc4-a4ad-557a-2cacc99b8e75@nifty.com> <86ccb4af-5719-c017-26bb-fc06b4c904d2@yandex.ru> <83r35uxkr5.fsf@gnu.org> <4e12d4ad-cd6b-3087-5d7c-449d4c1886e2@yandex.ru> <83lgw1q9uu.fsf@gnu.org> <83eg1tq8is.fsf@gnu.org> <787e5206-53e0-752f-a339-4608d2f7ad39@yandex.ru> <8360n5q6j4.fsf@gnu.org> <8337i8rkbe.fsf@gnu.org> <83polcpzwk.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=089e0118454c4ff67a0544bbff15 X-Trace: blaine.gmane.org 1482949172 28747 195.159.176.226 (28 Dec 2016 18:19:32 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 28 Dec 2016 18:19:32 +0000 (UTC) Cc: larsi@gnus.org, dgutov@yandex.ru, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Dec 28 19:19:27 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cMIoq-0005t1-A8 for ged-emacs-devel@m.gmane.org; Wed, 28 Dec 2016 19:19:20 +0100 Original-Received: from localhost ([::1]:60402 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cMIov-0006wB-9V for ged-emacs-devel@m.gmane.org; Wed, 28 Dec 2016 13:19:25 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:39273) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cMIoC-0006vj-DH for emacs-devel@gnu.org; Wed, 28 Dec 2016 13:18:41 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cMIoB-0002vI-Ci for emacs-devel@gnu.org; Wed, 28 Dec 2016 13:18:40 -0500 Original-Received: from mail-wj0-x22c.google.com ([2a00:1450:400c:c01::22c]:34790) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cMIo9-0002ul-Km; Wed, 28 Dec 2016 13:18:37 -0500 Original-Received: by mail-wj0-x22c.google.com with SMTP id sd9so157596450wjb.1; Wed, 28 Dec 2016 10:18:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=A6HV4KRgJ8UCLDlW7QloWoIQ2p53rKTXYBGUxQfXw8E=; b=cXHUhQOaRpjO0JcwMNWzxP78pXZN7q0dZSuPsIUAnA4/fneSmZPCI4IQhXFVGqnOiD NR0eA5tBoUkzYkf9XniV7cxo+L46uXQXQdgQ0q8iGj24ajaAzzIsRrSkw2pBEeTexJxj qFs6KlTq5/xRpvDWWdyclJEZDtApx7b/N9rBk76gUxgD3C4ukQzjBS8fzIW9jfG3sMbS nHDSYW5qK+aV5fPHi+lrUhAS6qwQbuuZlSKezdGSyzpg6yx7odtHYvvMfTs9H6FO5Z+K pnNrqb0kSXPUEn/wUX7z6tvYqKxMMMtHpSDZvu0cUV4jW05nHw3xLby2NiTa2ofO6MYa IStg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=A6HV4KRgJ8UCLDlW7QloWoIQ2p53rKTXYBGUxQfXw8E=; b=hfTtyvab6kLv71h1sB7MmbhXOOXKyXnh/tBSo42jgassaMaNGHmqSduFu362+tWluv WdFWbPCyhX/BaKU9miaWqTZ43NW6ccY432WiCkw7fOw3Jsjf7VLwznySGinwMDbEg4Hl A0981CGWFlnlK5dd+VD1bBxrU3KhhN4LYcHCFqq1L/A/6k6RXhifcL450hZk4SHPZO5/ jqKSX3BQUQ417RCnuJCoAyF/RoWsv6CtqYRSI6vGi2K3g2SXOxHCL3Vaezb2wM1xGU7J PNlAUERIe/5vCz8ko3rwAYePMbG09pqWISYY3i8nA6ASvGLYFHvWOQT13PGEkbfQB6GE 2z9g== X-Gm-Message-State: AIkVDXK2Z46KqcETSDQ3LTXC2um/fkVbNjRhLwACg+eGI/t0tHIn+BH4GQtmRHn7ofrAR85NfDiFtlmbHy+Xog== X-Received: by 10.194.148.4 with SMTP id to4mr33659630wjb.194.1482949116554; Wed, 28 Dec 2016 10:18:36 -0800 (PST) In-Reply-To: <83polcpzwk.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2a00:1450:400c:c01::22c X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:210917 Archived-At: --089e0118454c4ff67a0544bbff15 Content-Type: text/plain; charset=UTF-8 Eli Zaretskii schrieb am Mi., 30. Nov. 2016 um 19:48 Uhr: > > From: Philipp Stephani > > Date: Wed, 30 Nov 2016 18:25:09 +0000 > > Cc: emacs-devel@gnu.org, kentaro.nakazawa@nifty.com, dgutov@yandex.ru > > > > > That's right -- why should any code care? Yet url.el does. > > > > No, it doesn't, not if the string is plain ASCII. > > > > But in that case it isn't, it's morally a byte array. > > Yes, because the internal representation of characters in Emacs is a > superset of UTF-8. > That has nothing to do with characters. A byte array is conceptually different from a character string. > > > What Emacs lacks is good support for byte arrays. > > Unibyte strings are byte arrays. What do you think we lack in that regard? > If unibyte strings should be used for byte arrays, then the URL functions should indeed signal an error whenever url-request-data is a multibyte string, as HTTP requests are conceptually byte arrays, not character strings. > > > For HTTP, process-send-string shouldn't need to deal > > with encoding or EOL conversion, it should just accept a byte array and > send that, unmodified. > > I disagree. Handling unibyte strings is a nuisance, so Emacs allows > most applications be oblivious about them, and just handle > human-readable text. > That is the wrong approach (byte arrays and character strings are fundamentally different types, and mixing them together only causes pain), and it cannot work when implementing network protocols. HTTP requests are *not* human-readable text, they are byte arrays. Attempting to handle Unicode strings can't work because we wouldn't know the number of encoded bytes. --089e0118454c4ff67a0544bbff15 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


Eli Za= retskii <eliz@gnu.org> schrieb am= Mi., 30. Nov. 2016 um 19:48=C2=A0Uhr:
> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Wed, 30 Nov 2016 18:25:09 +0000
> Cc: emacs-devel@gnu.org, kentaro.nakazawa@nifty.com, dgutov@yandex.ru
>
>=C2=A0 > That's right -- why should any code care? Yet url.el do= es.
>
>=C2=A0 No, it doesn't, not if the string is plain ASCII.
>
> But in that case it isn't, it's morally a byte array.

Yes, because the internal representation of characters in Emacs is a
superset of UTF-8.

That has no= thing to do with characters. A byte array is conceptually different from a = character string.
=C2=A0

> What Emacs lacks is good support for byte arrays.

Unibyte strings are byte arrays.=C2=A0 What do you think we lack in that re= gard?

If unibyte st= rings should be used for byte arrays, then the URL functions should indeed = signal an error whenever url-request-data is a multibyte string, as HTTP re= quests are conceptually byte arrays, not character strings.
=C2= =A0

> For HTTP, process-send-string shouldn't need to deal
> with encoding or EOL conversion, it should just accept a byte array an= d send that, unmodified.

I disagree.=C2=A0 Handling unibyte strings is a nuisance, so Emacs allows most applications be oblivious about them, and just handle
human-readable text.

That is the wrong approach (byte arrays and character strings are fundame= ntally different types, and mixing them together only causes pain), and it = cannot work when implementing network protocols. HTTP requests are *not* hu= man-readable text, they are byte arrays. Attempting to handle Unicode strin= gs can't work because we wouldn't know the number of encoded bytes.=
--089e0118454c4ff67a0544bbff15--