From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Philipp Stephani
> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Wed, 28 Dec 2016 18:18:25 +0000
> Cc: larsi@gnus.org, emacs-devel@gnu.org, kentaro.nakaz= awa@nifty.com,
>=C2=A0 =C2=A0 =C2=A0 =C2=A0dgutov@yandex.ru
>
>=C2=A0 > > That's right -- why should any code care? Yet url.= el does.
>=C2=A0 >
>=C2=A0 > No, it doesn't, not if the string is plain ASCII.
>=C2=A0 >
>=C2=A0 > But in that case it isn't, it's morally a byte arra= y.
>
>=C2=A0 Yes, because the internal representation of characters in Emacs = is a
>=C2=A0 superset of UTF-8.
>
> That has nothing to do with characters. A byte array is conceptually d= ifferent from a character string.
In Emacs, they are both implemented using very similar objects.
>=C2=A0 > What Emacs lacks is good support for byte arrays.
>
>=C2=A0 Unibyte strings are byte arrays. What do you think we lack in th= at regard?
>
> If unibyte strings should be used for byte arrays, then the URL functi= ons should indeed signal an error
> whenever url-request-data is a multibyte string, as HTTP requests are = conceptually byte arrays, not character
> strings.
Which is what we do now.
>=C2=A0 > For HTTP, process-send-string shouldn't need to deal
>=C2=A0 > with encoding or EOL conversion, it should just accept a by= te array and send that, unmodified.
>
>=C2=A0 I disagree. Handling unibyte strings is a nuisance, so Emacs all= ows
>=C2=A0 most applications be oblivious about them, and just handle
>=C2=A0 human-readable text.
>
> That is the wrong approach (byte arrays and character strings are fund= amentally different types, and mixing
> them together only causes pain), and it cannot work when implementing = network protocols. HTTP requests
> are *not* human-readable text, they are byte arrays. Attempting to han= dle Unicode strings can't work because
> we wouldn't know the number of encoded bytes.
You are arguing against a long and quite painful history of non-ASCII
strings in Emacs.=C2=A0 What we have now is based on a lot of experience
and at least two very large refactoring jobs.=C2=A0 Going back would be a very bad idea indeed, as we've been there already, and users didn't=
like that.=C2=A0 Some of us are old enough to remember the notorious \201 bytes creeping into text files and mail messages, due to that.=C2=A0 Never<= br class=3D"gmail_msg"> again.