all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Yuri Khan <yuri.v.khan@gmail.com>
Cc: p.stephani2@gmail.com, emacs-devel@gnu.org,
	kentaro.nakazawa@nifty.com, larsi@gnus.org, dgutov@yandex.ru
Subject: Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
Date: Fri, 02 Dec 2016 17:45:02 +0200	[thread overview]
Message-ID: <83k2bimj29.fsf@gnu.org> (raw)
In-Reply-To: <CAP_d_8URYN1QPGAXnB7RLVzs758MgPzSMz7bvM05wbRU1OWJLg@mail.gmail.com> (message from Yuri Khan on Fri, 2 Dec 2016 21:53:16 +0700)

> From: Yuri Khan <yuri.v.khan@gmail.com>
> Date: Fri, 2 Dec 2016 21:53:16 +0700
> Cc: Dmitry Gutov <dgutov@yandex.ru>, Philipp Stephani <p.stephani2@gmail.com>, 
> It is really unfortunate that we talk about ASCII strings, unibyte
> strings, multibyte strings, as if that was a meaningful
> classification.

It is meaningful when you work on Emacs code.

> The real dichotomy is between text (aka strings) and MIME-type-tagged
> byte arrays.

That might be so in the context of HTTP, but in general, byte arrays
("raw bytes" in Emacs parlance) are not limited to MIME types.
Moreover, there are very frequent use cases where Emacs code needs to
work with a byte array whose type is unknown, or even cannot be known
at all, because it doesn't come with any meta-data of any kind.

> In order to send a string over HTTP, one must encode it
> to a byte array and tag it as "text/plain; charset=utf-8" or
> "text/html; charset=utf-8" or application/json (no charset parameter
> because json must always be encoded in one of utf-* for transmission).
> Conversely, a byte array received over HTTP can, MIME type allowing,
> decoded into a string.
> 
> The fact that there exist strings for which encoding and decoding are
> identity transforms should be regarded only as an implementation
> detail.

You are talking generalities here, whereas this discussion is about
Emacs-specific internal issues.  In Emacs, a plain-ASCII string is
indistinguishable from a "byte array" whose bytes are all below 128.
They have the same representation.  To muddy the water even more, a
plain-ASCII string can be "marked" as multibyte (again, internally),
but it should be clear that such a "mark" has no meaning at all for
ASCII text.

From the Lisp application POV, whether a plain-ASCII string it
receives or processes is marked as unibyte or multibyte is entirely
random.  So if some ASCII text is accepted by an Emacs API involved in
sending HTTP requests, while an identical ASCII string is rejected,
it could be a source of surprises and bug reports.

That is the core of the issues discussed here.

> Attempts by libraries and frameworks to silently DTRT for this
> subset lead to applications neglecting to properly encode or tag
> strings, leading, in turn, to breakage in presence of multilingual
> text.

Based on Emacs experience of dealing with multibyte text and its
encoding/decoding, the conclusion was that it is better to silently
DTRT where we can be sure we know how.  Making a point of educating
users by harsh measures such as signaling errors where Emacs could
easily proceed, is generally not welcome.  We will see if this case is
any different.



  reply	other threads:[~2016-12-02 15:45 UTC|newest]

Thread overview: 125+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-29  8:22 bug#23750: 25.0.95; bug in url-retrieve or json.el Kentaro NAKAZAWA
2016-11-29  9:54 ` Andreas Schwab
2016-11-29 10:06   ` Kentaro NAKAZAWA
2016-11-29 10:08     ` Dmitry Gutov
2016-11-29 10:23       ` Kentaro NAKAZAWA
2016-11-29 10:34         ` Lars Ingebrigtsen
2016-11-29 10:38           ` Kentaro NAKAZAWA
2016-11-29 10:42             ` Lars Ingebrigtsen
2016-11-29 10:48               ` Kentaro NAKAZAWA
2016-11-29 10:49               ` Dmitry Gutov
2016-11-29 10:50             ` Dmitry Gutov
2016-11-29 10:55               ` Kentaro NAKAZAWA
2016-11-29 10:59                 ` Dmitry Gutov
2016-11-29 11:03                   ` Kentaro NAKAZAWA
2016-11-29 11:05                     ` Dmitry Gutov
2016-11-29 11:12                       ` Kentaro NAKAZAWA
2016-11-29 17:23                       ` Eli Zaretskii
2016-11-29 23:09                         ` Philipp Stephani
2016-11-29 23:18                           ` Philipp Stephani
2016-11-30 15:11                             ` Eli Zaretskii
2016-11-30 15:20                               ` Lars Ingebrigtsen
2016-11-30 15:43                                 ` Eli Zaretskii
2016-11-30 15:46                                   ` Lars Ingebrigtsen
2016-11-30  0:16                           ` Dmitry Gutov
2016-11-30 15:13                             ` Eli Zaretskii
2016-11-30 15:17                               ` Dmitry Gutov
2016-11-30 15:32                                 ` Stefan Monnier
2016-11-30 15:42                                 ` Eli Zaretskii
2016-11-30 15:45                                   ` Dmitry Gutov
2016-11-30 15:48                                     ` Lars Ingebrigtsen
2016-11-30 16:25                                       ` Eli Zaretskii
2016-11-30 16:27                                         ` Lars Ingebrigtsen
2016-11-30 16:42                                           ` Eli Zaretskii
2016-11-30 18:25                                             ` Philipp Stephani
2016-11-30 18:48                                               ` Eli Zaretskii
2016-12-28 18:18                                                 ` Philipp Stephani
2016-12-28 18:34                                                   ` Eli Zaretskii
2016-12-28 18:45                                                     ` Philipp Stephani
2016-12-28 18:55                                                       ` Eli Zaretskii
2016-12-28 19:03                                                       ` Andreas Schwab
2016-11-30 18:23                                         ` Philipp Stephani
2016-11-30 18:44                                           ` Eli Zaretskii
2016-12-28 18:09                                             ` Philipp Stephani
2016-12-28 18:27                                               ` Eli Zaretskii
2016-12-28 18:35                                                 ` Philipp Stephani
2016-12-28 18:45                                                   ` Eli Zaretskii
2016-12-28 18:22                                       ` Philipp Stephani
2016-12-28 18:57                                         ` Lars Ingebrigtsen
2016-12-30  0:07                                           ` Richard Stallman
2016-12-30 14:15                                             ` Lars Ingebrigtsen
2016-12-30 16:59                                               ` Eli Zaretskii
2017-01-21 15:39                                                 ` Lars Ingebrigtsen
2017-01-21 15:56                                                   ` Eli Zaretskii
2017-01-21 16:30                                                     ` Lars Ingebrigtsen
2017-01-21 22:58                                                       ` Stefan Monnier
2017-01-24 20:04                                                         ` Lars Ingebrigtsen
2017-01-28  9:52                                                           ` Elias Mårtenson
2017-01-28 14:16                                                             ` Lars Ingebrigtsen
2016-12-30 21:38                                               ` Richard Stallman
2016-11-30 16:23                                     ` Eli Zaretskii
2016-12-01  0:30                                       ` Dmitry Gutov
2016-12-01 17:17                                         ` Eli Zaretskii
2016-12-02 13:18                                           ` Dmitry Gutov
2016-12-02 14:24                                             ` Eli Zaretskii
2016-12-02 14:35                                               ` Dmitry Gutov
2016-12-02 15:20                                                 ` Eli Zaretskii
2016-12-02 14:53                                               ` Yuri Khan
2016-12-02 15:45                                                 ` Eli Zaretskii [this message]
2016-12-02 15:51                                                 ` Lars Ingebrigtsen
2016-12-02 15:58                                                   ` Eli Zaretskii
2016-12-02 15:29                                             ` Lars Ingebrigtsen
2016-12-02 15:32                                               ` Dmitry Gutov
2016-12-02 15:48                                                 ` Lars Ingebrigtsen
2016-12-02 15:56                                                   ` Dmitry Gutov
2016-12-02 16:02                                                     ` Lars Ingebrigtsen
2016-12-02 16:06                                                       ` Dmitry Gutov
2016-12-02 16:31                                                         ` Lars Ingebrigtsen
2016-12-02 23:13                                                           ` Dmitry Gutov
2016-12-03  0:37                                                             ` Lars Ingebrigtsen
2016-12-03  1:27                                                               ` Dmitry Gutov
2016-12-03  8:12                                                               ` Eli Zaretskii
2016-12-03 10:01                                                                 ` Lars Ingebrigtsen
2016-12-03 16:00                                                                   ` Stefan Monnier
2016-12-03 20:01                                                                     ` Lars Ingebrigtsen
2016-12-03 20:57                                                                       ` Andreas Schwab
2016-12-28 18:25                                         ` Philipp Stephani
2016-11-30 15:06                           ` Eli Zaretskii
2016-11-30 15:31                             ` Stefan Monnier
  -- strict thread matches above, loose matches on Subject: below --
2016-06-12  2:22 Leo Liu
2016-06-13 15:02 ` Dmitry Gutov
2016-06-13 17:55   ` Stefan Monnier
2016-06-13 19:26     ` Dmitry Gutov
2016-06-14  0:30       ` Stefan Monnier
2016-06-19 18:14         ` Dmitry Gutov
2016-06-19 18:25           ` Eli Zaretskii
2016-06-19 18:30             ` John Wiegley
2016-06-19 18:45               ` Dmitry Gutov
2016-06-19 19:56                 ` John Wiegley
2016-06-19 20:05                   ` Dmitry Gutov
2016-06-19 21:07                     ` John Wiegley
2016-06-20  1:28                       ` Glenn Morris
2016-06-20  4:22                         ` John Wiegley
2016-06-20 12:39                           ` Lars Ingebrigtsen
2016-07-01 20:49                             ` John Wiegley
2016-06-20 14:42                           ` Eli Zaretskii
2016-06-23 17:14                           ` Glenn Morris
2016-06-20  1:26                   ` Glenn Morris
2016-06-20  2:58                   ` Dmitry Gutov
2016-06-19 18:36             ` Dmitry Gutov
2016-06-20  0:15               ` Leo Liu
2016-06-20 14:39                 ` Eli Zaretskii
2016-06-20  2:40               ` Eli Zaretskii
2016-06-20  2:51                 ` Dmitry Gutov
2016-06-20 14:38                   ` Eli Zaretskii
2016-06-20 14:54                     ` Dmitry Gutov
2016-06-20 15:03                       ` Eli Zaretskii
2016-06-20 17:16                     ` Dmitry Gutov
2016-06-20 20:17                       ` Eli Zaretskii
2016-06-20 20:27                         ` Dmitry Gutov
2016-06-21  2:30                           ` Eli Zaretskii
2016-06-21 13:51                             ` Dmitry Gutov
2016-06-21 15:18                               ` Eli Zaretskii
2016-06-22  1:08                                 ` John Wiegley
2016-06-22  2:36                                   ` Eli Zaretskii
2016-06-22 18:21                                   ` Dmitry Gutov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83k2bimj29.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=dgutov@yandex.ru \
    --cc=emacs-devel@gnu.org \
    --cc=kentaro.nakazawa@nifty.com \
    --cc=larsi@gnus.org \
    --cc=p.stephani2@gmail.com \
    --cc=yuri.v.khan@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.