bug#23750: 25.0.95; bug in url-retrieve or json.el

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* bug#23750: 25.0.95; bug in url-retrieve or json.el
@ 2016-11-29  8:22 Kentaro NAKAZAWA
  2016-11-29  9:54 ` Andreas Schwab
  0 siblings, 1 reply; 88+ messages in thread
From: Kentaro NAKAZAWA @ 2016-11-29  8:22 UTC (permalink / raw)
  To: dgutov, emacs-devel

Why can not I use multibyte text for http requests?
The following correct http request will fail.

(require 'json)
(let* ((content "ほげ <- VALID utf-8 Japanese multibyte text")
       (url "https://api.github.com/gists")
       (url-request-method "POST")
       (url-request-data
        (json-encode
         `(("description" . "test")
           ("public" . false)
           ("files" . (("test.txt" . (("content" . ,content)))))))))
  (with-current-buffer (url-retrieve-synchronously url)
    (buffer-string)))
=> url-http-create-request: Multibyte text in HTTP request: POST /gists
HTTP/1.1

Please apply the following patch.

--- url-http.el.orig	2016-09-15 17:16:04.000000000 +0900
+++ url-http.el	2016-11-29 17:10:57.018703500 +0900
@@ -351,16 +351,12 @@
              (if url-http-data
                  (concat
                   "Content-length: " (number-to-string
-                                      (length url-http-data))
+                                      (string-bytes url-http-data))
                   "\r\n"))
              ;; End request
              "\r\n"
              ;; Any data
              url-http-data))
-    ;; Bug#23750
-    (unless (= (string-bytes request)
-               (length request))
-      (error "Multibyte text in HTTP request: %s" request))
     (url-http-debug "Request is: \n%s" request)
     request))



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29  8:22 bug#23750: 25.0.95; bug in url-retrieve or json.el Kentaro NAKAZAWA
@ 2016-11-29  9:54 ` Andreas Schwab
  2016-11-29 10:06   ` Kentaro NAKAZAWA
  0 siblings, 1 reply; 88+ messages in thread
From: Andreas Schwab @ 2016-11-29  9:54 UTC (permalink / raw)
  To: Kentaro NAKAZAWA; +Cc: emacs-devel, dgutov

On Nov 29 2016, Kentaro NAKAZAWA <kentaro.nakazawa@nifty.com> wrote:

> Why can not I use multibyte text for http requests?

You need to encode it.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29  9:54 ` Andreas Schwab
@ 2016-11-29 10:06   ` Kentaro NAKAZAWA
  2016-11-29 10:08     ` Dmitry Gutov
  0 siblings, 1 reply; 88+ messages in thread
From: Kentaro NAKAZAWA @ 2016-11-29 10:06 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: emacs-devel, dgutov

On 2016/11/29 18:54, Andreas Schwab wrote:

> You need to encode it.

The text is encoded with utf-8.
The correct utf-8 text also contains multibyte text.
(Multibyte text is (/= (string-bytes text) (length text)) => t)

How can I correctly POST multibyte text?



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29 10:06   ` Kentaro NAKAZAWA
@ 2016-11-29 10:08     ` Dmitry Gutov
  2016-11-29 10:23       ` Kentaro NAKAZAWA
  0 siblings, 1 reply; 88+ messages in thread
From: Dmitry Gutov @ 2016-11-29 10:08 UTC (permalink / raw)
  To: Kentaro NAKAZAWA, Andreas Schwab; +Cc: emacs-devel

On 29.11.2016 12:06, Kentaro NAKAZAWA wrote:

> The text is encoded with utf-8.
> The correct utf-8 text also contains multibyte text.
> (Multibyte text is (/= (string-bytes text) (length text)) => t)
>
> How can I correctly POST multibyte text?

You encode it to a unibyte string using encode-coding-string.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29 10:08     ` Dmitry Gutov
@ 2016-11-29 10:23       ` Kentaro NAKAZAWA
  2016-11-29 10:34         ` Lars Ingebrigtsen
  0 siblings, 1 reply; 88+ messages in thread
From: Kentaro NAKAZAWA @ 2016-11-29 10:23 UTC (permalink / raw)
  To: Dmitry Gutov, Andreas Schwab; +Cc: emacs-devel



On 2016/11/29 19:08, Dmitry Gutov wrote:

> You encode it to a unibyte string using encode-coding-string.

(let* ((content (encode-coding-string
                 "ほげ <- VALID utf-8 Japanese multibyte text"
                 'us-ascii))
=> The following text was POSTed.
?? <- VALID utf-8 Japanese multibyte text
^^Two question marks

(let* ((content (encode-coding-string
                 "ほげ <- VALID utf-8 Japanese multibyte text"
                 'raw-text))
=> url-http-create-request: Multibyte text in HTTP request: POST /gists
HTTP/1.1

I tried various things but I do not know how to do it ...



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29 10:23       ` Kentaro NAKAZAWA
@ 2016-11-29 10:34         ` Lars Ingebrigtsen
  2016-11-29 10:38           ` Kentaro NAKAZAWA
  0 siblings, 1 reply; 88+ messages in thread
From: Lars Ingebrigtsen @ 2016-11-29 10:34 UTC (permalink / raw)
  To: Kentaro NAKAZAWA; +Cc: emacs-devel

Kentaro NAKAZAWA <kentaro.nakazawa@nifty.com> writes:

> (let* ((content (encode-coding-string
>                  "ほげ <- VALID utf-8 Japanese multibyte text"
>                  'us-ascii))

Use

(encode-coding-string "ほげ <- VALID utf-8 Japanese multibyte text" 'utf-8)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29 10:34         ` Lars Ingebrigtsen
@ 2016-11-29 10:38           ` Kentaro NAKAZAWA
  2016-11-29 10:42             ` Lars Ingebrigtsen
  2016-11-29 10:50             ` Dmitry Gutov
  0 siblings, 2 replies; 88+ messages in thread
From: Kentaro NAKAZAWA @ 2016-11-29 10:38 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: emacs-devel

On 2016/11/29 19:34, Lars Ingebrigtsen wrote:

> (encode-coding-string "ほげ <- VALID utf-8 Japanese multibyte text" 'utf-8)

=> url-http-create-request: Multibyte text in HTTP request: POST /gists
HTTP/1.1

It is the same result.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29 10:38           ` Kentaro NAKAZAWA
@ 2016-11-29 10:42             ` Lars Ingebrigtsen
  2016-11-29 10:48               ` Kentaro NAKAZAWA
  2016-11-29 10:49               ` Dmitry Gutov
  2016-11-29 10:50             ` Dmitry Gutov
  1 sibling, 2 replies; 88+ messages in thread
From: Lars Ingebrigtsen @ 2016-11-29 10:42 UTC (permalink / raw)
  To: Kentaro NAKAZAWA; +Cc: emacs-devel

Kentaro NAKAZAWA <kentaro.nakazawa@nifty.com> writes:

> On 2016/11/29 19:34, Lars Ingebrigtsen wrote:
>
>> (encode-coding-string "ほげ <- VALID utf-8 Japanese multibyte text" 'utf-8)
>
> => url-http-create-request: Multibyte text in HTTP request: POST /gists
> HTTP/1.1
>
> It is the same result.

Uhm...  how about

(string-as-unibyte
 (encode-coding-string "ほげ <- VALID utf-8 Japanese multibyte text" 'utf-8))

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29 10:42             ` Lars Ingebrigtsen
@ 2016-11-29 10:48               ` Kentaro NAKAZAWA
  2016-11-29 10:49               ` Dmitry Gutov
  1 sibling, 0 replies; 88+ messages in thread
From: Kentaro NAKAZAWA @ 2016-11-29 10:48 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: emacs-devel

On 2016/11/29 19:42, Lars Ingebrigtsen wrote:

> (string-as-unibyte
>  (encode-coding-string "ほげ <- VALID utf-8 Japanese multibyte text"
'utf-8))

=> url-http-create-request: Multibyte text in HTTP request: POST /gists
HTTP/1.1

This is also the same result...



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29 10:42             ` Lars Ingebrigtsen
  2016-11-29 10:48               ` Kentaro NAKAZAWA
@ 2016-11-29 10:49               ` Dmitry Gutov
  1 sibling, 0 replies; 88+ messages in thread
From: Dmitry Gutov @ 2016-11-29 10:49 UTC (permalink / raw)
  To: Lars Ingebrigtsen, Kentaro NAKAZAWA; +Cc: emacs-devel

On 29.11.2016 12:42, Lars Ingebrigtsen wrote:

> (string-as-unibyte
>  (encode-coding-string "ほげ <- VALID utf-8 Japanese multibyte text" 'utf-8))

That shouldn't be necessary.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29 10:38           ` Kentaro NAKAZAWA
  2016-11-29 10:42             ` Lars Ingebrigtsen
@ 2016-11-29 10:50             ` Dmitry Gutov
  2016-11-29 10:55               ` Kentaro NAKAZAWA
  1 sibling, 1 reply; 88+ messages in thread
From: Dmitry Gutov @ 2016-11-29 10:50 UTC (permalink / raw)
  To: Kentaro NAKAZAWA, Lars Ingebrigtsen; +Cc: emacs-devel

On 29.11.2016 12:38, Kentaro NAKAZAWA wrote:
> On 2016/11/29 19:34, Lars Ingebrigtsen wrote:
>
>> (encode-coding-string "ほげ <- VALID utf-8 Japanese multibyte text" 'utf-8)
>
> => url-http-create-request: Multibyte text in HTTP request: POST /gists
> HTTP/1.1
>
> It is the same result.

Do you have a full example to reproduce this?



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29 10:50             ` Dmitry Gutov
@ 2016-11-29 10:55               ` Kentaro NAKAZAWA
  2016-11-29 10:59                 ` Dmitry Gutov
  0 siblings, 1 reply; 88+ messages in thread
From: Kentaro NAKAZAWA @ 2016-11-29 10:55 UTC (permalink / raw)
  To: Dmitry Gutov, Lars Ingebrigtsen; +Cc: emacs-devel

On 2016/11/29 19:50, Dmitry Gutov wrote:

> Do you have a full example to reproduce this?

(require 'json)
(let* ((content "ほげ <- VALID utf-8 Japanese multibyte text")
       (url "https://api.github.com/gists")
       (url-request-method "POST")
       (url-request-data
        (json-encode
         `(("description" . "test")
           ("public" . false)
           ("files" . (("test.txt" . (("content" . ,content)))))))))
  (with-current-buffer (url-retrieve-synchronously url)
    (buffer-string)))

Evaluate the above by *scratch* and post it to private anonymous gist.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29 10:55               ` Kentaro NAKAZAWA
@ 2016-11-29 10:59                 ` Dmitry Gutov
  2016-11-29 11:03                   ` Kentaro NAKAZAWA
  0 siblings, 1 reply; 88+ messages in thread
From: Dmitry Gutov @ 2016-11-29 10:59 UTC (permalink / raw)
  To: Kentaro NAKAZAWA, Lars Ingebrigtsen; +Cc: emacs-devel

On 29.11.2016 12:55, Kentaro NAKAZAWA wrote:
> On 2016/11/29 19:50, Dmitry Gutov wrote:
>
>> Do you have a full example to reproduce this?
>
> (require 'json)
> (let* ((content "ほげ <- VALID utf-8 Japanese multibyte text")
>        (url "https://api.github.com/gists")
>        (url-request-method "POST")
>        (url-request-data
>         (json-encode
>          `(("description" . "test")
>            ("public" . false)
>            ("files" . (("test.txt" . (("content" . ,content)))))))))
>   (with-current-buffer (url-retrieve-synchronously url)
>     (buffer-string)))

Where is the encode-coding-string call?



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29 10:59                 ` Dmitry Gutov
@ 2016-11-29 11:03                   ` Kentaro NAKAZAWA
  2016-11-29 11:05                     ` Dmitry Gutov
  0 siblings, 1 reply; 88+ messages in thread
From: Kentaro NAKAZAWA @ 2016-11-29 11:03 UTC (permalink / raw)
  To: Dmitry Gutov, Lars Ingebrigtsen; +Cc: emacs-devel

On 2016/11/29 19:59, Dmitry Gutov wrote:

> Where is the encode-coding-string call?

Sorry, this is it.

(let* ((content (encode-coding-string
                 "ほげ <- VALID utf-8 Japanese multibyte text"
                 'utf-8))
       (url "https://api.github.com/gists")
       (url-request-method "POST")
       (url-request-data
        (json-encode
         `(("description" . "test")
           ("public" . false)
           ("files" . (("test.txt" . (("content" . ,content)))))))))
  (with-current-buffer (url-retrieve-synchronously url)
    (buffer-string)))




^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29 11:03                   ` Kentaro NAKAZAWA
@ 2016-11-29 11:05                     ` Dmitry Gutov
  2016-11-29 11:12                       ` Kentaro NAKAZAWA
  2016-11-29 17:23                       ` Eli Zaretskii
  0 siblings, 2 replies; 88+ messages in thread
From: Dmitry Gutov @ 2016-11-29 11:05 UTC (permalink / raw)
  To: Kentaro NAKAZAWA, Lars Ingebrigtsen; +Cc: emacs-devel

On 29.11.2016 13:03, Kentaro NAKAZAWA wrote:

> (let* ((content (encode-coding-string
>                  "ほげ <- VALID utf-8 Japanese multibyte text"
>                  'utf-8))
>        (url "https://api.github.com/gists")
>        (url-request-method "POST")
>        (url-request-data
>         (json-encode
>          `(("description" . "test")
>            ("public" . false)
>            ("files" . (("test.txt" . (("content" . ,content)))))))))
>   (with-current-buffer (url-retrieve-synchronously url)
>     (buffer-string)))

json-encode returns a multibyte string. Try this:

(let* ((content "ほげ <- VALID utf-8 Japanese multibyte text")
        (url "https://api.github.com/gists")
        (url-request-method "POST")
        (url-request-data
         (encode-coding-string
          (json-encode
           `(("description" . "test")
             ("public" . false)
             ("files" . (("test.txt" . (("content" . ,content)))))))
          'utf-8)))
   (with-current-buffer (url-retrieve-synchronously url)
     (buffer-string)))



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29 11:05                     ` Dmitry Gutov
@ 2016-11-29 11:12                       ` Kentaro NAKAZAWA
  2016-11-29 17:23                       ` Eli Zaretskii
  1 sibling, 0 replies; 88+ messages in thread
From: Kentaro NAKAZAWA @ 2016-11-29 11:12 UTC (permalink / raw)
  To: Dmitry Gutov, Lars Ingebrigtsen; +Cc: emacs-devel

On 2016/11/29 20:05, Dmitry Gutov wrote:

> json-encode returns a multibyte string. Try this:

It worked! Thank you for telling me the correct code!
I confirmed the correct result below.

(let* ((content "ほげ <- VALID utf-8 Japanese multibyte text")
       (url "https://api.github.com/gists")
       (url-request-method "POST")
       (url-request-data
        (encode-coding-string
         (json-encode
          `(("description" . "test")
            ("public" . false)
            ("files" . (("test.txt" . (("content" . ,content)))))))
         'utf-8)))
  (with-current-buffer (url-retrieve-synchronously url)
    (when (url-http-parse-headers)
      (search-forward-regexp "\n\\s-*\n" nil t)
      (browse-url (cdr (assoc 'html_url (json-read)))))))



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29 11:05                     ` Dmitry Gutov
  2016-11-29 11:12                       ` Kentaro NAKAZAWA
@ 2016-11-29 17:23                       ` Eli Zaretskii
  2016-11-29 23:09                         ` Philipp Stephani
  1 sibling, 1 reply; 88+ messages in thread
From: Eli Zaretskii @ 2016-11-29 17:23 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: larsi, kentaro.nakazawa, emacs-devel

> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Tue, 29 Nov 2016 13:05:39 +0200
> Cc: emacs-devel@gnu.org
> 
> On 29.11.2016 13:03, Kentaro NAKAZAWA wrote:
> 
> > (let* ((content (encode-coding-string
> >                  "ほげ <- VALID utf-8 Japanese multibyte text"
> >                  'utf-8))
> >        (url "https://api.github.com/gists")
> >        (url-request-method "POST")
> >        (url-request-data
> >         (json-encode
> >          `(("description" . "test")
> >            ("public" . false)
> >            ("files" . (("test.txt" . (("content" . ,content)))))))))
> >   (with-current-buffer (url-retrieve-synchronously url)
> >     (buffer-string)))
> 
> json-encode returns a multibyte string.

Any idea why?  Is it again that 'concat' misfeature, when one of the
strings is pure-ASCII, but happens to be multibyte?  Maybe we should
do something about that.

Thanks.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29 17:23                       ` Eli Zaretskii
@ 2016-11-29 23:09                         ` Philipp Stephani
  2016-11-29 23:18                           ` Philipp Stephani
                                             ` (2 more replies)
  0 siblings, 3 replies; 88+ messages in thread
From: Philipp Stephani @ 2016-11-29 23:09 UTC (permalink / raw)
  To: Eli Zaretskii, Dmitry Gutov; +Cc: larsi, kentaro.nakazawa, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1622 bytes --]

Eli Zaretskii <eliz@gnu.org> schrieb am Di., 29. Nov. 2016 um 18:24 Uhr:

> > From: Dmitry Gutov <dgutov@yandex.ru>
> > Date: Tue, 29 Nov 2016 13:05:39 +0200
> > Cc: emacs-devel@gnu.org
> >
> > On 29.11.2016 13:03, Kentaro NAKAZAWA wrote:
> >
> > > (let* ((content (encode-coding-string
> > >                  "ほげ <- VALID utf-8 Japanese multibyte text"
> > >                  'utf-8))
> > >        (url "https://api.github.com/gists")
> > >        (url-request-method "POST")
> > >        (url-request-data
> > >         (json-encode
> > >          `(("description" . "test")
> > >            ("public" . false)
> > >            ("files" . (("test.txt" . (("content" . ,content)))))))))
> > >   (with-current-buffer (url-retrieve-synchronously url)
> > >     (buffer-string)))
> >
> > json-encode returns a multibyte string.
>
> Any idea why?


Because (symbol-name 'false) returns a multibyte string. I guess the
ultimate reason is that the reader always creates multibyte strings for
symbol names.


> Is it again that 'concat' misfeature, when one of the
> strings is pure-ASCII, but happens to be multibyte?


Why is it a misfeature? I'd expect a concatenation of multibyte and unibyte
strings to either implicitly upgrade to as multibyte string (as in Python
2) or raise a signal (as in Python 3).
That url-retrieve breaks in this case is unfortunate, but I guess we can't
do much about it without breaking other stuff. Maybe the behavior regarding
unibyte and multibyte strings (e.g. what kinds of strings the reader and
`concat' generate) should simply be documented.

[-- Attachment #2: Type: text/html, Size: 2984 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29 23:09                         ` Philipp Stephani
@ 2016-11-29 23:18                           ` Philipp Stephani
  2016-11-30 15:11                             ` Eli Zaretskii
  2016-11-30  0:16                           ` Dmitry Gutov
  2016-11-30 15:06                           ` Eli Zaretskii
  2 siblings, 1 reply; 88+ messages in thread
From: Philipp Stephani @ 2016-11-29 23:18 UTC (permalink / raw)
  To: Eli Zaretskii, Dmitry Gutov; +Cc: larsi, kentaro.nakazawa, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 327 bytes --]

Philipp Stephani <p.stephani2@gmail.com> schrieb am Mi., 30. Nov. 2016 um
00:09 Uhr:

> That url-retrieve breaks in this case is unfortunate, but I guess we can't
> do much about it without breaking other stuff.
>

Ah, I guess the URL functions could simply call string-to-unibyte, that
should do the right thing in all cases.

[-- Attachment #2: Type: text/html, Size: 706 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29 23:09                         ` Philipp Stephani
  2016-11-29 23:18                           ` Philipp Stephani
@ 2016-11-30  0:16                           ` Dmitry Gutov
  2016-11-30 15:13                             ` Eli Zaretskii
  2016-11-30 15:06                           ` Eli Zaretskii
  2 siblings, 1 reply; 88+ messages in thread
From: Dmitry Gutov @ 2016-11-30  0:16 UTC (permalink / raw)
  To: Philipp Stephani, Eli Zaretskii; +Cc: larsi, kentaro.nakazawa, emacs-devel

On 30.11.2016 01:09, Philipp Stephani wrote:

> Because (symbol-name 'false) returns a multibyte string. I guess the ultimate reason is that the reader always creates multibyte strings for symbol names.

Yes. For the same reason,

(json-encode-alist '((a . "abc")))

also returns a multibyte string. And we're likely to see symbols as keys 
a lot.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29 23:09                         ` Philipp Stephani
  2016-11-29 23:18                           ` Philipp Stephani
  2016-11-30  0:16                           ` Dmitry Gutov
@ 2016-11-30 15:06                           ` Eli Zaretskii
  2016-11-30 15:31                             ` Stefan Monnier
  2 siblings, 1 reply; 88+ messages in thread
From: Eli Zaretskii @ 2016-11-30 15:06 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: larsi, emacs-devel, kentaro.nakazawa, dgutov

> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Tue, 29 Nov 2016 23:09:57 +0000
> Cc: larsi@gnus.org, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org
> 
>  > json-encode returns a multibyte string.
> 
>  Any idea why? 
> 
> Because (symbol-name 'false) returns a multibyte string. I guess the ultimate reason is that the reader always
> creates multibyte strings for symbol names.

I'm not sure I understand how symbol-name comes into play here.  Can
you help me understand this?

>  Is it again that 'concat' misfeature, when one of the
>  strings is pure-ASCII, but happens to be multibyte?
> 
> Why is it a misfeature?

Because a pure-ASCII string doesn't need to be multibyte, it's only
becomes that by accident.  The net results is that this misfeature
gets in the way when you want to produce a unibyte string by
concatenating an encoded string and some ASCII text.

> I'd expect a concatenation of multibyte and unibyte strings to either implicitly upgrade
> to as multibyte string (as in Python 2) or raise a signal (as in Python 3).

But when all the strings are either unibyte or pure-ASCII, we could
produce a unibyte string without losing anything.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-29 23:18                           ` Philipp Stephani
@ 2016-11-30 15:11                             ` Eli Zaretskii
  2016-11-30 15:20                               ` Lars Ingebrigtsen
  0 siblings, 1 reply; 88+ messages in thread
From: Eli Zaretskii @ 2016-11-30 15:11 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: larsi, emacs-devel, kentaro.nakazawa, dgutov

> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Tue, 29 Nov 2016 23:18:21 +0000
> Cc: larsi@gnus.org, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org
> 
> Ah, I guess the URL functions could simply call string-to-unibyte, that should do the right thing in all cases. 

That would bring back the problem which caused us to introduce the
test which triggered this bug report.  string-to-unibyte can produce
results that might surprise naïve users, and it also can signal an
error whose text is not fit for showing it to users.

We are trying to avoid using that function, for these very reasons.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30  0:16                           ` Dmitry Gutov
@ 2016-11-30 15:13                             ` Eli Zaretskii
  2016-11-30 15:17                               ` Dmitry Gutov
  0 siblings, 1 reply; 88+ messages in thread
From: Eli Zaretskii @ 2016-11-30 15:13 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa

> Cc: larsi@gnus.org, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Wed, 30 Nov 2016 02:16:36 +0200
> 
> On 30.11.2016 01:09, Philipp Stephani wrote:
> 
> > Because (symbol-name 'false) returns a multibyte string. I guess the ultimate reason is that the reader always creates multibyte strings for symbol names.
> 
> Yes. For the same reason,
> 
> (json-encode-alist '((a . "abc")))
> 
> also returns a multibyte string. And we're likely to see symbols as keys 
> a lot.

Can we do something about that in json-encode-* functions?



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 15:13                             ` Eli Zaretskii
@ 2016-11-30 15:17                               ` Dmitry Gutov
  2016-11-30 15:32                                 ` Stefan Monnier
  2016-11-30 15:42                                 ` Eli Zaretskii
  0 siblings, 2 replies; 88+ messages in thread
From: Dmitry Gutov @ 2016-11-30 15:17 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa

On 30.11.2016 17:13, Eli Zaretskii wrote:

> Can we do something about that in json-encode-* functions?

json-encode uses the previously mentioned symbol-name, which returns 
multibyte values. What would we do about that?



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 15:11                             ` Eli Zaretskii
@ 2016-11-30 15:20                               ` Lars Ingebrigtsen
  2016-11-30 15:43                                 ` Eli Zaretskii
  0 siblings, 1 reply; 88+ messages in thread
From: Lars Ingebrigtsen @ 2016-11-30 15:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Philipp Stephani, emacs-devel, kentaro.nakazawa, dgutov

Eli Zaretskii <eliz@gnu.org> writes:

> We are trying to avoid using that function, for these very reasons.

Indeed.

The entire url-retrieve interface is more than a little broken in many
small ways.

In the next-generation URL library interface (the `with-url' thing
discussed intermittently the past few years) I think it would make sense
to supply the caller with a method to say what charset you want stuff
like this to be encoded with.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 15:06                           ` Eli Zaretskii
@ 2016-11-30 15:31                             ` Stefan Monnier
  0 siblings, 0 replies; 88+ messages in thread
From: Stefan Monnier @ 2016-11-30 15:31 UTC (permalink / raw)
  To: emacs-devel

> But when all the strings are either unibyte or pure-ASCII, we could
> produce a unibyte string without losing anything.

Actually, technically, if we take a multibyte string which only contains
pure-ASCII and convert it to unibyte, we lose information: with
a multibyte string, we can compare the `size` and the `size_byte`
fields, and if they're equal we know we have a pure-ASCII string,
whereas with a unibyte string, we'd have to scan the whole string
looking for a byte >= 128 to determine that it's pure-ASCII.

So maybe the change should be that when concat has to combine a unibyte
string and a multibyte string, it should first look to see if the
multibyte string has `size == size_byte` and if so, generate
a unibyte string.

        Stefan

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 15:17                               ` Dmitry Gutov
@ 2016-11-30 15:32                                 ` Stefan Monnier
  2016-11-30 15:42                                 ` Eli Zaretskii
  1 sibling, 0 replies; 88+ messages in thread
From: Stefan Monnier @ 2016-11-30 15:32 UTC (permalink / raw)
  To: emacs-devel

>> Can we do something about that in json-encode-* functions?
> json-encode uses the previously mentioned symbol-name, which returns
> multibyte values. What would we do about that?

We need to encode the symbol name since it's a plain string which can
contain non-ASCII chars.


        Stefan




^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 15:17                               ` Dmitry Gutov
  2016-11-30 15:32                                 ` Stefan Monnier
@ 2016-11-30 15:42                                 ` Eli Zaretskii
  2016-11-30 15:45                                   ` Dmitry Gutov
  1 sibling, 1 reply; 88+ messages in thread
From: Eli Zaretskii @ 2016-11-30 15:42 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa

> Cc: p.stephani2@gmail.com, larsi@gnus.org, kentaro.nakazawa@nifty.com,
>  emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Wed, 30 Nov 2016 17:17:18 +0200
> 
> On 30.11.2016 17:13, Eli Zaretskii wrote:
> 
> > Can we do something about that in json-encode-* functions?
> 
> json-encode uses the previously mentioned symbol-name, which returns 
> multibyte values. What would we do about that?

Check that the value returned by symbol-name is pure-ASCII, and if so,
make it unibyte?



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 15:20                               ` Lars Ingebrigtsen
@ 2016-11-30 15:43                                 ` Eli Zaretskii
  2016-11-30 15:46                                   ` Lars Ingebrigtsen
  0 siblings, 1 reply; 88+ messages in thread
From: Eli Zaretskii @ 2016-11-30 15:43 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, dgutov

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: Philipp Stephani <p.stephani2@gmail.com>,  dgutov@yandex.ru,  kentaro.nakazawa@nifty.com,  emacs-devel@gnu.org
> Date: Wed, 30 Nov 2016 16:20:20 +0100
> 
> In the next-generation URL library interface (the `with-url' thing
> discussed intermittently the past few years) I think it would make sense
> to supply the caller with a method to say what charset you want stuff
> like this to be encoded with.

Would they ever want anything except utf-8?



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 15:42                                 ` Eli Zaretskii
@ 2016-11-30 15:45                                   ` Dmitry Gutov
  2016-11-30 15:48                                     ` Lars Ingebrigtsen
  2016-11-30 16:23                                     ` Eli Zaretskii
  0 siblings, 2 replies; 88+ messages in thread
From: Dmitry Gutov @ 2016-11-30 15:45 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa

On 30.11.2016 17:42, Eli Zaretskii wrote:

>> json-encode uses the previously mentioned symbol-name, which returns
>> multibyte values. What would we do about that?
>
> Check that the value returned by symbol-name is pure-ASCII, and if so,
> make it unibyte?

In json-encode? Should it really deal with that concern explicitly?

I could understand an idea along the lines of "use a different 
algorithm", but calling encode-coding-string inside json-encode sounds odd.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 15:43                                 ` Eli Zaretskii
@ 2016-11-30 15:46                                   ` Lars Ingebrigtsen
  0 siblings, 0 replies; 88+ messages in thread
From: Lars Ingebrigtsen @ 2016-11-30 15:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, dgutov

Eli Zaretskii <eliz@gnu.org> writes:

> Would they ever want anything except utf-8?

Standard HTTP values should be URL-encoded (or similar) anyway, so
non-URL-encoded values are for pretty non-standard use.  So I would
expect people to create interfaces in whatever charset they happen to
think of.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 15:45                                   ` Dmitry Gutov
@ 2016-11-30 15:48                                     ` Lars Ingebrigtsen
  2016-11-30 16:25                                       ` Eli Zaretskii
  2016-12-28 18:22                                       ` Philipp Stephani
  2016-11-30 16:23                                     ` Eli Zaretskii
  1 sibling, 2 replies; 88+ messages in thread
From: Lars Ingebrigtsen @ 2016-11-30 15:48 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, emacs-devel, p.stephani2, kentaro.nakazawa

Dmitry Gutov <dgutov@yandex.ru> writes:

> In json-encode? Should it really deal with that concern explicitly?
>
> I could understand an idea along the lines of "use a different
> algorithm", but calling encode-coding-string inside json-encode sounds
> odd.

Yes, this is not a json.el problem at all.  It does the correct thing,
and shouldn't be changed.

It's just url.el being lacking in features, as usual.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 15:45                                   ` Dmitry Gutov
  2016-11-30 15:48                                     ` Lars Ingebrigtsen
@ 2016-11-30 16:23                                     ` Eli Zaretskii
  2016-12-01  0:30                                       ` Dmitry Gutov
  1 sibling, 1 reply; 88+ messages in thread
From: Eli Zaretskii @ 2016-11-30 16:23 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa

> Cc: p.stephani2@gmail.com, larsi@gnus.org, kentaro.nakazawa@nifty.com,
>  emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Wed, 30 Nov 2016 17:45:25 +0200
> 
> On 30.11.2016 17:42, Eli Zaretskii wrote:
> 
> >> json-encode uses the previously mentioned symbol-name, which returns
> >> multibyte values. What would we do about that?
> >
> > Check that the value returned by symbol-name is pure-ASCII, and if so,
> > make it unibyte?
> 
> In json-encode? Should it really deal with that concern explicitly?

Since both the original issue and this one are at least indirectly
caused by jason.el, it might make sense.

> I could understand an idea along the lines of "use a different 
> algorithm", but calling encode-coding-string inside json-encode sounds odd.

I didn't mean encode-coding-string, I meant string-make-unibyte, which
for a pure-ASCII string doesn't touch the contents.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 15:48                                     ` Lars Ingebrigtsen
@ 2016-11-30 16:25                                       ` Eli Zaretskii
  2016-11-30 16:27                                         ` Lars Ingebrigtsen
  2016-11-30 18:23                                         ` Philipp Stephani
  2016-12-28 18:22                                       ` Philipp Stephani
  1 sibling, 2 replies; 88+ messages in thread
From: Eli Zaretskii @ 2016-11-30 16:25 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, dgutov

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: Eli Zaretskii <eliz@gnu.org>,  p.stephani2@gmail.com,  kentaro.nakazawa@nifty.com,  emacs-devel@gnu.org
> Date: Wed, 30 Nov 2016 16:48:09 +0100
> 
> Yes, this is not a json.el problem at all.  It does the correct thing,
> and shouldn't be changed.

??? Why should any code care whether a pure-ASCII string is marked as
unibyte or as multibyte?  Both are "correct".



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 16:25                                       ` Eli Zaretskii
@ 2016-11-30 16:27                                         ` Lars Ingebrigtsen
  2016-11-30 16:42                                           ` Eli Zaretskii
  2016-11-30 18:23                                         ` Philipp Stephani
  1 sibling, 1 reply; 88+ messages in thread
From: Lars Ingebrigtsen @ 2016-11-30 16:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, dgutov

Eli Zaretskii <eliz@gnu.org> writes:

>> Yes, this is not a json.el problem at all.  It does the correct thing,
>> and shouldn't be changed.
>
> ??? Why should any code care whether a pure-ASCII string is marked as
> unibyte or as multibyte?  Both are "correct".

That's right -- why should any code care?  Yet url.el does.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 16:27                                         ` Lars Ingebrigtsen
@ 2016-11-30 16:42                                           ` Eli Zaretskii
  2016-11-30 18:25                                             ` Philipp Stephani
  0 siblings, 1 reply; 88+ messages in thread
From: Eli Zaretskii @ 2016-11-30 16:42 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, dgutov

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: dgutov@yandex.ru,  p.stephani2@gmail.com,  kentaro.nakazawa@nifty.com,  emacs-devel@gnu.org
> Date: Wed, 30 Nov 2016 17:27:05 +0100
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> Yes, this is not a json.el problem at all.  It does the correct thing,
> >> and shouldn't be changed.
> >
> > ??? Why should any code care whether a pure-ASCII string is marked as
> > unibyte or as multibyte?  Both are "correct".
> 
> That's right -- why should any code care?  Yet url.el does.

No, it doesn't, not if the string is plain ASCII.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 16:25                                       ` Eli Zaretskii
  2016-11-30 16:27                                         ` Lars Ingebrigtsen
@ 2016-11-30 18:23                                         ` Philipp Stephani
  2016-11-30 18:44                                           ` Eli Zaretskii
  1 sibling, 1 reply; 88+ messages in thread
From: Philipp Stephani @ 2016-11-30 18:23 UTC (permalink / raw)
  To: Eli Zaretskii, Lars Ingebrigtsen; +Cc: emacs-devel, kentaro.nakazawa, dgutov

[-- Attachment #1: Type: text/plain, Size: 875 bytes --]

Eli Zaretskii <eliz@gnu.org> schrieb am Mi., 30. Nov. 2016 um 17:25 Uhr:

> > From: Lars Ingebrigtsen <larsi@gnus.org>
> > Cc: Eli Zaretskii <eliz@gnu.org>,  p.stephani2@gmail.com,
> kentaro.nakazawa@nifty.com,  emacs-devel@gnu.org
> > Date: Wed, 30 Nov 2016 16:48:09 +0100
> >
> > Yes, this is not a json.el problem at all.  It does the correct thing,
> > and shouldn't be changed.
>
> ??? Why should any code care whether a pure-ASCII string is marked as
> unibyte or as multibyte?  Both are "correct".
>

I guess the problem is that process-send-string cares. If it didn't, we
wouldn't have the problem.
For URL, we'd need functions like
  (byte-array-length s) = (length (string-to-unibyte s))
  (process-send-bytes s) = (process-send-string (string-to-unibyte s))
(conceptually; process-send-string also does EOL conversion, which should
never be done for HTTP bodies.)

[-- Attachment #2: Type: text/html, Size: 1802 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 16:42                                           ` Eli Zaretskii
@ 2016-11-30 18:25                                             ` Philipp Stephani
  2016-11-30 18:48                                               ` Eli Zaretskii
  0 siblings, 1 reply; 88+ messages in thread
From: Philipp Stephani @ 2016-11-30 18:25 UTC (permalink / raw)
  To: Eli Zaretskii, Lars Ingebrigtsen; +Cc: dgutov, kentaro.nakazawa, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 934 bytes --]

Eli Zaretskii <eliz@gnu.org> schrieb am Mi., 30. Nov. 2016 um 17:42 Uhr:

> > From: Lars Ingebrigtsen <larsi@gnus.org>
> > Cc: dgutov@yandex.ru,  p.stephani2@gmail.com,
> kentaro.nakazawa@nifty.com,  emacs-devel@gnu.org
> > Date: Wed, 30 Nov 2016 17:27:05 +0100
> >
> > Eli Zaretskii <eliz@gnu.org> writes:
> >
> > >> Yes, this is not a json.el problem at all.  It does the correct thing,
> > >> and shouldn't be changed.
> > >
> > > ??? Why should any code care whether a pure-ASCII string is marked as
> > > unibyte or as multibyte?  Both are "correct".
> >
> > That's right -- why should any code care?  Yet url.el does.
>
> No, it doesn't, not if the string is plain ASCII.
>
>
But in that case it isn't, it's morally a byte array.
What Emacs lacks is good support for byte arrays. For HTTP,
process-send-string shouldn't need to deal with encoding or EOL conversion,
it should just accept a byte array and send that, unmodified.

[-- Attachment #2: Type: text/html, Size: 2099 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 18:23                                         ` Philipp Stephani
@ 2016-11-30 18:44                                           ` Eli Zaretskii
  2016-12-28 18:09                                             ` Philipp Stephani
  0 siblings, 1 reply; 88+ messages in thread
From: Eli Zaretskii @ 2016-11-30 18:44 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: larsi, emacs-devel, kentaro.nakazawa, dgutov

> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Wed, 30 Nov 2016 18:23:14 +0000
> Cc: dgutov@yandex.ru, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org
> 
>  > Yes, this is not a json.el problem at all. It does the correct thing,
>  > and shouldn't be changed.
> 
>  ??? Why should any code care whether a pure-ASCII string is marked as
>  unibyte or as multibyte? Both are "correct".
> 
> I guess the problem is that process-send-string cares. If it didn't, we wouldn't have the problem.

I don't think I follow.  The error we are talking about is signaled
from url-http-create-request, not from process-send-string.

> For URL, we'd need functions like
> (byte-array-length s) = (length (string-to-unibyte s))

Why do you need this?  string-to-unibyte is well-defined only for
unibyte or ASCII strings (if we forget the raw bytes for a moment), so
length will do.

> (process-send-bytes s) = (process-send-string (string-to-unibyte s))

Why is this needed?  process-send-string already encodes its argument,
which produces a unibyte string.

> (conceptually; process-send-string also does EOL conversion, which should never be done for HTTP
> bodies.) 

I don't understand why.  There are protocols that require CR-LF, no?




^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 18:25                                             ` Philipp Stephani
@ 2016-11-30 18:48                                               ` Eli Zaretskii
  2016-12-28 18:18                                                 ` Philipp Stephani
  0 siblings, 1 reply; 88+ messages in thread
From: Eli Zaretskii @ 2016-11-30 18:48 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: larsi, dgutov, kentaro.nakazawa, emacs-devel

> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Wed, 30 Nov 2016 18:25:09 +0000
> Cc: emacs-devel@gnu.org, kentaro.nakazawa@nifty.com, dgutov@yandex.ru
> 
>  > That's right -- why should any code care? Yet url.el does.
> 
>  No, it doesn't, not if the string is plain ASCII.
> 
> But in that case it isn't, it's morally a byte array.

Yes, because the internal representation of characters in Emacs is a
superset of UTF-8.

> What Emacs lacks is good support for byte arrays.

Unibyte strings are byte arrays.  What do you think we lack in that regard?

> For HTTP, process-send-string shouldn't need to deal
> with encoding or EOL conversion, it should just accept a byte array and send that, unmodified.

I disagree.  Handling unibyte strings is a nuisance, so Emacs allows
most applications be oblivious about them, and just handle
human-readable text.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 16:23                                     ` Eli Zaretskii
@ 2016-12-01  0:30                                       ` Dmitry Gutov
  2016-12-01 17:17                                         ` Eli Zaretskii
  2016-12-28 18:25                                         ` Philipp Stephani
  0 siblings, 2 replies; 88+ messages in thread
From: Dmitry Gutov @ 2016-12-01  0:30 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa

On 30.11.2016 18:23, Eli Zaretskii wrote:

> Since both the original issue and this one are at least indirectly
> caused by jason.el, it might make sense.

Triggered, more like. JSON is a frequently-used format, but there are 
others. And same problems will remain when e.g. plain text is used.

> I didn't mean encode-coding-string, I meant string-make-unibyte, which
> for a pure-ASCII string doesn't touch the contents.

Either way, I don't think it's a great idea. Quite the opposite: by 
allowing the programmer to avoid calling `encode-coding-string' in more 
cases, we'll just make the problem in their code harder to find, until 
some user of that code really does need to transfer multibyte content.

Further, now that Emacs 25 is out, and we are allowed to have more 
breaking changes in Emacs 26, I think we should change the check at the 
end of url-http-create-request to just use multibyte-string-p.

Barring some unforeseen consequences, this will solidify the requirement 
that the caller need to deal with encoding explicitly in all cases, 
before passing the request body to the transport level.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-01  0:30                                       ` Dmitry Gutov
@ 2016-12-01 17:17                                         ` Eli Zaretskii
  2016-12-02 13:18                                           ` Dmitry Gutov
  2016-12-28 18:25                                         ` Philipp Stephani
  1 sibling, 1 reply; 88+ messages in thread
From: Eli Zaretskii @ 2016-12-01 17:17 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa

> Cc: p.stephani2@gmail.com, larsi@gnus.org, kentaro.nakazawa@nifty.com,
>  emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Thu, 1 Dec 2016 02:30:15 +0200
> 
> On 30.11.2016 18:23, Eli Zaretskii wrote:
> 
> > Since both the original issue and this one are at least indirectly
> > caused by jason.el, it might make sense.
> 
> Triggered, more like.

Nothing wrong with that.  If some issue isn't a bug, but gets in the
way of a broad class of applications, it is okay to silently DTRT for
that class only, in some central place that serves the class.

> Either way, I don't think it's a great idea. Quite the opposite: by 
> allowing the programmer to avoid calling `encode-coding-string' in more 
> cases, we'll just make the problem in their code harder to find, until 
> some user of that code really does need to transfer multibyte content.

I don't think we will win any hearts by nagging application
programmers when we could silently DTRT ourselves.

> Further, now that Emacs 25 is out, and we are allowed to have more 
> breaking changes in Emacs 26, I think we should change the check at the 
> end of url-http-create-request to just use multibyte-string-p.
> 
> Barring some unforeseen consequences, this will solidify the requirement 
> that the caller need to deal with encoding explicitly in all cases, 
> before passing the request body to the transport level.

Can you show me a patch to that effect, or point me to where it was
posted in the past?  I'm afraid I no longer remember those details.

Thanks.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-01 17:17                                         ` Eli Zaretskii
@ 2016-12-02 13:18                                           ` Dmitry Gutov
  2016-12-02 14:24                                             ` Eli Zaretskii
  2016-12-02 15:29                                             ` Lars Ingebrigtsen
  0 siblings, 2 replies; 88+ messages in thread
From: Dmitry Gutov @ 2016-12-02 13:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa

On 01.12.2016 19:17, Eli Zaretskii wrote:

> Nothing wrong with that.  If some issue isn't a bug, but gets in the
> way of a broad class of applications,

I don't think it's useful to extract applications that use JSON+HTTP 
with ASCII-only payloads into a separate class.

Most of the time (or at least very often) it depends on the user, what 
kind of payload gets sent (with multibyte characters or not).

> it is okay to silently DTRT for
> that class only, in some central place that serves the class.

Those central places are coding.c and url/url-*.el. Not sure what can be 
done there, though.

> I don't think we will win any hearts by nagging application
> programmers when we could silently DTRT ourselves.

We can win the hearts of some users, long term, by making the API such 
that it's harder to do the wrong thing.

You yourself suggested multibyte-string-p originally, and I suggested 
the current more permissive approach more or less because that the new 
release was very close:

https://debbugs.gnu.org/cgi/bugreport.cgi?bug=23750#83

> Can you show me a patch to that effect, or point me to where it was
> posted in the past?  I'm afraid I no longer remember those details.

Something like this:

diff --git a/lisp/url/url-http.el b/lisp/url/url-http.el
index e0e080e..affd5c2 100644
--- a/lisp/url/url-http.el
+++ b/lisp/url/url-http.el
@@ -358,9 +358,8 @@ url-http-create-request
               ;; Any data
               url-http-data))
      ;; Bug#23750
-    (unless (= (string-bytes request)
-               (length request))
-      (error "Multibyte text in HTTP request: %s" request))
+    (when (mutibyte-string-p request)
+      (error "Multibyte text in HTTP request: %s, please translate any 
multibyte components to unibyte using `encode-coding-string'" request))
      (url-http-debug "Request is: \n%s" request)
      request))





^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-02 13:18                                           ` Dmitry Gutov
@ 2016-12-02 14:24                                             ` Eli Zaretskii
  2016-12-02 14:35                                               ` Dmitry Gutov
  2016-12-02 14:53                                               ` Yuri Khan
  2016-12-02 15:29                                             ` Lars Ingebrigtsen
  1 sibling, 2 replies; 88+ messages in thread
From: Eli Zaretskii @ 2016-12-02 14:24 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa

> Cc: p.stephani2@gmail.com, larsi@gnus.org, kentaro.nakazawa@nifty.com,
>  emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Fri, 2 Dec 2016 15:18:48 +0200
> 
> > it is okay to silently DTRT for
> > that class only, in some central place that serves the class.
> 
> Those central places are coding.c and url/url-*.el.

That's not what I meant (and coding.c is definitely not the place),
but let's leave this alone.

> diff --git a/lisp/url/url-http.el b/lisp/url/url-http.el
> index e0e080e..affd5c2 100644
> --- a/lisp/url/url-http.el
> +++ b/lisp/url/url-http.el
> @@ -358,9 +358,8 @@ url-http-create-request
>                ;; Any data
>                url-http-data))
>       ;; Bug#23750
> -    (unless (= (string-bytes request)
> -               (length request))
> -      (error "Multibyte text in HTTP request: %s" request))
> +    (when (mutibyte-string-p request)
> +      (error "Multibyte text in HTTP request: %s, please translate any 
> multibyte components to unibyte using `encode-coding-string'" request))
>       (url-http-debug "Request is: \n%s" request)
>       request))

This will also reject pure-ASCII strings that just happen to be
multibyte, although there will be no problem with such an HTTP
request.  Do we really want to disallow that use case?



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-02 14:24                                             ` Eli Zaretskii
@ 2016-12-02 14:35                                               ` Dmitry Gutov
  2016-12-02 15:20                                                 ` Eli Zaretskii
  2016-12-02 14:53                                               ` Yuri Khan
  1 sibling, 1 reply; 88+ messages in thread
From: Dmitry Gutov @ 2016-12-02 14:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa

On 02.12.2016 16:24, Eli Zaretskii wrote:

> This will also reject pure-ASCII strings that just happen to be
> multibyte, although there will be no problem with such an HTTP
> request.  Do we really want to disallow that use case?

That's the whole point of the patch. I think I've explained why in the 
previous message.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-02 14:24                                             ` Eli Zaretskii
  2016-12-02 14:35                                               ` Dmitry Gutov
@ 2016-12-02 14:53                                               ` Yuri Khan
  2016-12-02 15:45                                                 ` Eli Zaretskii
  2016-12-02 15:51                                                 ` Lars Ingebrigtsen
  1 sibling, 2 replies; 88+ messages in thread
From: Yuri Khan @ 2016-12-02 14:53 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: Philipp Stephani, Emacs developers, kentaro.nakazawa,
	Lars Magne Ingebrigtsen, Dmitry Gutov

On Fri, Dec 2, 2016 at 9:24 PM, Eli Zaretskii <eliz@gnu.org> wrote:

>> +    (when (mutibyte-string-p request)
>> +      (error "Multibyte text in HTTP request: %s, please translate any
>> multibyte components to unibyte using `encode-coding-string'" request))
>>       (url-http-debug "Request is: \n%s" request)
>>       request))
>
> This will also reject pure-ASCII strings that just happen to be
> multibyte, although there will be no problem with such an HTTP
> request.  Do we really want to disallow that use case?

It is really unfortunate that we talk about ASCII strings, unibyte
strings, multibyte strings, as if that was a meaningful
classification.

The real dichotomy is between text (aka strings) and MIME-type-tagged
byte arrays. In order to send a string over HTTP, one must encode it
to a byte array and tag it as "text/plain; charset=utf-8" or
"text/html; charset=utf-8" or application/json (no charset parameter
because json must always be encoded in one of utf-* for transmission).
Conversely, a byte array received over HTTP can, MIME type allowing,
decoded into a string.

The fact that there exist strings for which encoding and decoding are
identity transforms should be regarded only as an implementation
detail. Attempts by libraries and frameworks to silently DTRT for this
subset lead to applications neglecting to properly encode or tag
strings, leading, in turn, to breakage in presence of multilingual
text.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-02 14:35                                               ` Dmitry Gutov
@ 2016-12-02 15:20                                                 ` Eli Zaretskii
  0 siblings, 0 replies; 88+ messages in thread
From: Eli Zaretskii @ 2016-12-02 15:20 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: p.stephani2, emacs-devel, larsi, kentaro.nakazawa

> Cc: p.stephani2@gmail.com, larsi@gnus.org, kentaro.nakazawa@nifty.com,
>  emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Fri, 2 Dec 2016 16:35:32 +0200
> 
> On 02.12.2016 16:24, Eli Zaretskii wrote:
> 
> > This will also reject pure-ASCII strings that just happen to be
> > multibyte, although there will be no problem with such an HTTP
> > request.  Do we really want to disallow that use case?
> 
> That's the whole point of the patch. I think I've explained why in the 
> previous message.

Fine, let's try.

Thanks.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-02 13:18                                           ` Dmitry Gutov
  2016-12-02 14:24                                             ` Eli Zaretskii
@ 2016-12-02 15:29                                             ` Lars Ingebrigtsen
  2016-12-02 15:32                                               ` Dmitry Gutov
  1 sibling, 1 reply; 88+ messages in thread
From: Lars Ingebrigtsen @ 2016-12-02 15:29 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, kentaro.nakazawa, p.stephani2, emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

> -    (unless (= (string-bytes request)
> -               (length request))
> -      (error "Multibyte text in HTTP request: %s" request))
> +    (when (mutibyte-string-p request)
> +      (error "Multibyte text in HTTP request: %s, please translate

This is going to break many current callers.  Most people aren't doing
anything as weird as trying to transmit non-ASCII text via any of these
headers (it's a very uncommon thing to do), but are just passing in
normal Emacs strings (containing nothing by ASCII, as is proper).

These will all fail if you do this, for no real gain.

Sorry to keep harping on about this, but the current url-* interface is
inadequate.  We should leave it be and move on to create a new,
well-defined url-fetching interface.

I hope to get time to do that during my next holiday, which should be in
February.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-02 15:29                                             ` Lars Ingebrigtsen
@ 2016-12-02 15:32                                               ` Dmitry Gutov
  2016-12-02 15:48                                                 ` Lars Ingebrigtsen
  0 siblings, 1 reply; 88+ messages in thread
From: Dmitry Gutov @ 2016-12-02 15:32 UTC (permalink / raw)
  To: Lars Ingebrigtsen
  Cc: Eli Zaretskii, kentaro.nakazawa, p.stephani2, emacs-devel

On 02.12.2016 17:29, Lars Ingebrigtsen wrote:

> This is going to break many current callers.  Most people aren't doing
> anything as weird as trying to transmit non-ASCII text via any of these
> headers (it's a very uncommon thing to do), but are just passing in
> normal Emacs strings (containing nothing by ASCII, as is proper).

Do you have some examples?

> These will all fail if you do this, for no real gain.

That's debatable.

> Sorry to keep harping on about this, but the current url-* interface is
> inadequate.  We should leave it be and move on to create a new,
> well-defined url-fetching interface.

I'm sure a well-defined interface will need to have a required 
"encoding" step, or an argument somewhere, at least.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-02 14:53                                               ` Yuri Khan
@ 2016-12-02 15:45                                                 ` Eli Zaretskii
  2016-12-02 15:51                                                 ` Lars Ingebrigtsen
  1 sibling, 0 replies; 88+ messages in thread
From: Eli Zaretskii @ 2016-12-02 15:45 UTC (permalink / raw)
  To: Yuri Khan; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, larsi, dgutov

> From: Yuri Khan <yuri.v.khan@gmail.com>
> Date: Fri, 2 Dec 2016 21:53:16 +0700
> Cc: Dmitry Gutov <dgutov@yandex.ru>, Philipp Stephani <p.stephani2@gmail.com>, 
> It is really unfortunate that we talk about ASCII strings, unibyte
> strings, multibyte strings, as if that was a meaningful
> classification.

It is meaningful when you work on Emacs code.

> The real dichotomy is between text (aka strings) and MIME-type-tagged
> byte arrays.

That might be so in the context of HTTP, but in general, byte arrays
("raw bytes" in Emacs parlance) are not limited to MIME types.
Moreover, there are very frequent use cases where Emacs code needs to
work with a byte array whose type is unknown, or even cannot be known
at all, because it doesn't come with any meta-data of any kind.

> In order to send a string over HTTP, one must encode it
> to a byte array and tag it as "text/plain; charset=utf-8" or
> "text/html; charset=utf-8" or application/json (no charset parameter
> because json must always be encoded in one of utf-* for transmission).
> Conversely, a byte array received over HTTP can, MIME type allowing,
> decoded into a string.
> 
> The fact that there exist strings for which encoding and decoding are
> identity transforms should be regarded only as an implementation
> detail.

You are talking generalities here, whereas this discussion is about
Emacs-specific internal issues.  In Emacs, a plain-ASCII string is
indistinguishable from a "byte array" whose bytes are all below 128.
They have the same representation.  To muddy the water even more, a
plain-ASCII string can be "marked" as multibyte (again, internally),
but it should be clear that such a "mark" has no meaning at all for
ASCII text.

From the Lisp application POV, whether a plain-ASCII string it
receives or processes is marked as unibyte or multibyte is entirely
random.  So if some ASCII text is accepted by an Emacs API involved in
sending HTTP requests, while an identical ASCII string is rejected,
it could be a source of surprises and bug reports.

That is the core of the issues discussed here.

> Attempts by libraries and frameworks to silently DTRT for this
> subset lead to applications neglecting to properly encode or tag
> strings, leading, in turn, to breakage in presence of multilingual
> text.

Based on Emacs experience of dealing with multibyte text and its
encoding/decoding, the conclusion was that it is better to silently
DTRT where we can be sure we know how.  Making a point of educating
users by harsh measures such as signaling errors where Emacs could
easily proceed, is generally not welcome.  We will see if this case is
any different.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-02 15:32                                               ` Dmitry Gutov
@ 2016-12-02 15:48                                                 ` Lars Ingebrigtsen
  2016-12-02 15:56                                                   ` Dmitry Gutov
  0 siblings, 1 reply; 88+ messages in thread
From: Lars Ingebrigtsen @ 2016-12-02 15:48 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: p.stephani2, Eli Zaretskii, kentaro.nakazawa, emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

> On 02.12.2016 17:29, Lars Ingebrigtsen wrote:
>
>> This is going to break many current callers.  Most people aren't doing
>> anything as weird as trying to transmit non-ASCII text via any of these
>> headers (it's a very uncommon thing to do), but are just passing in
>> normal Emacs strings (containing nothing by ASCII, as is proper).
>
> Do you have some examples?

(multibyte-string-p (symbol-name 'a))
=> t

> I'm sure a well-defined interface will need to have a required
> "encoding" step, or an argument somewhere, at least.

Yes, of course.  The interface will allow the caller to specify the
charset of the data.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-02 14:53                                               ` Yuri Khan
  2016-12-02 15:45                                                 ` Eli Zaretskii
@ 2016-12-02 15:51                                                 ` Lars Ingebrigtsen
  2016-12-02 15:58                                                   ` Eli Zaretskii
  1 sibling, 1 reply; 88+ messages in thread
From: Lars Ingebrigtsen @ 2016-12-02 15:51 UTC (permalink / raw)
  To: Yuri Khan
  Cc: Eli Zaretskii, Dmitry Gutov, kentaro.nakazawa, Philipp Stephani,
	Emacs developers

Yuri Khan <yuri.v.khan@gmail.com> writes:

> The real dichotomy is between text (aka strings) and MIME-type-tagged
> byte arrays.

To nit-pick (this is emacs-devel, after all): "Byte array" isn't very
meaningful, either.  The standards talk about octet streams.  :-)

But you're right, of course: This function has a string-based interface,
which is pretty meaningless, since no protocols (well, extremely few)
deal with characters -- they only deal with octet streams.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-02 15:48                                                 ` Lars Ingebrigtsen
@ 2016-12-02 15:56                                                   ` Dmitry Gutov
  2016-12-02 16:02                                                     ` Lars Ingebrigtsen
  0 siblings, 1 reply; 88+ messages in thread
From: Dmitry Gutov @ 2016-12-02 15:56 UTC (permalink / raw)
  To: Lars Ingebrigtsen
  Cc: p.stephani2, Eli Zaretskii, kentaro.nakazawa, emacs-devel

On 02.12.2016 17:48, Lars Ingebrigtsen wrote:
> Dmitry Gutov <dgutov@yandex.ru> writes:
>
>> On 02.12.2016 17:29, Lars Ingebrigtsen wrote:
>>
>>> This is going to break many current callers.  Most people aren't doing
>>> anything as weird as trying to transmit non-ASCII text via any of these
>>> headers (it's a very uncommon thing to do), but are just passing in
>>> normal Emacs strings (containing nothing by ASCII, as is proper).
>>
>> Do you have some examples?
>
> (multibyte-string-p (symbol-name 'a))
> => t

Examples of things "most people" are doing "trying to transmit" "nothing 
but ASCII" using the URL package, please.

>> I'm sure a well-defined interface will need to have a required
>> "encoding" step, or an argument somewhere, at least.
>
> Yes, of course.  The interface will allow the caller to specify the
> charset of the data.

And at least make it clear that the parameter with default to UTF-8, right?



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-02 15:51                                                 ` Lars Ingebrigtsen
@ 2016-12-02 15:58                                                   ` Eli Zaretskii
  0 siblings, 0 replies; 88+ messages in thread
From: Eli Zaretskii @ 2016-12-02 15:58 UTC (permalink / raw)
  To: Lars Ingebrigtsen
  Cc: p.stephani2, emacs-devel, dgutov, kentaro.nakazawa, yuri.v.khan

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: Eli Zaretskii <eliz@gnu.org>,  Philipp Stephani <p.stephani2@gmail.com>,  Emacs developers <emacs-devel@gnu.org>,  kentaro.nakazawa@nifty.com,  Dmitry Gutov <dgutov@yandex.ru>
> Date: Fri, 02 Dec 2016 16:51:28 +0100
> 
> But you're right, of course: This function has a string-based interface,
> which is pretty meaningless, since no protocols (well, extremely few)
> deal with characters -- they only deal with octet streams.

The Emacs implementation of an octet stream is a unibyte string.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-02 15:56                                                   ` Dmitry Gutov
@ 2016-12-02 16:02                                                     ` Lars Ingebrigtsen
  2016-12-02 16:06                                                       ` Dmitry Gutov
  0 siblings, 1 reply; 88+ messages in thread
From: Lars Ingebrigtsen @ 2016-12-02 16:02 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: p.stephani2, Eli Zaretskii, kentaro.nakazawa, emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

> Examples of things "most people" are doing "trying to transmit"
> "nothing but ASCII" using the URL package, please.

I'm not sure what you want an example of.  That most people try to
transmit nothing but ASCII?  That they may end up with multibyte ASCII
strings without having "meaning" to (because it should make no
difference)?

The first thing is trivially true, and the second I think is also pretty
much self-evident:

(multibyte-string-p (buffer-substring (point) (- (point) 10)))
=> t

> And at least make it clear that the parameter with default to UTF-8, right?

Of course.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-02 16:02                                                     ` Lars Ingebrigtsen
@ 2016-12-02 16:06                                                       ` Dmitry Gutov
  2016-12-02 16:31                                                         ` Lars Ingebrigtsen
  0 siblings, 1 reply; 88+ messages in thread
From: Dmitry Gutov @ 2016-12-02 16:06 UTC (permalink / raw)
  To: Lars Ingebrigtsen
  Cc: p.stephani2, Eli Zaretskii, kentaro.nakazawa, emacs-devel

On 02.12.2016 18:02, Lars Ingebrigtsen wrote:

>> Examples of things "most people" are doing "trying to transmit"
>> "nothing but ASCII" using the URL package, please.
>
> I'm not sure what you want an example of.  That most people try to
> transmit nothing but ASCII?

Yes.

> That they may end up with multibyte ASCII
> strings without having "meaning" to (because it should make no
> difference)?

> The first thing is trivially true, and the second I think is also pretty
> much self-evident:
>
> (multibyte-string-p (buffer-substring (point) (- (point) 10)))
> => t

It's absolutely not a given that most applications or libraries that 
people write with Elisp will end up sending ASCII-only text.

Especially if those applications are then available publicly for other 
people to use.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-02 16:06                                                       ` Dmitry Gutov
@ 2016-12-02 16:31                                                         ` Lars Ingebrigtsen
  2016-12-02 23:13                                                           ` Dmitry Gutov
  0 siblings, 1 reply; 88+ messages in thread
From: Lars Ingebrigtsen @ 2016-12-02 16:31 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: p.stephani2, Eli Zaretskii, kentaro.nakazawa, emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

>> I'm not sure what you want an example of.  That most people try to
>> transmit nothing but ASCII?
>
> Yes.

Normal web applications require that you URL-encode (or similar) any
data you send to them.  These encodings are ASCII only.

Here's a typical example of how this is used:

   (let ((url-request-method "POST")
	 (url-request-extra-headers
	  (list (cons "Content-Type"
		      (concat "multipart/form-data; boundary="
			      boundary))))
	 (url-request-data
	  (mm-url-encode-multipart-form-data values boundary)))

The output from mm-url-encode-multipart-form-data is ASCII, and is
typically multibyte.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-02 16:31                                                         ` Lars Ingebrigtsen
@ 2016-12-02 23:13                                                           ` Dmitry Gutov
  2016-12-03  0:37                                                             ` Lars Ingebrigtsen
  0 siblings, 1 reply; 88+ messages in thread
From: Dmitry Gutov @ 2016-12-02 23:13 UTC (permalink / raw)
  To: Lars Ingebrigtsen
  Cc: p.stephani2, Eli Zaretskii, kentaro.nakazawa, emacs-devel

On 02.12.2016 18:31, Lars Ingebrigtsen wrote:

> Normal web applications require that you URL-encode (or similar) any
> data you send to them.  These encodings are ASCII only.
>
> Here's a typical example of how this is used:
>
>    (let ((url-request-method "POST")
> 	 (url-request-extra-headers
> 	  (list (cons "Content-Type"
> 		      (concat "multipart/form-data; boundary="
> 			      boundary))))
> 	 (url-request-data
> 	  (mm-url-encode-multipart-form-data values boundary)))

Thanks!

> The output from mm-url-encode-multipart-form-data is ASCII, and is
> typically multibyte.

If we make the proposed change, this function will violate the contract 
on url-request-data (if the described above is its main use case).

Luckily, this function is part of Emacs, so we can fix it in the same patch.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-02 23:13                                                           ` Dmitry Gutov
@ 2016-12-03  0:37                                                             ` Lars Ingebrigtsen
  2016-12-03  1:27                                                               ` Dmitry Gutov
  2016-12-03  8:12                                                               ` Eli Zaretskii
  0 siblings, 2 replies; 88+ messages in thread
From: Lars Ingebrigtsen @ 2016-12-03  0:37 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: p.stephani2, emacs-devel, Eli Zaretskii, kentaro.nakazawa

Dmitry Gutov <dgutov@yandex.ru> writes:

> If we make the proposed change, this function will violate the
> contract on url-request-data (if the described above is its main use
> case).
>
> Luckily, this function is part of Emacs, so we can fix it in the same patch.

I'm sorry, I'm not sure how to respond to this without making
accusations of a bad faith response on your part.

This is a function will an ill-defined interface, but virtually all
callers here understand what the interface is ("don't put anything into
the body that isn't ASCII").  Even if wonkily defined, this works for
virtually all callers, in-tree or not.

You're proposing a change that would make virtually all these usages of
this (ill-defined) function fail.

The real fix for this extremely obscure problem is 1) to remove the
`error' call you introduced in Emacs 25.1, and 2) make the
Content-Length header reflect the number of octets transferred instead
of the number of bytes in the URL string.  This would have moved the
number of successful calls to `url-retrieve' from (I'm guesstimating)
99.9995% to 99.999995%, and people who wanted to send iso8859-1 text to
web servers would still fail.  But these people are pretty rare.

Your proposal would move the number of successful calls to
`url-retrieve' with a body to around 0%.

At this point I'm not sure what else to say.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-03  0:37                                                             ` Lars Ingebrigtsen
@ 2016-12-03  1:27                                                               ` Dmitry Gutov
  2016-12-03  8:12                                                               ` Eli Zaretskii
  1 sibling, 0 replies; 88+ messages in thread
From: Dmitry Gutov @ 2016-12-03  1:27 UTC (permalink / raw)
  To: Lars Ingebrigtsen
  Cc: p.stephani2, emacs-devel, Eli Zaretskii, kentaro.nakazawa

On 03.12.2016 02:37, Lars Ingebrigtsen wrote:

> I'm sorry, I'm not sure how to respond to this without making
> accusations of a bad faith response on your part.

All I'm trying to do here is to introduce a more meaningful, stronger 
typing. See Yuri's comment on why that can be important.

I don't really know if the benefits really outweigh the inconvenience, 
but the only example you gave so far can be trivially solved from our side.

That leaves clients that perform "url encoding" manually using their own 
code, but there might be none of them, for all I know.

IME, JSON encoding is more popular than that, and those users are 
affected already.

> This is a function will an ill-defined interface, but virtually all
> callers here understand what the interface is ("don't put anything into
> the body that isn't ASCII").  Even if wonkily defined, this works for
> virtually all callers, in-tree or not.

> You're proposing a change that would make virtually all these usages of
> this (ill-defined) function fail.

True.

> The real fix for this extremely obscure problem is 1) to remove the
> `error' call you introduced in Emacs 25.1, and 2) make the
> Content-Length header reflect the number of octets transferred instead
> of the number of bytes in the URL string.  This would have moved the
> number of successful calls to `url-retrieve' from (I'm guesstimating)
> 99.9995% to 99.999995%, and people who wanted to send iso8859-1 text to
> web servers would still fail.  But these people are pretty rare.
>
> Your proposal would move the number of successful calls to
> `url-retrieve' with a body to around 0%.

Not true. All current users of json.el, at least, who have updated their 
code for Emacs 25, won't be affected. And I imagine they represent a 
significant fraction of `url-retrieve' users.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-03  0:37                                                             ` Lars Ingebrigtsen
  2016-12-03  1:27                                                               ` Dmitry Gutov
@ 2016-12-03  8:12                                                               ` Eli Zaretskii
  2016-12-03 10:01                                                                 ` Lars Ingebrigtsen
  1 sibling, 1 reply; 88+ messages in thread
From: Eli Zaretskii @ 2016-12-03  8:12 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, dgutov

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: p.stephani2@gmail.com,  Eli Zaretskii <eliz@gnu.org>,  kentaro.nakazawa@nifty.com,  emacs-devel@gnu.org
> Date: Sat, 03 Dec 2016 01:37:19 +0100
> 
> I'm sorry, I'm not sure how to respond to this without making
> accusations of a bad faith response on your part.

Please don't.  There's no bad faith on anyone's side here.

> make the Content-Length header reflect the number of octets
> transferred instead of the number of bytes in the URL string.

How do you propose to compute the number of transferred octets, given
that the URL request payload is a string?



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-03  8:12                                                               ` Eli Zaretskii
@ 2016-12-03 10:01                                                                 ` Lars Ingebrigtsen
  2016-12-03 16:00                                                                   ` Stefan Monnier
  0 siblings, 1 reply; 88+ messages in thread
From: Lars Ingebrigtsen @ 2016-12-03 10:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, dgutov

Eli Zaretskii <eliz@gnu.org> writes:

> How do you propose to compute the number of transferred octets, given
> that the URL request payload is a string?

Just use `string-bytes' instead of `length'.  This happens to work since
almost all web services expect utf-8, and our strings happen to be
utf-8, too.  (The few callers that are sending a different charset
already presumably know to encode their data, or their applications
would be failing already.)

Yes, it's yucky, but this is an ill-defined function.  And we should
emphasise backwards compatibility instead of breaking people's code, I
think.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-03 10:01                                                                 ` Lars Ingebrigtsen
@ 2016-12-03 16:00                                                                   ` Stefan Monnier
  2016-12-03 20:01                                                                     ` Lars Ingebrigtsen
  0 siblings, 1 reply; 88+ messages in thread
From: Stefan Monnier @ 2016-12-03 16:00 UTC (permalink / raw)
  To: emacs-devel

> Just use `string-bytes' instead of `length'.

IIRC the problem with that is if the string is the result of
concatenating a unibyte and a multibyte string, in which case the string
may only contain bytes (and hence `length` gives the right result) yet
`string-bytes` and `length` will return different results (because the
≥128 bytes are encoded as 2 bytes in the multibyte representation).

        Stefan

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-03 16:00                                                                   ` Stefan Monnier
@ 2016-12-03 20:01                                                                     ` Lars Ingebrigtsen
  2016-12-03 20:57                                                                       ` Andreas Schwab
  0 siblings, 1 reply; 88+ messages in thread
From: Lars Ingebrigtsen @ 2016-12-03 20:01 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> IIRC the problem with that is if the string is the result of
> concatenating a unibyte and a multibyte string, in which case the string
> may only contain bytes (and hence `length` gives the right result) yet
> `string-bytes` and `length` will return different results (because the
> ≥128 bytes are encoded as 2 bytes in the multibyte representation).

Hm...  I see...  I think...  :-)

Can `string-bytes' return a different number than

(with-temp-buffer
  (set-buffer-multibyte nil)
  (insert string)
  (buffer-size))

?

In any case, this latter is what we want, because those are the octets
that will be transmitted to the server.  Unless there's another
subtlety I'm not aware of, which seems likely.  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-03 20:01                                                                     ` Lars Ingebrigtsen
@ 2016-12-03 20:57                                                                       ` Andreas Schwab
  0 siblings, 0 replies; 88+ messages in thread
From: Andreas Schwab @ 2016-12-03 20:57 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Stefan Monnier, emacs-devel

On Dez 03 2016, Lars Ingebrigtsen <larsi@gnus.org> wrote:

> Can `string-bytes' return a different number than
>
> (with-temp-buffer
>   (set-buffer-multibyte nil)
>   (insert string)
>   (buffer-size))
>
> ?

ELISP> (string-bytes "\200")
1 (#o1, #x1, ?\C-a)
ELISP> (string-bytes (string-make-multibyte "\200"))
2 (#o2, #x2, ?\C-b)
ELISP> (let ((string "\200")) (with-temp-buffer
  (set-buffer-multibyte nil)
  (insert string)
  (buffer-size)))
1 (#o1, #x1, ?\C-a)
ELISP> (let ((string (string-make-multibyte "\200"))) (with-temp-buffer
  (set-buffer-multibyte nil)
  (insert string)
  (buffer-size)))
1 (#o1, #x1, ?\C-a)
ELISP> 

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 18:44                                           ` Eli Zaretskii
@ 2016-12-28 18:09                                             ` Philipp Stephani
  2016-12-28 18:27                                               ` Eli Zaretskii
  0 siblings, 1 reply; 88+ messages in thread
From: Philipp Stephani @ 2016-12-28 18:09 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, emacs-devel, kentaro.nakazawa, dgutov

[-- Attachment #1: Type: text/plain, Size: 2051 bytes --]

Eli Zaretskii <eliz@gnu.org> schrieb am Mi., 30. Nov. 2016 um 19:45 Uhr:

> > From: Philipp Stephani <p.stephani2@gmail.com>
> > Date: Wed, 30 Nov 2016 18:23:14 +0000
> > Cc: dgutov@yandex.ru, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org
> >
> >  > Yes, this is not a json.el problem at all. It does the correct thing,
> >  > and shouldn't be changed.
> >
> >  ??? Why should any code care whether a pure-ASCII string is marked as
> >  unibyte or as multibyte? Both are "correct".
> >
> > I guess the problem is that process-send-string cares. If it didn't, we
> wouldn't have the problem.
>
> I don't think I follow.  The error we are talking about is signaled
> from url-http-create-request, not from process-send-string.
>

Yes, but url-http-create-request only cares about unibyte strings because
the request it creates is passed to process-send-string, which
special-cases unibyte strings.


>
> > For URL, we'd need functions like
> > (byte-array-length s) = (length (string-to-unibyte s))
>
> Why do you need this?  string-to-unibyte is well-defined only for
> unibyte or ASCII strings (if we forget the raw bytes for a moment), so
> length will do.
>

We need it because we have to send the byte length in a header. We can't
just use (length s) because it would silently give a wrong result.


>
> > (process-send-bytes s) = (process-send-string (string-to-unibyte s))
>
> Why is this needed?  process-send-string already encodes its argument,
> which produces a unibyte string.
>

We can't give a multibyte string to process-send-string, because we have to
pass the length in bytes in a header first. Therefore we have to encode any
string before passing it to process-send-string.


>
> > (conceptually; process-send-string also does EOL conversion, which
> should never be done for HTTP
> > bodies.)
>
> I don't understand why.  There are protocols that require CR-LF, no?
>
>
Yes, but HTTP request/response bodies should just be byte arrays and no
conversion whatsoever should happen. After all, the body could be a binary
data format.

[-- Attachment #2: Type: text/html, Size: 3840 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 18:48                                               ` Eli Zaretskii
@ 2016-12-28 18:18                                                 ` Philipp Stephani
  2016-12-28 18:34                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 88+ messages in thread
From: Philipp Stephani @ 2016-12-28 18:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, dgutov, kentaro.nakazawa, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1685 bytes --]

Eli Zaretskii <eliz@gnu.org> schrieb am Mi., 30. Nov. 2016 um 19:48 Uhr:

> > From: Philipp Stephani <p.stephani2@gmail.com>
> > Date: Wed, 30 Nov 2016 18:25:09 +0000
> > Cc: emacs-devel@gnu.org, kentaro.nakazawa@nifty.com, dgutov@yandex.ru
> >
> >  > That's right -- why should any code care? Yet url.el does.
> >
> >  No, it doesn't, not if the string is plain ASCII.
> >
> > But in that case it isn't, it's morally a byte array.
>
> Yes, because the internal representation of characters in Emacs is a
> superset of UTF-8.
>

That has nothing to do with characters. A byte array is conceptually
different from a character string.


>
> > What Emacs lacks is good support for byte arrays.
>
> Unibyte strings are byte arrays.  What do you think we lack in that regard?
>

If unibyte strings should be used for byte arrays, then the URL functions
should indeed signal an error whenever url-request-data is a multibyte
string, as HTTP requests are conceptually byte arrays, not character
strings.


>
> > For HTTP, process-send-string shouldn't need to deal
> > with encoding or EOL conversion, it should just accept a byte array and
> send that, unmodified.
>
> I disagree.  Handling unibyte strings is a nuisance, so Emacs allows
> most applications be oblivious about them, and just handle
> human-readable text.
>

That is the wrong approach (byte arrays and character strings are
fundamentally different types, and mixing them together only causes pain),
and it cannot work when implementing network protocols. HTTP requests are
*not* human-readable text, they are byte arrays. Attempting to handle
Unicode strings can't work because we wouldn't know the number of encoded
bytes.

[-- Attachment #2: Type: text/html, Size: 3100 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-11-30 15:48                                     ` Lars Ingebrigtsen
  2016-11-30 16:25                                       ` Eli Zaretskii
@ 2016-12-28 18:22                                       ` Philipp Stephani
  2016-12-28 18:57                                         ` Lars Ingebrigtsen
  1 sibling, 1 reply; 88+ messages in thread
From: Philipp Stephani @ 2016-12-28 18:22 UTC (permalink / raw)
  To: Lars Ingebrigtsen, Dmitry Gutov
  Cc: Eli Zaretskii, kentaro.nakazawa, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 857 bytes --]

Lars Ingebrigtsen <larsi@gnus.org> schrieb am Mi., 30. Nov. 2016 um
16:48 Uhr:

> Dmitry Gutov <dgutov@yandex.ru> writes:
>
> > In json-encode? Should it really deal with that concern explicitly?
> >
> > I could understand an idea along the lines of "use a different
> > algorithm", but calling encode-coding-string inside json-encode sounds
> > odd.
>
> Yes, this is not a json.el problem at all.  It does the correct thing,
> and shouldn't be changed.
>

Agreed. Neither symbol-function nor concat nor the JSON function do
anything wrong here.


>
> It's just url.el being lacking in features, as usual.
>
>
>
I don't think url.el needs to grow features for encoding; after all, Emacs
already has functions for that. I'd rather add an explicit check for
unibyte-ness of url-request-data and document that url-request-data must be
a unibyte string or nil.

[-- Attachment #2: Type: text/html, Size: 1690 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-01  0:30                                       ` Dmitry Gutov
  2016-12-01 17:17                                         ` Eli Zaretskii
@ 2016-12-28 18:25                                         ` Philipp Stephani
  1 sibling, 0 replies; 88+ messages in thread
From: Philipp Stephani @ 2016-12-28 18:25 UTC (permalink / raw)
  To: Dmitry Gutov, Eli Zaretskii; +Cc: larsi, kentaro.nakazawa, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 406 bytes --]

Dmitry Gutov <dgutov@yandex.ru> schrieb am Do., 1. Dez. 2016 um 01:30 Uhr:

>
> Further, now that Emacs 25 is out, and we are allowed to have more
> breaking changes in Emacs 26, I think we should change the check at the
> end of url-http-create-request to just use multibyte-string-p.
>
>
I think that's a good idea. (The check should also be moved to the front
and documented, but those are minor nits.)

[-- Attachment #2: Type: text/html, Size: 785 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-28 18:09                                             ` Philipp Stephani
@ 2016-12-28 18:27                                               ` Eli Zaretskii
  2016-12-28 18:35                                                 ` Philipp Stephani
  0 siblings, 1 reply; 88+ messages in thread
From: Eli Zaretskii @ 2016-12-28 18:27 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: larsi, emacs-devel, kentaro.nakazawa, dgutov

> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Wed, 28 Dec 2016 18:09:52 +0000
> Cc: larsi@gnus.org, dgutov@yandex.ru, kentaro.nakazawa@nifty.com, 
> 	emacs-devel@gnu.org
> 
> 
> [1:text/plain Show]
> 
> 
> [2:text/html Hide Save:noname (9kB)]
> 
> Eli Zaretskii <eliz@gnu.org> schrieb am Mi., 30. Nov. 2016 um 19:45 Uhr:
> 
>  > From: Philipp Stephani <p.stephani2@gmail.com>
>  > Date: Wed, 30 Nov 2016 18:23:14 +0000
>  > Cc: dgutov@yandex.ru, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org
>  >
>  > > Yes, this is not a json.el problem at all. It does the correct thing,
>  > > and shouldn't be changed.
>  >
>  > ??? Why should any code care whether a pure-ASCII string is marked as
>  > unibyte or as multibyte? Both are "correct".
>  >
>  > I guess the problem is that process-send-string cares. If it didn't, we wouldn't have the problem.
> 
>  I don't think I follow. The error we are talking about is signaled
>  from url-http-create-request, not from process-send-string.
> 
> Yes, but url-http-create-request only cares about unibyte strings because the request it creates is passed to
> process-send-string, which special-cases unibyte strings.

How do you see that process-send-string special-cases unibyte strings?

>  > For URL, we'd need functions like
>  > (byte-array-length s) = (length (string-to-unibyte s))
> 
>  Why do you need this? string-to-unibyte is well-defined only for
>  unibyte or ASCII strings (if we forget the raw bytes for a moment), so
>  length will do.
> 
> We need it because we have to send the byte length in a header. We can't just use (length s) because it
> would silently give a wrong result.

We are miscommunicating.  string-to-unibyte can only meaningfully be
called on a pure-ASCII string, and for pure-ASCII strings 'length'
will count bytes.  So I see no need for 'byte-array-length' if its
implementation is as you indicated.

>  > (process-send-bytes s) = (process-send-string (string-to-unibyte s))
> 
>  Why is this needed? process-send-string already encodes its argument,
>  which produces a unibyte string.
> 
> We can't give a multibyte string to process-send-string, because we have to pass the length in bytes in a
> header first. Therefore we have to encode any string before passing it to process-send-string.

Once you encoded the string, why do you need anything except calling
process-send-string?



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-28 18:18                                                 ` Philipp Stephani
@ 2016-12-28 18:34                                                   ` Eli Zaretskii
  2016-12-28 18:45                                                     ` Philipp Stephani
  0 siblings, 1 reply; 88+ messages in thread
From: Eli Zaretskii @ 2016-12-28 18:34 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: larsi, dgutov, kentaro.nakazawa, emacs-devel

> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Wed, 28 Dec 2016 18:18:25 +0000
> Cc: larsi@gnus.org, emacs-devel@gnu.org, kentaro.nakazawa@nifty.com, 
> 	dgutov@yandex.ru
> 
>  > > That's right -- why should any code care? Yet url.el does.
>  >
>  > No, it doesn't, not if the string is plain ASCII.
>  >
>  > But in that case it isn't, it's morally a byte array.
> 
>  Yes, because the internal representation of characters in Emacs is a
>  superset of UTF-8.
> 
> That has nothing to do with characters. A byte array is conceptually different from a character string.

In Emacs, they are both implemented using very similar objects.

>  > What Emacs lacks is good support for byte arrays.
> 
>  Unibyte strings are byte arrays. What do you think we lack in that regard?
> 
> If unibyte strings should be used for byte arrays, then the URL functions should indeed signal an error
> whenever url-request-data is a multibyte string, as HTTP requests are conceptually byte arrays, not character
> strings.

Which is what we do now.

>  > For HTTP, process-send-string shouldn't need to deal
>  > with encoding or EOL conversion, it should just accept a byte array and send that, unmodified.
> 
>  I disagree. Handling unibyte strings is a nuisance, so Emacs allows
>  most applications be oblivious about them, and just handle
>  human-readable text.
> 
> That is the wrong approach (byte arrays and character strings are fundamentally different types, and mixing
> them together only causes pain), and it cannot work when implementing network protocols. HTTP requests
> are *not* human-readable text, they are byte arrays. Attempting to handle Unicode strings can't work because
> we wouldn't know the number of encoded bytes.

You are arguing against a long and quite painful history of non-ASCII
strings in Emacs.  What we have now is based on a lot of experience
and at least two very large refactoring jobs.  Going back would be a
very bad idea indeed, as we've been there already, and users didn't
like that.  Some of us are old enough to remember the notorious \201
bytes creeping into text files and mail messages, due to that.  Never
again.

Our experience is that we should keep use of unibyte strings in Lisp
application code to the absolute minimum, ideally zero.  Once we
arrived at that conclusion, we've been living happily ever after.
This minor issue we are discussing here is certainly not worth
repeating past mistakes for which we paid plenty in sweat and blood.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-28 18:27                                               ` Eli Zaretskii
@ 2016-12-28 18:35                                                 ` Philipp Stephani
  2016-12-28 18:45                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 88+ messages in thread
From: Philipp Stephani @ 2016-12-28 18:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, dgutov, kentaro.nakazawa, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 3307 bytes --]

Eli Zaretskii <eliz@gnu.org> schrieb am Mi., 28. Dez. 2016 um 19:28 Uhr:

> > From: Philipp Stephani <p.stephani2@gmail.com>
> > Date: Wed, 28 Dec 2016 18:09:52 +0000
> > Cc: larsi@gnus.org, dgutov@yandex.ru, kentaro.nakazawa@nifty.com,
> >       emacs-devel@gnu.org
> >
> >
> > [1:text/plain Show]
> >
> >
> > [2:text/html Hide Save:noname (9kB)]
> >
> > Eli Zaretskii <eliz@gnu.org> schrieb am Mi., 30. Nov. 2016 um 19:45 Uhr:
> >
> >  > From: Philipp Stephani <p.stephani2@gmail.com>
> >  > Date: Wed, 30 Nov 2016 18:23:14 +0000
> >  > Cc: dgutov@yandex.ru, kentaro.nakazawa@nifty.com, emacs-devel@gnu.org
> >  >
> >  > > Yes, this is not a json.el problem at all. It does the correct
> thing,
> >  > > and shouldn't be changed.
> >  >
> >  > ??? Why should any code care whether a pure-ASCII string is marked as
> >  > unibyte or as multibyte? Both are "correct".
> >  >
> >  > I guess the problem is that process-send-string cares. If it didn't,
> we wouldn't have the problem.
> >
> >  I don't think I follow. The error we are talking about is signaled
> >  from url-http-create-request, not from process-send-string.
> >
> > Yes, but url-http-create-request only cares about unibyte strings
> because the request it creates is passed to
> > process-send-string, which special-cases unibyte strings.
>
> How do you see that process-send-string special-cases unibyte strings?
>

The send_process function has two branches, one for unibyte, one for
multibyte.


>
> >  > For URL, we'd need functions like
> >  > (byte-array-length s) = (length (string-to-unibyte s))
> >
> >  Why do you need this? string-to-unibyte is well-defined only for
> >  unibyte or ASCII strings (if we forget the raw bytes for a moment), so
> >  length will do.
> >
> > We need it because we have to send the byte length in a header. We can't
> just use (length s) because it
> > would silently give a wrong result.
>
> We are miscommunicating.  string-to-unibyte can only meaningfully be
> called on a pure-ASCII string, and for pure-ASCII strings 'length'
> will count bytes.  So I see no need for 'byte-array-length' if its
> implementation is as you indicated.
>

That depends on how you want to represent byte arrays/octet streams in
Emacs. If you want to represent them using unibyte strings, then you indeed
only need `length'. But some earlier messages sounded like you wanted to
represent byte arrays either using unibyte strings or byte-only multibyte
strings. In that case `string-to-unibyte' is necessary.


>
> >  > (process-send-bytes s) = (process-send-string (string-to-unibyte s))
> >
> >  Why is this needed? process-send-string already encodes its argument,
> >  which produces a unibyte string.
> >
> > We can't give a multibyte string to process-send-string, because we have
> to pass the length in bytes in a
> > header first. Therefore we have to encode any string before passing it
> to process-send-string.
>
> Once you encoded the string, why do you need anything except calling
> process-send-string?
>
>
The byte size should be added as a Content-length HTTP header. If
url-request-data is a unibyte string, that's not a problem (except for the
newline conversion behavior in send_string), you can just use `length'. But
if it's a multibyte string, you need to encode first to find the byte
length.

[-- Attachment #2: Type: text/html, Size: 6155 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-28 18:34                                                   ` Eli Zaretskii
@ 2016-12-28 18:45                                                     ` Philipp Stephani
  2016-12-28 18:55                                                       ` Eli Zaretskii
  2016-12-28 19:03                                                       ` Andreas Schwab
  0 siblings, 2 replies; 88+ messages in thread
From: Philipp Stephani @ 2016-12-28 18:45 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, dgutov, kentaro.nakazawa, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 3366 bytes --]

Eli Zaretskii <eliz@gnu.org> schrieb am Mi., 28. Dez. 2016 um 19:35 Uhr:

> > From: Philipp Stephani <p.stephani2@gmail.com>
> > Date: Wed, 28 Dec 2016 18:18:25 +0000
> > Cc: larsi@gnus.org, emacs-devel@gnu.org, kentaro.nakazawa@nifty.com,
> >       dgutov@yandex.ru
> >
> >  > > That's right -- why should any code care? Yet url.el does.
> >  >
> >  > No, it doesn't, not if the string is plain ASCII.
> >  >
> >  > But in that case it isn't, it's morally a byte array.
> >
> >  Yes, because the internal representation of characters in Emacs is a
> >  superset of UTF-8.
> >
> > That has nothing to do with characters. A byte array is conceptually
> different from a character string.
>
> In Emacs, they are both implemented using very similar objects.
>

Yes, that's why I said "conceptually different". The concepts may be the
different, but the implementation might still be the same.


>
> >  > What Emacs lacks is good support for byte arrays.
> >
> >  Unibyte strings are byte arrays. What do you think we lack in that
> regard?
> >
> > If unibyte strings should be used for byte arrays, then the URL
> functions should indeed signal an error
> > whenever url-request-data is a multibyte string, as HTTP requests are
> conceptually byte arrays, not character
> > strings.
>
> Which is what we do now.
>

There is no such check for url-request-data. There's an overall check for
the complete request, but that also doesn't check for unibyte-ness.


>
> >  > For HTTP, process-send-string shouldn't need to deal
> >  > with encoding or EOL conversion, it should just accept a byte array
> and send that, unmodified.
> >
> >  I disagree. Handling unibyte strings is a nuisance, so Emacs allows
> >  most applications be oblivious about them, and just handle
> >  human-readable text.
> >
> > That is the wrong approach (byte arrays and character strings are
> fundamentally different types, and mixing
> > them together only causes pain), and it cannot work when implementing
> network protocols. HTTP requests
> > are *not* human-readable text, they are byte arrays. Attempting to
> handle Unicode strings can't work because
> > we wouldn't know the number of encoded bytes.
>
> You are arguing against a long and quite painful history of non-ASCII
> strings in Emacs.  What we have now is based on a lot of experience
> and at least two very large refactoring jobs.  Going back would be a
> very bad idea indeed, as we've been there already, and users didn't
> like that.  Some of us are old enough to remember the notorious \201
> bytes creeping into text files and mail messages, due to that.  Never
> again.
>

I'm not suggesting going back, too much would be broken.


>
> Our experience is that we should keep use of unibyte strings in Lisp
> application code to the absolute minimum, ideally zero.  Once we
> arrived at that conclusion, we've been living happily ever after.
> This minor issue we are discussing here is certainly not worth
> repeating past mistakes for which we paid plenty in sweat and blood.
>

If you want unibyte strings to represent octet streams, then unibyte
strings must be usable in application code, because octet streams are a
concept that exists in reality, and applications must be able to support
them in some way. If you don't want unibyte strings, then you need to
provide some different way to represent octet streams.

[-- Attachment #2: Type: text/html, Size: 5777 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-28 18:35                                                 ` Philipp Stephani
@ 2016-12-28 18:45                                                   ` Eli Zaretskii
  0 siblings, 0 replies; 88+ messages in thread
From: Eli Zaretskii @ 2016-12-28 18:45 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: larsi, dgutov, kentaro.nakazawa, emacs-devel

> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Wed, 28 Dec 2016 18:35:58 +0000
> Cc: larsi@gnus.org, emacs-devel@gnu.org, kentaro.nakazawa@nifty.com, 
> 	dgutov@yandex.ru
> 
>  How do you see that process-send-string special-cases unibyte strings?
> 
> The send_process function has two branches, one for unibyte, one for multibyte.

That's not special-casing.  That's polymorphism, if you like: Emacs
silently does TRT for both.

>  We are miscommunicating. string-to-unibyte can only meaningfully be
>  called on a pure-ASCII string, and for pure-ASCII strings 'length'
>  will count bytes. So I see no need for 'byte-array-length' if its
>  implementation is as you indicated.
> 
> That depends on how you want to represent byte arrays/octet streams in Emacs. If you want to represent
> them using unibyte strings, then you indeed only need `length'. But some earlier messages sounded like you
> wanted to represent byte arrays either using unibyte strings or byte-only multibyte strings. In that case
> `string-to-unibyte' is necessary.

No, it's not.  Multibyte strings that include raw bytes are converted
to single bytes when you encode them.

>  Once you encoded the string, why do you need anything except calling
>  process-send-string?
> 
> The byte size should be added as a Content-length HTTP header. If url-request-data is a unibyte string, that's
> not a problem (except for the newline conversion behavior in send_string), you can just use `length'. But if it's
> a multibyte string, you need to encode first to find the byte length. 

I thought we've just agreed that multibyte strings there should not be
allowed.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-28 18:45                                                     ` Philipp Stephani
@ 2016-12-28 18:55                                                       ` Eli Zaretskii
  2016-12-28 19:03                                                       ` Andreas Schwab
  1 sibling, 0 replies; 88+ messages in thread
From: Eli Zaretskii @ 2016-12-28 18:55 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: larsi, dgutov, kentaro.nakazawa, emacs-devel

> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Wed, 28 Dec 2016 18:45:43 +0000
> Cc: larsi@gnus.org, emacs-devel@gnu.org, kentaro.nakazawa@nifty.com, 
> 	dgutov@yandex.ru
> 
>  > That has nothing to do with characters. A byte array is conceptually different from a character string.
> 
>  In Emacs, they are both implemented using very similar objects.
> 
> Yes, that's why I said "conceptually different". The concepts may be the different, but the implementation
> might still be the same.

If the implementation is the same, then concepts are not very
different to begin with, and the abstraction will sooner or later
leak into applications.

>  Our experience is that we should keep use of unibyte strings in Lisp
>  application code to the absolute minimum, ideally zero. Once we
>  arrived at that conclusion, we've been living happily ever after.
>  This minor issue we are discussing here is certainly not worth
>  repeating past mistakes for which we paid plenty in sweat and blood.
> 
> If you want unibyte strings to represent octet streams, then unibyte strings must be usable in application
> code

They are usable, but using them requires knowledge and proficiency
that's unusual with many Lisp developers, and it also has some
unpleasant pitfalls.

> because octet streams are a concept that exists in reality, and applications must be able to support
> them in some way. If you don't want unibyte strings, then you need to provide some different way to represent
> octet streams. 

We use unibyte strings where we must, and otherwise prefer multibyte
ones.  In most cases the unibyte strings exist in Emacs internals, so
that Lisp applications will not have to deal with them.  This case is
one of the few exceptions.

If you are still unconvinced and think that we need some separate
representation for byte arrays, consider this: when Emacs starts, it
takes some time until it bootstraps itself enough to learn how to
decode non-ASCII strings, such as file names.  Until then, all file
names are unibyte strings, and Emacs still must handle them correctly,
because otherwise it would be impossible to build or start it in a
directory that includes non-ASCII characters.

This and other similar subtleties are the reason why using anything
but a string for raw byte arrays is not a good idea, IMO and IME.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-28 18:22                                       ` Philipp Stephani
@ 2016-12-28 18:57                                         ` Lars Ingebrigtsen
  2016-12-30  0:07                                           ` Richard Stallman
  0 siblings, 1 reply; 88+ messages in thread
From: Lars Ingebrigtsen @ 2016-12-28 18:57 UTC (permalink / raw)
  To: Philipp Stephani
  Cc: Eli Zaretskii, emacs-devel, kentaro.nakazawa, Dmitry Gutov

Philipp Stephani <p.stephani2@gmail.com> writes:

> I don't think url.el needs to grow features for encoding; after all, Emacs
> already has functions for that. I'd rather add an explicit check for
> unibyte-ness of url-request-data and document that url-request-data must be
> a unibyte string or nil. 

Nah.  If you want to do something here, just compute the correct length
header (as previously discussed), and virtually all callers will be happy.

I've started working on a `with-url' functionality that'll replace the
current mess.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-28 18:45                                                     ` Philipp Stephani
  2016-12-28 18:55                                                       ` Eli Zaretskii
@ 2016-12-28 19:03                                                       ` Andreas Schwab
  1 sibling, 0 replies; 88+ messages in thread
From: Andreas Schwab @ 2016-12-28 19:03 UTC (permalink / raw)
  To: Philipp Stephani
  Cc: Eli Zaretskii, emacs-devel, kentaro.nakazawa, larsi, dgutov

On Dez 28 2016, Philipp Stephani <p.stephani2@gmail.com> wrote:

> If you want unibyte strings to represent octet streams, then unibyte
> strings must be usable in application code, because octet streams are a
> concept that exists in reality, and applications must be able to support
> them in some way. If you don't want unibyte strings, then you need to
> provide some different way to represent octet streams.

Octet streams are basically encoded strings, and we use unibyte strings
for encoded strings.  That's the only place where unibyte strings should
be used in Emacs.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-28 18:57                                         ` Lars Ingebrigtsen
@ 2016-12-30  0:07                                           ` Richard Stallman
  2016-12-30 14:15                                             ` Lars Ingebrigtsen
  0 siblings, 1 reply; 88+ messages in thread
From: Richard Stallman @ 2016-12-30  0:07 UTC (permalink / raw)
  To: Lars Ingebrigtsen
  Cc: p.stephani2, dgutov, kentaro.nakazawa, eliz, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > I've started working on a `with-url' functionality that'll replace the
  > current mess.

The name `with-url' suggests that Emacs has some sort of "current URL",
and that this macro temporarily specifies some particular URL as current.

That's not the case, is it?  So the name `with-url' doesn't fit
what it does.  (What does it do?)

We should change the name to something that fits what it does.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-30  0:07                                           ` Richard Stallman
@ 2016-12-30 14:15                                             ` Lars Ingebrigtsen
  2016-12-30 16:59                                               ` Eli Zaretskii
  2016-12-30 21:38                                               ` Richard Stallman
  0 siblings, 2 replies; 88+ messages in thread
From: Lars Ingebrigtsen @ 2016-12-30 14:15 UTC (permalink / raw)
  To: Richard Stallman; +Cc: p.stephani2, dgutov, kentaro.nakazawa, emacs-devel

Richard Stallman <rms@gnu.org> writes:

> The name `with-url' suggests that Emacs has some sort of "current URL",
> and that this macro temporarily specifies some particular URL as current.
>
> That's not the case, is it?  So the name `with-url' doesn't fit
> what it does.  (What does it do?)

It's like `with-temp-buffer' and it's cousins: It generates a new
buffer, executes the body in that buffer, and kills the buffer when the
form finishes.

The contents of the buffer come from the specified URL, of course.  See
the recent discussion of with-url on emacs-devel.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-30 14:15                                             ` Lars Ingebrigtsen
@ 2016-12-30 16:59                                               ` Eli Zaretskii
  2017-01-21 15:39                                                 ` Lars Ingebrigtsen
  2016-12-30 21:38                                               ` Richard Stallman
  1 sibling, 1 reply; 88+ messages in thread
From: Eli Zaretskii @ 2016-12-30 16:59 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, rms, dgutov

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Date: Fri, 30 Dec 2016 15:15:26 +0100
> Cc: p.stephani2@gmail.com, dgutov@yandex.ru, kentaro.nakazawa@nifty.com,
> 	emacs-devel@gnu.org
> 
> Richard Stallman <rms@gnu.org> writes:
> 
> > The name `with-url' suggests that Emacs has some sort of "current URL",
> > and that this macro temporarily specifies some particular URL as current.
> >
> > That's not the case, is it?  So the name `with-url' doesn't fit
> > what it does.  (What does it do?)
> 
> It's like `with-temp-buffer' and it's cousins: It generates a new
> buffer, executes the body in that buffer, and kills the buffer when the
> form finishes.

How about 'with-fetched-url', then?



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-30 14:15                                             ` Lars Ingebrigtsen
  2016-12-30 16:59                                               ` Eli Zaretskii
@ 2016-12-30 21:38                                               ` Richard Stallman
  1 sibling, 0 replies; 88+ messages in thread
From: Richard Stallman @ 2016-12-30 21:38 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, dgutov

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > That's not the case, is it?  So the name `with-url' doesn't fit
  > > what it does.  (What does it do?)

  > It's like `with-temp-buffer' and it's cousins: It generates a new
  > buffer, executes the body in that buffer, and kills the buffer when the
  > form finishes.

It sounds useful, but the name isn't clear.  Let's call it
`with-url-contents'; that fits what it does.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2016-12-30 16:59                                               ` Eli Zaretskii
@ 2017-01-21 15:39                                                 ` Lars Ingebrigtsen
  2017-01-21 15:56                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 88+ messages in thread
From: Lars Ingebrigtsen @ 2017-01-21 15:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, emacs-devel, kentaro.nakazawa, rms, dgutov

Eli Zaretskii <eliz@gnu.org> writes:

>> It's like `with-temp-buffer' and it's cousins: It generates a new
>> buffer, executes the body in that buffer, and kills the buffer when the
>> form finishes.
>
> How about 'with-fetched-url', then?

Hm...  I'm not sure it gives us more clarity.  It should really be
`with-content-fetched-from-specified-url', but that's a bit long, right?
So I think `with-url' is fine for anybody who's working with these
things.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2017-01-21 15:39                                                 ` Lars Ingebrigtsen
@ 2017-01-21 15:56                                                   ` Eli Zaretskii
  2017-01-21 16:30                                                     ` Lars Ingebrigtsen
  0 siblings, 1 reply; 88+ messages in thread
From: Eli Zaretskii @ 2017-01-21 15:56 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: p.stephani2, dgutov, kentaro.nakazawa, rms, emacs-devel

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Date: Sat, 21 Jan 2017 16:39:12 +0100
> Cc: p.stephani2@gmail.com, emacs-devel@gnu.org, kentaro.nakazawa@nifty.com,
> 	rms@gnu.org, dgutov@yandex.ru
> 
> > How about 'with-fetched-url', then?
> 
> Hm...  I'm not sure it gives us more clarity.  It should really be
> `with-content-fetched-from-specified-url', but that's a bit long, right?
> So I think `with-url' is fine for anybody who's working with these
> things.

Both Richard and myself came up with almost identical comments on
with-url, so I hope you will reconsider.



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2017-01-21 15:56                                                   ` Eli Zaretskii
@ 2017-01-21 16:30                                                     ` Lars Ingebrigtsen
  2017-01-21 22:58                                                       ` Stefan Monnier
  0 siblings, 1 reply; 88+ messages in thread
From: Lars Ingebrigtsen @ 2017-01-21 16:30 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, dgutov, kentaro.nakazawa, rms, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> Both Richard and myself came up with almost identical comments on
> with-url, so I hope you will reconsider.

Perhaps we could have a vote.  The contenders are `with-url',
`with-fetched-url', `with-url-contents' and
`with-contents-in-a-buffer-fetched-from-somewhere-specified-by-the-following-url'.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2017-01-21 16:30                                                     ` Lars Ingebrigtsen
@ 2017-01-21 22:58                                                       ` Stefan Monnier
  2017-01-24 20:04                                                         ` Lars Ingebrigtsen
  0 siblings, 1 reply; 88+ messages in thread
From: Stefan Monnier @ 2017-01-21 22:58 UTC (permalink / raw)
  To: emacs-devel

>>>>> "Lars" == Lars Ingebrigtsen <larsi@gnus.org> writes:

> Eli Zaretskii <eliz@gnu.org> writes:
>> Both Richard and myself came up with almost identical comments on
>> with-url, so I hope you will reconsider.

> Perhaps we could have a vote.  The contenders are `with-url',
> `with-fetched-url', `with-url-contents' and
> `with-contents-in-a-buffer-fetched-from-somewhere-specified-by-the-following-url'.

I vote against with-url and
with-contents-in-a-buffer-fetched-from-somewhere-specified-by-the-following-url.
The other two seem fine,


        Stefan




^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2017-01-21 22:58                                                       ` Stefan Monnier
@ 2017-01-24 20:04                                                         ` Lars Ingebrigtsen
  2017-01-28  9:52                                                           ` Elias Mårtenson
  0 siblings, 1 reply; 88+ messages in thread
From: Lars Ingebrigtsen @ 2017-01-24 20:04 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> Perhaps we could have a vote.  The contenders are `with-url',
>> `with-fetched-url', `with-url-contents' and
>> `with-contents-in-a-buffer-fetched-from-somewhere-specified-by-the-following-url'.
>
> I vote against with-url and
> with-contents-in-a-buffer-fetched-from-somewhere-specified-by-the-following-url.
> The other two seem fine,

OK, then we have 1 vote for `with-url', 1.5 votes for `with-fetched-url'
and `with-url-contents' each, and zero for
`with-contents-in-a-buffer-fetched-from-somewhere-specified-by-the-following-url'.

The competition is heating up!

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2017-01-24 20:04                                                         ` Lars Ingebrigtsen
@ 2017-01-28  9:52                                                           ` Elias Mårtenson
  2017-01-28 14:16                                                             ` Lars Ingebrigtsen
  0 siblings, 1 reply; 88+ messages in thread
From: Elias Mårtenson @ 2017-01-28  9:52 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: Stefan Monnier, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 937 bytes --]

Who is allowed to vote? I consider with-url to be less than ideal and not
very clear. with-url-contents is a lot better.

Regards,
Elias


On 25 Jan 2017 4:06 AM, "Lars Ingebrigtsen" <larsi@gnus.org> wrote:

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> Perhaps we could have a vote.  The contenders are `with-url',
>> `with-fetched-url', `with-url-contents' and
>> `with-contents-in-a-buffer-fetched-from-somewhere-
specified-by-the-following-url'.
>
> I vote against with-url and
> with-contents-in-a-buffer-fetched-from-somewhere-
specified-by-the-following-url.
> The other two seem fine,

OK, then we have 1 vote for `with-url', 1.5 votes for `with-fetched-url'
and `with-url-contents' each, and zero for
`with-contents-in-a-buffer-fetched-from-somewhere-
specified-by-the-following-url'.

The competition is heating up!

--
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

[-- Attachment #2: Type: text/html, Size: 1695 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: bug#23750: 25.0.95; bug in url-retrieve or json.el
  2017-01-28  9:52                                                           ` Elias Mårtenson
@ 2017-01-28 14:16                                                             ` Lars Ingebrigtsen
  0 siblings, 0 replies; 88+ messages in thread
From: Lars Ingebrigtsen @ 2017-01-28 14:16 UTC (permalink / raw)
  To: Elias Mårtenson; +Cc: Stefan Monnier, emacs-devel

Elias Mårtenson <lokedhs@gmail.com> writes:

> Who is allowed to vote? I consider with-url to be less than ideal and not very
> clear. with-url-contents is a lot better. 

OK, `with-url-contents' is now the clear leader here with 2.5 votes!

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 88+ messages in thread

end of thread, other threads:[~2017-01-28 14:16 UTC | newest]

Thread overview: 88+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-29  8:22 bug#23750: 25.0.95; bug in url-retrieve or json.el Kentaro NAKAZAWA
2016-11-29  9:54 ` Andreas Schwab
2016-11-29 10:06   ` Kentaro NAKAZAWA
2016-11-29 10:08     ` Dmitry Gutov
2016-11-29 10:23       ` Kentaro NAKAZAWA
2016-11-29 10:34         ` Lars Ingebrigtsen
2016-11-29 10:38           ` Kentaro NAKAZAWA
2016-11-29 10:42             ` Lars Ingebrigtsen
2016-11-29 10:48               ` Kentaro NAKAZAWA
2016-11-29 10:49               ` Dmitry Gutov
2016-11-29 10:50             ` Dmitry Gutov
2016-11-29 10:55               ` Kentaro NAKAZAWA
2016-11-29 10:59                 ` Dmitry Gutov
2016-11-29 11:03                   ` Kentaro NAKAZAWA
2016-11-29 11:05                     ` Dmitry Gutov
2016-11-29 11:12                       ` Kentaro NAKAZAWA
2016-11-29 17:23                       ` Eli Zaretskii
2016-11-29 23:09                         ` Philipp Stephani
2016-11-29 23:18                           ` Philipp Stephani
2016-11-30 15:11                             ` Eli Zaretskii
2016-11-30 15:20                               ` Lars Ingebrigtsen
2016-11-30 15:43                                 ` Eli Zaretskii
2016-11-30 15:46                                   ` Lars Ingebrigtsen
2016-11-30  0:16                           ` Dmitry Gutov
2016-11-30 15:13                             ` Eli Zaretskii
2016-11-30 15:17                               ` Dmitry Gutov
2016-11-30 15:32                                 ` Stefan Monnier
2016-11-30 15:42                                 ` Eli Zaretskii
2016-11-30 15:45                                   ` Dmitry Gutov
2016-11-30 15:48                                     ` Lars Ingebrigtsen
2016-11-30 16:25                                       ` Eli Zaretskii
2016-11-30 16:27                                         ` Lars Ingebrigtsen
2016-11-30 16:42                                           ` Eli Zaretskii
2016-11-30 18:25                                             ` Philipp Stephani
2016-11-30 18:48                                               ` Eli Zaretskii
2016-12-28 18:18                                                 ` Philipp Stephani
2016-12-28 18:34                                                   ` Eli Zaretskii
2016-12-28 18:45                                                     ` Philipp Stephani
2016-12-28 18:55                                                       ` Eli Zaretskii
2016-12-28 19:03                                                       ` Andreas Schwab
2016-11-30 18:23                                         ` Philipp Stephani
2016-11-30 18:44                                           ` Eli Zaretskii
2016-12-28 18:09                                             ` Philipp Stephani
2016-12-28 18:27                                               ` Eli Zaretskii
2016-12-28 18:35                                                 ` Philipp Stephani
2016-12-28 18:45                                                   ` Eli Zaretskii
2016-12-28 18:22                                       ` Philipp Stephani
2016-12-28 18:57                                         ` Lars Ingebrigtsen
2016-12-30  0:07                                           ` Richard Stallman
2016-12-30 14:15                                             ` Lars Ingebrigtsen
2016-12-30 16:59                                               ` Eli Zaretskii
2017-01-21 15:39                                                 ` Lars Ingebrigtsen
2017-01-21 15:56                                                   ` Eli Zaretskii
2017-01-21 16:30                                                     ` Lars Ingebrigtsen
2017-01-21 22:58                                                       ` Stefan Monnier
2017-01-24 20:04                                                         ` Lars Ingebrigtsen
2017-01-28  9:52                                                           ` Elias Mårtenson
2017-01-28 14:16                                                             ` Lars Ingebrigtsen
2016-12-30 21:38                                               ` Richard Stallman
2016-11-30 16:23                                     ` Eli Zaretskii
2016-12-01  0:30                                       ` Dmitry Gutov
2016-12-01 17:17                                         ` Eli Zaretskii
2016-12-02 13:18                                           ` Dmitry Gutov
2016-12-02 14:24                                             ` Eli Zaretskii
2016-12-02 14:35                                               ` Dmitry Gutov
2016-12-02 15:20                                                 ` Eli Zaretskii
2016-12-02 14:53                                               ` Yuri Khan
2016-12-02 15:45                                                 ` Eli Zaretskii
2016-12-02 15:51                                                 ` Lars Ingebrigtsen
2016-12-02 15:58                                                   ` Eli Zaretskii
2016-12-02 15:29                                             ` Lars Ingebrigtsen
2016-12-02 15:32                                               ` Dmitry Gutov
2016-12-02 15:48                                                 ` Lars Ingebrigtsen
2016-12-02 15:56                                                   ` Dmitry Gutov
2016-12-02 16:02                                                     ` Lars Ingebrigtsen
2016-12-02 16:06                                                       ` Dmitry Gutov
2016-12-02 16:31                                                         ` Lars Ingebrigtsen
2016-12-02 23:13                                                           ` Dmitry Gutov
2016-12-03  0:37                                                             ` Lars Ingebrigtsen
2016-12-03  1:27                                                               ` Dmitry Gutov
2016-12-03  8:12                                                               ` Eli Zaretskii
2016-12-03 10:01                                                                 ` Lars Ingebrigtsen
2016-12-03 16:00                                                                   ` Stefan Monnier
2016-12-03 20:01                                                                     ` Lars Ingebrigtsen
2016-12-03 20:57                                                                       ` Andreas Schwab
2016-12-28 18:25                                         ` Philipp Stephani
2016-11-30 15:06                           ` Eli Zaretskii
2016-11-30 15:31                             ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).