unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* url-retrieve-synchronously results differ from curl
@ 2015-01-20  2:28 Artur Malabarba
  2015-01-20  7:37 ` Nicolas Richard
  2015-01-21  8:01 ` Thien-Thi Nguyen
  0 siblings, 2 replies; 13+ messages in thread
From: Artur Malabarba @ 2015-01-20  2:28 UTC (permalink / raw)
  To: help-gnu-emacs, Sean Allred

There are further details on this phenomenon at this link
http://emacs.stackexchange.com/q/7545/50

In short, we're trying to do a post request with
`url-retrieve-synchronously', but it's yielding a different result than the
same request using curl. The url request *works*, in that it performs the
post and returns sane results. However, these results are not the expected
ones, and performing the same request using curl does return the expected
results.

It's possible this is just a bug in the API we're posting to, but it's more
likely we just haven't built this request correctly in
url-retrieve-synchronously. This would explain why it's being handled
differently from the request we make with curl. We'd appreciate the input
from anyone who knows more about the `url' package, before we run off
blaming the API. :-)

Less relevant details can be found at the linked question, but below is the
basic url call we're using. The address and data-args variables are defined
elsewhere, but I'm just looking to know whether we're doing something wrong
with the url call.

      (let ((url-automatic-caching t)
            (url-inhibit-uncompression t)
            (url-request-data data-args)
            (url-request-method "post")
            (url-request-extra-headers
             '(("Content-Type" . "application/x-www-form-urlencoded"))))
        (with-current-buffer
            (url-retrieve-synchronously address)
          (goto-char (point-min))
          (search-forward "\n\n")
          (delete-region (point-min) (point))
          (buffer-string)))

For comparison, here's the curl command we use (which returns different
results)
         (format "curl --silent -X POST --data %S %s"
           data-args
           address)

Thanks a lot
Artur Malabarba


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: url-retrieve-synchronously results differ from curl
  2015-01-20  2:28 url-retrieve-synchronously results differ from curl Artur Malabarba
@ 2015-01-20  7:37 ` Nicolas Richard
  2015-01-20 15:35   ` Artur Malabarba
  2015-01-21  8:01 ` Thien-Thi Nguyen
  1 sibling, 1 reply; 13+ messages in thread
From: Nicolas Richard @ 2015-01-20  7:37 UTC (permalink / raw)
  To: Artur Malabarba; +Cc: Sean Allred, help-gnu-emacs

Artur Malabarba <bruce.connor.am@gmail.com> writes:
> It's possible this is just a bug in the API we're posting to, but it's
> more likely we just haven't built this request correctly in
> url-retrieve-synchronously. This would explain why it's being handled
> differently from the request we make with curl.

(I'm going a bit off topic here, sorry about that.)

Did you inspect the actual request being sent via e.g. wireshark (this
is GPL software) ? Perhaps you can spot a difference and get at least a
starting point.

If you never used wireshark, here's how I would do it :

1. type something in the filter box (near the top) like:
   (ip.src==208.118.235.148 or ip.dst == 208.118.235.148) and http
   That would only show HTTP traffic from/to gnu.org.
2. select a capture interface (e.g. eth0) and hit start
3. send the requests
4. hit "Stop" and inspect what you've got.

HTH,

-- 
Nicolas



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: url-retrieve-synchronously results differ from curl
  2015-01-20  7:37 ` Nicolas Richard
@ 2015-01-20 15:35   ` Artur Malabarba
  2015-01-21  7:13     ` Nicolas Richard
  0 siblings, 1 reply; 13+ messages in thread
From: Artur Malabarba @ 2015-01-20 15:35 UTC (permalink / raw)
  To: Nicolas Richard; +Cc: Sean Allred, help-gnu-emacs

>> It's possible this is just a bug in the API we're posting to, but it's
>> more likely we just haven't built this request correctly in
>> url-retrieve-synchronously. This would explain why it's being handled
>> differently from the request we make with curl.
>
> (I'm going a bit off topic here, sorry about that.)
>
> Did you inspect the actual request being sent via e.g. wireshark (this
> is GPL software) ? Perhaps you can spot a difference and get at least a
> starting point.
>
> If you never used wireshark, here's how I would do it :
>
> 1. type something in the filter box (near the top) like:
>    (ip.src==208.118.235.148 or ip.dst == 208.118.235.148) and http
>    That would only show HTTP traffic from/to gnu.org.
> 2. select a capture interface (e.g. eth0) and hit start
> 3. send the requests
> 4. hit "Stop" and inspect what you've got.

Thanks for the suggestion. I've managed to follow the steps above, but
I can't say I fully understand the results.

A single call to `url-retrieve-synchronously' yields 11 entries
meeting the ip.dst filter (most TCP and a few TLS). I tried looking
through these entires to figure out what data/headers were being sent,
but had no success.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: url-retrieve-synchronously results differ from curl
  2015-01-20 15:35   ` Artur Malabarba
@ 2015-01-21  7:13     ` Nicolas Richard
  2015-01-21 13:14       ` Artur Malabarba
  2015-01-21 14:27       ` Sean Allred
  0 siblings, 2 replies; 13+ messages in thread
From: Nicolas Richard @ 2015-01-21  7:13 UTC (permalink / raw)
  To: Artur Malabarba; +Cc: Sean Allred, help-gnu-emacs

Artur Malabarba <bruce.connor.am@gmail.com> writes:
> A single call to `url-retrieve-synchronously' yields 11 entries
> meeting the ip.dst filter (most TCP and a few TLS). I tried looking
> through these entires to figure out what data/headers were being sent,
> but had no success.

Does restricting to 'http' help ? This is the kind of thing I have :
http://i.imgur.com/cOuz8QR.png

-- 
Nicolas



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: url-retrieve-synchronously results differ from curl
  2015-01-20  2:28 url-retrieve-synchronously results differ from curl Artur Malabarba
  2015-01-20  7:37 ` Nicolas Richard
@ 2015-01-21  8:01 ` Thien-Thi Nguyen
  2015-01-21 14:14   ` Artur Malabarba
  1 sibling, 1 reply; 13+ messages in thread
From: Thien-Thi Nguyen @ 2015-01-21  8:01 UTC (permalink / raw)
  To: help-gnu-emacs

[-- Attachment #1: Type: text/plain, Size: 419 bytes --]

() Artur Malabarba <bruce.connor.am@gmail.com>
() Tue, 20 Jan 2015 00:28:46 -0200

               (url-request-method "post")

IIRC, HTTP specifies the spelling as "POST" (all upcase).

-- 
Thien-Thi Nguyen
   GPG key: 4C807502
   (if you're human and you know it)
      read my lisp: (responsep (questions 'technical)
                               (not (via 'mailing-list)))
                     => nil

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: url-retrieve-synchronously results differ from curl
  2015-01-21  7:13     ` Nicolas Richard
@ 2015-01-21 13:14       ` Artur Malabarba
  2015-01-21 13:42         ` Nicolas Richard
  2015-01-21 14:27       ` Sean Allred
  1 sibling, 1 reply; 13+ messages in thread
From: Artur Malabarba @ 2015-01-21 13:14 UTC (permalink / raw)
  To: Nicolas Richard; +Cc: Sean Allred, help-gnu-emacs

Retricting to http yields no results (the list becomes empty), so
maybe I'm doing something wrong.

The api call is indeed https, with the domain `api.stackexchange.com',
but none of the entries in the list use the http protocol. I thought I
might be using a wrong destination IP in the filter (which I
discovered by pinging the domain above), but even if I remove the
(ip.dst == 198.252.206.140) part I still get an empty list for the
`http' filter (also for the `http2').

I also tried changing the interface to `any' (in case I was using the
wrong one), but the result is the same. I get some http entries listed
if I do some browsing, but I get nothing by contacting the API with
url-retrieve-synchronously.

2015-01-21 5:13 GMT-02:00 Nicolas Richard <theonewiththeevillook@yahoo.fr>:
> Artur Malabarba <bruce.connor.am@gmail.com> writes:
>> A single call to `url-retrieve-synchronously' yields 11 entries
>> meeting the ip.dst filter (most TCP and a few TLS). I tried looking
>> through these entires to figure out what data/headers were being sent,
>> but had no success.
>
> Does restricting to 'http' help ? This is the kind of thing I have :
> http://i.imgur.com/cOuz8QR.png
>
> --
> Nicolas



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: url-retrieve-synchronously results differ from curl
  2015-01-21 13:14       ` Artur Malabarba
@ 2015-01-21 13:42         ` Nicolas Richard
  0 siblings, 0 replies; 13+ messages in thread
From: Nicolas Richard @ 2015-01-21 13:42 UTC (permalink / raw)
  To: bruce.connor.am; +Cc: Sean Allred, help-gnu-emacs

Le 21/01/2015 14:14, Artur Malabarba a écrit :
> The api call is indeed https, with the domain `api.stackexchange.com',
> but none of the entries in the list use the http protocol. I thought I
> might be using a wrong destination IP in the filter (which I
> discovered by pinging the domain above), but even if I remove the
> (ip.dst == 198.252.206.140) part I still get an empty list for the
> `http' filter (also for the `http2').

I had overlooked the fact you were using https, sorry about this. I
confirm that when I try with an https url, the output seems unusable.

Nicolas.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: url-retrieve-synchronously results differ from curl
  2015-01-21  8:01 ` Thien-Thi Nguyen
@ 2015-01-21 14:14   ` Artur Malabarba
  0 siblings, 0 replies; 13+ messages in thread
From: Artur Malabarba @ 2015-01-21 14:14 UTC (permalink / raw)
  To: Thien-Thi Nguyen; +Cc: help-gnu-emacs

>
>                (url-request-method "post")
>
> IIRC, HTTP specifies the spelling as "POST" (all upcase).
>

Oh god. This actually works! It fixes the bug.

Why the hell would it work 50% with lower case method?! I mean, the posting
succeeds anyway, only the return value is kind of off.

I'll do some more testing later, but this does seem to be the culprit.
Thanks a lot, Thien.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: url-retrieve-synchronously results differ from curl
  2015-01-21  7:13     ` Nicolas Richard
  2015-01-21 13:14       ` Artur Malabarba
@ 2015-01-21 14:27       ` Sean Allred
  2015-01-21 16:29         ` Thien-Thi Nguyen
  2015-01-22  1:14         ` Artur Malabarba
  1 sibling, 2 replies; 13+ messages in thread
From: Sean Allred @ 2015-01-21 14:27 UTC (permalink / raw)
  To: Nicolas Richard; +Cc: help-gnu-emacs, Artur Malabarba

Artur discovered something this morning that I wanted to share here as he works on a proper bug report. (We’re both especially busy today, unfortunately.) Perhaps someone can shed light on this behavior.

Binding `url-request-method` to `”POST”` rather than `”post”` seems to fix the issues en masse. This has not undergone proper testing, but it definitely seems to work [1]. I apologize that I don’t have an example that works out-of-the-box – I really have a very limited experience with web stuff – but using “POST” rather than “post” makes a significant difference.

The question we both have at this point is *why* – *why* does “post” ‘partly’ work? *Why* does “POST” work fully? And frankly, in my personal opinion, *why* aren’t these methods taken in as symbols in the first place?

Either Artur or I will be filing a proper bug report for url.el on this issue as time allows (assuming of course that someone doesn’t beat both of us to it).

(Unless this thread can count as a bug-report?)

All the best,
Sean Allred

[1]: https://gist.github.com/90be70f12cd097be1247 <https://gist.github.com/90be70f12cd097be1247>, reproduced below

(defconst tmp:access-token
  ;; Needed to post answers.  Considered a secret.  If you would like
  ;; a key, use `sx-authenticate' from sx.el (available from MELPA)
  ;; and look in ~/.emacs.d/.sx/auth.el
  "YOUR ACCESS TOKEN HERE")

(defconst tmp:key
  ;; not considered a secret
  "0TE6s1tveCpP9K5r5JNDNQ((")

(defun tmp:api-bug (use-curl access-token key)
  "Post a test answer to the formatting sandbox."
  (let ((random-body-1 (md5 (current-time-string)))
        (random-body-2 (md5 (md5 (current-time-string))))
        (method "https://api.stackexchange.com/2.2/questions/3122/answers/add")
        (args
         (mapconcat
          #'identity
          `(,(format "access_token=%s"
                     (replace-regexp-in-string
                      "%" "%%" (url-hexify-string access-token)))
            ,(format "key=%s"
                     (replace-regexp-in-string
                      "%" "%%" (url-hexify-string key)))
            "site=meta"
            "pagesize=100"
            "filter=%%21GoYr1we0U5inG5G7wBg4JBGpbgX%%29C7LDqpy-%%2AbfwPOujOr4SR4W%%29bLNSyYUpQDdTwTj.XChTFB0gfLaAJq0hv"
            "body=this-is-an-answer-test-for-sx.el--%s")
          "&")))

    (if use-curl
        (shell-command-to-string
         (format
          "curl --silent -X POST --data %S %s | gunzip"
          (format args random-body-2)
          method))
      (let ((url-automatic-caching t)
            (url-inhibit-uncompression t)
            (url-request-data (format args random-body-1))
            ;; emacs-devel: note vvvv
            (url-request-method "POST")
            (url-request-extra-headers
             '(("Content-Type" . "application/x-www-form-urlencoded"))))
        (with-current-buffer
            (url-retrieve-synchronously method)
          (goto-char (point-min))
          (search-forward "\n\n")
          (delete-region (point-min) (point))
          (buffer-string))))))

(require 'json)

(json-read-from-string (tmp:api-bug t tmp:access-token tmp:key))

((total . 1)                            ; this data structure used curl
 (page_size . 100)
 (page . 1)
 (quota_remaining . 9979)
 (quota_max . 10000)
 (has_more . :json-false)
 (items .
        [((link . "http://meta.stackexchange.com/questions/3122//247295#247295")
          (body_markdown . "this-is-an-answer-test-for-sx.el--e3eeb6228ed9c2c58e5385b73493f0f0")
          (share_link . "http://meta.stackexchange.com/a/247295/188148")
          (answer_id . 247295)
          (creation_date . 1421684370)
          (last_activity_date . 1421684370)
          (score . 0)
          (upvoted . :json-false)
          (downvoted . :json-false)
          (owner
           (display_name . "Sean Allred")
           (reputation . 160)))]))

(json-read-from-string (tmp:api-bug nil tmp:access-token tmp:key))

((total . 1)                            ; this data structure used "POST"
 (page_size . 100)
 (page . 1)
 (quota_remaining . 9999)
 (quota_max . 10000)
 (has_more . :json-false)
 (items .
        [((link . "http://meta.stackexchange.com/questions/3122//247396#247396")
          (body_markdown . "this-is-an-answer-test-for-sx.el--5b447b87e7078ed0fd34a3169ee84319")
          (share_link . "http://meta.stackexchange.com/a/247396/188148")
          (answer_id . 247396)
          (creation_date . 1421847252)
          (last_activity_date . 1421847252)
          (score . 0)
          (upvoted . :json-false)
          (downvoted . :json-false)
          (owner
           (display_name . "Sean Allred")
           (reputation . 160)))]))


((total . 1)                            ; this data structure used "post"
 (page_size . 100)
 (page . 1)
 (quota_remaining . 9977)
 (quota_max . 10000)
 (has_more . :json-false)
 (items .
        [((link . "http://meta.stackexchange.com/questions/3122//247297#247297")
          (share_link . "http://meta.stackexchange.com/a/247297/188148")
          (answer_id . 247297)
          (creation_date . 1421684555)
          (last_activity_date . 1421684555)
          (score . 0)
          (owner
           (display_name . "Sean Allred")
           (reputation . 160)))]))



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: url-retrieve-synchronously results differ from curl
  2015-01-21 14:27       ` Sean Allred
@ 2015-01-21 16:29         ` Thien-Thi Nguyen
  2015-01-22  1:14         ` Artur Malabarba
  1 sibling, 0 replies; 13+ messages in thread
From: Thien-Thi Nguyen @ 2015-01-21 16:29 UTC (permalink / raw)
  To: help-gnu-emacs

[-- Attachment #1: Type: text/plain, Size: 690 bytes --]

() Sean Allred <code@seanallred.com>
() Wed, 21 Jan 2015 09:27:16 -0500

   The question we both have at this point is *why* – *why* does
   “post” ‘partly’ work? *Why* does “POST” work fully?

That can only be answered precisely by the programmers of the
server software.  They were probably smitten by Postel, blech,
and spewful of one-armed-if expressions and other such half-
sense.  [Insert more curmudgeonly harrumphing here.  :-D]

-- 
Thien-Thi Nguyen
   GPG key: 4C807502
   (if you're human and you know it)
      read my lisp: (responsep (questions 'technical)
                               (not (via 'mailing-list)))
                     => nil

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: url-retrieve-synchronously results differ from curl
  2015-01-21 14:27       ` Sean Allred
  2015-01-21 16:29         ` Thien-Thi Nguyen
@ 2015-01-22  1:14         ` Artur Malabarba
  2015-01-22  2:41           ` Artur Malabarba
  2015-01-22  9:00           ` Thien-Thi Nguyen
  1 sibling, 2 replies; 13+ messages in thread
From: Artur Malabarba @ 2015-01-22  1:14 UTC (permalink / raw)
  To: Sean Allred; +Cc: Nicolas Richard, help-gnu-emacs

> Binding url-request-method to ”POST” rather than ”post” seems to fix
> the issues en masse. This has not undergone proper testing, but it
> definitely seems to work [1].

I'd just like to confirm (now that I've done a bit more testing) that
this does indeed fix the issue. So thanks again to Thien-Thi for
suggesting that.

I'll see if I can find out where this happens in url.el. If not, I'll
just file a bug report. IMHO, the package should either support both
versions indiscriminately, or warn the user if they use the wrong
version. Even better (as Sean suggests) would be to support and
encourage the use of symbols.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: url-retrieve-synchronously results differ from curl
  2015-01-22  1:14         ` Artur Malabarba
@ 2015-01-22  2:41           ` Artur Malabarba
  2015-01-22  9:00           ` Thien-Thi Nguyen
  1 sibling, 0 replies; 13+ messages in thread
From: Artur Malabarba @ 2015-01-22  2:41 UTC (permalink / raw)
  To: Sean Allred; +Cc: help-gnu-emacs

Upon further (much much further) investigation, it seems that nothing
in the url package could cause this specific behavior. Still, I see
many places where the package will silently bug if the user downcases
the "get" method (and a couple other methods). Given that
`url-request-method' makes no mention of that, I think the following
would be worthy patches (and I'm willing to write them if people
agree):

1. Document in `url-request-method' that method names are uppercase
strings (the current version doesn't even say they are strings).
2. Warn the user (with a message?) in `url-retrieve-internal' if the
above variable is dynamically bound to a lowercase string.

2015-01-21 23:14 GMT-02:00 Artur Malabarba <bruce.connor.am@gmail.com>:
>> Binding url-request-method to ”POST” rather than ”post” seems to fix
>> the issues en masse. This has not undergone proper testing, but it
>> definitely seems to work [1].
>
> I'd just like to confirm (now that I've done a bit more testing) that
> this does indeed fix the issue. So thanks again to Thien-Thi for
> suggesting that.
>
> I'll see if I can find out where this happens in url.el. If not, I'll
> just file a bug report. IMHO, the package should either support both
> versions indiscriminately, or warn the user if they use the wrong
> version. Even better (as Sean suggests) would be to support and
> encourage the use of symbols.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: url-retrieve-synchronously results differ from curl
  2015-01-22  1:14         ` Artur Malabarba
  2015-01-22  2:41           ` Artur Malabarba
@ 2015-01-22  9:00           ` Thien-Thi Nguyen
  1 sibling, 0 replies; 13+ messages in thread
From: Thien-Thi Nguyen @ 2015-01-22  9:00 UTC (permalink / raw)
  To: help-gnu-emacs

[-- Attachment #1: Type: text/plain, Size: 1250 bytes --]

() Artur Malabarba <bruce.connor.am@gmail.com>
() Wed, 21 Jan 2015 23:14:56 -0200

   I'll see if I can find out where this happens in url.el.
   If not, I'll just file a bug report.  IMHO, the package
   should either support both versions indiscriminately, or
   warn the user if they use the wrong version.

This is not a legitimate complaint.  HTTP specifies the
all-upcase spelling explicitly, so url.el DTRT already.
(Nod to pedants: Actually, RFC1945 sez "The method is
case-sensitive." and lists "POST" in the RHS in the
‘Method’ non-terminal production...)

If anything, a bug-report is indicated for the server side:
unrecognized method names should ellicit status code 501
(error; "not implemented").

   Even better (as Sean suggests) would be to support and
   encourage the use of symbols.

This is a separate issue, but legitimate.  Personally, i'd love
to see more symbols and fewer strings-as-symbols, all around.
(Insert rant on languages lacking a "symbol" data type, here.)

-- 
Thien-Thi Nguyen
   GPG key: 4C807502
   (if you're human and you know it)
      read my lisp: (responsep (questions 'technical)
                               (not (via 'mailing-list)))
                     => nil

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-01-22  9:00 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-20  2:28 url-retrieve-synchronously results differ from curl Artur Malabarba
2015-01-20  7:37 ` Nicolas Richard
2015-01-20 15:35   ` Artur Malabarba
2015-01-21  7:13     ` Nicolas Richard
2015-01-21 13:14       ` Artur Malabarba
2015-01-21 13:42         ` Nicolas Richard
2015-01-21 14:27       ` Sean Allred
2015-01-21 16:29         ` Thien-Thi Nguyen
2015-01-22  1:14         ` Artur Malabarba
2015-01-22  2:41           ` Artur Malabarba
2015-01-22  9:00           ` Thien-Thi Nguyen
2015-01-21  8:01 ` Thien-Thi Nguyen
2015-01-21 14:14   ` Artur Malabarba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).