unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
From: Mark H Weaver <mhw@netris.org>
To: Ricardo Wurmus <rekado@elephly.net>
Cc: 32528@debbugs.gnu.org
Subject: bug#32528: http-post breaks with XML response payload containing boundary
Date: Tue, 28 Aug 2018 17:51:14 -0400	[thread overview]
Message-ID: <87bm9mf9d9.fsf@netris.org> (raw)
In-Reply-To: <874lfiltkg.fsf@elephly.net> (Ricardo Wurmus's message of "Sat, 25 Aug 2018 10:49:19 +0200")

Ricardo Wurmus <rekado@elephly.net> writes:

> I’m having a problem with http-post and I think it might be a bug.  I’m
> talking to a Debbugs SOAP service over HTTP by sending (via POST) an XML
> request.  The Debbugs SOAP service responds with a string of XML.
>
> Here’s a simplified version of what I do:
>
>   (use-module (web http))
>   (let ((req-xml "<soap:Envelope xmlns:soap...>"))
>     (receive (response body)
>         (http-post uri
>                    #:body req-xml
>                    #:headers
>                    `((content-type . (text/xml))
>                      (content-length . ,(string-length req-xml))))
>      ;; Do something with the response body
>      (xml->sxml body #:trim-whitespace? #t)))
>
> This fails for some requests with an error like this:
>
>     web/http.scm:1609:23: Bad Content-Type header: multipart/related; type="text/xml"; start="<main_envelope>"; boundary="=-=-="

[...]

> The reason why it fails is that Guile processes the response and treats
> the *payload* contained in the XML response as HTTP.

No, this was a good guess, but it's not actually the problem.

If you add --save-headers to the wget command line, you'll see the full
response, and the HTTP headers are what's being parsed, as it should be.
It looks like this (except that I removed the carriage returns below):

  HTTP/1.1 200 OK
  Date: Tue, 28 Aug 2018 21:40:30 GMT
  Server: Apache
  SOAPServer: SOAP::Lite/Perl/1.11
  Strict-Transport-Security: max-age=63072000
  Content-Length: 32650
  X-Content-Type-Options: nosniff
  X-Frame-Options: sameorigin
  X-XSS-Protection: 1; mode=block
  Keep-Alive: timeout=5, max=100
  Connection: Keep-Alive
  Content-Type: multipart/related; type="text/xml"; start="<main_envelope>"; boundary="=-=-="
  
  <?xml [...]

The problem is simply that our Content-Type header parser is broken.
It's very simplistic and merely splits the string wherever ';' is found,
and then checks to make sure there's only one '=' in each parameter,
without taking into account that quoted strings in the parameters might
include those characters.

I'll work on a proper parser for Content-Type headers.

      Thanks,
        Mark





  reply	other threads:[~2018-08-28 21:51 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-25  8:49 bug#32528: http-post breaks with XML response payload containing boundary Ricardo Wurmus
2018-08-28 21:51 ` Mark H Weaver [this message]
2018-08-29  3:28   ` Mark H Weaver
2019-06-25  8:25     ` Ludovic Courtès
2018-08-29 10:26   ` Ricardo Wurmus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bm9mf9d9.fsf@netris.org \
    --to=mhw@netris.org \
    --cc=32528@debbugs.gnu.org \
    --cc=rekado@elephly.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).