From: Mark H Weaver <mhw@netris.org>
To: Ricardo Wurmus <rekado@elephly.net>
Cc: 32528@debbugs.gnu.org
Subject: bug#32528: http-post breaks with XML response payload containing boundary
Date: Tue, 28 Aug 2018 17:51:14 -0400 [thread overview]
Message-ID: <87bm9mf9d9.fsf@netris.org> (raw)
In-Reply-To: <874lfiltkg.fsf@elephly.net> (Ricardo Wurmus's message of "Sat, 25 Aug 2018 10:49:19 +0200")
Ricardo Wurmus <rekado@elephly.net> writes:
> I’m having a problem with http-post and I think it might be a bug. I’m
> talking to a Debbugs SOAP service over HTTP by sending (via POST) an XML
> request. The Debbugs SOAP service responds with a string of XML.
>
> Here’s a simplified version of what I do:
>
> (use-module (web http))
> (let ((req-xml "<soap:Envelope xmlns:soap...>"))
> (receive (response body)
> (http-post uri
> #:body req-xml
> #:headers
> `((content-type . (text/xml))
> (content-length . ,(string-length req-xml))))
> ;; Do something with the response body
> (xml->sxml body #:trim-whitespace? #t)))
>
> This fails for some requests with an error like this:
>
> web/http.scm:1609:23: Bad Content-Type header: multipart/related; type="text/xml"; start="<main_envelope>"; boundary="=-=-="
[...]
> The reason why it fails is that Guile processes the response and treats
> the *payload* contained in the XML response as HTTP.
No, this was a good guess, but it's not actually the problem.
If you add --save-headers to the wget command line, you'll see the full
response, and the HTTP headers are what's being parsed, as it should be.
It looks like this (except that I removed the carriage returns below):
HTTP/1.1 200 OK
Date: Tue, 28 Aug 2018 21:40:30 GMT
Server: Apache
SOAPServer: SOAP::Lite/Perl/1.11
Strict-Transport-Security: max-age=63072000
Content-Length: 32650
X-Content-Type-Options: nosniff
X-Frame-Options: sameorigin
X-XSS-Protection: 1; mode=block
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: multipart/related; type="text/xml"; start="<main_envelope>"; boundary="=-=-="
<?xml [...]
The problem is simply that our Content-Type header parser is broken.
It's very simplistic and merely splits the string wherever ';' is found,
and then checks to make sure there's only one '=' in each parameter,
without taking into account that quoted strings in the parameters might
include those characters.
I'll work on a proper parser for Content-Type headers.
Thanks,
Mark
next prev parent reply other threads:[~2018-08-28 21:51 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-25 8:49 bug#32528: http-post breaks with XML response payload containing boundary Ricardo Wurmus
2018-08-28 21:51 ` Mark H Weaver [this message]
2018-08-29 3:28 ` Mark H Weaver
2019-06-25 8:25 ` Ludovic Courtès
2018-08-29 10:26 ` Ricardo Wurmus
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87bm9mf9d9.fsf@netris.org \
--to=mhw@netris.org \
--cc=32528@debbugs.gnu.org \
--cc=rekado@elephly.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).