From: Christopher Allan Webber <cwebber@dustycloud.org>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: 24075@debbugs.gnu.org
Subject: bug#24075: tls/https support in Guile (through r6rs binary ports?)
Date: Sun, 06 Nov 2016 12:32:38 -0600 [thread overview]
Message-ID: <87k2cg5uih.fsf@dustycloud.org> (raw)
In-Reply-To: <87fun56987.fsf@gnu.org>
[-- Attachment #1: Type: text/plain, Size: 9101 bytes --]
Some less good news: I found out that the https stuff is not working
right for all sites. I tested though... the code works *before* I
wrapped it in custom-binary-input/output-port.
After being wrapped though, strange things happen. For some sites (eg
"https://webmention.net/") things seem fine:
scheme@(guile-user)> (http-get (string->uri "https://webmention.net/"))
$7 = #<<response> version: (1 . 1) code: 200 reason-phrase: "OK" headers: ((server . "nginx/1.9.10") (date . #<date nanosecond: 0 second: 46 minute: 15 hour: 18 day: 6 month: 11 year: 2016 zone-offset: 0>) (content-type text/html (charset . "UTF-8")) (transfer-encoding (chunked)) (connection close) (x-powered-by . "PHP/5.6.21-1+donate.sury.org~trusty+4")) port: #<closed: file f25310>>
$8 = "<!DOCTYPE html>\n<html>\n<head>\n <title>Webmention</title>\n <link rel=\"stylesheet\" href=\"/styles.css\">\n</head>\n<body>\n\n<div class=\"page\">\n \n <h1>Webmention</h1>\n \n <div class=\"subtitle\">Webmention is a simple way to notify any URL when you link to it from your site.</div>\n \n <ul class=\"links\">\n <li>The Webmention specification is being developed under the <a href=\"https://www.w3.org/wiki/Socialwg\">W3C Social Web Working Group</a>.</li>\n <li class=\"main\"><a href=\"https://www.w3.org/TR/webmention/\">Latest published version</a></li>\n <li class=\"main\"><a href=\"http://webmention.net/draft/\">Latest editor's draft</a></li>\n <li class=\"main\"><a href=\"http://webmention.net/implementations/\">Implementations</a></li>\n <li>The specification was contributed to the W3C by the IndieWeb community. More information and history of the spec can be found on the <a href=\"https://indieweb.org/webmention\">IndieWeb wiki</a>.</li>\n </ul>\n \n</div>\n\n</body>\n</html>"
For other sites, especially ones where the pages are larger, things are
broken. For example, let's try to pull down the site of friend Joey
Hess:
scheme@(guile-user)> (http-get (string->uri "https://joeyh.name/"))
$9 = #<<response> version: (1 . 1) code: 200 reason-phrase: "OK" headers: ((date . #<date nanosecond: 0 second: 58 minute: 15 hour: 18 day: 6 month: 11 year: 2016 zone-offset: 0>) (server . "Apache/2.4.10 (Debian)") (last-modified . #<date nanosecond: 0 second: 43 minute: 34 hour: 14 day: 24 month: 10 year: 2016 zone-offset: 0>) (etag "195c-53f9d4af683f3" . #t) (accept-ranges bytes) (content-length . 6492) (vary accept-encoding) (cache-control (max-age . 0)) (expires . #<date nanosecond: 0 second: 58 minute: 15 hour: 18 day: 6 month: 11 year: 2016 zone-offset: 0>) (connection close) (content-type text/html)) port: #<closed: file f4c070>>
$10 = "moz-background-size: cover;\n -o-background-size: cover;\n background-size: cover;\n}\n.sidebar {\n background: none;\n border: none;\n}\ninput#searchbox {\n display: none;\n}\n#pageinfo {\n display: none;\n}\n.pageheader .actions ul {\n border-bottom: none;\n}\n#pagebody {\n margin-left: 20%;\n}\n.archivepagedate {\n font-size: 0.5em;\n}\n.actions {\n display: none;\n}\n</style>\n</div>\n\n\n\n\n<table>\n<tr>\n<td width=\"33%\" valign=top><h3>personal</h3>\n\n<p><a href=\"./blog/\">blog</a><br/>\n<a href=\"./pics/\">pics</a><br/>\n<a href=\"./contact/\">contact me</a><br/>\n<a href=\"./todo/\">todo</a></p>\n\n\n\n</td>\n<td width=\"33%\" valign=top><h3>technical</h3>\n\n<p><a href=\"./code/\">code</a><br/>\n<a href=\"./vcshome/\">vcshome</a><br/>\n<a href=\"./talks/\">talks</a><br/>\n<a href=\"./screencasts/\">screencasts</a><br/>\n<a href=\"./termcast/\">termcasts</a><br/>\n<a href=\"./rfc/\">rfcs</a><br/>\n<a href=\"./boxen/\">boxen</a></p>\n\n\n\n</td>\n<td width=\"33%\" valign=top><h3>fun</h3>\n\n<p><a href=\"./learnstofly/\">Joey Learns to Fly</a><br/>\n<a href=\"http://olduse.net/\">oldusenet</a><br/>\n<a href=\"./languages/\">languages</a><br/>\n<a href=\"./yurt/\">yurt</a><br/>\n<a href=\"./caving/\">caving</a><br/>\n<a href=\"./grep/\">grep</a><br/>\n<a href=\"./meta/\">meta</a></p>\n\n\n\n</td>\n</tr>\n</table>\n\n\n\n\n<h3>interviews</h3>\n\n<p><a href=\"http://joey.hess.usesthis.com\">2012: The Setup</a></p>\n\n<blockquote><p>\"When power is low, I often hack in the evenings by lantern light.\"</p></blockquote>\n\n<p><a href=\"http://zgrimshell.github.io/posts/interviews-with-floss-developers-joey-hess.html\">2015: Life after Debian</a></p>\n\n<blockquote><p>\"I want to build worthwhile things that might last.\"</p></blockquote>\n\n<p><a href=\"http://lwn.net/Articles/672352/\">2016: Linux Weekly News</a></p>\n\n<blockquote><p>\"I still see myself as a beginner, and certainly not an exemplar.\"</p></blockquote>\n\n\n\n\n\n\n</section>\n\n\n\n\n\n\n\n</div>\n\n<footer id=\"footer\" class=\"pagefooter\" role=\"contentinfo\">\n\n<nav id=\"pageinfo\">\n\n\n\n\n\n\n\n\n\n\n\n<div class=\"pagedate\">\nLast edited <time datetime=\"2015-03-02T15:14:09Z\" class=\"relativedate\" title=\"Mon, 02 Mar 2015 10:14:09 -0500\">mid-morning Monday, March 2nd, 2015</time>\n<!-- Created <time datetime=\"2006-03-19T23:58:19Z\" pubdate=\"pubdate\" class=\"relativedate\" title=\"Sun, 19 Mar 2006 18:58:19 -0500\">Sunday evening, March 19th, 2006</time> -->\n</div>\n\n</nav>\n\n\n<!-- from joey -->\n</footer>\n\n</article>\n\n</body>\n</html>\n\" title=\"Thu, 22 Sep 2016 16:13:21 -0400\">at teatime on Thursday, September 22nd, 2016</time>\n\n</span>\n</div>\n<div class=\"archivepage\">\n\n<a href=\"./blog/entry/PoW_bucket_bloom/\">PoW bucket bloom: throttling anonymous clients with proof of work, token buckets, and bloom filters</a><br />\n\n<span class=\"archivepagedate\">\nPosted <time datetime=\"2016-09-13T05:14:47Z\" pubdate=\"pubdate\" class=\"relativedate\" title=\"Tue, 13 Sep 2016 01:14:47 -0400\">late Monday night, September 13th, 2016</time>\n\n</span>\n</div>\n<div class=\"archivepage\">\n\n<a href=\"./blog/entry/late_summer/\">late summer</a><br />\n\n<span class=\"archivepagedate\">\nPosted <time datetime=\"2016-08-31T01:15:40Z\" pubdate=\"pubdate\" class=\"relativedate\" title=\"Tue, 30 Aug 2016 21:15:40 -0400\">late Tuesday evening, August 30th, 2016</time>\n\n</span>\n</div>\n\n\n\n\n\n</aside>\n\n\n\n<div id=\"pagebody\">\n\n<section id=\"content\" role=\"main\">\n\n\n<div>\n<style>\nhtml { \n background: url(joeykite.jpg) no-repeat center center fixed; \n -webkit-background-size: cover;\n -\x00r\x00\x00\x00e\x00\x00\x00t\x00\x00\x00u\x00\x00\x00r\x00\x00\x00n\x00\x00\x00e\x00\x00\x00d\x00\x00\x00 \x00\x00\x00a\x00\x00\x00s\x00\x00\x00 \x00\x00\x00a\x00\x00\x00 \x00\x00\x00b\x00\x00\x00y\x00\x00\x00t\x00\x00\x00e\x00\x00\x00v\x00\x00\x00e\x00\x00\x00c\x00\x00\x00t\x00\x00\x00o\x00\x00\x00r\x00\x00\x00.\x00\x00\x00\n\x00\x00\x00\n\x00\x00\x00H\x00\x00\x00o\x00\x00\x00w\x00\x00\x00e\x00\x00\x00v\x00\x00\x00e\x00\x00\x00r\x00\x00\x00,\x00\x00\x00 \x00\x00\x00i\x00\x00\x00f\x00\x00\x00 \x00\x00\x00S\x00\x00\x00T\x00\x00\x00R\x00\x00\x00E\x00\x00\x00A\x00\x00\x00M\x00\x00\x00I\x00\x00\x00N\x00\x00\x00G\x00\x00\x00?\x00\x00\x00 \x00\x00\x00i\x00\x00\x00s\x00\x00\x00 \x00\x00\x00t\x00\x00\x00r\x00\x00\x00u\x00\x00\x00e\x00\x00\x00,\x00\x00\x00 \x00\x00\x00i\x00\x00\x00n\x00\x00\x00s\x00\x00\x00t\x00\x00\x00e\x00\x00\x00a\x00\x00\x00d\x00\x00\x00 \x00\x00\x00o\x00\x00\x00f\x00\x00\x00 \x00\x00\x00e\x00\x00\x00a\x00\x00\x00g\x00\x00\x00e\x00\x00\x00r\x00\x00\x00l\x00\x00\x00y\x00\x00\x00 \x00\x00\x00r\x00\x00\x00e\x00\x00\x00a\x00\x00\x00d\x00\x00\x00i\x00\x00\x00n\x00\x00\x00g\x00\x00\x00 \x00\x00\x00t\x00\x00\x00h\x00\x00\x00e\x00\x00\x00 \x00\x00\x00r\x00\x00\x00e\x00\x00\x00s\x00\x00\x00p\x00\x00\x00o\x00\x00\x00n\x00\x00\x00s\x00\x00\x00e\x00\x00\x00\n\x00\x00\x00b\x00\x00\x00o\x00\x00\x00d\x00\x00\x00y\x00\x00\x00 \x00\x00\x00f\x00\x00\x00r\x00\x00\x00o\x00\x00\x00m\x00\x00\x00 \x00\x00\x00t\x00\x00\x00h\x00\x00\x00e\x00\x00\x00 \x00\x00\x00s\x00\x00\x00e\x00\x00\x00r\x00\x00\x00v\x00\x00\x00e\x00\x00\x00r\x00\x00\x00,\x00\x00\x00 \x00\x00\x00t\x00\x00\x00h\x00\x00\x00i\x00\x00\x00s\x00\x00"
(truncated a bit)
First of all, the response body starts in the wrong place... it should
start with "<!DOCTYPE html>". Then, somewhere in the middle it switches
to garbage output. I'm not sure why.
Again, it's fine before being wrapped in the custom-binary-input/output-port.
So either it's my fault (could well be) or there's a bug in the
custom-binary-input/output-port implementation. I feel like I don't
know enough to be sure. I would assume it's on my end, but since I
think this is the first major use of that interface, a bug seems hardly
impossible.
Anyway, to test this bug you'll need to have gnutls compiled with a
newer Guile. I've attached the hacky guix package I'm using to test
this. Then you'll want to do:
$ guix environment --ad-hoc gnutls-with-guile-next guile-next
(You need guile-next even if doing gnutls-with-guile-next in the
environment to enable the Guile 2.2 paths.)
I could use some help on this... I'm afraid that if I've done something
wrong, I'm not knowledgeable enough to know how to get out of the problem.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-web-Add-https-support-through-gnutls.patch --]
[-- Type: text/x-patch, Size: 9746 bytes --]
From 91c0a4a728ca4bf2e9468cdc849c350dd3f7380f Mon Sep 17 00:00:00 2001
From: Christopher Allan Webber <cwebber@dustycloud.org>
Date: Thu, 17 Sep 2015 15:14:54 -0500
Subject: [PATCH] web: Add https support through gnutls.
Since importing gnutls directly would result in a dependency cycle,
we load gnutls lazily.
This uses code originally written for Guix by Ludovic
* module/web/client.scm: (%http-receive-buffer-size)
(warn-no-gnutls-return-false, gnutls-module, ensure-gnutls)
(gnutls-ref, tls-wrap): New variables.
(open-socket-for-uri): Wrap in tls when uri scheme is https.
* doc/ref/web.texi (open-socket-for-uri): Document gnutls usage.
---
doc/ref/web.texi | 6 +-
module/web/client.scm | 175 +++++++++++++++++++++++++++++++++++++++++++-------
2 files changed, 158 insertions(+), 23 deletions(-)
diff --git a/doc/ref/web.texi b/doc/ref/web.texi
index becdc28..c2f3f61 100644
--- a/doc/ref/web.texi
+++ b/doc/ref/web.texi
@@ -1422,7 +1422,11 @@ the lower-level HTTP, request, and response modules.
@end example
@deffn {Scheme Procedure} open-socket-for-uri uri
-Return an open input/output port for a connection to URI.
+Return an open input/output port for a connection to URI. Guile
+dynamically loads gnutls for https support; for more information, see
+@xref{Guile Preparations,
+how to install the GnuTLS bindings for Guile,, gnutls-guile,
+GnuTLS-Guile}.
@end deffn
@deffn {Scheme Procedure} http-get uri arg...
diff --git a/module/web/client.scm b/module/web/client.scm
index f24a4d7..f0fba49 100644
--- a/module/web/client.scm
+++ b/module/web/client.scm
@@ -1,6 +1,6 @@
;;; Web client
-;; Copyright (C) 2011, 2012, 2013, 2014 Free Software Foundation, Inc.
+;; Copyright (C) 2011, 2012, 2013, 2014, 2015, 2016 Free Software Foundation, Inc.
;; This library is free software; you can redistribute it and/or
;; modify it under the terms of the GNU Lesser General Public
@@ -43,8 +43,11 @@
#:use-module (srfi srfi-1)
#:use-module (srfi srfi-9)
#:use-module (srfi srfi-9 gnu)
+ #:use-module ((rnrs io ports)
+ #:prefix rnrs-ports:)
#:export (current-http-proxy
open-socket-for-uri
+ open-connection-for-uri
http-get
http-get*
http-head
@@ -54,11 +57,113 @@
http-trace
http-options))
+(define %http-receive-buffer-size
+ ;; Size of the HTTP receive buffer.
+ 65536)
+
+;; Provide access to the gnutls-module, but fail gracefully if not available.
+;; Why take this route and not just straight up import the module?
+;; Guile can't depend on gnutls because gnutls includes Guile as a dependency.
+;; There's some risk of dependency cycles, so lazily resolving things only
+;; once needed helps!
+
+(define warn-no-gnutls-return-false
+ (lambda _
+ (format (current-error-port)
+ "warning: (gnutls) module not available\n")
+ #f))
+
+(define gnutls-module
+ (delay
+ (catch 'misc-error
+ (lambda ()
+ (let ((module (resolve-interface '(gnutls))))
+ ;; In some 2.1/2.2 installations installed alongside Guile 2.0, gnutls
+ ;; can be imported but the bindings are broken as "unknown type".
+ ;; Here we check that gnutls-version is the right type (a procedure)
+ ;; to make sure the bindings are ok.
+ (if (procedure? (module-ref module 'gnutls-version))
+ module
+ (warn-no-gnutls-return-false))))
+ warn-no-gnutls-return-false)))
+
+(define (ensure-gnutls)
+ (if (not (force gnutls-module))
+ (throw 'gnutls-not-available "(gnutls) module not available")))
+
+(define (gnutls-ref symbol)
+ "Fetch method-symbol from the gnutls module"
+ (module-ref (force gnutls-module) symbol))
+
(define current-http-proxy
(make-parameter (let ((proxy (getenv "http_proxy")))
(and (not (equal? proxy ""))
proxy))))
+(define (tls-wrap port server)
+ "Return PORT wrapped in a TLS connection to SERVER. SERVER must be a DNS
+host name without trailing dot."
+ (define (log level str)
+ (format (current-error-port)
+ "gnutls: [~a|~a] ~a" (getpid) level str))
+
+ (ensure-gnutls)
+
+ (let ((session ((gnutls-ref 'make-session)
+ (gnutls-ref 'connection-end/client))))
+
+ ;; Some servers such as 'cloud.github.com' require the client to support
+ ;; the 'SERVER NAME' extension. However, 'set-session-server-name!' is
+ ;; not available in older GnuTLS releases. See
+ ;; <http://bugs.gnu.org/18526> for details.
+ (if (module-defined? (force gnutls-module)
+ 'set-session-server-name!)
+ ((gnutls-ref 'set-session-server-name!)
+ session (gnutls-ref 'server-name-type/dns) server)
+ (format (current-error-port)
+ "warning: TLS 'SERVER NAME' extension not supported~%"))
+
+ ((gnutls-ref 'set-session-transport-fd!) session (fileno port))
+ ((gnutls-ref 'set-session-default-priority!) session)
+
+ ;; The "%COMPAT" bit allows us to work around firewall issues (info
+ ;; "(gnutls) Priority Strings"); see <http://bugs.gnu.org/23311>.
+ ;; Explicitly disable SSLv3, which is insecure:
+ ;; <https://tools.ietf.org/html/rfc7568>.
+ ((gnutls-ref 'set-session-priorities!) session "NORMAL:%COMPAT:-VERS-SSL3.0")
+
+ ((gnutls-ref 'set-session-credentials!) session
+ ((gnutls-ref 'make-certificate-credentials)))
+
+ ;; Uncomment the following lines in case of debugging emergency.
+ ;;(set-log-level! 10)
+ ;;(set-log-procedure! log)
+
+ ((gnutls-ref 'handshake) session)
+ (let ((record ((gnutls-ref 'session-record-port) session)))
+ (define (read! bv start count)
+ (define read-bv (get-bytevector-n record count))
+ (if (eof-object? read-bv)
+ 0 ; read! returns 0 on eof-object
+ (let ((read-bv-len (bytevector-length read-bv)))
+ (bytevector-copy! read-bv 0 bv 0 read-bv-len)
+ read-bv-len)))
+ (define (write! bv start count)
+ (put-bytevector record bv start count)
+ count)
+ (define (get-position)
+ (rnrs-ports:port-position record))
+ (define (set-position! new-position)
+ (rnrs-ports:set-port-position! record new-position))
+ (define (close)
+ (unless (port-closed? port)
+ (close-port port))
+ (unless (port-closed? record)
+ (close-port record)))
+ (make-custom-binary-input/output-port "gnutls wrapped port" read! write!
+ get-position set-position!
+ close))))
+
(define (ensure-uri uri-or-string)
(cond
((string? uri-or-string) (string->uri uri-or-string))
@@ -81,27 +186,53 @@
0))
(lambda (ai1 ai2)
(equal? (addrinfo:addr ai1) (addrinfo:addr ai2))))))
-
- (let loop ((addresses addresses))
- (let* ((ai (car addresses))
- (s (with-fluids ((%default-port-encoding #f))
- ;; Restrict ourselves to TCP.
- (socket (addrinfo:fam ai) SOCK_STREAM IPPROTO_IP))))
- (catch 'system-error
- (lambda ()
- (connect s (addrinfo:addr ai))
-
- ;; Buffer input and output on this port.
- (setvbuf s 'block)
- ;; If we're using a proxy, make a note of that.
- (when http-proxy (set-http-proxy-port?! s #t))
- s)
- (lambda args
- ;; Connection failed, so try one of the other addresses.
- (close s)
- (if (null? (cdr addresses))
- (apply throw args)
- (loop (cdr addresses))))))))
+ (define https?
+ (eq? 'https (uri-scheme uri)))
+ (define (open-socket)
+ (let loop ((addresses addresses))
+ (let* ((ai (car addresses))
+ (s (with-fluids ((%default-port-encoding #f))
+ ;; Restrict ourselves to TCP.
+ (socket (addrinfo:fam ai) SOCK_STREAM IPPROTO_IP))))
+ (catch 'system-error
+ (lambda ()
+ (connect s (addrinfo:addr ai))
+
+ ;; Buffer input and output on this port.
+ (setvbuf s 'block)
+ ;; If we're using a proxy, make a note of that.
+ (when http-proxy (set-http-proxy-port?! s #t))
+ s)
+ (lambda args
+ ;; Connection failed, so try one of the other addresses.
+ (close s)
+ (if (null? (cdr addresses))
+ (apply throw args)
+ (loop (cdr addresses))))))))
+
+ (let-syntax ((with-https-proxy
+ (syntax-rules ()
+ ((_ exp)
+ ;; For HTTPS URIs, honor 'https_proxy', not 'http_proxy'.
+ ;; FIXME: Proxying is not supported for https.
+ (let ((thunk (lambda () exp)))
+ (if (and https?
+ current-http-proxy)
+ (parameterize ((current-http-proxy #f))
+ (when (and=> (getenv "https_proxy")
+ (negate string-null?))
+ (format (current-error-port)
+ "warning: 'https_proxy' is ignored~%"))
+ (thunk))
+ (thunk)))))))
+ (with-https-proxy
+ (let ((s (open-socket)))
+ ;; Buffer input and output on this port.
+ (setvbuf s _IOFBF %http-receive-buffer-size)
+
+ (if https?
+ (tls-wrap s (uri-host uri))
+ s)))))
(define (extend-request r k v . additional)
(let ((r (set-field r (request-headers)
--
2.10.2
next prev parent reply other threads:[~2016-11-06 18:32 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-26 15:55 bug#24075: tls/https support in Guile (through r6rs binary ports?) Christopher Allan Webber
2016-08-04 20:37 ` Andy Wingo
2016-08-21 15:58 ` Christopher Allan Webber
2016-11-05 18:39 ` Christopher Allan Webber
2016-11-05 19:02 ` Ludovic Courtès
2016-11-06 17:37 ` Christopher Allan Webber
2016-11-06 21:13 ` Ludovic Courtès
2016-11-07 18:14 ` Christopher Allan Webber
2016-11-07 20:13 ` Ludovic Courtès
2016-11-06 18:32 ` Christopher Allan Webber [this message]
2016-11-06 21:06 ` Ludovic Courtès
2016-11-07 4:40 ` Christopher Allan Webber
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87k2cg5uih.fsf@dustycloud.org \
--to=cwebber@dustycloud.org \
--cc=24075@debbugs.gnu.org \
--cc=ludo@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).