unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
From: swedebugia <swedebugia@riseup.net>
To: Roel Janssen <roel@gnu.org>
Cc: guile-user <guile-user-bounces+swedebugia=riseup.net@gnu.org>,
	guile-user@gnu.org
Subject: Re: Trouble parsing a response (Was: Re: New library: guile-wikidata)
Date: Thu, 3 Jan 2019 13:25:37 +0100	[thread overview]
Message-ID: <568bae0a-0012-bd5e-d251-9b03aee23245@riseup.net> (raw)
In-Reply-To: <7ebe1814-8aff-2d09-dc54-f978b4d16fe0@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 6646 bytes --]

On 2018-12-13 23:03, Roel Janssen wrote:
> 
> 
> On 13-12-18 17:06, swedebugia@riseup.net wrote:
>> On 2018-12-13 16:01, swedebugia@riseup.net wrote:
>> snip
>>
>>>
>>> I tried with the file attached but got this because the driver does not
>>> support URIs but only host, port, type, token:
>>
>> Ah, I saw now that you already implemented URI on master :)
>> https://github.com/roelj/guile-sparql/blob/master/sparql/driver.scm
>>
>> When I try calling this
>> ;; Example query to wikidata listing cats
>> (sparql-query
>>   "SELECT ?item
>> WHERE
>> {
>> ?item wdt:P31 wd:Q146.
>> }
>> LIMIT 10
>> "
>>   #:uri "https://query.wikidata.org/sparql"
>>   ;; #:port 80
>>   #:type "application/sparql-results+json"
>>   ;;  #:token "..."
>>   #:store-backend 'blazegraph
>>   )
>>
>> I get this fine result:
>> #<<response> version: (1 . 1) code: 200 reason-phrase: "OK" headers:
>> ((date . #<date nanosecond: 0 second: 12 minute: 32 hour: 15 day: 13
>> month: 12 year: 2018 zone-offset: 0>) (content-type
>> application/sparql-results+json (charset . "utf-8")) (transfer-encoding
>> (chunked)) (connection close) (server . "nginx/1.13.6") (x-served-by .
>> "wdqs1005") (access-control-allow-origin . "*") (cache-control public
>> (max-age . 300)) (vary accept accept-encoding) (x-varnish . "644531744,
>> 572094009, 417977651") (via "1.1 varnish (Varnish/5.1)" "1.1 varnish
>> (Varnish/5.1)" "1.1 varnish (Varnish/5.1)") (accept-ranges bytes) (age .
>> 0) (x-cache . "cp1079 pass, cp3030 pass, cp3030 pass") (x-cache-status .
>> "pass") (server-timing . "cache;desc=\"pass\"")
>> (strict-transport-security . "max-age=106384710; includeSubDomains;
>> preload") (set-cookie .
>> "WMF-Last-Access=13-Dec-2018;Path=/;HttpOnly;secure;Expires=Mon, 14 Jan
>> 2019 12:00:00 GMT") (set-cookie .
>> "WMF-Last-Access-Global=13-Dec-2018;Path=/;Domain=.wikidata.org;HttpOnly;secure;Expires=Mon, 
>>
>> 14 Jan 2019 12:00:00 GMT") (x-analytics . "https=1;nocookies=1")
>> (x-client-ip . "83.185.90.53")) port: #<input-output: file 8c37e70>>
>>
>> My problem now is that I don't know how to separate the header from the
>> port-file.
>>
>> Ah, reading here
>> https://www.gnu.org/software/guile/manual/html_node/Responses.html#Responses 
>>
>> I found (response-port).
>>
> 
> Here's what I did in SPARQLing-genomics:
> https://github.com/UMCUGenetics/sparqling-genomics/blob/master/web/www/pages/query-response.scm#L86 
> 
> 
> So basically:
> 
> (use-modules
>    (ice-9 receive)
>    (ice-9 rdelim)
>    (web response))
> 
> (receive (header port)
>    ;; Note: "text/csv" is the only format that is consistent for 
> multiple SPARQL back-ends (Virtuoso, BlazeGraph, ...)
>    (sparql-query ... #:type "text/csv")
>    (if (= (response-code header) 200) ; This means the query went OK.
>      (call-some-function port)
>      #f)) ; Deal with errors at the #f.
> 
> (define (call-some-function port)
>    (let ((line (read-line port)))
>      (if (eof-object? line)
>        #t
>        (begin
>          (format #t "Line: ~a~%" line)
>          ;; Tail-recurse until we have processed each line.
>          (call-some-function port)))))
> 
> The SPARQLing-genomics code deals with more error codes, and processes 
> the lines in a more useful way.
> 
>> Unfortunately this only took me one step further as I run into this
>> instead when trying to parse the port with (json->scm):
>>
>> Backtrace:
>>             7 (apply-smob/1 #<catch-closure 9769550>)
>> In ice-9/boot-9.scm:
>>      705:2  6 (call-with-prompt _ _ #<procedure default-prompt-handle…>)
>> In ice-9/eval.scm:
>>      619:8  5 (_ #(#(#<directory (guile-user) 9759910>)))
>> In ice-9/boot-9.scm:
>>     2312:4  4 (save-module-excursion _)
>>    3831:12  3 (_)
>> In sdb-test.scm:
>>       24:1  2 (_)
>> In json/parser.scm:
>>     311:18  1 (json-read-number _)
>>     148:28  0 (read-number _)
>>
>> json/parser.scm:148:28: In procedure read-number:
>> Throw to key `json-invalid' with args `(#<json-parser port:
>> #<input-output: string 98dea80>>)'.
>>
>> Maybe this is a bug in (json)?
> 
> It looks like the JSON response is not (only) JSON, or simply invalid.
> Maybe the "text/xml" or "text/csv" content-type will work better for 
> you.  I noticed that each back-end provides their own structure for XML 
> and JSON, so I used the somewhat quirky CSV format as a work-for-all 
> response type.
> 
> I hope this helps.

You were right!

I debugged away and got this in the end after many trial and errors:

$ env |grep ssl
GIT_SSL_CAINFO=/home/egil/.guix-profile/etc/ssl/certs/ca-certificates.crt
SSL_CERT_DIR=/home/egil/.guix-profile/etc/ssl/certs
$ guile --version
guile (GNU Guile) 2.2.4
$ guix --version
guix (GNU Guix) 0.16.0 <- installed from binary 0.16 on parabola.

$ guile -s test2.scm
Line: 1aa
Line: item
Line: http://www.wikidata.org/entity/Q28114532
Line: http://www.wikidata.org/entity/Q28114535
Line: http://www.wikidata.org/entity/Q28665865
Line: http://www.wikidata.org/entity/Q28792126
Line: http://www.wikidata.org/entity/Q30600575
Line: http://www.wikidata.org/entity/Q42442324
Line: http://www.wikidata.org/entity/Q43260736
Line: http://www.wikidata.org/entity/Q48895080
Line: http://www.wikidata.org/entity/Q49581026
Line: http://www.wikidata.org/entity/Q50378472
Line:
Line: 0
Line:
Backtrace:
            9 (apply-smob/1 #<catch-closure 18835c0>)
In ice-9/boot-9.scm:
     705:2  8 (call-with-prompt _ _ #<procedure default-prompt-handler 
(k proc)>)
In ice-9/eval.scm:
     619:8  7 (_ #(#(#<directory (guile-user) 18f2140>)))
In ice-9/boot-9.scm:
    2312:4  6 (save-module-excursion _)
   3831:12  5 (_)
In test2.scm:
     51:14  4 (read #<input-output: string 1c22d20>)
In ice-9/rdelim.scm:
    195:24  3 (read-line _ _)
In unknown file:
            2 (%read-line #<input-output: string 1c22d20>)
In web/client.scm:
    142:24  1 (read! #vu8(48 13 10 13 10 103 47 101 110 116 105 116 121 
47 81 52 56 56 57 53 48 56 48 13 10 104 116 116 112 58 47 47 119 119 119 
46 119 ?) ?)
In unknown file:
            0 (get-bytevector-some #<input-output: string 1cf51c0>)

ERROR: In procedure get-bytevector-some:
Throw to key `gnutls-error' with args `(#<gnutls-error-enum The TLS 
connection was non-properly terminated.> read_from_session_record_port)'.

Can anyone replicate this? (run the attachment)
Is this a bug in guile?
How do I ignore this error?

-- 
Cheers Swedebugia

[-- Attachment #2: test2.scm --]
[-- Type: text/x-scheme, Size: 2011 bytes --]

;; Example query to wikidata listing cats                               
(use-modules
 (ice-9 receive)
 (ice-9 rdelim)
 (ice-9 textual-ports)
 (web response)
 (web client)
 (web uri))

(define q2
  "
SELECT ?item
WHERE                                                                   
{                                                                       
?item wdt:P31 wd:Q146.                                                  
}                                                                       
LIMIT 10                                                                
")
(define post-url "https://query.wikidata.org/sparql")

(define json "application/sparql-results+json")
(define csv "text/csv") 
(define type csv)

(define* (old-url-encoding input #:optional (index 0) (output ""))
  (if (< (string-length input) 3)
      (string-append output input)
      (let ((triple (substring/read-only input index (+ index 3))))
        (if (string= triple "%20")
            (old-url-encoding (string-drop input 3)
                              0
                              (string-append output "+"))
            (old-url-encoding (string-drop input 1)
                              0
                              (string-append output
                                             (string
                                              (string-ref input index))))))))

(define r
  (let ((query q2))
   (http-post post-url
              #:body (string-append "query=" (old-url-encoding
                                              (uri-encode query)))
              #:streaming? #t
              #:headers
              `((user-agent   . "GNU Guile")
		(content-type . (application/x-www-form-urlencoded))
		(accept       . ((,(string->symbol type))))))))

(define (read port)
  (let ((line (read-line port)))
    (if (eof-object? line)
	#t
	(begin
          (format #t "Line: ~a~%" line)
          ;; Tail-recurse until we have processed each line.
          (read port)))))

(read (response-port r))


      parent reply	other threads:[~2019-01-03 12:25 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <5ffe968620f1d2d940b7db2b1900dc43@riseup.net>
2018-12-09  9:11 ` New library: guile-wikidata swedebugia
2018-12-09 11:08   ` tomas
2018-12-09 21:26     ` Arne Babenhauserheide
2018-12-11  0:32       ` swedebugia
2018-12-11 10:29         ` Roel Janssen
2018-12-13 15:01           ` swedebugia
2018-12-13 16:06             ` Trouble parsing a response (Was: Re: New library: guile-wikidata) swedebugia
2018-12-13 22:03               ` Roel Janssen
2018-12-13 22:29                 ` Trouble parsing a response swedebugia
2018-12-14 15:59                   ` Roel Janssen
2018-12-26 19:49                 ` Trouble parsing a response (Was: Re: New library: guile-wikidata) swedebugia
2019-01-03 12:25                 ` swedebugia [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=568bae0a-0012-bd5e-d251-9b03aee23245@riseup.net \
    --to=swedebugia@riseup.net \
    --cc=guile-user-bounces+swedebugia=riseup.net@gnu.org \
    --cc=guile-user@gnu.org \
    --cc=roel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).