all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* New library: guile-wikidata
@ 2018-12-04 22:03 swedebugia
  2018-12-05  8:55 ` Alex Sassmannshausen
  0 siblings, 1 reply; 2+ messages in thread
From: swedebugia @ 2018-12-04 22:03 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 2207 bytes --]

Hi

I worked hard for a few hours playing with guile and came up with this.

Next step is to implement looking up the entities for the result and
filter on instance of (the properties). E.g. software.

:D

What do you think?
I tried reuse code and abstract away in a a la Guile. This is my first
program so feel free to give it a review and ideas for improvement.

It depends on (guix import json) right now and I think we should
upstream those two json-to-alist functions to guile-json.

Example output:

scheme@(guile-user) [25]> (show-first-result "gcc")
1.: Q178940: GNU Compiler Collection: compiler system with support for
various programming languages

scheme@(guile-user) [25]> (show-first-result "emacs")
1.: Q189722: Emacs: family of text editors

scheme@(guile-user) [25]> (show-first-result "guile")
1.: Q601244: Guile: character from the Street Fighter fighting game
series

scheme@(guile-user) [25]> (show-first-result "openssh")
1.: Q847062: OpenSSH: set of computer programs providing encrypted
communication sessions

scheme@(guile-user) [25]> (show-first-result "automake")
1.: Q1324275: Automake: tool for generating GNU Standards-compliant
Makefiles


Not so reliable output for bioinformatics packages:

scheme@(guile-user) [25]> (show-first-result "aragorn")
1.: Q180322: Aragorn: character from the Lord of the Rings
$19 = #t
scheme@(guile-user) [25]> (show-first-result "bamm")
1.: Q12819323: Bamm: Uzbek name of deep and dark sounds of stringed
instruments
$20 = #t
scheme@(guile-user) [25]> (show-first-result "bamtools")
1.: Q30426432: BamTools: a C++ API and toolkit for analyzing and
managing BAM files.: scientific article
$21 = #t
scheme@(guile-user) [25]> (show-first-result "bcftools")
1.: Q31041251: BCFtools/RoH: a hidden Markov model approach for
detecting autozygosity from next-generation sequencing data.:\
 scientific article
$22 = #t
scheme@(guile-user) [25]> (show-first-result "bedops")
1.: Q36076319: BEDOPS: high-performance genomic feature operations.:
scientific article
$23 = #t
scheme@(guile-user) [25]> (show-first-result "blast+")
1.: Q179057: explosion: sudden release of energy through high
temperatures and gas expansion
$24 = #t


-- 
Cheers
Swedebugia

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: wd.scm --]
[-- Type: text/plain; charset=us-ascii; name=wd.scm, Size: 6482 bytes --]

;;; Copyright © 2018 swedebugia <swedebugia@riseup.net>
;;;
;;; This file is part of guile-wikidata.
;;;
;;; guile-wikidata is free software; you can redistribute it and/or modify it
;;; under the terms of the GNU General Public License as published by
;;; the Free Software Foundation; either version 3 of the License, or (at
;;; your option) any later version.
;;;
;;; guile-wikidata is distributed in the hope that it will be useful, but
;;; WITHOUT ANY WARRANTY; without even the implied warranty of
;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
;;; GNU General Public License for more details.
;;;
;;; You should have received a copy of the GNU General Public License
;;; along with guile-wikidata.  If not, see <http://www.gnu.org/licenses/>.

(use-modules (ice-9 format)
	     (json)
	     (guix import json)
	     (srfi srfi-1)
	     (web uri)
	     )

;; Inspired by PYPI wikidata_suggest
(define (wbsearch-uri name)
  "Build URI for the Wikidata wbsearchintities API."
  (let ((url "https://www.wikidata.org/w/api.php")
	(& "&")
	(= "=")
	(search name)
	(action "wbsearchentities")
	(format "json")
	;; Hardcoded en for now
	(language "en")
	(type "item")
	(continue "0")
	(limit "10"))
    (string->uri
     (string-append url "?"
		    "search" = search &
		    "action" = action &
		    "format" = format &
		    "language" = language &
		    "type" = type &
		    "continue" = continue &
		    "limit" = limit
		    ))))

;; Inspired by
;; https://opendata.stackexchange.com/questions/5248/how-to-get-the-name-of-a-wikidata-item
(define (wbget-uri qid)
  "Build URI for the Wikidata wbsearchintities API."
  (let ((url "https://www.wikidata.org/w/api.php")
	(& "&")
	(= "=")
	(ids qid)
	(action "wbgetentities")
	(format "json")
	;; Hardcoded en for now
	(language "en"))
    (string->uri
     (string-append url "?"
		    "ids" = ids &
		    "action" = action &
		    "format" = format &
		    "language" = language &
		    ))))

(define (wdquery uri)
  "Fetch the data, return an alist"
  (json-fetch-alist uri))

(define (extract-qid alist)
  "Accept unnested alist and output Q-ID."
  (if (list? alist)
      (assoc-ref alist "id")
      (error "extract-qid: Not a proper list" alist)))

(define (extract-label alist)
  "Accept unnested ALIST and output label."
  (if (list? alist)
      (assoc-ref alist "label")
      (begin
	(error "extract-label: Not a proper list" )
	(display alist))))

(define (extract-desc alist)
  "Accept unnested ALIST and output label."
  (if (list? alist)
      ;; TODO add error handling when no descriptions
      (assoc-ref alist "description")
      (begin
	(error "extract-desc: Not a proper list" )
	(display alist))))

(define (extract-all alist)
  (if (list? alist)
      `(("label" . ,(extract-label alist))
       ("description" . ,(extract-desc alist))
       ("id" . ,(extract-qid alist)))
      (begin
	(error "extract-all: Not a proper list" )
	(display alist))))

(define (extract-result name)
  "Returns list with each element being an alist of label, desc, qid"
  (map extract-all (assoc-ref (wdquery (wbsearch-uri name)) "search")))

;; BROKEN for some reason
;; (define (output-field qid)
;;   "Generate output field with Q-ID for the package record."
;;   (if (string? qid)
;;       `(wikidata ,qid)
;;       ;;(error "Not a string")
;;       ;; Parse as pair
;;       (if (pair? qid)
;; 	  (let ((qid (cdr qid)))
;; 	    `(wikidata ,qid))
;; 	  ;;Not a pair?
;; 	  ;; debug
;; 	  (format #t "debug qid: ~a ~%"  qid)
;; 	  )))

(define (extract-first-result alist)
  "Extract first result from nested alist"
  ;; TODO add error if no results
  (if (equal? (first (first alist)) "success")
      (first (assoc-ref alist "search"))
      (error "success not found"))
  )

;; Broken for some reason :-/
;; (define (parse-results query num)
;;   (let lp
;;       ;;Variable
;;       ((num))
;;     ;;Bindings?
;;     (if (> num 0)
;; 	;;Body
;; 	(let (; Reverse list first so they end up in the right order.
;; 	      (result (take (reverse (extract-result query)) num)))
;; 	  (format #t "~a: ~a: ~a ~%" (number->string num) (extract-label result) (extract-desc result)))
;; 	;; Subtract
;; 	(lp (- num 1))
;; 	;; Tail
;; 	num)))

;;test
;; (parse-results "emacs" 2)

(define (show-first-result query)
  ;; First result which is often the right one :)
  (let ((result (first (extract-result query))))
    (format #t "1.: ~a: ~a: ~a ~%"
    	    (extract-qid result) (extract-label result) (extract-desc result))
    ))

;;testing :)
;;(display "first result:") (newline)
(show-first-result "openssh")
;; (newline)

;; (show-first-result "xorg")
;; (newline)
;; (display "extract-label") (newline)
;; (display (extract-label (first (assoc-ref (wdquery (wbsearch-uri "xorg")) "search"))))
;; (newline)
;; (output-field (extract-qid (first (assoc-ref (wdquery (wbsearch-uri "xorg")) "search"))))

;; (format #t "parse-results: ~%")
;; (parse-results "gnome3" 1)
;; (newline)

;; (format #t "take: ~%")


;; (take (extract-result "gnome") 1)

;; (define (show-3 package-name)
;;   "Correlate PACKAGE-NAME with Wikidata Q-IDs and add field to package record for all matches."
;;   (let* ((q1 (take (assoc-ref (wdquery (wbsearch-uri package-name)) "search")) 2)
;; 	 (num 2))
;;     (while ((>= num 0))
;;       (let (((take q1 num)
;; 	     (l (extract-label q1))
;; 	    (d (extract-desc q1))
;; 	    (id (extract-qid q1))))
;;        ;; descriptions are sometimes missing
;;        (if (d)
;; 	   (format #t "~a : ~a : ~a ~%" l d id)
;; 	   (format #t "~a : ~a ~%" l id))
;;        ;;debug
;;        (display "debug: ")
;;        (display q1) (newline)
;;        ;;(format #t "~a ~%"(string-append t ":" d))
;;        ;;(format #t "~a ~%"(string-append t ":" id))
;;        )
;;       (- num 1))))

;; (format #t "Show-3: ~%")
;; (show-3 "gnome3")


;;(assoc-ref (assoc-ref query "search") "label")

;; def _wikipedia(name, lang='en'):
;; url = "https://%s.wikipedia.org/w/api.php" % lang
;; params = {
;; 	  "action": "query",
;; 	  "list": "search",
;; 	  "format": "json",
;; 	  "srnamespace": "0",
;; 	  "srsearch": name
;; 	  }
;; sug = None
;; results = requests.get(url, params=params).json()
;; if len(results['query']['search']) > 0:
;; sug = results['query']['search'][0]['title']
;; elif 'suggestion' in results['query']['searchinfo'] and \
;; name != results['query']['searchinfo']['suggestion']:
;; sug = _wikipedia(results['query']['searchinfo']['suggestion'], lang)
;; return sug
	  

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: New library: guile-wikidata
  2018-12-04 22:03 New library: guile-wikidata swedebugia
@ 2018-12-05  8:55 ` Alex Sassmannshausen
  0 siblings, 0 replies; 2+ messages in thread
From: Alex Sassmannshausen @ 2018-12-05  8:55 UTC (permalink / raw)
  To: swedebugia; +Cc: guix-devel

Hi Swedebugia,

This looks pretty cool! Congrats on getting this far :-)

I would love to be able to have a nice high-level library to introduce
Wikipedia info in my programs, so please continue!

Alex

swedebugia@riseup.net writes:

> Hi
>
> I worked hard for a few hours playing with guile and came up with this.
>
> Next step is to implement looking up the entities for the result and
> filter on instance of (the properties). E.g. software.
>
> :D
>
> What do you think?
> I tried reuse code and abstract away in a a la Guile. This is my first
> program so feel free to give it a review and ideas for improvement.
>
> It depends on (guix import json) right now and I think we should
> upstream those two json-to-alist functions to guile-json.
>
> Example output:
>
> scheme@(guile-user) [25]> (show-first-result "gcc")
> 1.: Q178940: GNU Compiler Collection: compiler system with support for
> various programming languages
>
> scheme@(guile-user) [25]> (show-first-result "emacs")
> 1.: Q189722: Emacs: family of text editors
>
> scheme@(guile-user) [25]> (show-first-result "guile")
> 1.: Q601244: Guile: character from the Street Fighter fighting game
> series
>
> scheme@(guile-user) [25]> (show-first-result "openssh")
> 1.: Q847062: OpenSSH: set of computer programs providing encrypted
> communication sessions
>
> scheme@(guile-user) [25]> (show-first-result "automake")
> 1.: Q1324275: Automake: tool for generating GNU Standards-compliant
> Makefiles
>
>
> Not so reliable output for bioinformatics packages:
>
> scheme@(guile-user) [25]> (show-first-result "aragorn")
> 1.: Q180322: Aragorn: character from the Lord of the Rings
> $19 = #t
> scheme@(guile-user) [25]> (show-first-result "bamm")
> 1.: Q12819323: Bamm: Uzbek name of deep and dark sounds of stringed
> instruments
> $20 = #t
> scheme@(guile-user) [25]> (show-first-result "bamtools")
> 1.: Q30426432: BamTools: a C++ API and toolkit for analyzing and
> managing BAM files.: scientific article
> $21 = #t
> scheme@(guile-user) [25]> (show-first-result "bcftools")
> 1.: Q31041251: BCFtools/RoH: a hidden Markov model approach for
> detecting autozygosity from next-generation sequencing data.:\
>  scientific article
> $22 = #t
> scheme@(guile-user) [25]> (show-first-result "bedops")
> 1.: Q36076319: BEDOPS: high-performance genomic feature operations.:
> scientific article
> $23 = #t
> scheme@(guile-user) [25]> (show-first-result "blast+")
> 1.: Q179057: explosion: sudden release of energy through high
> temperatures and gas expansion
> $24 = #t

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-12-05  8:56 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-12-04 22:03 New library: guile-wikidata swedebugia
2018-12-05  8:55 ` Alex Sassmannshausen

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.