From mboxrd@z Thu Jan 1 00:00:00 1970 From: swedebugia Subject: Re: [Feature idea] Adding wikidata, wikipedia & screenshot-url fields to package-recipes Date: Thu, 1 Nov 2018 23:33:48 +0100 Message-ID: <301f9d56-2091-2fbc-da89-c2ca53d0f580@riseup.net> References: <20181101102150.naklct2uiujtp2rl@thebird.nl> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="------------BF918D7EBCC4EAC9769F8029" Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:53341) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gILS2-0006OL-8G for guix-devel@gnu.org; Thu, 01 Nov 2018 18:28:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gILRu-0003P1-CT for guix-devel@gnu.org; Thu, 01 Nov 2018 18:28:28 -0400 Received: from mx1.riseup.net ([198.252.153.129]:55638) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gILRt-0003Hr-QY for guix-devel@gnu.org; Thu, 01 Nov 2018 18:28:22 -0400 In-Reply-To: <20181101102150.naklct2uiujtp2rl@thebird.nl> Content-Language: en-US List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Pjotr Prins Cc: guix-devel@gnu.org This is a multi-part message in MIME format. --------------BF918D7EBCC4EAC9769F8029 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Hi :) On 2018-11-01 11:21, Pjotr Prins wrote: > On Thu, Nov 01, 2018 at 10:44:19AM +0100, swedebugia wrote: >> Also to help us to associate new and existing packages with >> wikidata-entries we could devise a guile-programmed way to associate >> wikidata-entries to existing package objects and perhaps use this to >> populate new package-recipes created with guix import >> >> Guix would then be the first package manager to both be completely free >> of proprietary software and to leverage knowledge from Wikidata and WP. >> >> What do you think? > Absolutely the way forward. Totally excited you want to run with this! > > Wikidata is linked data and dry and by using it we can share between > distros and software building projects (conda, easybuild etc.). It > scales because the software maintainers themselves will be encouraged > to update project information - such as a reference to a mailing list > - which is the only way to really keep up-to-date in a scalable way. > Wikipedia will use that information too. Wikidata is DRY. First I did not understand DRY. Found https://en.wikipedia.org/wiki/Don't_repeat_yourself Agreed, it would be nice if changes to Wikidata would be driven by the authors of the programs and shared with leaf nodes (package managers, etc.) > People don't realise it, but in science to find the right tools for > the job is often a challenge. Wikidata will help us create sections > for tools that address certain tasks. For example variant calling in > sequencing data. I and others here have a direct stake in solving this > problem. This meta-information does not belong in Guix, so we need a > place to handle it. I agree. Though it would be good to find a way to keep queries to wikidata to a minimum. Caching maybe. > When you have a proof-of-concept we can even > consider writing a paper about it. Eh, unfortunately my guile and guix fu is not really anything to brag about, yet ;-). I am reading up on guile and trying to understand the code in guix. Right now my skill level is at * finding spelling errors and unclear text in the manual * contribute new simple packages (about to package recoll https://en.wikipedia.org/wiki/Recoll and splint) * adding better error messages to guix * sharing new ideas This wikidata endeavor would likely take some time for me to accomplish with a good mentor. First up is deciding whether the core procedures interacting with wikidata should be in guix or as a separate module. I suggest separate module. Then writing client procedures to interface with the SPARQL API in wikidata. This has already been done in python 3 (beta) see https://github.com/dahlia/wikidata gplv3+ We could piggyback on this client (essentially making guix dependent on python :/) or better yet contribute to one of the existing guile sql libraries: * https://sourceforge.net/projects/guile-simplesql/files/latest/download (unmaintained since 2014 it seems) * https://github.com/opencog/guile-dbi (active fork) The last one looks most promising but I did not look at the code yet. > Will you join our Guix event at FOSDEM? This would be an interesting > working group. Thanks for the invitation. I will think about it. I look forward to hone my guile and guix skills :) Cheers Swedebugia --------------BF918D7EBCC4EAC9769F8029 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 7bit

Hi :)

On 2018-11-01 11:21, Pjotr Prins wrote:
On Thu, Nov 01, 2018 at 10:44:19AM +0100, swedebugia wrote:
   Also to help us to associate new and existing packages with
   wikidata-entries we could devise a guile-programmed way to associate
   wikidata-entries to existing package objects and perhaps use this to
   populate new package-recipes created with guix import

   Guix would then be the first package manager to both be completely free
   of proprietary software and to leverage knowledge from Wikidata and WP.

   What do you think?
Absolutely the way forward. Totally excited you want to run with this!

Wikidata is linked data and dry and by using it we can share between
distros and software building projects (conda, easybuild etc.). It
scales because the software maintainers themselves will be encouraged
to update project information - such as a reference to a mailing list
- which is the only way to really keep up-to-date in a scalable way.
Wikipedia will use that information too. Wikidata is DRY.

First I did not understand DRY. Found https://en.wikipedia.org/wiki/Don't_repeat_yourself

Agreed, it would be nice if changes to Wikidata would be driven by the authors of the programs and shared with leaf nodes (package managers, etc.)

People don't realise it, but in science to find the right tools for
the job is often a challenge. Wikidata will help us create sections
for tools that address certain tasks. For example variant calling in
sequencing data. I and others here have a direct stake in solving this
problem. This meta-information does not belong in Guix, so we need a
place to handle it. 
I agree. Though it would be good to find a way to keep queries to wikidata to a minimum. Caching maybe.
When you have a proof-of-concept we can even
consider writing a paper about it.
Eh, unfortunately my guile and guix fu is not really anything to brag about, yet ;-). I am reading up on guile and trying to understand the code in guix.

Right now my skill level is at
  • finding spelling errors and unclear text in the manual
  • contribute new simple packages (about to package recoll https://en.wikipedia.org/wiki/Recoll and splint)
  • adding better error messages to guix
  • sharing new ideas


This wikidata endeavor would likely take some time for me to accomplish with a good mentor.

First up is deciding whether the core procedures interacting with wikidata should be in guix or as a separate module. I suggest separate module.

Then writing client procedures to interface with the SPARQL API in wikidata. This has already been done in python 3 (beta) see https://github.com/dahlia/wikidata gplv3+

We could piggyback on this client (essentially making guix dependent on python :/) or better yet contribute to one of the existing guile sql libraries:

The last one looks most promising but I did not look at the code yet.

Will you join our Guix event at FOSDEM? This would be an interesting
working group. 
Thanks for the invitation. I will think about it.

I look forward to hone my guile and guix skills :)
Cheers
Swedebugia
--------------BF918D7EBCC4EAC9769F8029--