* [Feature idea] Adding wikidata, wikipedia & screenshot-url fields to package-recipes
@ 2018-11-01 9:44 swedebugia
2018-11-01 10:21 ` Pjotr Prins
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: swedebugia @ 2018-11-01 9:44 UTC (permalink / raw)
To: guix-devel
[-- Attachment #1: Type: text/plain, Size: 2715 bytes --]
Hi
I am a contributor to OSM and have seen how combining OSM and
Wikidata/Wikipedia (WP) has been very useful.
I got the idea of adding Wikidata-entries to guix package objecs would
be fruitful because:
It makes it possible to a more useful list of packages e.g. by showing
links to WP entries for the program in the users local language. (E.g.
by firing up a browser from emacs or the shell, or by populating a (per
channel) html package list (with screenshots, local WP-links, etc.) and
firing up a throw-away web-server instance serving this with e.g. /guix
package --list-available-packages-html)/
It would also perhaps be of benefit to WP-contributors because we could
easily make statistics for how many of the packages in guix a
Wikidata-entry and/or WP-entry exists. Thus perhaps leading to creating
of more articles for notable packages or improving WP-articles with
outdated release information.
*Implementation:*
It could be implemented by adding the fields to package-objects.
The rationale for adding screenshot-url to the recipe is that this
parsing of wikidata->en-WP->url-for-first-image for every package in our
list is quite expensive. Better to do it once and perhaps update all the
screenshot-urls once a year or so.
The rationale for adding WP (list of Wikipeidas with an article in the
wikidata entry, e.g. ("en" "sv" "es") to the recipe is that this parsing
of wikidata->WP for every package in our list is quite expensive. Better
to do it once and perhaps update once a year or so.
Also to help us to associate new and existing packages with
wikidata-entries we could devise a guile-programmed way to associate
wikidata-entries to existing package objects and perhaps use this to
populate new package-recipes created with /guix import/
Guix would then be the first package manager to both be completely free
of proprietary software and to leverage knowledge from Wikidata and WP.
What do you think?
/Cheers/
Swedebugia
PS: we could further improve our recipes by adding fields like "release
date" either via guix import from upstream or by populating from WP.
This would make it easy for WP-contributors to track when new releases
happen and perhaps with a script automatically update WP-articles based
on our recipes when we have newer information.
PPS: Perhaps over time it will even be feasible for WP to use our
synopsis/descriptions somehow. This would enable us to integrate
descriptions and translations for programs. E.g. a WP-contributor sees
that an article in spanish for program x does not yet exist in es.WP but
a translated synopsis and description already exists in Guix.
[-- Attachment #2: Type: text/html, Size: 3265 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Feature idea] Adding wikidata, wikipedia & screenshot-url fields to package-recipes
2018-11-01 9:44 [Feature idea] Adding wikidata, wikipedia & screenshot-url fields to package-recipes swedebugia
@ 2018-11-01 10:21 ` Pjotr Prins
2018-11-01 22:33 ` swedebugia
2018-11-01 13:37 ` Amirouche Boubekki
2018-11-01 15:00 ` Tobias Geerinckx-Rice
2 siblings, 1 reply; 10+ messages in thread
From: Pjotr Prins @ 2018-11-01 10:21 UTC (permalink / raw)
To: swedebugia; +Cc: guix-devel
On Thu, Nov 01, 2018 at 10:44:19AM +0100, swedebugia wrote:
> Also to help us to associate new and existing packages with
> wikidata-entries we could devise a guile-programmed way to associate
> wikidata-entries to existing package objects and perhaps use this to
> populate new package-recipes created with guix import
>
> Guix would then be the first package manager to both be completely free
> of proprietary software and to leverage knowledge from Wikidata and WP.
>
> What do you think?
Absolutely the way forward. Totally excited you want to run with this!
Wikidata is linked data and dry and by using it we can share between
distros and software building projects (conda, easybuild etc.). It
scales because the software maintainers themselves will be encouraged
to update project information - such as a reference to a mailing list
- which is the only way to really keep up-to-date in a scalable way.
Wikipedia will use that information too. Wikidata is DRY.
People don't realise it, but in science to find the right tools for
the job is often a challenge. Wikidata will help us create sections
for tools that address certain tasks. For example variant calling in
sequencing data. I and others here have a direct stake in solving this
problem. This meta-information does not belong in Guix, so we need a
place to handle it. When you have a proof-of-concept we can even
consider writing a paper about it.
Will you join our Guix event at FOSDEM? This would be an interesting
working group. I know also a few people outside Guix who will be
interested.
Pj.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Feature idea] Adding wikidata, wikipedia & screenshot-url fields to package-recipes
2018-11-01 10:21 ` Pjotr Prins
@ 2018-11-01 22:33 ` swedebugia
2018-11-02 7:24 ` Pjotr Prins
0 siblings, 1 reply; 10+ messages in thread
From: swedebugia @ 2018-11-01 22:33 UTC (permalink / raw)
To: Pjotr Prins; +Cc: guix-devel
[-- Attachment #1: Type: text/plain, Size: 3323 bytes --]
Hi :)
On 2018-11-01 11:21, Pjotr Prins wrote:
> On Thu, Nov 01, 2018 at 10:44:19AM +0100, swedebugia wrote:
>> Also to help us to associate new and existing packages with
>> wikidata-entries we could devise a guile-programmed way to associate
>> wikidata-entries to existing package objects and perhaps use this to
>> populate new package-recipes created with guix import
>>
>> Guix would then be the first package manager to both be completely free
>> of proprietary software and to leverage knowledge from Wikidata and WP.
>>
>> What do you think?
> Absolutely the way forward. Totally excited you want to run with this!
>
> Wikidata is linked data and dry and by using it we can share between
> distros and software building projects (conda, easybuild etc.). It
> scales because the software maintainers themselves will be encouraged
> to update project information - such as a reference to a mailing list
> - which is the only way to really keep up-to-date in a scalable way.
> Wikipedia will use that information too. Wikidata is DRY.
First I did not understand DRY. Found
https://en.wikipedia.org/wiki/Don't_repeat_yourself
Agreed, it would be nice if changes to Wikidata would be driven by the
authors of the programs and shared with leaf nodes (package managers, etc.)
> People don't realise it, but in science to find the right tools for
> the job is often a challenge. Wikidata will help us create sections
> for tools that address certain tasks. For example variant calling in
> sequencing data. I and others here have a direct stake in solving this
> problem. This meta-information does not belong in Guix, so we need a
> place to handle it.
I agree. Though it would be good to find a way to keep queries to
wikidata to a minimum. Caching maybe.
> When you have a proof-of-concept we can even
> consider writing a paper about it.
Eh, unfortunately my guile and guix fu is not really anything to brag
about, yet ;-). I am reading up on guile and trying to understand the
code in guix.
Right now my skill level is at
* finding spelling errors and unclear text in the manual
* contribute new simple packages (about to package recoll
https://en.wikipedia.org/wiki/Recoll and splint)
* adding better error messages to guix
* sharing new ideas
This wikidata endeavor would likely take some time for me to accomplish
with a good mentor.
First up is deciding whether the core procedures interacting with
wikidata should be in guix or as a separate module. I suggest separate
module.
Then writing client procedures to interface with the SPARQL API in
wikidata. This has already been done in python 3 (beta) see
https://github.com/dahlia/wikidata gplv3+
We could piggyback on this client (essentially making guix dependent on
python :/) or better yet contribute to one of the existing guile sql
libraries:
* https://sourceforge.net/projects/guile-simplesql/files/latest/download
(unmaintained since 2014 it seems)
* https://github.com/opencog/guile-dbi (active fork)
The last one looks most promising but I did not look at the code yet.
> Will you join our Guix event at FOSDEM? This would be an interesting
> working group.
Thanks for the invitation. I will think about it.
I look forward to hone my guile and guix skills :)
Cheers
Swedebugia
[-- Attachment #2: Type: text/html, Size: 4932 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Feature idea] Adding wikidata, wikipedia & screenshot-url fields to package-recipes
2018-11-01 22:33 ` swedebugia
@ 2018-11-02 7:24 ` Pjotr Prins
2018-11-02 11:37 ` swedebugia
0 siblings, 1 reply; 10+ messages in thread
From: Pjotr Prins @ 2018-11-02 7:24 UTC (permalink / raw)
To: swedebugia; +Cc: guix-devel
On Thu, Nov 01, 2018 at 11:33:48PM +0100, swedebugia wrote:
> Wikidata is linked data and dry and by using it we can share between
> distros and software building projects (conda, easybuild etc.). It
> scales because the software maintainers themselves will be encouraged
> to update project information - such as a reference to a mailing list
> - which is the only way to really keep up-to-date in a scalable way.
> Wikipedia will use that information too. Wikidata is DRY.
>
> First I did not understand DRY. Found
> [1]https://en.wikipedia.org/wiki/Don't_repeat_yourself
> Agreed, it would be nice if changes to Wikidata would be driven by the
> authors of the programs and shared with leaf nodes (package managers,
> etc.)
Yeah, sorry for the jargon.
> I agree. Though it would be good to find a way to keep queries to
> wikidata to a minimum. Caching maybe.
It lends itself naturally to caching. We can use fetching links for
the website - that would be a good start - and later see if we can
enrich package descriptions in Guix itself. In both cases the user
should decide whether they want to use internet access/use a cache
instead.
> When you have a proof-of-concept we can even
> consider writing a paper about it.
>
> Eh, unfortunately my guile and guix fu is not really anything to brag
> about, yet ;-). I am reading up on guile and trying to understand the
> code in guix.
> Right now my skill level is at
> * finding spelling errors and unclear text in the manual
> * contribute new simple packages (about to package recoll
> [2]https://en.wikipedia.org/wiki/Recoll and splint)
> * adding better error messages to guix
> * sharing new ideas
>
> This wikidata endeavor would likely take some time for me to accomplish
> with a good mentor.
No problem! I think it is actually a very good learning project. We
can help. Start small is my advice.
> First up is deciding whether the core procedures interacting with
> wikidata should be in guix or as a separate module. I suggest separate
> module.
Agree. I think it can be a tool that is separate from Guix itself.
Just start with a simple query and store that either as an
S-expression or as JSON. I think (eventually) we ought to do both so
other languages may use output too. Have a look at the tooling that
generates the website.
> Then writing client procedures to interface with the SPARQL API in
> wikidata. This has already been done in python 3 (beta) see
> [3]https://github.com/dahlia/wikidata gplv3+
>
> We could piggyback on this client (essentially making guix dependent on
> python :/) or better yet contribute to one of the existing guile sql
> libraries:
> * [4]https://sourceforge.net/projects/guile-simplesql/files/latest/do
> wnload (unmaintained since 2014 it seems)
> * [5]https://github.com/opencog/guile-dbi (active fork)
>
> The last one looks most promising but I did not look at the code yet.
Personally I would use the Python stuff first and then slowly replace
that with Guile. That way you get to results fast and we can improve over
time. I personally take no issue with mixing stuff. And because it is
a separate tool it is your choice anyway. I think also, initially, we
should build a separate website that can display all this information.
That way you have full freedom on implementation and experiments.
> Will you join our Guix event at FOSDEM? This would be an interesting
> working group.
>
> Thanks for the invitation. I will think about it.
> I look forward to hone my guile and guix skills :)
Please come. FOSDEM is awesome.
Pj.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Feature idea] Adding wikidata, wikipedia & screenshot-url fields to package-recipes
2018-11-02 7:24 ` Pjotr Prins
@ 2018-11-02 11:37 ` swedebugia
2018-11-02 15:37 ` Pjotr Prins
0 siblings, 1 reply; 10+ messages in thread
From: swedebugia @ 2018-11-02 11:37 UTC (permalink / raw)
To: Pjotr Prins; +Cc: guix-devel
[-- Attachment #1: Type: text/plain, Size: 2297 bytes --]
On 2018-11-02 08:24, Pjotr Prins wrote:
> On Thu, Nov 01, 2018 at 11:33:48PM +0100, swedebugia wrote: >> When you have a proof-of-concept we can even consider writing a >>
paper about it.
That would probably be fun :) I did not write a paper for a long time
and never in the field of computing.
>> This wikidata endeavor would likely take some time for me to >> accomplish with a good mentor. > > No problem! I think it is
actually a very good learning project. We > can help. Start small is my
advice.
Thank you! I feel motivated.
> >> First up is deciding whether the core procedures interacting with >>
wikidata should be in guix or as a separate module. I suggest >>
separate module. > > Agree. I think it can be a tool that is separate
from Guix itself. > Just start with a simple query and store that either
as an > S-expression or as JSON. I think (eventually) we ought to do
both so > other languages may use output too. Have a look at the tooling
that > generates the website.
Ok. Is there a json guile module?
Will take a close look at the python module.
> >> Then writing client procedures to interface with the SPARQL API in
>> wikidata. This has already been done in python 3 (beta) see >>
[3]https://github.com/dahlia/wikidata gplv3+ >> >> We could piggyback on
this client (essentially making guix >> dependent on python :/) or
better yet contribute to one of the >> existing guile sql libraries: > >
Personally I would use the Python stuff first and then slowly > replace
that with Guile. That way you get to results fast and we can > improve
over time. I personally take no issue with mixing stuff. And > because
it is a separate tool it is your choice anyway. I think also, >
initially, we should build a separate website that can display all >
this information. That way you have full freedom on implementation > and
experiments.
How would I go about mixing python and guile?
Export the list of package records from guile -> JSON and import in a
python script?
Can I call a python-script from guile and receive input from it?
So something like:
iterate over record fields
calling a python script to fetch data from wikidata
acting on the data
feeding it to the console/web template code
fire up the webserver serving the html
Cheers
Swedebugia
[-- Attachment #2: Type: text/html, Size: 3262 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Feature idea] Adding wikidata, wikipedia & screenshot-url fields to package-recipes
2018-11-02 11:37 ` swedebugia
@ 2018-11-02 15:37 ` Pjotr Prins
0 siblings, 0 replies; 10+ messages in thread
From: Pjotr Prins @ 2018-11-02 15:37 UTC (permalink / raw)
To: swedebugia; +Cc: guix-devel
On Fri, Nov 02, 2018 at 12:37:15PM +0100, swedebugia wrote:
> Ok. Is there a json guile module?
Yup. Dave wrote one, I believe. There is one shipped with Guix. It is
probably this one https://directory.fsf.org/wiki/Guile-json#Details
> How would I go about mixing python and guile? Export the list of
> package records from guile -> JSON and import in a python script?
> Can I call a python-script from guile and receive input from it?
> So something like: iterate over record fields calling a python
> script to fetch data from wikidata acting on the data feeding it
> to the console/web template code fire up the webserver serving
> the html
That could work, for sure.
Pj.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Feature idea] Adding wikidata, wikipedia & screenshot-url fields to package-recipes
2018-11-01 9:44 [Feature idea] Adding wikidata, wikipedia & screenshot-url fields to package-recipes swedebugia
2018-11-01 10:21 ` Pjotr Prins
@ 2018-11-01 13:37 ` Amirouche Boubekki
2018-11-01 23:25 ` swedebugia
2018-11-01 15:00 ` Tobias Geerinckx-Rice
2 siblings, 1 reply; 10+ messages in thread
From: Amirouche Boubekki @ 2018-11-01 13:37 UTC (permalink / raw)
To: swedebugia; +Cc: guix-devel
Hello,
Like Pjotr I think it's a very good idea and the way forward.
Find below my comments with some modulation.
Le jeu. 1 nov. 2018 à 10:39, swedebugia <swedebugia@riseup.net> a écrit :
>
> Hi
>
> I am a contributor to OSM and have seen how combining OSM and Wikidata/Wikipedia (WP) has been very useful.
>
> I got the idea of adding Wikidata-entries to guix package objecs would be fruitful because:
The idea is to add a wikidata identifier for guix packages.
For those that are not familiar with wikidata here is a little summary
of my own.
wikidata is wikimedia project that put together structured data about the world.
wikidata is itself a wiki like wikipedia that anybody can improve it. The goal
of the project is to have a machine readable form of knowledge. One of the use
case for that, is to easily keep wikipedia (and other wik) up-to-date regarding
metadata. Simply said, one could generate, so called, info boxes on wikipedia
from wikidata.
See https://www.wikidata.org/wiki/Q937466 for GNU mailman wikidata entity.
> It makes it possible to a more useful list of packages e.g. by showing links to WP entries for the program in the users local language.
> (E.g. by firing up a browser from emacs or the shell, or by populating a (per channel) html package list (with screenshots, local WP-links, etc.)
> and firing up a throw-away web-server instance serving this with e.g. guix package --list-available-packages-html)
The benefits for guix project:
Immediate benefit:
- It will be easier to translate description and synopsis
- Improve guix packages discover-ability via wikidata SPARQL endpoint
(e.g. give me all guix packages that deal with biology)
- Grab screenshot and other media or metadata about a given package
Other benefits:
- If upstream and other distro adopt wikidata as the Single Source Of
Truth, it will help with packaging and keeping guix up-to-date
- Everything is connected!
> It would also perhaps be of benefit to WP-contributors because we could easily make statistics for how many of the packages
> in guix a Wikidata-entry and/or WP-entry exists. Thus perhaps leading to creating of more articles for notable packages or improving
> WP-articles with outdated release information.
This will be of great benefit for wikidata.
>
> Implementation:
>
> It could be implemented by adding the fields to package-objects.
nitpick, those are records in guile scheme.
> The rationale for adding screenshot-url to the recipe is that this parsing of wikidata->en-WP->url-for-first-image
> for every package in our list is quite expensive. Better to do it once and perhaps update all the screenshot-urls
> once a year or so.
I think the screenshot-url field will not be very helpful that can be
fetched based on wikidata identifier.
>
> The rationale for adding WP (list of Wikipeidas with an article in the wikidata entry, e.g. ("en" "sv" "es")
> to the recipe is that this parsing of wikidata->WP for every package in our list is quite expensive. Better
> to do it once and perhaps update once a year or so.
Based on the wikidata entry, you can use SPARQL to retrieve the
wikipedia page in various language
and use wikipedia commons links to fetch screenshot.
Simply said, I think we should not add more fields than necessary to
build the package and push to wikidata
the information guix might need for other purposes than distribution and build.
The benefit of this approach is that package definition is not
overloaded with fields and non-code contributors
can still contribute to guix by submitting a screenshot to wikipedia
commons and editing wikidata.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Feature idea] Adding wikidata, wikipedia & screenshot-url fields to package-recipes
2018-11-01 13:37 ` Amirouche Boubekki
@ 2018-11-01 23:25 ` swedebugia
2018-11-03 7:53 ` Catonano
0 siblings, 1 reply; 10+ messages in thread
From: swedebugia @ 2018-11-01 23:25 UTC (permalink / raw)
To: Amirouche Boubekki; +Cc: guix-devel
[-- Attachment #1: Type: text/plain, Size: 1079 bytes --]
Hi Amirouche
On 2018-11-01 14:37, Amirouche Boubekki wrote:
>
>> Implementation:
>>
>> It could be implemented by adding the fields to package-objects.
> nitpick, those are records in guile scheme.
Did you mean to correct my use of "object" here?
Your are right about that.
Scheme Syntax: *define-record-type* /type
(constructor fieldname …)
predicate
(fieldname accessor [modifier]) …
/
I finally begin to understand all these words an Scheme-ways of doing
things./
/
Have I understood correctly that we have at least 1 nested record types
in guix? E.g. the package record contains an origin record and a lot of
other fields.
>> The rationale for adding screenshot-url to the recipe is that this parsing of wikidata->en-WP->url-for-first-image
>> for every package in our list is quite expensive. Better to do it once and perhaps update all the screenshot-urls
>> once a year or so.
> I think the screenshot-url field will not be very helpful that can be
> fetched based on wikidata identifier.
Ok, I understand your point.
/Swedebugia
[-- Attachment #2: Type: text/html, Size: 1955 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Feature idea] Adding wikidata, wikipedia & screenshot-url fields to package-recipes
2018-11-01 23:25 ` swedebugia
@ 2018-11-03 7:53 ` Catonano
0 siblings, 0 replies; 10+ messages in thread
From: Catonano @ 2018-11-03 7:53 UTC (permalink / raw)
To: swedebugia; +Cc: guix-devel
[-- Attachment #1: Type: text/plain, Size: 810 bytes --]
Il giorno ven 2 nov 2018 alle ore 00:20 swedebugia <swedebugia@riseup.net>
ha scritto:
> Hi Amirouche
>
> On 2018-11-01 14:37, Amirouche Boubekki wrote:
>
>
> Implementation:
>
> It could be implemented by adding the fields to package-objects.
>
> nitpick, those are records in guile scheme.
>
> Did you mean to correct my use of "object" here?
> Your are right about that.
>
> Scheme Syntax: *define-record-type*
>
>
>
> *type (constructor fieldname …) predicate (fieldname accessor [modifier])
> … *
> I finally begin to understand all these words an Scheme-ways of doing
> things.
>
> Have I understood correctly that we have at least 1 nested record types in
> guix? E.g. the package record contains an origin record and a lot of other
> fields.
>
yes, that's correct
[-- Attachment #2: Type: text/html, Size: 1478 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Feature idea] Adding wikidata, wikipedia & screenshot-url fields to package-recipes
2018-11-01 9:44 [Feature idea] Adding wikidata, wikipedia & screenshot-url fields to package-recipes swedebugia
2018-11-01 10:21 ` Pjotr Prins
2018-11-01 13:37 ` Amirouche Boubekki
@ 2018-11-01 15:00 ` Tobias Geerinckx-Rice
2 siblings, 0 replies; 10+ messages in thread
From: Tobias Geerinckx-Rice @ 2018-11-01 15:00 UTC (permalink / raw)
To: swedebugia; +Cc: guix-devel
Hullo swedebugia,
swedebugia wrote:
> I am a contributor to OSM and have seen how combining OSM and
> Wikidata/Wikipedia (WP) has been very useful.
I'm not too familiar with Wikidata but, like Pjotr and Amirouche,
think it's a worthwile idea to explore and discuss!
> Guix would then be the first package manager to both be
> completely
> free of proprietary software and to leverage knowledge from
> Wikidata
> and WP.
Something to seriously consider here are the FSDG. Where do we
draw the line between ‘external content’ (as we presumably treat
home pages, which currently contain everything from proprietary
software recommendations to #TeamWhite race war progaganda) and
Guix when such WD/WP data is presented inside our UI?
The FSD[0] comes to mind. Coincidentally, a new mail was just
posted to the directory-discuss list about Wikidata. It should be
enlightening to read the responses, if any.
I also assume that we'll be keeping our current set of
metadata. The requirement for a full-blown Internet connection
just to explore (not install) packages would be a regression.
Kind regards,
T G-R
[0]: https://directory.fsf.org/wiki/Free_Software_Directory:About
[1]:
https://lists.gnu.org/archive/html/directory-discuss/2018-11/msg00000.html
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2018-11-03 7:53 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-11-01 9:44 [Feature idea] Adding wikidata, wikipedia & screenshot-url fields to package-recipes swedebugia
2018-11-01 10:21 ` Pjotr Prins
2018-11-01 22:33 ` swedebugia
2018-11-02 7:24 ` Pjotr Prins
2018-11-02 11:37 ` swedebugia
2018-11-02 15:37 ` Pjotr Prins
2018-11-01 13:37 ` Amirouche Boubekki
2018-11-01 23:25 ` swedebugia
2018-11-03 7:53 ` Catonano
2018-11-01 15:00 ` Tobias Geerinckx-Rice
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/guix.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.