From: swedebugia <swedebugia@riseup.net>
To: guix-devel@gnu.org
Subject: Re: Re-approaching package tagging
Date: Wed, 19 Dec 2018 08:42:24 +0100 [thread overview]
Message-ID: <8622fccd-52f3-bd5e-6f3e-2cb460f4430d@riseup.net> (raw)
In-Reply-To: <261b0ff4-53f8-6c54-1d3e-4e0ed8128d91@riseup.net>
On 2018-12-19 07:51, swedebugia wrote:
> On 2018-12-18 08:48, Catonano wrote:
>>
>>
>> Il giorno lun 17 dic 2018 alle ore 22:10 swedebugia
>> <swedebugia@riseup.net <mailto:swedebugia@riseup.net>> ha scritto:
>>
>> Hi :)
>>
>> On 2018-12-17 20:01, Christopher Lemmer Webber wrote:
>> > Hello,
>> >
>> > In the past when we've discussed package tagging, I think Ludo'
>> has been
>> > against it, primarily because it's a giant source of
>> bikeshedding. I
>> > agree that it's a huge space for bikeshedding... no space
>> provides more
>> > bikeshedding than naming things, and tagging things is a many
>> to many
>> > naming system.
>> >
>> > However, I will say that finding packages based on topical
>> interest is
>> > pretty hard right now. If I want to find all the available
>> roguelikes:
>> >
>> > cwebber@jasmine:~$ guix package -A rogue
>> > hyperrogue 10.5 out gnu/packages/games.scm:3652:2
>> > roguebox-adventures 2.2.1 out
>> gnu/packages/games.scm:1047:2
>> >
>> > Hm, that's strange, there's definitely more roguelikes that
>> should show
>> > up than that! A more specific search is even worse:
>> >
>> > cwebber@jasmine:~$ guix package -A roguelike
>> > cwebber@jasmine:~$
>> >
>> > What I should have gotten back:
>> > - angband
>> > - cataclysm-dda
>> > - crawl
>> > - crawl-tiles
>> > - hyperrogue
>> > - nethack
>> > - roguebox-adventures
>> > - tome4
>> >
>> > So I only got 1/4 of the entries I was interested in in my first
>> query.
>> > Too bad!
>> >
>> > I get that we're opening up space for bikeshedding and *that's
>> true*.
>> > But it seems like not doing so makes things hard on users.
>> >
>> > What do you think? Is there a way to open the (pandora's?) box
>> of tags
>> > safely?
>>
>> Yes and no.
>>
>> Pjotr and I have discussed this relating to biotech software. He said
>> that many scientists have a hard time finding the right tools for
>> the job.
>>
>> I proposed tight integration with wikidata[1] (every software in the
>> world will eventually have an item there) and Guix (QID on every
>> package
>> and lookup/catogory integration) and leave all the categorizing to
>> them.
>> Ha problem sidestepped, they are bikeshedding experts over there in
>> wikiland! :D
>>
>> The advantage of this is that everyone using wikidata (every package
>> manager) could pull the same categorization so we only do it once
>> in a
>> central
>>
>> What do you think?
>>
>> --
>>
>>
>> There is also the Free Software Directory
>> https://directory.fsf.org/wiki/Main_Page
>>
>> I don't know what the relationship between Wikidata and the FSD is
>>
>> Does Wikidata import data from the FSD ? Or viceversa ?
>>
>
> I don't know. For now at least they keep reference to the FSD on
> software-entries that exists in the FSD.
>
> We could integrate the FSD also but I have yet to investigate if they
> provide an API for their entries.
>
> Anyways I view FSD as a subset of Wikidata/Wikipedia. Wikidata is the
> node and FSD the leaf. Wikidata/Wikipedia will probably within a few
> years contain the data or links to the data that now exists in the FSD.
>
> Correct me if I'm wrong but the only advantage of FSD over Wikidata &
> Wikipedia is that they do not include references to proprietary software
> at all.
>
> In my view it is more feasible to compile the information on in a
> structured way in central node and then pull the relevant bits to the leaf.
>
> E.g. FSD of the future could be generated from all wikidata-entries and
> extracts of wikipedia that are an instance of
> https://www.wikidata.org/wiki/Q341. This would avoid fragmentation and
> help concentrate on building a large shared collective source of all
> knowledge within the wiki-community. FSD could exist anyhow and surely
> help enrich the upstream data.
>
> Similarly we could generate a wikipedia subset without any entries
> pointing to (evil) private corporations (any entries that is part of
> https://www.wikidata.org/wiki/Q5621421 or whatever). I can't imagine
> what this would be good for but it its possible.
>
> I cannot imagine that the information in FSD would not be accepted in
> any of the wikimedia projects. I could be wrong though as I honestly did
> not visit or study the FSD very much.
>
Also the license of the FSD (GFDL 1.2) differs from both Wikidata (CC0)
and Wikipedia (CC-BY-SA 4.0 + GFDL 1.2).
This is not to their advantage in the long run.
I fear the FSD is already becoming unmaintained and obsolete with people
favoring more open and smarter solutions from the wikimedia-projects (at
least I am).
When it comes to completeness we have at least 500.000 packages missing
in both Wikidata and FSD (450.000+ MIT & CC0 licensed npm packages).
Would any of you like to import those twice? I don't and as I see it
Wikidata is far superior in multiple ways to get the job done and do it
well with a big community backing it up with tools, bots, manual edits,
et all. Who wants to update with new versions in two places when we have
over half a million free software packages to juggle?
Here is a small comparison example:
Top 8 JS packages according to
https://github.com/search?o=desc&q=js&s=stars&type=Repositories
(900.000+ repositories in total!) (i filtered out a few non softwares)
1. angular.js
https://www.wikidata.org/wiki/Q28925578
https://directory.fsf.org/wiki/Angular2
2. node
https://directory.fsf.org/wiki/Node#tab=Overview
https://www.wikidata.org/wiki/Q756100
3. axios
not found in either
4. three.js
https://www.wikidata.org/wiki/Q3525922
https://directory.fsf.org/wiki/Three.js
5. socket.io
https://www.wikidata.org/wiki/Q7552998
not found (poor search function in my view)
6. reveal.js
not found
not found
7. chart.js
not found
not found
8. json-server
not found
not found
Wikidata already contains way more entries and data on the entries I
compared (e.g. node, npm, gcc) than FSD despite it being a much younger
project.
--
Cheers Swedebugia
next prev parent reply other threads:[~2018-12-19 7:36 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-17 19:01 Re-approaching package tagging Christopher Lemmer Webber
2018-12-17 20:57 ` swedebugia
2018-12-17 23:08 ` zimoun
2018-12-18 7:48 ` Catonano
2018-12-18 11:34 ` Catonano
2018-12-19 6:51 ` swedebugia
2018-12-19 7:42 ` swedebugia [this message]
2018-12-18 11:29 ` Ludovic Courtès
2018-12-18 14:54 ` Christopher Lemmer Webber
2018-12-18 10:36 ` zimoun
2018-12-19 15:04 ` Ludovic Courtès
2018-12-18 20:46 ` zimoun
2018-12-19 23:12 ` zimoun
2018-12-20 7:53 ` Ludovic Courtès
2018-12-20 9:44 ` Chris Marusich
2018-12-21 21:00 ` Ludovic Courtès
2018-12-20 10:57 ` Christopher Lemmer Webber
2018-12-20 11:55 ` swedebugia
2018-12-21 21:06 ` Ludovic Courtès
2018-12-19 15:05 ` Ludovic Courtès
2018-12-18 20:48 ` zimoun
2018-12-20 7:55 ` Ludovic Courtès
2018-12-20 14:42 ` zimoun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8622fccd-52f3-bd5e-6f3e-2cb460f4430d@riseup.net \
--to=swedebugia@riseup.net \
--cc=guix-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).