unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: Ian Eure <ian@retrospec.tv>,  guix-devel <guix-devel@gnu.org>
Subject: Re: Concerns/questions around Software Heritage Archive
Date: Thu, 09 May 2024 12:00:10 -0400	[thread overview]
Message-ID: <87pltv0yd1.fsf@gmail.com> (raw)
In-Reply-To: <87r0eky0bb.fsf@gnu.org> ("Ludovic Courtès"'s message of "Thu, 02 May 2024 12:28:56 +0200")

Hi Ian, Ludovic.

Ludovic Courtès <ludo@gnu.org> writes:

> Hi Ian,
>
> Ian Eure <ian@retrospec.tv> skribis:
>
>> Summarizing the situation:
>>
>> - SHF has an opaque, difficult, and undocumented process for
>>   handling name changes.  I’s like to stress again that this is
>>   *not* strictly a transgender issue (though it likely affects   them
>>   more, or in worse/different ways) -- it is a human respect   issue.
>>   Many, many more cisgender people change their name than
>>   transgender people.
>
> It is also not strictly an SWH issue: how does Internet Archive handle
> name changes?  What about append-only storage in general?  We’ve
> discussed this already.

>> - SHF gave their archive to HuggingFace, an "AI" company which is
>>   generating derived works with no attribution or provenance, in
>>   ways which violate the both licenses of the projects used to   train
>>  their model, and the SHF principles for LLMs.
>
> [...]
>
>> - Has Guix reached out to SHF to express these concerns / get a
>>   response?
>
> I’ve seen and participated in informal discussions, but that’s all I
> know.  Maintainers?

We haven't.  Given some improvements were apparently already made by SWF
in response to concerns raised, it seems the dialogue should continue.

>> - Whether a public or private response, what would Guix consider   to
>>  be an acceptable response?  An unacceptable respoinse?
>> - How long is Guix willing to wait for a response?
>
> Free software people, myself included, have expressed disappointment
> regarding the use of code harvested by SWH for HuggingFace’s training.
> Stefano Zacchiroli of SWH responded to these concerns on Mastodon back
> in March, as you probably saw.
>
> One important point is that copyleft code is excluded from the training
> dataset; I was able to anecdotally check that for GPL code such as Guix
> using their interface (there was a thread on Mastodon but I can’t find
> it): <https://huggingface.co/spaces/bigcode/in-the-stack>.  That
> addresses my main concern.
>
> Remaining concerns include the weak wording of the principles put
> forward by SWH in its statement on LLMs:
> <https://www.softwareheritage.org/2023/10/19/swh-statement-on-llm-for-code/>.
> I think this is something worth discussing further with them (it’s
> already been brought up notably on Mastodon).  It’s not clear to me
> whether this is a task for Guix as a project.

I don't think it is a task for Guix specifically, but rather for all
users of SWH or interested parties.

-- 
Thanks,
Maxim


      reply	other threads:[~2024-05-09 16:02 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-16 15:52 Concerns/questions around Software Heritage Archive Ian Eure
2024-03-16 17:50 ` Christopher Baines
2024-03-16 18:24   ` MSavoritias
2024-03-16 19:08     ` Christopher Baines
2024-03-16 19:45     ` Tomas Volf
2024-03-17  7:06       ` MSavoritias
2024-03-16 19:06   ` Ian Eure
2024-03-16 19:49     ` Tomas Volf
2024-03-16 23:16   ` Vivien Kraus
2024-03-16 23:27     ` Tomas Volf
     [not found]     ` <EoCuAq3N681mOIAh7ptCyXiyscM9R0iPDBWId1eS4EbTJ2-ARWNfGuqtXIvmqcJNBl1SQvMM4X6-GiC5LiUv4TJv6J4ritPA3uZ2JBwkAzQ=@protonmail.com>
2024-03-16 23:40       ` Fw: " Ryan Prior
2024-03-16 17:58 ` MSavoritias
2024-03-18  9:50   ` Please hold your horses Simon Tournier
2024-03-16 21:37 ` Concerns/questions around Software Heritage Archive Ryan Prior
2024-03-17  9:39   ` Lars-Dominik Braun
2024-03-17  9:47     ` MSavoritias
2024-03-17 11:53       ` paul
2024-03-17 11:57         ` MSavoritias
2024-03-17 14:57           ` Richard Sent
2024-03-17 16:28           ` Ian Eure
2024-03-17 12:51         ` Tomas Volf
2024-03-17 23:56           ` Attila Lendvai
2024-03-20 15:25         ` contributor uuid (was Re: Concerns/questions around Software Heritage Archive) bae66428a8ad58eafaa98cb0ab2e512f045974ecf4bf947e32096fae574d99c6
2024-03-17 16:20       ` Concerns/questions around Software Heritage Archive Ian Eure
2024-03-17 16:55         ` MSavoritias
2024-03-18 14:04     ` pinoaffe
2024-03-17 13:03 ` Olivier Dion
2024-03-17 17:57 ` Ludovic Courtès
2024-03-20 17:22   ` the right to rewrite history to rectify the past (was Re: Concerns/questions around Software Heritage Archive) Giovanni Biscuolo
2024-03-21  6:12     ` MSavoritias
2024-03-21 10:49       ` Attila Lendvai
2024-03-21 11:51       ` pelzflorian (Florian Pelz)
2024-03-21 11:52       ` pinoaffe
2024-03-21 15:08         ` Giovanni Biscuolo
2024-03-21 15:11           ` MSavoritias
2024-03-21 22:11             ` Philip McGrath
2024-03-21 16:17           ` pinoaffe
2024-03-21 15:23       ` Hartmut Goebel
2024-03-21 15:27         ` MSavoritias
2024-03-21 15:54           ` Ekaitz Zarraga
2024-03-22  4:33           ` Felix Lechner via Development of GNU Guix and the GNU System distribution.
2024-03-21 16:18         ` Efraim Flashner
2024-03-21 16:23         ` pinoaffe
2024-03-18  9:28 ` Concerns/questions around Software Heritage Archive Simon Tournier
2024-03-18 11:47   ` MSavoritias
2024-03-18 13:12     ` Simon Tournier
2024-03-18 14:00       ` MSavoritias
2024-03-18 14:32         ` Simon Tournier
2024-03-18 16:27   ` Kaelyn
2024-03-18 17:39     ` Daniel Littlewood
2024-03-18 20:38     ` Olivier Dion
2024-03-18 19:38   ` Ian Eure
2024-03-18 22:02     ` Ludovic Courtès
2024-03-19 10:58     ` Simon Tournier
2024-03-19 15:37       ` Ian Eure
2024-03-18 11:14 ` Content-Addressed system and history? Simon Tournier
2024-04-20 18:48 ` Concerns/questions around Software Heritage Archive Ian Eure
2024-05-01 15:29   ` Ian Eure
2024-05-01 15:41     ` Tomas Volf
2024-05-02 10:28   ` Ludovic Courtès
2024-05-09 16:00     ` Maxim Cournoyer [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pltv0yd1.fsf@gmail.com \
    --to=maxim.cournoyer@gmail.com \
    --cc=guix-devel@gnu.org \
    --cc=ian@retrospec.tv \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).