From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: Ian Eure <ian@retrospec.tv>, guix-devel <guix-devel@gnu.org>
Subject: Re: Concerns/questions around Software Heritage Archive
Date: Thu, 09 May 2024 12:00:10 -0400 [thread overview]
Message-ID: <87pltv0yd1.fsf@gmail.com> (raw)
In-Reply-To: <87r0eky0bb.fsf@gnu.org> ("Ludovic Courtès"'s message of "Thu, 02 May 2024 12:28:56 +0200")
Hi Ian, Ludovic.
Ludovic Courtès <ludo@gnu.org> writes:
> Hi Ian,
>
> Ian Eure <ian@retrospec.tv> skribis:
>
>> Summarizing the situation:
>>
>> - SHF has an opaque, difficult, and undocumented process for
>> handling name changes. I’s like to stress again that this is
>> *not* strictly a transgender issue (though it likely affects them
>> more, or in worse/different ways) -- it is a human respect issue.
>> Many, many more cisgender people change their name than
>> transgender people.
>
> It is also not strictly an SWH issue: how does Internet Archive handle
> name changes? What about append-only storage in general? We’ve
> discussed this already.
>> - SHF gave their archive to HuggingFace, an "AI" company which is
>> generating derived works with no attribution or provenance, in
>> ways which violate the both licenses of the projects used to train
>> their model, and the SHF principles for LLMs.
>
> [...]
>
>> - Has Guix reached out to SHF to express these concerns / get a
>> response?
>
> I’ve seen and participated in informal discussions, but that’s all I
> know. Maintainers?
We haven't. Given some improvements were apparently already made by SWF
in response to concerns raised, it seems the dialogue should continue.
>> - Whether a public or private response, what would Guix consider to
>> be an acceptable response? An unacceptable respoinse?
>> - How long is Guix willing to wait for a response?
>
> Free software people, myself included, have expressed disappointment
> regarding the use of code harvested by SWH for HuggingFace’s training.
> Stefano Zacchiroli of SWH responded to these concerns on Mastodon back
> in March, as you probably saw.
>
> One important point is that copyleft code is excluded from the training
> dataset; I was able to anecdotally check that for GPL code such as Guix
> using their interface (there was a thread on Mastodon but I can’t find
> it): <https://huggingface.co/spaces/bigcode/in-the-stack>. That
> addresses my main concern.
>
> Remaining concerns include the weak wording of the principles put
> forward by SWH in its statement on LLMs:
> <https://www.softwareheritage.org/2023/10/19/swh-statement-on-llm-for-code/>.
> I think this is something worth discussing further with them (it’s
> already been brought up notably on Mastodon). It’s not clear to me
> whether this is a task for Guix as a project.
I don't think it is a task for Guix specifically, but rather for all
users of SWH or interested parties.
--
Thanks,
Maxim
prev parent reply other threads:[~2024-05-09 16:02 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-16 15:52 Concerns/questions around Software Heritage Archive Ian Eure
2024-03-16 17:50 ` Christopher Baines
2024-03-16 18:24 ` MSavoritias
2024-03-16 19:08 ` Christopher Baines
2024-03-16 19:45 ` Tomas Volf
2024-03-17 7:06 ` MSavoritias
2024-03-16 19:06 ` Ian Eure
2024-03-16 19:49 ` Tomas Volf
2024-03-16 23:16 ` Vivien Kraus
2024-03-16 23:27 ` Tomas Volf
[not found] ` <EoCuAq3N681mOIAh7ptCyXiyscM9R0iPDBWId1eS4EbTJ2-ARWNfGuqtXIvmqcJNBl1SQvMM4X6-GiC5LiUv4TJv6J4ritPA3uZ2JBwkAzQ=@protonmail.com>
2024-03-16 23:40 ` Fw: " Ryan Prior
2024-03-16 17:58 ` MSavoritias
2024-03-18 9:50 ` Please hold your horses Simon Tournier
2024-03-16 21:37 ` Concerns/questions around Software Heritage Archive Ryan Prior
2024-03-17 9:39 ` Lars-Dominik Braun
2024-03-17 9:47 ` MSavoritias
2024-03-17 11:53 ` paul
2024-03-17 11:57 ` MSavoritias
2024-03-17 14:57 ` Richard Sent
2024-03-17 16:28 ` Ian Eure
2024-03-17 12:51 ` Tomas Volf
2024-03-17 23:56 ` Attila Lendvai
2024-03-20 15:25 ` contributor uuid (was Re: Concerns/questions around Software Heritage Archive) bae66428a8ad58eafaa98cb0ab2e512f045974ecf4bf947e32096fae574d99c6
2024-03-17 16:20 ` Concerns/questions around Software Heritage Archive Ian Eure
2024-03-17 16:55 ` MSavoritias
2024-03-18 14:04 ` pinoaffe
2024-03-17 13:03 ` Olivier Dion
2024-03-17 17:57 ` Ludovic Courtès
2024-03-20 17:22 ` the right to rewrite history to rectify the past (was Re: Concerns/questions around Software Heritage Archive) Giovanni Biscuolo
2024-03-21 6:12 ` MSavoritias
2024-03-21 10:49 ` Attila Lendvai
2024-03-21 11:51 ` pelzflorian (Florian Pelz)
2024-03-21 11:52 ` pinoaffe
2024-03-21 15:08 ` Giovanni Biscuolo
2024-03-21 15:11 ` MSavoritias
2024-03-21 22:11 ` Philip McGrath
2024-03-21 16:17 ` pinoaffe
2024-03-21 15:23 ` Hartmut Goebel
2024-03-21 15:27 ` MSavoritias
2024-03-21 15:54 ` Ekaitz Zarraga
2024-03-22 4:33 ` Felix Lechner via Development of GNU Guix and the GNU System distribution.
2024-03-21 16:18 ` Efraim Flashner
2024-03-21 16:23 ` pinoaffe
2024-03-18 9:28 ` Concerns/questions around Software Heritage Archive Simon Tournier
2024-03-18 11:47 ` MSavoritias
2024-03-18 13:12 ` Simon Tournier
2024-03-18 14:00 ` MSavoritias
2024-03-18 14:32 ` Simon Tournier
2024-03-18 16:27 ` Kaelyn
2024-03-18 17:39 ` Daniel Littlewood
2024-03-18 20:38 ` Olivier Dion
2024-03-18 19:38 ` Ian Eure
2024-03-18 22:02 ` Ludovic Courtès
2024-03-19 10:58 ` Simon Tournier
2024-03-19 15:37 ` Ian Eure
2024-03-18 11:14 ` Content-Addressed system and history? Simon Tournier
2024-04-20 18:48 ` Concerns/questions around Software Heritage Archive Ian Eure
2024-05-01 15:29 ` Ian Eure
2024-05-01 15:41 ` Tomas Volf
2024-05-02 10:28 ` Ludovic Courtès
2024-05-09 16:00 ` Maxim Cournoyer [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87pltv0yd1.fsf@gmail.com \
--to=maxim.cournoyer@gmail.com \
--cc=guix-devel@gnu.org \
--cc=ian@retrospec.tv \
--cc=ludo@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).