From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2.migadu.com ([2001:41d0:403:58f0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms1.migadu.com with LMTPS id GHXJKxZrM2YMUgEAe85BDQ:P1 (envelope-from ) for ; Thu, 02 May 2024 12:29:42 +0200 Received: from aspmx1.migadu.com ([2001:41d0:403:58f0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2.migadu.com with LMTPS id GHXJKxZrM2YMUgEAe85BDQ (envelope-from ) for ; Thu, 02 May 2024 12:29:42 +0200 X-Envelope-To: larch@yhetil.org Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gnu.org header.s=fencepost-gnu-org header.b=JgbWhDm6; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1714645782; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=wIKiGcgKpj6BldoyMf0zomXk6yanlJBWcp0z1zWinVI=; b=HmyRQE+BL9z2YRocQHzfSZyowE0Wb2Wtlabw79Ud7o9Xu2EQnYGxIXb1AOCaAS7vyYdZWI 3w6MLLeola2rR+k3VcSQbXtk5kYGB8dWM2Z0JcXD7ERN966fysn35iPQNOTRUN7AmMwsAN 9/d3wh8oRaBMUlmEEF5NJ8ukfx75MNn81XDownLo6KoK5La4M3KP+S0CtO+Gc7qkYwi1dV A+q4eL2fv5HVn2gI53X2YV1pGBtuFo/FB8MxmdZWQO9EOuub/LOEfLSAP8kSd8pDJ03fwV DAzGTFPCVO8ueE/qiWeYoMqZ2699NSvkcigwhG3dx6Au6+QRfyPYemZgzGHemg== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1714645782; a=rsa-sha256; cv=none; b=slrMACX/box/MsvEGkhuj+sFeu6AY1U8UgtTjedWDKfLiPeNMUxOii5ehlqNRM03k3m6+P Ff7EGAv90GeO9wrFzUfEuhtNMcbzjOsgrTZMXxC4GT1zWTv+/agA6Sl/rHfgXT4UsQWfDG aJsdUug3TL8oltyx6T3WIF4ZpKibu8PPdQevqJaDRYAize/1X7LoPoPF5ZEn3LUQUpYGJ8 qkxfbIDRau2UWyBaz6Ry95gx20ZJISaISpQ4lHuz4HttBEhIBxNZTMrNC5BywhDGAkqApf NvW385zywjK51uEA8V33ePnGBFTugJhxLLZR0yvc3pAXU94CfVQYA+lTBxJiVA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gnu.org header.s=fencepost-gnu-org header.b=JgbWhDm6; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 96AC867F67 for ; Thu, 02 May 2024 12:29:42 +0200 (CEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1s2Tgc-0006hJ-Eg; Thu, 02 May 2024 06:29:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1s2Tga-0006gx-Ra for guix-devel@gnu.org; Thu, 02 May 2024 06:29:08 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1s2Tga-0006bc-9t; Thu, 02 May 2024 06:29:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:Date:References:In-Reply-To:Subject:To: From; bh=wIKiGcgKpj6BldoyMf0zomXk6yanlJBWcp0z1zWinVI=; b=JgbWhDm6SRD8i3N00GO1 tUStKwDK0zlm4BsFi/dDhbwgUINAFDi+Ba5xtxX7Gc1qFPHXpJPtVgXO9SWJfnxXNIqJz+CIuh9nz TrPqUHNkMj9a82x/7unY6tOUNJ3cyjfk8oZNem44YVCWqJ/qEuSa/XTGX8AyUBT8SaV1NsuXXdYBy 4RuiQewuR/NPa4CklMjaykAVSHMBat5q2xMQyDsnaAGzWCPbg9mcVuWnZgq5Tb0NsrBTZraMVEyCb JBIisASh7sQpunEdRWBIqyjTgNTo+CC0T6zO1STlPWeru2XBLzeJxk5wBqJW4b4msLbg6pt7uABH4 cJATcnRUgHpDPg==; From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Ian Eure Cc: guix-devel Subject: Re: Concerns/questions around Software Heritage Archive In-Reply-To: <87frvfan0r.fsf@retrospec.tv> (Ian Eure's message of "Sat, 20 Apr 2024 11:48:20 -0700") References: <87il1mupco.fsf@meson> <87frvfan0r.fsf@retrospec.tv> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: Quartidi 14 =?utf-8?Q?Flor=C3=A9al?= an 232 de la =?utf-8?Q?R=C3=A9volution=2C?= jour du =?utf-8?Q?Cham=C3=A9risier?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Thu, 02 May 2024 12:28:56 +0200 Message-ID: <87r0eky0bb.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: guix-devel-bounces+larch=yhetil.org@gnu.org X-Migadu-Flow: FLOW_IN X-Migadu-Country: US X-Migadu-Spam-Score: -8.15 X-Spam-Score: -8.15 X-Migadu-Queue-Id: 96AC867F67 X-Migadu-Scanner: mx11.migadu.com X-TUID: r4zbEvsj+81O Hi Ian, Ian Eure skribis: > Summarizing the situation: > > - SHF has an opaque, difficult, and undocumented process for > handling name changes. I=E2=80=99s like to stress again that this is > *not* strictly a transgender issue (though it likely affects them > more, or in worse/different ways) -- it is a human respect issue. > Many, many more cisgender people change their name than > transgender people. It is also not strictly an SWH issue: how does Internet Archive handle name changes? What about append-only storage in general? We=E2=80=99ve discussed this already. > - SHF gave their archive to HuggingFace, an "AI" company which is > generating derived works with no attribution or provenance, in > ways which violate the both licenses of the projects used to train > their model, and the SHF principles for LLMs. [...] > - Has Guix reached out to SHF to express these concerns / get a > response? I=E2=80=99ve seen and participated in informal discussions, but that=E2=80= =99s all I know. Maintainers? > - Whether a public or private response, what would Guix consider to > be an acceptable response? An unacceptable respoinse? > - How long is Guix willing to wait for a response? Free software people, myself included, have expressed disappointment regarding the use of code harvested by SWH for HuggingFace=E2=80=99s traini= ng. Stefano Zacchiroli of SWH responded to these concerns on Mastodon back in March, as you probably saw. One important point is that copyleft code is excluded from the training dataset; I was able to anecdotally check that for GPL code such as Guix using their interface (there was a thread on Mastodon but I can=E2=80=99t f= ind it): . That addresses my main concern. Remaining concerns include the weak wording of the principles put forward by SWH in its statement on LLMs: . I think this is something worth discussing further with them (it=E2=80=99s already been brought up notably on Mastodon). It=E2=80=99s not clear to me whether this is a task for Guix as a project. (I do not forget that, in the meantime, Microsoft ingests everything that=E2=80=99s on GitHub, including copyleft code, and including clones of = repos that were not initially hosted there.) I=E2=80=99m not sure this is the kind of answer you expected, but I hope it makes sense! Ludo=E2=80=99.