From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp10.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms9.migadu.com with LMTPS id SDXSMtYidmSA2wAASxT56A (envelope-from ) for ; Tue, 30 May 2023 18:22:46 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp10.migadu.com with LMTPS id CD0QMtYidmRYsQAAG6o9tA (envelope-from ) for ; Tue, 30 May 2023 18:22:46 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 70139336D5 for ; Tue, 30 May 2023 18:22:46 +0200 (CEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1q4275-0007ub-7L; Tue, 30 May 2023 12:22:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q426z-0007s7-SC for guix-devel@gnu.org; Tue, 30 May 2023 12:22:17 -0400 Received: from mail-wr1-x42f.google.com ([2a00:1450:4864:20::42f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1q426v-0005ce-8q; Tue, 30 May 2023 12:22:16 -0400 Received: by mail-wr1-x42f.google.com with SMTP id ffacd0b85a97d-30ae3a399aaso706454f8f.1; Tue, 30 May 2023 09:22:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1685463731; x=1688055731; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:from:to:cc:subject:date:message-id :reply-to; bh=hLYj4f6R4WPcXhX+iEnrLmI8Wgsrk91B1iuiGMJIBng=; b=DCgZlFGU3QQ4rq1FdpALhRdsHDsVskgdTwQI+ti7QZhS5rwuoTya0unUHyxSPW7RFc hZuVRh92Jc0rwun/hifgReeUhMVgPLq1pBOznSHXJvlW/5mWTvxCm5Ru856BKI0tJP73 e8FjugfQN7MsUmrIpqGQOouU40yOA7x/gDeUpp/tfCCkGiRtRZdcf39QYfX6DZnBAsDC TSGqPQhGwjAfNKFU+5wR9TwfqqQy6vnrlKkxc/CYU+TIyNVANTC7b7WgZ+wlHxwidMvb LlIKFfcGLZhiUvv32ogmUfHga2XYfANhw2rnNvaGf55r1GPZ/8WPk8I7SHXC0kP99jik EJzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685463731; x=1688055731; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hLYj4f6R4WPcXhX+iEnrLmI8Wgsrk91B1iuiGMJIBng=; b=excoiffyRuHZDDE3CY420Rbsf8uYkFX9J/+ujYo2KDVh2JYa3fYXuz4DE/9/b5HLEe kWq1GBDrOuAtjlBaBhFWJ/IEbqY43xTmZDWFEty+rrV1zP1LEyeeFxCNNx6UfP0J8rkY gEoBsutSbOo8l2c7G/Ufk1U9JFI+vs4ykllUYl+q/Ijp/YeW88yAUGDxwZ/vfxzMGPhW SLNzaFP4guhodU3ZYaqYqRN3XdiaLOVS93DMdZgcFFY7fSviopmgVGGRyHleASCsDHe2 BF2t4Y7DU6U2VQH59mM9NXCMGQki/ImPSPi3rIgTkI+VGR/Vcns794yHqJ/wtF3tDoDp GouA== X-Gm-Message-State: AC+VfDz0ZixhnvZCbhuCfAD+aDGf96Xx6T0luu6bmM+k22CS2objXPML BeEAoQ40LlqH1AxFYu8K0xs= X-Google-Smtp-Source: ACHHUZ5boKmRVk6TQnwS4IMROtOjGlVAmY37dH3NGJbiz+gd9AooauTpcklgnkIWkXp5AjLSucki3w== X-Received: by 2002:a05:600c:1c98:b0:3f4:f204:4968 with SMTP id k24-20020a05600c1c9800b003f4f2044968mr2454503wms.1.1685463730887; Tue, 30 May 2023 09:22:10 -0700 (PDT) Received: from pfiuh07 ([193.48.40.241]) by smtp.gmail.com with ESMTPSA id v7-20020a05600c214700b003f4f89bc48dsm21569746wml.15.2023.05.30.09.22.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 May 2023 09:22:10 -0700 (PDT) From: Simon Tournier To: Ludovic =?utf-8?Q?Court=C3=A8s?= Cc: =?utf-8?B?5a6L5paH5q2m?= , Ryan Prior , Nicolas Graves , guix-devel@gnu.org, zamfofex Subject: Re: Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?) In-Reply-To: <87r0r3je82.fsf@gnu.org> References: <868rf5e71j.fsf@gmail.com> <87ilcweumh.fsf@envs.net> <87v8gtzvu3.fsf@gmail.com> <87r0r3je82.fsf@gnu.org> Date: Tue, 30 May 2023 15:15:22 +0200 Message-ID: <87wn0qrmdx.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=2a00:1450:4864:20::42f; envelope-from=zimon.toutoune@gmail.com; helo=mail-wr1-x42f.google.com X-Spam_score_int: -4 X-Spam_score: -0.5 X-Spam_bar: / X-Spam_report: (-0.5 / 5.0 requ) BAYES_00=-1.9, DATE_IN_PAST_03_06=1.592, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: guix-devel-bounces+larch=yhetil.org@gnu.org X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Seal: i=1; s=key1; d=yhetil.org; t=1685463766; a=rsa-sha256; cv=none; b=ozb0OIGglFv5hWcbPBFq8AKVn4Ee65dFXsFSvuZKKKKI3kyzxp1KOvw2k9FoNlpqyo4Dtv wVzdd4PmUt2xkDhI26Y6j3kMq2skTAA7cEFlNMGIi9Gbzzgv2eTv7WL+hB1/c46oSoAiap UcmQAGlXUS4KV1vvu7d+SMGna+sCQIHtcS0IzB0aPQNzjN+HtZ34oI4GAxjOLu/B1pkdnv fC/GgS4DyHRxP0WKt8Azj6OMyD0+PxPBLx14d/3+kMX/SmCx1uOz7VHALkFu4w9stPKuPG BkJWwhINQEcI2Zq+i+EEKR1BJMZNJNRoNgn7bzmeXKu4gl153NREMRA+U7jC9w== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=DCgZlFGU; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1685463766; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=hLYj4f6R4WPcXhX+iEnrLmI8Wgsrk91B1iuiGMJIBng=; b=gdTmeoWz0J6kTjmMPITUWnGGtHxwY8TYFyd8Pvod978Ift86GMaoSlVXwl9aYwzpYsVUWt t32Inzs25UKkjwXA5JlgN5xlT1ZNIH98klkLQbKG7m0Sk4ZGdT4aMU0vQJW7yvBZkhsxxk bxK4IWmVPppw44nSLQjuseki9SQ0aZHigHymigyzrAdYEXpX/qN7qu7UA4jR2Qiq+ModQz M6GbIpFE5vA+6NYe1Q5cRdx128JPdq4LvTKjO9k7nzJtKIeCgpYLxwjU5gzgkQl8KfsngI CdK8i3qkU/0Oy1adKUbhlVAc6FaAQd3IYQAaASmQ2yv3I6Ka8sfDmjmXgtt1zg== X-Migadu-Scanner: scn1.migadu.com Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=DCgZlFGU; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: 2.20 X-Spam-Score: 2.20 X-Migadu-Queue-Id: 70139336D5 X-TUID: iAJJe0QeXewM Hi Ludo, On ven., 26 mai 2023 at 17:37, Ludovic Court=C3=A8s wrote: >> Well, I do not know if we have reached a conclusion. From my point of >> view, both can be included *if* their licenses are compatible with Free >> Software =E2=80=93 included the weights (pre-trained model) as licensed = data. > > We discussed it in 2019: > > https://issues.guix.gnu.org/36071 Your concern in this thread was: My point is about whether these trained neural network data are something that we could distribute per the FSDG. https://issues.guix.gnu.org/36071#3-lineno21 and we discussed this specific concern for the package leela-zero. Quoting 3 messages: Perhaps we could do the same, but I=E2=80=99d like to hear what oth= ers think. Back to this patch: I think it=E2=80=99s fine to accept it as long = as the software necessary for training is included. The whole link is worth a click since there seems to be a =E2=80=98= server component=E2=80=99 involved as well. https://issues.guix.gnu.org/36071#3-lineno31 https://issues.guix.gnu.org/36071#5-lineno52 https://issues.guix.gnu.org/36071#6-lineno18 And somehow I am rising the same concern for packages using weights. We could discuss case-by-case, instead I find important to sketch guidelines about the weights because it would help to decide what to do with neuronal networks; as =E2=80=9CLeela Chess Zero=E2=80=9D [1] or others= (see below). 1: https://issues.guix.gnu.org/63088 > This LWN article on the debate that then took place in Debian is > insightful: > > https://lwn.net/Articles/760142/ As pointed in #36071 mentioned above, this LWN article is a digest of some Debian discussion, and it is also worth to give a look to the raw material (arguments): https://lists.debian.org/debian-devel/2018/07/msg00153.html > To me, there is no doubt that neural networks are a threat to user > autonomy: hard to train by yourself without very expensive hardware, > next to impossible without proprietary software, plus you need that huge > amount of data available to begin with. About the =E2=80=9Cothers=E2=80=9D from above, please note that GNU Backgam= on, already packaged in Guix with the name =E2=80=99gnubg=E2=80=99, asks similar questi= ons. :-) Quoting the webpage [2]: Tournament match and money session cube handling and cubeful play. All governed by underlying cubeless money game based neural networks. As Russ Allbery is pointing [3] =E2=80=93 similarly as I tried to do in this thread =E2=80=93 it seems hard to distinguish the data resulting from a pre-processing as some training to the data just resulting from good fitted parameters. 2: https://www.gnu.org/software/gnubg/ 3: https://lwn.net/Articles/760199/ > As a project, we don=E2=80=99t have guidelines about this though. I don= =E2=80=99t know > if we can come up with general guidelines or if we should, at least as a > start, look at things on a case-by-case basis. Somehow, if we do not have guidelines for helping in deciding, it makes harder the review of #63088 [1] asking the inclusion of lc0 or it makes hard to know what to do about GNU Backgamon. On these specific cases, what do we do? :-) Cheers, simon