From: Simon Tournier <zimon.toutoune@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: 宋文武 <iyzsong@envs.net>, "Ryan Prior" <rprior@protonmail.com>,
"Nicolas Graves" <ngraves@ngraves.fr>,
guix-devel@gnu.org, zamfofex <zamfofex@twdb.moe>
Subject: Re: Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?)
Date: Tue, 30 May 2023 15:15:22 +0200 [thread overview]
Message-ID: <87wn0qrmdx.fsf@gmail.com> (raw)
In-Reply-To: <87r0r3je82.fsf@gnu.org>
Hi Ludo,
On ven., 26 mai 2023 at 17:37, Ludovic Courtès <ludo@gnu.org> wrote:
>> Well, I do not know if we have reached a conclusion. From my point of
>> view, both can be included *if* their licenses are compatible with Free
>> Software – included the weights (pre-trained model) as licensed data.
>
> We discussed it in 2019:
>
> https://issues.guix.gnu.org/36071
Your concern in this thread was:
My point is about whether these trained neural network data are
something that we could distribute per the FSDG.
https://issues.guix.gnu.org/36071#3-lineno21
and we discussed this specific concern for the package leela-zero.
Quoting 3 messages:
Perhaps we could do the same, but I’d like to hear what others think.
Back to this patch: I think it’s fine to accept it as long as the
software necessary for training is included.
The whole link is worth a click since there seems to be a ‘server
component’ involved as well.
https://issues.guix.gnu.org/36071#3-lineno31
https://issues.guix.gnu.org/36071#5-lineno52
https://issues.guix.gnu.org/36071#6-lineno18
And somehow I am rising the same concern for packages using weights. We
could discuss case-by-case, instead I find important to sketch
guidelines about the weights because it would help to decide what to do
with neuronal networks; as “Leela Chess Zero” [1] or others (see below).
1: https://issues.guix.gnu.org/63088
> This LWN article on the debate that then took place in Debian is
> insightful:
>
> https://lwn.net/Articles/760142/
As pointed in #36071 mentioned above, this LWN article is a digest of
some Debian discussion, and it is also worth to give a look to the raw
material (arguments):
https://lists.debian.org/debian-devel/2018/07/msg00153.html
> To me, there is no doubt that neural networks are a threat to user
> autonomy: hard to train by yourself without very expensive hardware,
> next to impossible without proprietary software, plus you need that huge
> amount of data available to begin with.
About the “others” from above, please note that GNU Backgamon, already
packaged in Guix with the name ’gnubg’, asks similar questions. :-)
Quoting the webpage [2]:
Tournament match and money session cube handling and cubeful
play. All governed by underlying cubeless money game based
neural networks.
As Russ Allbery is pointing [3] – similarly as I tried to do in this
thread – it seems hard to distinguish the data resulting from a
pre-processing as some training to the data just resulting from good
fitted parameters.
2: https://www.gnu.org/software/gnubg/
3: https://lwn.net/Articles/760199/
> As a project, we don’t have guidelines about this though. I don’t know
> if we can come up with general guidelines or if we should, at least as a
> start, look at things on a case-by-case basis.
Somehow, if we do not have guidelines for helping in deciding, it makes
harder the review of #63088 [1] asking the inclusion of lc0 or it makes
hard to know what to do about GNU Backgamon.
On these specific cases, what do we do? :-)
Cheers,
simon
next prev parent reply other threads:[~2023-05-30 16:22 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-03 18:07 Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?) Ryan Prior
2023-04-03 20:48 ` Nicolas Graves via Development of GNU Guix and the GNU System distribution.
2023-04-03 21:18 ` Jack Hill
2023-04-06 8:42 ` Simon Tournier
2023-04-06 13:41 ` Kyle
2023-04-06 14:53 ` Simon Tournier
2023-05-13 4:13 ` 宋文武
2023-05-15 11:18 ` Simon Tournier
2023-05-26 15:37 ` Ludovic Courtès
2023-05-29 3:57 ` zamfofex
2023-05-30 13:15 ` Simon Tournier [this message]
2023-07-02 19:51 ` Ludovic Courtès
2023-07-03 9:39 ` Simon Tournier
2023-07-04 13:05 ` zamfofex
2023-07-04 20:03 ` Vagrant Cascadian
-- strict thread matches above, loose matches on Subject: below --
2023-04-07 5:50 Nathan Dehnel
2023-04-07 9:42 ` Simon Tournier
2023-04-08 10:21 ` Nathan Dehnel
2023-04-11 8:37 ` Simon Tournier
2023-04-11 12:41 ` Nathan Dehnel
2023-04-12 9:32 ` Csepp
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87wn0qrmdx.fsf@gmail.com \
--to=zimon.toutoune@gmail.com \
--cc=guix-devel@gnu.org \
--cc=iyzsong@envs.net \
--cc=ludo@gnu.org \
--cc=ngraves@ngraves.fr \
--cc=rprior@protonmail.com \
--cc=zamfofex@twdb.moe \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).