Re: Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?)

all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

From: zamfofex <zamfofex@twdb.moe>
To: "Ludovic Courtès" <ludo@gnu.org>,
	"Simon Tournier" <zimon.toutoune@gmail.com>,
	宋文武 <iyzsong@envs.net>
Cc: Ryan Prior <rprior@protonmail.com>,
	Nicolas Graves <ngraves@ngraves.fr>,
	guix-devel@gnu.org
Subject: Re: Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?)
Date: Mon, 29 May 2023 00:57:38 -0300 (BRT)	[thread overview]
Message-ID: <289152483.2528051.1685332658386@privateemail.com> (raw)
In-Reply-To: <87r0r3je82.fsf@gnu.org>

> To me, there is no doubt that neural networks are a threat to user
> autonomy: hard to train by yourself without very expensive hardware,
> next to impossible without proprietary software, plus you need that huge
> amount of data available to begin with.
> 
> As a project, we don’t have guidelines about this though.  I don’t know
> if we can come up with general guidelines or if we should, at least as a
> start, look at things on a case-by-case basis.

I feel like it’s important to have a guideline for this, at least if the issue becomes recurrent too frequently.

To me, a sensible *base criterion* is whether the user is able to practically produce their own networks (either from scratch, or by using the an existing networks) using free software alone. I feel like this solves the issue of user autonomy being in risk.

By “practically produce”, I mean within reasonable time (somewhere between a few minutes and a few weeks depending on the scope) and using exclusively hardware they physically own (assuming they own reasonbly recent hardware to run Guix, at least).

The effect is that the user shouldn’t be bound to the provided networks, and should be able to train their own for their own purposes if they so choose, even if using the existing networks during that training. (And in the context of Guix, the neural network needs to be packaged for the user to be able to use it that way.)

Regarding Lc0 specifically, that is already possible! The Lc0 project has a training client that can use existing networks and a set of configurations to train your own special‐purpose network. (And although this client supports proprietary software, it is able to run using exclusively free software too.) In fact, there are already community‐provided networks for Lc0[1], which sometimes can play even more accurately than the official ones (or otherwise play differently in various specialised ways).

Of course, this might seem very dissatisfying in the same way as providing binary seeds for software programs is. In the sense that if you require an existing network to further train networks, rather than being able to start a network from scratch (in this case). But I feel like (at least under my “base criterion”), the effects of this to the user are not as significant, since the effects of the networks are limited compared to those of actual programs.

In the sense that, even though you might want to argue that “the network affects the behavior of the program using it” in the same way as “a Python source file affects the behavior of its interpreter”, the effect of the network file for the program is limited compared to that of a Python program. It’s much more like how an image would affect the affect the behavior of the program displaying it. More concretely, there isn’t a trust issue to be solved, because the network doesn’t have as many capabilities (theoretical or practical) as a program does.

I say “practical capabilities” in the sense of being access user resources and data for purposes they don’t want. (E.g. By accessing/modifying their files, sharing their data through the Internet without their acknowledgement, etc.)

I say “theoretical capabilities” in the sense of doing things the user doesn’t want nor expects, i.e. thinking about using computations as a tool for some purpose. (E.g. Even sandboxed/containerised programs can be harmful, because the program could behave in a way the user doesn’t want without letting the user do something about it.)

The only autonomy‐disrespecting (or perhaps rather freedom‐disrespecting) issue is when the user is stuck with the provided network, and doesn’t have any tools to (practically) change how the program behaves by creating a different network that suits their needs. (Which is what my “base criterion” tries to defend against.) This is not the case with Lc0, as I said.

Finally, I will also note that, in addition to the aforementioned[2] fact that Stockfish (already packaged) does use pre‐trained neural networks too, the lastest versions of Stockfish (from 14 onward) use neural networks that have themselves been indirectly trained using the networks from the Lc0 project.[3]

[1]: See <https://lczero.org/play/networks/basics/#training-data>
[2]: It was mentioned in <https://issues.guix.gnu.org/63088>
[3]: See <https://stockfishchess.org/blog/2021/stockfish-14/>

next prev parent reply	other threads:[~2023-05-29  4:32 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-03 18:07 Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?) Ryan Prior
2023-04-03 20:48 ` Nicolas Graves via Development of GNU Guix and the GNU System distribution.
2023-04-03 21:18   ` Jack Hill
2023-04-06  8:42 ` Simon Tournier
2023-04-06 13:41   ` Kyle
2023-04-06 14:53     ` Simon Tournier
2023-05-13  4:13   ` 宋文武
2023-05-15 11:18     ` Simon Tournier
2023-05-26 15:37       ` Ludovic Courtès
2023-05-29  3:57         ` zamfofex [this message]
2023-05-30 13:15         ` Simon Tournier
2023-07-02 19:51           ` Ludovic Courtès
2023-07-03  9:39             ` Simon Tournier
2023-07-04 13:05               ` zamfofex
2023-07-04 20:03                 ` Vagrant Cascadian
  -- strict thread matches above, loose matches on Subject: below --
2023-04-07  5:50 Nathan Dehnel
2023-04-07  9:42 ` Simon Tournier
2023-04-08 10:21   ` Nathan Dehnel
2023-04-11  8:37     ` Simon Tournier
2023-04-11 12:41       ` Nathan Dehnel
2023-04-12  9:32         ` Csepp

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=289152483.2528051.1685332658386@privateemail.com \
    --to=zamfofex@twdb.moe \
    --cc=guix-devel@gnu.org \
    --cc=iyzsong@envs.net \
    --cc=ludo@gnu.org \
    --cc=ngraves@ngraves.fr \
    --cc=rprior@protonmail.com \
    --cc=zimon.toutoune@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.