unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Simon Tournier <zimon.toutoune@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: 宋文武 <iyzsong@envs.net>, "Ryan Prior" <rprior@protonmail.com>,
	"Nicolas Graves" <ngraves@ngraves.fr>,
	guix-devel@gnu.org, zamfofex <zamfofex@twdb.moe>
Subject: Re: Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?)
Date: Tue, 30 May 2023 15:15:22 +0200	[thread overview]
Message-ID: <87wn0qrmdx.fsf@gmail.com> (raw)
In-Reply-To: <87r0r3je82.fsf@gnu.org>

Hi Ludo,

On ven., 26 mai 2023 at 17:37, Ludovic Courtès <ludo@gnu.org> wrote:

>> Well, I do not know if we have reached a conclusion.  From my point of
>> view, both can be included *if* their licenses are compatible with Free
>> Software – included the weights (pre-trained model) as licensed data.
>
> We discussed it in 2019:
>
>   https://issues.guix.gnu.org/36071

Your concern in this thread was:

        My point is about whether these trained neural network data are
        something that we could distribute per the FSDG.

        https://issues.guix.gnu.org/36071#3-lineno21

and we discussed this specific concern for the package leela-zero.
Quoting 3 messages:

        Perhaps we could do the same, but I’d like to hear what others think.

        Back to this patch: I think it’s fine to accept it as long as the
        software necessary for training is included.

        The whole link is worth a click since there seems to be a ‘server
        component’ involved as well.

        https://issues.guix.gnu.org/36071#3-lineno31
        https://issues.guix.gnu.org/36071#5-lineno52
        https://issues.guix.gnu.org/36071#6-lineno18


And somehow I am rising the same concern for packages using weights.  We
could discuss case-by-case, instead I find important to sketch
guidelines about the weights because it would help to decide what to do
with neuronal networks; as “Leela Chess Zero” [1] or others (see below).

1: https://issues.guix.gnu.org/63088


> This LWN article on the debate that then took place in Debian is
> insightful:
>
>   https://lwn.net/Articles/760142/

As pointed in #36071 mentioned above, this LWN article is a digest of
some Debian discussion, and it is also worth to give a look to the raw
material (arguments):

https://lists.debian.org/debian-devel/2018/07/msg00153.html


> To me, there is no doubt that neural networks are a threat to user
> autonomy: hard to train by yourself without very expensive hardware,
> next to impossible without proprietary software, plus you need that huge
> amount of data available to begin with.

About the “others” from above, please note that GNU Backgamon, already
packaged in Guix with the name ’gnubg’, asks similar questions. :-)

Quoting the webpage [2]:

        Tournament match and money session cube handling and cubeful
        play. All governed by underlying cubeless money game based
        neural networks.


As Russ Allbery is pointing [3] – similarly as I tried to do in this
thread – it seems hard to distinguish the data resulting from a
pre-processing as some training to the data just resulting from good
fitted parameters.


2: https://www.gnu.org/software/gnubg/
3: https://lwn.net/Articles/760199/


> As a project, we don’t have guidelines about this though.  I don’t know
> if we can come up with general guidelines or if we should, at least as a
> start, look at things on a case-by-case basis.

Somehow, if we do not have guidelines for helping in deciding, it makes
harder the review of #63088 [1] asking the inclusion of lc0 or it makes
hard to know what to do about GNU Backgamon.

On these specific cases, what do we do? :-)


Cheers,
simon


  parent reply	other threads:[~2023-05-30 16:22 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-03 18:07 Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?) Ryan Prior
2023-04-03 20:48 ` Nicolas Graves via Development of GNU Guix and the GNU System distribution.
2023-04-03 21:18   ` Jack Hill
2023-04-06  8:42 ` Simon Tournier
2023-04-06 13:41   ` Kyle
2023-04-06 14:53     ` Simon Tournier
2023-05-13  4:13   ` 宋文武
2023-05-15 11:18     ` Simon Tournier
2023-05-26 15:37       ` Ludovic Courtès
2023-05-29  3:57         ` zamfofex
2023-05-30 13:15         ` Simon Tournier [this message]
2023-07-02 19:51           ` Ludovic Courtès
2023-07-03  9:39             ` Simon Tournier
2023-07-04 13:05               ` zamfofex
2023-07-04 20:03                 ` Vagrant Cascadian
  -- strict thread matches above, loose matches on Subject: below --
2023-04-07  5:50 Nathan Dehnel
2023-04-07  9:42 ` Simon Tournier
2023-04-08 10:21   ` Nathan Dehnel
2023-04-11  8:37     ` Simon Tournier
2023-04-11 12:41       ` Nathan Dehnel
2023-04-12  9:32         ` Csepp

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wn0qrmdx.fsf@gmail.com \
    --to=zimon.toutoune@gmail.com \
    --cc=guix-devel@gnu.org \
    --cc=iyzsong@envs.net \
    --cc=ludo@gnu.org \
    --cc=ngraves@ngraves.fr \
    --cc=rprior@protonmail.com \
    --cc=zamfofex@twdb.moe \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).