unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Ryan Prior <rprior@protonmail.com>
To: Nicolas Graves <ngraves@ngraves.fr>,
	"licensing@fsf.org" <licensing@fsf.org>
Cc: guix-devel@gnu.org
Subject: Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?)
Date: Mon, 03 Apr 2023 18:07:19 +0000	[thread overview]
Message-ID: <xanfHBZT3lYlyrr_OqHHMWkunLeZZlcxzY37_T3TeZo7mfJClD5-OTbkXDH2f3lMTkn94YIFVUj-Z31BP2Wj0W2rISNP6glC2PzXcPdb560=@protonmail.com> (raw)


Hi there FSF Licensing! (CC: Guix devel, Nicholas Graves) This morning I read through the FSDG to see if it gives any guidance on when machine learning model weights are appropriate for inclusion in a free system. It does not seem to offer much.

Many ML models are advertising themselves as "open source", including the llama model that Nicholas (quoted below) is interested in including into Guix. However, according to what I can find in Meta's announcement (https://ai.facebook.com/blog/large-language-model-llama-meta-ai/) and the project's documentation (https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md) the model itself is not covered by the GPLv3 but rather "a noncommercial license focused on research use cases." I cannot find the full text of this license anywhere in 20 minutes of searching, perhaps others have better ideas how to find it or perhaps the Meta team would provide a copy if we ask.

Free systems will see incentive to include trained models in their distributions to support use cases like automatic live transcription of audio, recognition of objects in photos and video, and natural language-driven help and documentation features. I hope we can update the FSDG to help ensure that any such inclusion fully meets the requirements of freedom for all our users.

Cheers,
Ryan


------- Original Message -------
On Monday, April 3rd, 2023 at 4:48 PM, Nicolas Graves via "Development of GNU Guix and the GNU System distribution." <guix-devel@gnu.org> wrote:


> 
> 
> 
> Hi Guix!
> 
> I've recently contributed a few tools that make a few OSS machine
> learning programs usable for Guix, namely nerd-dictation for dictation
> and llama-cpp as a converstional bot.
> 
> In the first case, I would also like to contribute parameters of some
> localized models so that they can be used more easily through Guix. I've
> already discussed this subject when submitting these patches, without a
> clear answer.
> 
> In the case of nerd-dictation, the model parameters that can be used
> are listed here : https://alphacephei.com/vosk/models
> 
> One caveat is that using all these models can take a lot of space on the
> servers, a burden which is not useful because no build step are really
> needed (except an unzip step). In this case, we can use the
> #:substitutable? #f flag. You can find an example of some of these
> packages right here :
> https://git.sr.ht/~ngraves/dotfiles/tree/main/item/packages.scm
> 
> So my question is: Should we add this type of models in packages for
> Guix? If yes, where should we put them? In machine-learning.scm? In a
> new file machine-learning-models.scm (such a file would never need new
> modules, and it might avoid some confusion between the tools and the
> parameters needed to use the tools)?
> 
> 
> --
> Best regards,
> Nicolas Graves


             reply	other threads:[~2023-04-03 18:08 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-03 18:07 Ryan Prior [this message]
2023-04-03 20:48 ` Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?) Nicolas Graves via Development of GNU Guix and the GNU System distribution.
2023-04-03 21:18   ` Jack Hill
2023-04-06  8:42 ` Simon Tournier
2023-04-06 13:41   ` Kyle
2023-04-06 14:53     ` Simon Tournier
2023-05-13  4:13   ` 宋文武
2023-05-15 11:18     ` Simon Tournier
2023-05-26 15:37       ` Ludovic Courtès
2023-05-29  3:57         ` zamfofex
2023-05-30 13:15         ` Simon Tournier
2023-07-02 19:51           ` Ludovic Courtès
2023-07-03  9:39             ` Simon Tournier
2023-07-04 13:05               ` zamfofex
2023-07-04 20:03                 ` Vagrant Cascadian
  -- strict thread matches above, loose matches on Subject: below --
2023-04-07  5:50 Nathan Dehnel
2023-04-07  9:42 ` Simon Tournier
2023-04-08 10:21   ` Nathan Dehnel
2023-04-11  8:37     ` Simon Tournier
2023-04-11 12:41       ` Nathan Dehnel
2023-04-12  9:32         ` Csepp

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='xanfHBZT3lYlyrr_OqHHMWkunLeZZlcxzY37_T3TeZo7mfJClD5-OTbkXDH2f3lMTkn94YIFVUj-Z31BP2Wj0W2rISNP6glC2PzXcPdb560=@protonmail.com' \
    --to=rprior@protonmail.com \
    --cc=guix-devel@gnu.org \
    --cc=licensing@fsf.org \
    --cc=ngraves@ngraves.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).