From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp11.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms9.migadu.com with LMTPS id UPMaJAYWK2TBIgEASxT56A (envelope-from ) for ; Mon, 03 Apr 2023 20:08:06 +0200 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp11.migadu.com with LMTPS id oDMcJAYWK2QhuQAA9RJhRA (envelope-from ) for ; Mon, 03 Apr 2023 20:08:06 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 136D6A321 for ; Mon, 3 Apr 2023 20:08:06 +0200 (CEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pjOaj-0002v9-G8; Mon, 03 Apr 2023 14:07:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pjOag-0002uj-7B for guix-devel@gnu.org; Mon, 03 Apr 2023 14:07:38 -0400 Received: from mail-4316.protonmail.ch ([185.70.43.16]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pjOae-0002Xq-2L for guix-devel@gnu.org; Mon, 03 Apr 2023 14:07:37 -0400 Date: Mon, 03 Apr 2023 18:07:19 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com; s=protonmail3; t=1680545251; x=1680804451; bh=wSMnLwRoaxEmMdEHimFIsvwwEucX7hVEQhxaY+TUwK0=; h=Date:To:From:Cc:Subject:Message-ID:Feedback-ID:From:To:Cc:Date: Subject:Reply-To:Feedback-ID:Message-ID:BIMI-Selector; b=Dps8aAiojiNDWelib0JZzKV3rW35Pn91lOD0q8RAMGcBOLi97o3Vykd44NIPh0FZ7 Ff7fDTYO9GOYiP6i2tNXHj/Z8t6GeWlrlfzGqL2arADZp/oL8DJy+k3WdVysiBPQ8K IW7s8LX41aIzX6zBCjjjLFWhuFNB2wGGbGZ4vPbQflXqlIWmxYHiZn5hq/tUwb0Us6 zkZvjmHhvodxffiogbhw0a/YCJenZijX1zSKT0WkK/w/QGhIUbMjQnfA9KBzb2LEBG 0MShW1xmYPOGwY6BaX/4IbCFkUF0vKr+ltkhLBFAsrTpM4KVge1Fxqi2pJQ3aiKXrH 5cqCVtbHkjCxw== To: Nicolas Graves , "licensing@fsf.org" From: Ryan Prior Cc: guix-devel@gnu.org Subject: Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?) Message-ID: Feedback-ID: 7396961:user:proton MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=185.70.43.16; envelope-from=rprior@protonmail.com; helo=mail-4316.protonmail.ch X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: guix-devel-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN ARC-Seal: i=1; s=key1; d=yhetil.org; t=1680545286; a=rsa-sha256; cv=none; b=busP9E4Xd9kcU61FbHuZoeRosY1uWPTCl9meaCoVL/F1z3NHOHRlGYh5gNiY3hTQWhsg5d BsT9QpjqoPo/jUTry+jMHB5ywE1tRwz1nUVrbQh1p/HPAN5pDGU0vtHQR5yIj1YtXEuAzN qfP7yRt0RE7+sVUls1cNbK/2sL3N9iOr0B3c0pj/sf1eivI5ws0u0Kt4ayNtS5yrLoEhtJ YcPWOVoIdaJRTrPPLoHknrY0SzRjA1QfxtssIeQoYsiCds9jUNYLFJ3gxd+eVw1oH1QpdK PXSwfzbdsRpwxPTyJx7rZQ/jUd6PVYHD8SrF709PtN/kQ/RtbZU8dmqhZ8+9AA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=protonmail.com header.s=protonmail3 header.b=Dps8aAio; dmarc=pass (policy=quarantine) header.from=protonmail.com; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1680545286; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=wSMnLwRoaxEmMdEHimFIsvwwEucX7hVEQhxaY+TUwK0=; b=LVgFluN7tYWM5buhuE/S1InBR/3ZVaaznlhdW2duBIFwgbw7oBT3IZPBkD1kx14hdE6j/L 8O1/FlObCGoaTbaTRdmWCTkMTgoqH8zMnoBu4qWcomY8mr6pTV1emcnsXqc7o8nMxiFujk dxmvDrxG/z8r4/BKIwTi1n1JNxgt/Yiw8KEI4lN5OBL6H/VNh/NkV7uWaVb0KvR5gIjAfK EMe+wwK5whY6nr/fpVFlqOk0qnir69KPgS/I/HnaOHV0eqZzNRcfXbSraN1H/lOc7ouEJR Qm3wHaX5bc7cuZPikV04TCTiAd3gfsMEpt7tc4/fiMtnOfDCQD5NCAfIaYnrCg== X-Migadu-Spam-Score: 0.54 X-Migadu-Scanner: scn1.migadu.com Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=protonmail.com header.s=protonmail3 header.b=Dps8aAio; dmarc=pass (policy=quarantine) header.from=protonmail.com; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Spam-Score: 0.54 X-Migadu-Queue-Id: 136D6A321 X-TUID: 0F6X9x3Qr8+d Hi there FSF Licensing! (CC: Guix devel, Nicholas Graves) This morning I re= ad through the FSDG to see if it gives any guidance on when machine learnin= g model weights are appropriate for inclusion in a free system. It does not= seem to offer much. Many ML models are advertising themselves as "open source", including the l= lama model that Nicholas (quoted below) is interested in including into Gui= x. However, according to what I can find in Meta's announcement (https://ai= .facebook.com/blog/large-language-model-llama-meta-ai/) and the project's d= ocumentation (https://github.com/facebookresearch/llama/blob/main/MODEL_CAR= D.md) the model itself is not covered by the GPLv3 but rather "a noncommerc= ial license focused on research use cases." I cannot find the full text of = this license anywhere in 20 minutes of searching, perhaps others have bette= r ideas how to find it or perhaps the Meta team would provide a copy if we = ask. Free systems will see incentive to include trained models in their distribu= tions to support use cases like automatic live transcription of audio, reco= gnition of objects in photos and video, and natural language-driven help an= d documentation features. I hope we can update the FSDG to help ensure that= any such inclusion fully meets the requirements of freedom for all our use= rs. Cheers, Ryan ------- Original Message ------- On Monday, April 3rd, 2023 at 4:48 PM, Nicolas Graves via "Development of G= NU Guix and the GNU System distribution." wrote: >=20 >=20 >=20 > Hi Guix! >=20 > I've recently contributed a few tools that make a few OSS machine > learning programs usable for Guix, namely nerd-dictation for dictation > and llama-cpp as a converstional bot. >=20 > In the first case, I would also like to contribute parameters of some > localized models so that they can be used more easily through Guix. I've > already discussed this subject when submitting these patches, without a > clear answer. >=20 > In the case of nerd-dictation, the model parameters that can be used > are listed here : https://alphacephei.com/vosk/models >=20 > One caveat is that using all these models can take a lot of space on the > servers, a burden which is not useful because no build step are really > needed (except an unzip step). In this case, we can use the > #:substitutable? #f flag. You can find an example of some of these > packages right here : > https://git.sr.ht/~ngraves/dotfiles/tree/main/item/packages.scm >=20 > So my question is: Should we add this type of models in packages for > Guix? If yes, where should we put them? In machine-learning.scm? In a > new file machine-learning-models.scm (such a file would never need new > modules, and it might avoid some confusion between the tools and the > parameters needed to use the tools)? >=20 >=20 > -- > Best regards, > Nicolas Graves