From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp11.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms9.migadu.com with LMTPS id aHLsATV7pGRm9AAASxT56A (envelope-from ) for ; Tue, 04 Jul 2023 22:04:05 +0200 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp11.migadu.com with LMTPS id CLjjATV7pGRIBQAA9RJhRA (envelope-from ) for ; Tue, 04 Jul 2023 22:04:05 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id AB1DD3E5D4 for ; Tue, 4 Jul 2023 22:04:04 +0200 (CEST) Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=debian.org header.s=1.vagrant.user header.b=NqNIX8Ho; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" ARC-Seal: i=1; s=key1; d=yhetil.org; t=1688501044; a=rsa-sha256; cv=none; b=FCVXGAwbu5ejGBH5qfoZNB1edU0O9TAipdDUE1mUvHy/nvfBtgSXJdRvGsNmjF85eZ5ylf y2ijcxlRxSMAw4jSbxTV3i4iBP+y3nmLTMyElqbFAPxCgCf0rPCNRDBWIVzfaYzgLj8X5x uCyBJN7G1x1XJnH3IuvDJZWyQzQbhiPETVTaL0X/twDX1cPUxFWTb6QN/N1YitXNyXYlD0 YaEbsFmBP9FaKNg1P3OFAif3S+X1Y9qvu5s3Kr5+V64XMSmmrnUGAEuan5RvFFHKRCYPS1 LuhaKDBaveLDs9YsZGUPD7Z4QxHmdgg4fCAMuCLPqBjutVY38dV21nwytk8FXw== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=debian.org header.s=1.vagrant.user header.b=NqNIX8Ho; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1688501044; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=LQG9QpCZoU0VHfUL1CP09Do/C06fNMxn/rIWkEGaUuM=; b=hAISI3060cveLaiqNg410ie0qeo8j1yWlyxO3qYwUtiny7BnEzTlT9pfhcYHxpVlM6ilBv yTMV9oKgDwFfMnFdBzj+jJdIgodXhCt7C/6NpPlug9ohYalSFTUXnpJpBvpH8WVCX6SGCI z4vxKeb6C3y5d0XJpEAgUTocv5dcaNkisWW6t/oHhDn7Lgohae5YVHi608xaJIGU/M8EVw cBG1c8zr+StyXYxAU2jmOGZDaplWxQj2s5QIDSciAvpeni3CTMkL1P/br7U+biQUOCyDri 2fiQ57s58PrES42jimhLOAmn9eeUxn5UKPeFNPAWkUGPuwNnX5gYAbPaLlUI2Q== Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qGmFG-0003ZY-2N; Tue, 04 Jul 2023 16:03:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qGmFD-0003Yr-OH for guix-devel@gnu.org; Tue, 04 Jul 2023 16:03:28 -0400 Received: from cascadia.aikidev.net ([2600:3c01:e000:267:0:a171:de7:c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qGmFB-00055B-Mj; Tue, 04 Jul 2023 16:03:27 -0400 Received: from localhost (unknown [IPv6:2600:3c01:e000:21:7:77:0:50]) (Authenticated sender: vagrant@cascadia.debian.net) by cascadia.aikidev.net (Postfix) with ESMTPSA id D04E01AB63; Tue, 4 Jul 2023 13:03:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=debian.org; s=1.vagrant.user; t=1688500996; bh=9EoUCwzrsRIsGKVkG5ZEOl7Lq2jLUGHaeteiu4PYFrU=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=NqNIX8Ho1XxTHGtDIaHI7ykyX43Wenk/B0Gynk7f8VbVdyTrfkW39Pwr5aKlH1W3M KsC6E+C9lwYCJoLczFW+DVEwnxmvYMk5F6wWJn18Fc0Nw0fMq3lMY3BV+Zz9eH8jgq udmFjWl2didSabWj9jSq1Z/hmRh6JvxE4warO5345We6v/KisMDz5mPnkLp9JqdbIh xxjhU9qN9CZvQmfyBrvk0P2GEfwQGGm3PBhq1aJkrQRAPA36pPn6ZSjCblCawLygdU qHVihSufUcnjCjmEpM/7UzwZyaCEAAsXP02jIg4kKl59vYWRejihIaSe5jONI59wx0 SmALGHLQyJW5w== From: Vagrant Cascadian To: zamfofex , Simon Tournier , Ludovic =?utf-8?Q?Court=C3=A8s?= Cc: =?utf-8?B?5a6L5paH5q2m?= , Ryan Prior , Nicolas Graves , guix-devel@gnu.org Subject: Re: Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?) In-Reply-To: <1353752735.686806.1688475901148@privateemail.com> References: <868rf5e71j.fsf@gmail.com> <87ilcweumh.fsf@envs.net> <87v8gtzvu3.fsf@gmail.com> <87r0r3je82.fsf@gnu.org> <87wn0qrmdx.fsf@gmail.com> <87cz1aum5j.fsf@gnu.org> <87wmzh5o5v.fsf@gmail.com> <1353752735.686806.1688475901148@privateemail.com> Date: Tue, 04 Jul 2023 13:03:13 -0700 Message-ID: <875y6zxx4u.fsf@wireframe> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" Received-SPF: none client-ip=2600:3c01:e000:267:0:a171:de7:c; envelope-from=vagrant@debian.org; helo=cascadia.aikidev.net X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: guix-devel-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN X-Migadu-Scanner: scn1.migadu.com X-Migadu-Spam-Score: -1.19 X-Migadu-Queue-Id: AB1DD3E5D4 X-Spam-Score: -1.19 X-TUID: JasMFxzJKUZP --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 2023-07-04, zamfofex wrote: >> On 07/03/2023 6:39 AM -03 Simon Tournier wrot= e: >>=20 >> Well, I do not see any difference between pre-trained weights and icons >> or sound or good fitted-parameters (e.g., the package >> python-scikit-learn has a lot ;-)). As I said elsewhere, I do not see >> the difference between pre-trained neural network weights and genomic >> references (e.g., the package r-bsgenome-hsapiens-1000genomes-hs37d5). > > I feel like, although this might (arguably) not be the case for > leela-zero nor Lc0 specifically, for certain machine learning > projects, a pretrained network can affect the program=E2=80=99s behavior = so > deeply that it might be considered a program itself! Such networks > usually approximate an arbitrary function. The more complex the model > is, the more complex the behavior of this function can be, and thus > the closer to being an arbitrary program it is. > > But this =E2=80=9Cprogram=E2=80=9D has no source code, it is effectively = created in > this binary form that is difficult to analyse. > > In any case, I feel like the issue Ludovic was talking about =E2=80=9Cuser > autonomy=E2=80=9D is fairly relevant (as I understand it). For icons, ima= ges, > and other similar kinds of assets, it is easy enough for the user to > replace them, or create their own if they want. But for pretrained > networks, even if they are under a free license, the user might not be > able to easily create their own network that suits their purposes. > > For example, for an image recognition software, there might be data > provided by the maintainers of the program that is able to recognise a > specific set of objects in input images, but the user might want to > use it to recognise a different kind of object. If it is too costly > for the user to train a new network for their purposes (in terms of > hardware and time required), the user is effectively entirely bound by > the decisions of the maintainers of the software, and they can=E2=80=99t > change it to suit their purposes. For a more concrete example, with facial reconition in particular, many models are quite good at recognition of faces of people of predominantly white european descent, and not very good with people of other backgrounds, in particular with darker skin. The models frequently reflect the blatant and subtle biases of the society in which they are created, and the creators who develop the models. This can have disasterous consequences when using these models without that understanding... (or even if you do understand the general biases!) This seems like a significant issue for user freedom; with source code, you can at least in theory examine the biases of the software you are using. live well, vagrant --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iHUEARYKAB0WIQRlgHNhO/zFx+LkXUXcUY/If5cWqgUCZKR7AQAKCRDcUY/If5cW qpIcAP9SsSaB6dcxMQPiqnBGPKpHIbJ/bTT5VfiFQzNB/tnAZwEA3R1ObgAk2qy7 tSP6rQ4Q4Ki/DJq/8HhfdLDW+JvKrgY= =GAzf -----END PGP SIGNATURE----- --=-=-=--