From: Csepp <raingloom@riseup.net>
To: Nathan Dehnel <ncdehnel@gmail.com>
Cc: Simon Tournier <zimon.toutoune@gmail.com>,
rprior@protonmail.com, guix-devel@gnu.org
Subject: Re: Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?)
Date: Wed, 12 Apr 2023 11:32:34 +0200 [thread overview]
Message-ID: <87sfd5qvub.fsf@riseup.net> (raw)
In-Reply-To: <CAEEhgEtzQiDXUQk2+z5HYr9dV=dFzo0W_s7ROcryf9c=0A_-2g@mail.gmail.com>
Nathan Dehnel <ncdehnel@gmail.com> writes:
> a) Bit-identical re-train of ML models is similar to #2; other said
> that bit-identical re-training of ML model weights does not protect
> much against biased training. The only protection against biased
> training is by human expertise.
>
> Yeah, I didn't mean to give the impression that I thought
> bit-reproducibility was the silver bullet for AI backdoors with that
> analogy. I guess my argument is this: if they release the training
> info, either 1) it does not produce the bias/backdoor of the trained
> model, so there's no problem, or 2) it does, in which case an expert
> will be able to look at it and go "wait, that's not right", and will
> raise an alarm, and it will go public. The expert does not need to be
> affiliated with guix, but guix will eventually hear about it. Similar
> to how a normal security vulnerability works.
>
> b) The resources (human, financial, hardware, etc.) for re-training is,
> for most of the cases, not affordable. Not because it would be
> difficult or because the task is complex, this is covered by the
> point a), no it is because the requirements in term of resources is
> just to high.
>
> Maybe distributed substitutes could change that equation?
Probably not, it would require distributed *builds*. Right now Guix
can't even use distcc, so it definitely can't use remote GPUs.
next prev parent reply other threads:[~2023-04-12 9:44 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-07 5:50 Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?) Nathan Dehnel
2023-04-07 9:42 ` Simon Tournier
2023-04-08 10:21 ` Nathan Dehnel
2023-04-11 8:37 ` Simon Tournier
2023-04-11 12:41 ` Nathan Dehnel
2023-04-12 9:32 ` Csepp [this message]
2023-09-06 14:28 ` Guidelines for pre-trained ML model weight binaries Andreas Enge
-- strict thread matches above, loose matches on Subject: below --
2023-04-03 18:07 Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?) Ryan Prior
2023-04-03 20:48 ` Nicolas Graves via Development of GNU Guix and the GNU System distribution.
2023-04-03 21:18 ` Jack Hill
2023-04-06 8:42 ` Simon Tournier
2023-04-06 13:41 ` Kyle
2023-04-06 14:53 ` Simon Tournier
2023-05-13 4:13 ` 宋文武
2023-05-15 11:18 ` Simon Tournier
2023-05-26 15:37 ` Ludovic Courtès
2023-05-29 3:57 ` zamfofex
2023-05-30 13:15 ` Simon Tournier
2023-07-02 19:51 ` Ludovic Courtès
2023-07-03 9:39 ` Simon Tournier
2023-07-04 13:05 ` zamfofex
2023-07-04 20:03 ` Vagrant Cascadian
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87sfd5qvub.fsf@riseup.net \
--to=raingloom@riseup.net \
--cc=guix-devel@gnu.org \
--cc=ncdehnel@gmail.com \
--cc=rprior@protonmail.com \
--cc=zimon.toutoune@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/guix.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.