From: Saku Laesvuori <saku@laesvuori.fi>
To: Nathan Dehnel <ncdehnel@gmail.com>
Cc: guix-devel@gnu.org
Subject: Re: Binary descriptors for OpenCV
Date: Thu, 3 Aug 2023 09:18:50 +0300 [thread overview]
Message-ID: <20230803061850.lugers327lcxcrrq@X-kone> (raw)
In-Reply-To: <CAEEhgEuXekoBhcNquuLkHVeUP86PZ3z8px1_r=2K6yWrQfUKcg@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 1616 bytes --]
> >You can always check what kind of data the program gives to the
> >neural network as the program is free software. If the data is valid
> >runtime input it is also valid training data.
>
> That's not necessarily true. Like an image generating program will be
> trained on image + caption pairs, but running it involves giving it
> just the captions. Thus, running the model doesn't inherently show you
> how to retrain the model.
In that case aren't the captions the input to the model and the images
the output? The training process tries to minimize the error between the
correct output from the data set and the generated output. From using
the model you do know what formats are expected as input and output, so
you do have the information required for retraining.
> >You can't exactly *know* that any extra training doesn't break the
> >model but the same holds for editing the original training data.
>
> You can know with more certainty that it doesn't break the model.
Well, that depends on what kind of data editing and extra training we
are comparing. If we remove a tiny bit from the original data set, it
obviously is less likely to break the model than retraining it with bad
data. But if you had a new training data set, it would not be any
different to retrain the pretrained model on it instead of adding the
new data set into the original training data and training from scratch.
I should probably mention that I have never tried retraining models in
practice and I'm basing all my arguments on my theoretical understading
of how machine learning models work.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2023-08-03 6:19 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-01 18:50 Binary descriptors for OpenCV Nathan Dehnel
2023-08-01 20:37 ` Saku Laesvuori
2023-08-01 20:58 ` Nathan Dehnel
2023-08-02 4:46 ` Saku Laesvuori
2023-08-02 20:25 ` Nathan Dehnel
2023-08-03 6:18 ` Saku Laesvuori [this message]
-- strict thread matches above, loose matches on Subject: below --
2023-08-01 7:21 Nathan Dehnel
2023-08-01 12:14 ` Ricardo Wurmus
2023-08-16 16:55 ` Ludovic Courtès
2023-08-17 21:57 ` Nathan Dehnel
2023-08-17 23:18 ` Maxim Cournoyer
2023-08-24 15:08 ` Ludovic Courtès
2023-07-31 13:12 Ricardo Wurmus
2023-08-01 14:02 ` Maxim Cournoyer
2023-08-01 14:39 ` Saku Laesvuori
2023-08-19 9:37 ` Simon Tournier
2023-08-24 15:06 ` Ludovic Courtès
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230803061850.lugers327lcxcrrq@X-kone \
--to=saku@laesvuori.fi \
--cc=guix-devel@gnu.org \
--cc=ncdehnel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/guix.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.