Collaborative training of Libre LLMs (was: Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm))

unofficial mirror of emacs-tangents@gnu.org
 help / color / mirror / Atom feed

From: Ihor Radchenko <yantar92@posteo.net>
To: rms@gnu.org
Cc: emacs-tangents@gnu.org, jporterbugs@gmail.com, ahyatt@gmail.com,
	team@khoj.dev
Subject: Collaborative training of Libre LLMs (was: Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm))
Date: Sat, 09 Sep 2023 10:28:35 +0000	[thread overview]
Message-ID: <87fs3n7i98.fsf@localhost> (raw)
In-Reply-To: <E1qelzp-0003lR-Nq@fencepost.gnu.org>

Richard Stallman <rms@gnu.org> writes:

> 1. In Wikipedia, a contributor voluntarily chooses to participate in editing,
> Editing participation is separate from consulting the encyclopedia.  This
> fits the word "collaborating.
>
> By contrast, a when the develoers of ChatGTP make it learn from the
> user, that "contribution" is neither voluntary nor active.  It is more
> "being taken advantage of" than "collaborating".

It is actually voluntary now - according to
https://techunwrapped.com/you-can-now-make-chatgpt-not-train-with-your-queries/,
one can disable or enable training on user queries.
By default, it is enabled though.

> 2. Wikipedia is a community project to develop a free/libre work.  (It
> is no coincidence that this resembles the GNU Project.)  Morally it
> deserves community support, despite some things it handles badly.
>
> By contrast, ChatGTP is neither a community project nor free/libre.
> That's perhaps why it arranges to manipulate people into "contributing"
> rather than letting them choose.

Indeed, they do hold coercive power as people have no choice to copy run
the model independently.

However, I do not care much about OpenAI corporate practices - they are
as bad as we are used to in other bigtech SaaSS companies. What might be
a more interesting question to discuss is actual genuine collaborative
effort training a libre (not ChatGTP) model.

Currently, improving models is rather sequential process. If there is
one publicly available model, anyone can download the weights, train
them locally, and share the results. However, if multiple people take a
single _same_ version of the model and train it, the results, AFAIK,
cannot be combined.

As Andrew mentioned, the approach with "patching" a model is quite
interesting idea - if such "patches" may be combined, we can
get rid of the above concern with collaborative _ethical_ development of
models.

However, if the "patching" technology can only serve a single "patch" +
main model, there is a problem. Improving libre neural networks will
become difficult, unless people utilize collaborative server to
continuously improve a model.

Such collaborative server, similar to ChatGPT, will combine "editing"
(training) and "consulting" together. And, unlike Wikipedia, these
activities are hard to separate.

This raises a moral question about practical ways to improve libre
neural networks without falling into SaaSS practices.

As a practical example, there is https://github.com/khoj-ai/khoj/ Libre
neural network interface in development (it features Emacs support).
They recently started https://khoj.dev/ cloud aiming for people who
cannot afford to run the models locally. This discussion might be one of
the ethical considerations of using such cloud.

I CCed khoj devs.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

next prev parent reply	other threads:[~2023-09-09 10:28 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAM6wYYJHa+tCUKO_SsnT77g-4MUM0x4FrkoCekr=T9-UF1ADDA@mail.gmail.com>
     [not found] ` <E1qTaA2-00038O-UA@fencepost.gnu.org>
     [not found]   ` <CAM6wYY+E=z5VqV2xXMbhbpN7vn+-tyzfOGKFAuG0s+croRmEPA@mail.gmail.com>
     [not found]     ` <E1qV08g-0001mb-11@fencepost.gnu.org>
     [not found]       ` <CAM6wYYLZ26E4rpo2Ae2PyxKSBYQKAXQ6U5_QGMoGx5SQy7AMSA@mail.gmail.com>
     [not found]         ` <87v8d0iqa5.fsf@posteo.net>
     [not found]           ` <E1qaR6l-00012I-VP@fencepost.gnu.org>
     [not found]             ` <CAM6wYYLYrQL9+3cgUELYavUdHQg5m0bqdW89_qJFvk050-sGNQ@mail.gmail.com>
     [not found]               ` <fd98dcaf-5016-1a84-f281-36ef6eb108c5@gmail.com>
     [not found]                 ` <E1qbX8C-0004EP-3M@fencepost.gnu.org>
     [not found]                   ` <87cyz3vaws.fsf@localhost>
2023-08-31 16:29                     ` [NonGNU ELPA] New package: llm chad
2023-09-01  9:53                       ` Ihor Radchenko
     [not found]                     ` <E1qcyN3-0001al-5t@fencepost.gnu.org>
2023-09-06 12:51                       ` Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm) Ihor Radchenko
2023-09-06 16:59                         ` Andrew Hyatt
2023-09-09  0:37                           ` Richard Stallman
2023-09-06 22:52                         ` Emanuel Berg
2023-09-07  7:28                           ` Lucien Cartier-Tilet
2023-09-07  7:57                             ` Emanuel Berg
2023-09-09  0:38                         ` Richard Stallman
2023-09-09 10:28                           ` Ihor Radchenko [this message]
2023-09-09 11:19                             ` Collaborative training of Libre LLMs (was: Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm)) Jean Louis
2023-09-10  0:22                             ` Richard Stallman
2023-09-10  2:18                               ` Debanjum Singh Solanky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fs3n7i98.fsf@localhost \
    --to=yantar92@posteo.net \
    --cc=ahyatt@gmail.com \
    --cc=emacs-tangents@gnu.org \
    --cc=jporterbugs@gmail.com \
    --cc=rms@gnu.org \
    --cc=team@khoj.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).