From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Ihor Radchenko Newsgroups: gmane.emacs.tangents Subject: Collaborative training of Libre LLMs (was: Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm)) Date: Sat, 09 Sep 2023 10:28:35 +0000 Message-ID: <87fs3n7i98.fsf@localhost> References: <87v8d0iqa5.fsf@posteo.net> <87cyz3vaws.fsf@localhost> <87a5tzsbvl.fsf@localhost> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="16527"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-tangents@gnu.org, jporterbugs@gmail.com, ahyatt@gmail.com, team@khoj.dev To: rms@gnu.org Original-X-From: emacs-tangents-bounces+get-emacs-tangents=m.gmane-mx.org@gnu.org Sat Sep 09 12:27:58 2023 Return-path: Envelope-to: get-emacs-tangents@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qevC0-00042f-T3 for get-emacs-tangents@m.gmane-mx.org; Sat, 09 Sep 2023 12:27:56 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qevBn-0006jY-80; Sat, 09 Sep 2023 06:27:43 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qevBk-0006ii-NB for emacs-tangents@gnu.org; Sat, 09 Sep 2023 06:27:40 -0400 Original-Received: from mout02.posteo.de ([185.67.36.66]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qevBg-0004NO-Tb for emacs-tangents@gnu.org; Sat, 09 Sep 2023 06:27:40 -0400 Original-Received: from submission (posteo.de [185.67.36.169]) by mout02.posteo.de (Postfix) with ESMTPS id 93D52240104 for ; Sat, 9 Sep 2023 12:27:34 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017; t=1694255254; bh=xIXt0qUxNyY1lpG8BwCEk2qJz6ytLqbJX0IxqDh4Xeo=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:From; b=T/DGFQoC010W+Uc7DtCtJirl448StXC69yxnRTo6iXnIffV1/wAFeSJv/WM1U/5cN 7+XbNTQ/0RKi2Vo+WDQ/WfcSGrbJoqTQs6GKCw42L2FGBzCucFKIXny6YalS4XsDoj pxKwAdpSMZvMLe3kz6kHqAB69Qa8zDAMkRwXDycrCunBUC68vJufk6iGW+lTcHWzrg yFx3qIvBqyY3zAaiQBVXHP/gd72+yJu4ETBOWtTcRvbaV+MPBd0BMx2S2qgxkeUkgT nvaSRZk2MlsYLVp/fhfDx9MgoTWZJYjBuKCcMkuLbdY92bGKmCAXWexo2vVf+iKxBj NWLNkxJNgPmhw== Original-Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4RjTgn6jC2z9rxN; Sat, 9 Sep 2023 12:27:33 +0200 (CEST) In-Reply-To: Received-SPF: pass client-ip=185.67.36.66; envelope-from=yantar92@posteo.net; helo=mout02.posteo.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-tangents@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Emacs news and miscellaneous discussions outside the scope of other Emacs mailing lists List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-tangents-bounces+get-emacs-tangents=m.gmane-mx.org@gnu.org Original-Sender: emacs-tangents-bounces+get-emacs-tangents=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.tangents:1059 Archived-At: Richard Stallman writes: > 1. In Wikipedia, a contributor voluntarily chooses to participate in editing, > Editing participation is separate from consulting the encyclopedia. This > fits the word "collaborating. > > By contrast, a when the develoers of ChatGTP make it learn from the > user, that "contribution" is neither voluntary nor active. It is more > "being taken advantage of" than "collaborating". It is actually voluntary now - according to https://techunwrapped.com/you-can-now-make-chatgpt-not-train-with-your-queries/, one can disable or enable training on user queries. By default, it is enabled though. > 2. Wikipedia is a community project to develop a free/libre work. (It > is no coincidence that this resembles the GNU Project.) Morally it > deserves community support, despite some things it handles badly. > > By contrast, ChatGTP is neither a community project nor free/libre. > That's perhaps why it arranges to manipulate people into "contributing" > rather than letting them choose. Indeed, they do hold coercive power as people have no choice to copy run the model independently. However, I do not care much about OpenAI corporate practices - they are as bad as we are used to in other bigtech SaaSS companies. What might be a more interesting question to discuss is actual genuine collaborative effort training a libre (not ChatGTP) model. Currently, improving models is rather sequential process. If there is one publicly available model, anyone can download the weights, train them locally, and share the results. However, if multiple people take a single _same_ version of the model and train it, the results, AFAIK, cannot be combined. As Andrew mentioned, the approach with "patching" a model is quite interesting idea - if such "patches" may be combined, we can get rid of the above concern with collaborative _ethical_ development of models. However, if the "patching" technology can only serve a single "patch" + main model, there is a problem. Improving libre neural networks will become difficult, unless people utilize collaborative server to continuously improve a model. Such collaborative server, similar to ChatGPT, will combine "editing" (training) and "consulting" together. And, unlike Wikipedia, these activities are hard to separate. This raises a moral question about practical ways to improve libre neural networks without falling into SaaSS practices. As a practical example, there is https://github.com/khoj-ai/khoj/ Libre neural network interface in development (it features Emacs support). They recently started https://khoj.dev/ cloud aiming for people who cannot afford to run the models locally. This discussion might be one of the ethical considerations of using such cloud. I CCed khoj devs. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at . Support Org development at , or support my work at