On 8/17/2023 10:08 AM, Daniel Fleischer wrote:
> That is not accurate; LLMs can definitely run locally on your machine.
> Models can be downloaded and ran using Python. Here is an LLM released
> under Apache 2 license [0]. There are "black-box" models, served in the
> cloud, but the revolution we're is precisely because many models are
> released freely and can be ran (and trained) locally, even on a laptop.
>
> [0] https://huggingface.co/mosaicml/mpt-7b
The link says that this model has been pretrained, which is certainly
useful for the average person who doesn't want (or doesn't have the
resources) to perform the training themselves, but from the
documentation, it's not clear how I *would* perform the training myself
if I were so inclined. (I've only toyed with LLMs, so I'm not an expert
at more "advanced" cases like this.)
The training of these is fairly straightforward, at least if you are familiar with the area. The code for implementing transformers in the original "Attention is All You Need" paper is at
https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/models/transformer.py under an Apache License, and the LLM we are talking about here use this technique to train and execute, changing some parameters and adding things like more attention heads, but keeping the fundamental architecture the same.
I'm not an expert, but I believe that due to the use of stochastic processes in training, even if you had the exact code, parameters and data used in training, you would never be able to reproduce the model they make available. It should be equivalent in quality, perhaps, but not the same.
I do see that the documentation mentions the training datasets used, but
it also says that "great efforts have been taken to clean the
pretraining data". Am I able to access the cleaned datasets? I looked
over their blog post[1], but I didn't see anything describing this in
detail.
While I certainly appreciate the effort people are making to produce
LLMs that are more open than OpenAI (a low bar), I'm not sure if
providing several gigabytes of model weights in binary format is really
providing the *source*. It's true that you can still edit these models
in a sense by fine-tuning them, but you could say the same thing about a
project that only provided the generated output from GNU Bison, instead
of the original input to Bison.
To me, I believe it should be about freedom. Not absolute freedom, but relative freedom: do you, the user, have the same amount of freedom as anyone else, including the creator? For the LLMs like huggingface and many other research LLMs, the answer is yes. You do have the freedom to fine-tune the model, as does the creator. You cannot change the base model in any meaningful way, but neither can the creator, because no one knows how to do that yet. You cannot understand the model, but neither can the creator, because while some progress has been made in understanding simple things about simple LLMs like GPT-2, the modern LLMs are too complex for anyone to make sense out of.
(Just to be clear, I don't mean any of the above to be leading
questions. I really don't know the answers, and using analogies to
previous cases like Bison can only get us so far. I truly hope there
*is* a freedom-respecting way to interface with LLMs, but I also think
it's worth taking some extra care at the beginning so we can choose the
right path forward.)
[1] https://www.mosaicml.com/blog/mpt-7b