* [NonGNU ELPA] New package: llm @ 2023-08-07 23:54 Andrew Hyatt 2023-08-08 5:42 ` Philip Kaludercic ` (2 more replies) 0 siblings, 3 replies; 68+ messages in thread From: Andrew Hyatt @ 2023-08-07 23:54 UTC (permalink / raw) To: Emacs-Devel devel [-- Attachment #1: Type: text/plain, Size: 1029 bytes --] Hi everyone, I've created a new package called llm, for the purpose of abstracting the interface to various large language model providers. There are many LLM packages already, but it would be wasteful for all of them to try to be compatible with a range of LLM providers API (local LLMs such as Llama 2, API providers such as Open AI and Google Cloud's Vertex). This package attempts to solve this problem by defining generic functions which can then be implemented by different LLM providers. I have started with just two: Open AI and Vertex. Llama 2 would be a next choice, but I don't yet have it working on my system. In addition, I'm starting with just two core functionality: chat and embeddings. Extending to async is probably something that I will do next. You can see the code at https://github.com/ahyatt/llm. I prefer that this is NonGNU, because I suspect people would like to contribute interfaces to different LLM, and not all of them will have FSF papers. Your thoughts would be appreciated, thank you! [-- Attachment #2: Type: text/html, Size: 1234 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-07 23:54 [NonGNU ELPA] New package: llm Andrew Hyatt @ 2023-08-08 5:42 ` Philip Kaludercic 2023-08-08 15:08 ` Spencer Baugh 2023-08-08 15:09 ` Andrew Hyatt 2023-08-09 3:47 ` Richard Stallman 2023-08-09 3:47 ` Richard Stallman 2 siblings, 2 replies; 68+ messages in thread From: Philip Kaludercic @ 2023-08-08 5:42 UTC (permalink / raw) To: Andrew Hyatt; +Cc: Emacs-Devel devel Andrew Hyatt <ahyatt@gmail.com> writes: > Hi everyone, > > I've created a new package called llm, for the purpose of abstracting the > interface to various large language model providers. There are many LLM > packages already, but it would be wasteful for all of them to try to be > compatible with a range of LLM providers API (local LLMs such as Llama 2, > API providers such as Open AI and Google Cloud's Vertex). This package > attempts to solve this problem by defining generic functions which can then > be implemented by different LLM providers. I have started with just two: > Open AI and Vertex. Llama 2 would be a next choice, but I don't yet have > it working on my system. In addition, I'm starting with just two core > functionality: chat and embeddings. Extending to async is probably > something that I will do next. Llama was the model that could be executed locally, and the other two are "real" services, right? > You can see the code at https://github.com/ahyatt/llm. > > I prefer that this is NonGNU, because I suspect people would like to > contribute interfaces to different LLM, and not all of them will have FSF > papers. I cannot estimate how important or not LLM will be in the future, but it might be worth having something like this in the core, at some point. Considering the size of a module at around 150-200 lines it seems, and the relative infrequency of new models (at least to my understanding), I don't know if the "advantage" of accepting contributions from people who haven't signed the CA has that much weight, opposed to the general that all users may enjoy from having the technology integrated into Emacs itself, in a way that other packages (and perhaps even the core-help system) could profit from it. > Your thoughts would be appreciated, thank you! ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-08 5:42 ` Philip Kaludercic @ 2023-08-08 15:08 ` Spencer Baugh 2023-08-08 15:09 ` Andrew Hyatt 1 sibling, 0 replies; 68+ messages in thread From: Spencer Baugh @ 2023-08-08 15:08 UTC (permalink / raw) To: emacs-devel Philip Kaludercic <philipk@posteo.net> writes: > in a way that other packages (and perhaps even the core-help > system) could profit from it. Now I'm imagining all kinds of integration with C-h to automatically help users with Emacs tasks. Such as an analog to apropos-command which queries an LLM for help. And maybe integration with M-x report-emacs-bug. And maybe an M-x doctor which *actually works* :) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-08 5:42 ` Philip Kaludercic 2023-08-08 15:08 ` Spencer Baugh @ 2023-08-08 15:09 ` Andrew Hyatt 1 sibling, 0 replies; 68+ messages in thread From: Andrew Hyatt @ 2023-08-08 15:09 UTC (permalink / raw) To: Philip Kaludercic; +Cc: Emacs-Devel devel [-- Attachment #1: Type: text/plain, Size: 2167 bytes --] On Tue, Aug 8, 2023 at 1:42 AM Philip Kaludercic <philipk@posteo.net> wrote: > Andrew Hyatt <ahyatt@gmail.com> writes: > > > Hi everyone, > > > > I've created a new package called llm, for the purpose of abstracting the > > interface to various large language model providers. There are many LLM > > packages already, but it would be wasteful for all of them to try to be > > compatible with a range of LLM providers API (local LLMs such as Llama 2, > > API providers such as Open AI and Google Cloud's Vertex). This package > > attempts to solve this problem by defining generic functions which can > then > > be implemented by different LLM providers. I have started with just two: > > Open AI and Vertex. Llama 2 would be a next choice, but I don't yet have > > it working on my system. In addition, I'm starting with just two core > > functionality: chat and embeddings. Extending to async is probably > > something that I will do next. > > Llama was the model that could be executed locally, and the other two > are "real" services, right? > That's correct. > > > You can see the code at https://github.com/ahyatt/llm. > > > > I prefer that this is NonGNU, because I suspect people would like to > > contribute interfaces to different LLM, and not all of them will have FSF > > papers. > > I cannot estimate how important or not LLM will be in the future, but it > might be worth having something like this in the core, at some point. > Considering the size of a module at around 150-200 lines it seems, and > the relative infrequency of new models (at least to my understanding), I > don't know if the "advantage" of accepting contributions from people who > haven't signed the CA has that much weight, opposed to the general that > all users may enjoy from having the technology integrated into Emacs > itself, in a way that other packages (and perhaps even the core-help > system) could profit from it. > That seems reasonable. I don't have a strong opinion here, so if others want to see this in GNU ELPA instead, I'm happy to do that. > > > Your thoughts would be appreciated, thank you! > [-- Attachment #2: Type: text/html, Size: 3077 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-07 23:54 [NonGNU ELPA] New package: llm Andrew Hyatt 2023-08-08 5:42 ` Philip Kaludercic @ 2023-08-09 3:47 ` Richard Stallman 2023-08-09 4:37 ` Andrew Hyatt 2023-08-09 3:47 ` Richard Stallman 2 siblings, 1 reply; 68+ messages in thread From: Richard Stallman @ 2023-08-09 3:47 UTC (permalink / raw) To: Andrew Hyatt; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] All the large language models are unjust -- either the models are nonfree software, released under a license that denies freedom 0 (https://www.gnu.org/philosophy/programs-must-not-limit-freedom-to-run.html), or they are not released at all, only made available for use as SaaSS (https://www.gnu.org/philosophy/who-does-that-server-really-serve.html). If a language model is much better known that GNU Emacs, it is ok to have code in Emacs to make it more convenient to use Emacs along with the language model. If the language model is not so well known, then Emacs should not mention it _in any way_. This is in the GNU coding standards. If Emacs is to have commands specifically to support them, we should make those commands inform the user -- every user of each of those commands -- of how they mistreat the user. It is enough to display a message explaining the situation in a way that it will really be seen. Displaying this for the first invocation on each day would be sufficient. Doing it more often would be annoying. Would someone please implemt this? -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-09 3:47 ` Richard Stallman @ 2023-08-09 4:37 ` Andrew Hyatt 2023-08-13 1:43 ` Richard Stallman 2023-08-13 1:43 ` Richard Stallman 0 siblings, 2 replies; 68+ messages in thread From: Andrew Hyatt @ 2023-08-09 4:37 UTC (permalink / raw) To: rms; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 3220 bytes --] On Tue, Aug 8, 2023 at 11:47 PM Richard Stallman <rms@gnu.org> wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > All the large language models are unjust -- either the models > are nonfree software, released under a license that denies freedom 0 > ( > https://www.gnu.org/philosophy/programs-must-not-limit-freedom-to-run.html > ), > or they are not released at all, only made available for use as SaaSS > (https://www.gnu.org/philosophy/who-does-that-server-really-serve.html). > > If a language model is much better known that GNU Emacs, > it is ok to have code in Emacs to make it more convenient > to use Emacs along with the language model. If the language model > is not so well known, then Emacs should not mention it _in any way_. > This is in the GNU coding standards. > > If Emacs is to have commands specifically to support them, we should > make those commands inform the user -- every user of each of those > commands -- of how they mistreat the user. > > It is enough to display a message explaining the situation > in a way that it will really be seen. > > Displaying this for the first invocation on each day > would be sufficient. Doing it more often would be annoying. > What you are saying is consistent with the GNU coding standard. However, I think any message about this would be annoying, personally, and would be a deterrent for clients to use this library. How about this, which I think would satisfy your concerns: We contribute ONLY llm.el, which mentions no implementations of LLMs, no companies, and no specific language models, to GNU ELPA. With only the interface, I believe there is nothing to warn the user about, and the clients have something in GNU ELPA to code against. If some day there is an LLM that qualifies for inclusion because it is sufficiently free, it can be added to GNU ELPA as well. All implementations can then separately be made available on some other package library not associated with GNU. In this scenario, I wouldn't have warnings on those implementations, just as the many llm-based packages on various alternative ELPAs do not have warnings today. If it still seems wrong to you to have an interface in GNU ELPA whose most popular interaces today involves non-free software, then perhaps it might be best to leave this package out of GNU or Non-GNU ELPA for now. I think that would be a shame, since the flexibility it provides is likely the only good hedge against over-reliance on SaaS LLMs, especially in the case that acceptable llms are developed. Such llms are likely available today (research ones come to mind), however I don't have good visibility into that world or the likelihood they would be useful to emacs users. > > Would someone please implemt this? > -- > Dr Richard Stallman (https://stallman.org) > Chief GNUisance of the GNU Project (https://gnu.org) > Founder, Free Software Foundation (https://fsf.org) > Internet Hall-of-Famer (https://internethalloffame.org) > > > [-- Attachment #2: Type: text/html, Size: 4358 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-09 4:37 ` Andrew Hyatt @ 2023-08-13 1:43 ` Richard Stallman 2023-08-13 1:43 ` Richard Stallman 1 sibling, 0 replies; 68+ messages in thread From: Richard Stallman @ 2023-08-13 1:43 UTC (permalink / raw) To: Andrew Hyatt; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > What you are saying is consistent with the GNU coding standard. However, I > think any message about this would be annoying, I am sure it would be a little annoying. But assuming the user can type SPC and move on from that message, the annoyance will be quite little. personally, and would be a > deterrent for clients to use this library. If the library is quite useful I doubt anyone would be deterred. If anyone minded it the message enough to stop using the package, perse could edit this out of the code. This issue is an example of those where two different values are pertinent. There is convenience, which counts but is superficial. And there is the purpose of the GNU system, which for 40 years has led the fight against injustice in software. That value is deep and, in the long term, the most important value of all. When they conflict in a specific practical matter, there is always pressure to prioritize convenience. But that is not wise. The right approach is to look for a compromise which serves both goals. I am sure we can find one here. I suggested showing the message once a day, because that is what first occurred to me. But there are lots of ways to vary the details of the compromise. Here's an idea. For each language model, it could diisplay the message the first, second, fifth, tenth, and after that every tenth time the user starts that mode. With this method, the frequency of little annoyance will diminish quickly, but the point will not be forgotten. As long as we do not overvalue minor inconvenience, there will be good solutions. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-09 4:37 ` Andrew Hyatt 2023-08-13 1:43 ` Richard Stallman @ 2023-08-13 1:43 ` Richard Stallman 2023-08-13 2:11 ` Emanuel Berg ` (2 more replies) 1 sibling, 3 replies; 68+ messages in thread From: Richard Stallman @ 2023-08-13 1:43 UTC (permalink / raw) To: Andrew Hyatt; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > What you are saying is consistent with the GNU coding standard. However, I > think any message about this would be annoying, I am sure it would be a little annoying. But assuming the user can type SPC and move on from that message, the annoyance will be quite little. personally, and would be a > deterrent for clients to use this library. If the library is quite useful I doubt anyone would be deterred. If anyone minded it the message enough to stop using the package, perse could edit this out of the code. This issue is an example of those where two different values are pertinent. There is convenience, which counts but is superficial. And there is the purpose of the GNU system, which for 40 years has led the fight against injustice in software. That value is deep and, in the long term, the most important value of all. When they conflict in a specific practical matter, there is always pressure to prioritize convenience. But that is not wise. The right approach is to look for a ocmpromise which serves both goals. I am sure we can find one here. I suggested showing the message once a day, because that is what first occurred to me. But there are lots of ways to vary the details. Here's an idea. For each language model, it could diisplay the message the first, second, fifth, tenth, and after that every tenth time the user starts that mode. With this, the frequency of little annoyance will diminish soon, but the point will not be forgotten. You made suggestions for how to exclude more code from Emacs itself, and support for obscure language models we probably should exclude. But there is no need to exclude the support for the well-known ones, as I've explained. And we can do better than that! We can educate the users about what is wrong with those systems -- something that the media hysteria fails to mention at all. That is important -- let's use Emacs for it! > All implementations can then separately be made available on some other > package library not associated with GNU. In this scenario, I wouldn't have > warnings on those implementations, just as the many llm-based packages on > various alternative ELPAs do not have warnings today. They ought to show warnings -- the issue is exactly the same. We should not slide quietly into acceptance and normalization of a new systematic injustice. Opposing it is our job. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-13 1:43 ` Richard Stallman @ 2023-08-13 2:11 ` Emanuel Berg 2023-08-15 5:14 ` Andrew Hyatt 2023-08-27 1:07 ` Andrew Hyatt 2 siblings, 0 replies; 68+ messages in thread From: Emanuel Berg @ 2023-08-13 2:11 UTC (permalink / raw) To: emacs-devel Richard Stallman wrote: > You made suggestions for how to exclude more code from Emacs > itself, and support for obscure language models we probably > should exclude. But there is no need to exclude the support > for the well-known ones, as I've explained. We should include as much as possible, but it doesn't really matter if we include it in vanilla Emacs or in a package in ELPA as long as it is included. Rather, the message would be, in vanilla Emacs where whenever something wasn't included, "you have opened a file for programming in X which is currently partially unsupported in vanilla Emacs, but note there are 7 packages in ELPA including a major mode to do exactly that ...". And when enough people get annoyed by that message one would consider it to be about time to move it from ELPA into vanilla Emacs ... -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-13 1:43 ` Richard Stallman 2023-08-13 2:11 ` Emanuel Berg @ 2023-08-15 5:14 ` Andrew Hyatt 2023-08-15 17:12 ` Jim Porter 2023-08-16 2:30 ` Richard Stallman 2023-08-27 1:07 ` Andrew Hyatt 2 siblings, 2 replies; 68+ messages in thread From: Andrew Hyatt @ 2023-08-15 5:14 UTC (permalink / raw) To: rms; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 4193 bytes --] On Sat, Aug 12, 2023 at 9:43 PM Richard Stallman <rms@gnu.org> wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > What you are saying is consistent with the GNU coding standard. > However, I > > think any message about this would be annoying, > > I am sure it would be a little annoying. But assuming the user can > type SPC and move on from that message, the annoyance will be quite > little. > > personally, and would > be a > > deterrent for clients to use this library. > > If the library is quite useful I doubt anyone would be deterred. > If anyone minded it the message enough to stop using the package, perse > could > edit this out of the code. > > This issue is an example of those where two different values are > pertinent. There is convenience, which counts but is superficial. > And there is the purpose of the GNU system, which for 40 years has led > the fight against injustice in software. That value is deep and, in the > long term, the most important value of all. > > When they conflict in a specific practical matter, there is always > pressure to prioritize convenience. But that is not wise. > The right approach is to look for a ocmpromise which serves both > goals. I am sure we can find one here. > > I suggested showing the message once a day, because that is what first > occurred to me. But there are lots of ways to vary the details. > Here's an idea. For each language model, it could diisplay the > message the first, second, fifth, tenth, and after that every tenth > time the user starts that mode. With this, the frequency of little > annoyance will diminish soon, but the point will not be forgotten. > Is there anything else in emacs that does something similar? I'd like to look at how other modules do the same thing and how they communicate things to the user. I believe we can output something, but at least some of the LLM calls are asynchronous, and, as a library, even when not async, we have no idea about the UI context we're in. Suddenly throwing up a window in a function that has no side-effects seems unfriendly to clients of the library. Perhaps we could just use the "warn" function, which is more in line with what might be expected from a library. And the user can suppress the warning if needed. > You made suggestions for how to exclude more code from Emacs itself, > and support for obscure language models we probably should exclude. > But there is no need to exclude the support for the well-known ones, > as I've explained. > > And we can do better than that! We can educate the users about what > is wrong with those systems -- something that the media hysteria fails > to mention at all. That is important -- let's use Emacs for it! > > All implementations can then separately be made available on some other > > package library not associated with GNU. In this scenario, I wouldn't > have > > warnings on those implementations, just as the many llm-based packages > on > > various alternative ELPAs do not have warnings today. > > They ought to show warnings -- the issue is exactly the same. > > We should not slide quietly into acceptance and normalization of a new > systematic injustice. Opposing it is our job. > I don't doubt that or disagree, I'd just rather us oppose it in documentation or code comments, not during runtime. The other packages aren't under GNU control, and the authors may have different philosophies. It would be unfortunate if that worked out to the advantage of users, who have for whatever reason chosen to use a LLM provider being well aware that it is not a free system. I'm curious what others think. > > -- > Dr Richard Stallman (https://stallman.org) > Chief GNUisance of the GNU Project (https://gnu.org) > Founder, Free Software Foundation (https://fsf.org) > Internet Hall-of-Famer (https://internethalloffame.org) > > > [-- Attachment #2: Type: text/html, Size: 5444 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-15 5:14 ` Andrew Hyatt @ 2023-08-15 17:12 ` Jim Porter 2023-08-17 2:02 ` Richard Stallman 2023-08-16 2:30 ` Richard Stallman 1 sibling, 1 reply; 68+ messages in thread From: Jim Porter @ 2023-08-15 17:12 UTC (permalink / raw) To: Andrew Hyatt, rms; +Cc: emacs-devel On 8/14/2023 10:14 PM, Andrew Hyatt wrote: > I don't doubt that or disagree, I'd just rather us oppose it in > documentation or code comments, not during runtime. I'd be hesitant to add support for these LLMs even *with* a warning message at runtime. That's not to say there should never be a GNU project with support for any LLM, but that I think we should tread carefully. Among other things, I'm curious about what the FSF would say about the *models* the LLMs use. Are they "just data", or should we treat them more like object code? What does an LLM that fully adheres to FSF principles actually look like? I'm not personally aware of any official FSF stance on LLMs, so that would be the next step as I see it, before publishing any code. Again, that doesn't mean Emacs should never have an LLM package, just that some detailed guidance from the FSF would make it a lot clearer (to me, at least) how to progress. - Jim ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-15 17:12 ` Jim Porter @ 2023-08-17 2:02 ` Richard Stallman 2023-08-17 2:48 ` Andrew Hyatt 2023-08-17 17:08 ` Daniel Fleischer 0 siblings, 2 replies; 68+ messages in thread From: Richard Stallman @ 2023-08-17 2:02 UTC (permalink / raw) To: Jim Porter; +Cc: ahyatt, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Among other things, I'm curious about what the FSF would say > about the *models* the LLMs use. Are they "just data", or should we > treat them more like object code? What does an LLM that fully adheres to > FSF principles actually look like? I've been thinking about this, and my tentative conclusion is that that precise question is not crucial, because what is certain is that they are part of the control over the system's behavior. So they ought to be released under a free license. In the examples I've heard of, that is never the case. Either they are secret -- users can only use them on a server, which is SaaSS, see https://gnu.org/philosophy/who-does-that-server-really-serve.html -- or they are released under nonfree licenses that restrict freedom 0; see https://www.gnu.org/philosophy/programs-must-not-limit-freedom-to-run.html. As I recall, we don't have a rule against features to interface servers whose code is not released, and we certainly don't have a rule against code in Emacs to interact with nonfree software _provided said software is well known_ -- that is why it is ok to have code to interact with Windows and Android. ISTR we have features in Emacs for talking to servers whose code is not release. But does anyone recall better than I do? -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-17 2:02 ` Richard Stallman @ 2023-08-17 2:48 ` Andrew Hyatt 2023-08-19 1:51 ` Richard Stallman 2023-08-17 17:08 ` Daniel Fleischer 1 sibling, 1 reply; 68+ messages in thread From: Andrew Hyatt @ 2023-08-17 2:48 UTC (permalink / raw) To: rms; +Cc: Jim Porter, emacs-devel [-- Attachment #1: Type: text/plain, Size: 2602 bytes --] On Wed, Aug 16, 2023 at 10:02 PM Richard Stallman <rms@gnu.org> wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > Among other things, I'm curious about what the FSF would say > > about the *models* the LLMs use. Are they "just data", or should we > > treat them more like object code? What does an LLM that fully adheres > to > > FSF principles actually look like? > > I've been thinking about this, and my tentative conclusion is that > that precise question is not crucial, because what is certain is that > they are part of the control over the system's behavior. So they > ought to be released under a free license. > > In the examples I've heard of, that is never the case. Either they > are secret -- users can only use them on a server, which is SaaSS, see > https://gnu.org/philosophy/who-does-that-server-really-serve.html -- > or they are released under nonfree licenses that restrict freedom 0; > see > https://www.gnu.org/philosophy/programs-must-not-limit-freedom-to-run.html > . > > As I recall, we don't have a rule against features to interface > servers whose code is not released, and we certainly don't have a rule > against code in Emacs to interact with nonfree software _provided said > software is well known_ -- that is why it is ok to have code to > interact with Windows and Android. > > ISTR we have features in Emacs for talking to servers whose code > is not release. But does anyone recall better than I do? > There is the "excorporate" package on GNU ELPA that talks to exchange corporate servers (although perhaps there are free variants that also speak this protocol?). There's the "metar" package on GNU ELPA that receives the weather from the metar system. A brief search didn't find any code for that, but it might exist. The other interesting find was "sql-oracle", as well as other nonfree similar sql servers in the main emacs lisp. It is a server, although the interface used is local and mediated by a program. But it is an interface to a nonfree utility software. There is no warning given, but a message in `sql--help-docstring' asks the user to consider free alternatives. > > -- > Dr Richard Stallman (https://stallman.org) > Chief GNUisance of the GNU Project (https://gnu.org) > Founder, Free Software Foundation (https://fsf.org) > Internet Hall-of-Famer (https://internethalloffame.org) > > > [-- Attachment #2: Type: text/html, Size: 3717 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-17 2:48 ` Andrew Hyatt @ 2023-08-19 1:51 ` Richard Stallman 2023-08-19 9:08 ` Ihor Radchenko 0 siblings, 1 reply; 68+ messages in thread From: Richard Stallman @ 2023-08-19 1:51 UTC (permalink / raw) To: Andrew Hyatt; +Cc: jporterbugs, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > There's the "metar" package on GNU ELPA that receives the > weather from the metar system. Wikipedia says that METAR is a format, not a system. Where precsely does that package get the data? Servers run by who? Servers that publish data of public interest are normally NOT SaaSS. (Indeed, most servers are NOT SaaSS.) SaaSS means using a server to do computing that naturally is yours. You ask for some computing to be done, send the input, and get the output back. Services that give you METAR data do computing that you might find useful, but I think that computing isn't specifically yours, so it isn't SaaSS. A brief search didn't find any code for > that, but it might exist. I can't make sense of that. Didn't find any code for what? > The other interesting find was "sql-oracle", as well as other nonfree > similar sql servers in the main emacs lisp. It is a server, although the > interface used is local and mediated by a program. But it is an interface > to a nonfree utility software. There is no warning given, but a message in > `sql--help-docstring' asks the user to consider free alternatives. This sounds like SaaSS to me. Maybe we should add such a warning here. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-19 1:51 ` Richard Stallman @ 2023-08-19 9:08 ` Ihor Radchenko 2023-08-21 1:12 ` Richard Stallman 0 siblings, 1 reply; 68+ messages in thread From: Ihor Radchenko @ 2023-08-19 9:08 UTC (permalink / raw) To: rms; +Cc: Andrew Hyatt, jporterbugs, emacs-devel Richard Stallman <rms@gnu.org> writes: > > The other interesting find was "sql-oracle", as well as other nonfree > > similar sql servers in the main emacs lisp. It is a server, although the > > interface used is local and mediated by a program. But it is an interface > > to a nonfree utility software. There is no warning given, but a message in > > `sql--help-docstring' asks the user to consider free alternatives. > > This sounds like SaaSS to me. Maybe we should add such a warning here. AFAIU, this has been discussed recently in https://list.orgmode.org/orgmode/E1pJoMI-0001Rf-Rq@fencepost.gnu.org/ -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-19 9:08 ` Ihor Radchenko @ 2023-08-21 1:12 ` Richard Stallman 2023-08-21 8:26 ` Ihor Radchenko 0 siblings, 1 reply; 68+ messages in thread From: Richard Stallman @ 2023-08-21 1:12 UTC (permalink / raw) To: Ihor Radchenko; +Cc: ahyatt, jporterbugs, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > This sounds like SaaSS to me. Maybe we should add such a warning here. > AFAIU, this has been discussed recently in > https://list.orgmode.org/orgmode/E1pJoMI-0001Rf-Rq@fencepost.gnu.org/ That seems to be a copy of the message I sent. It looks like maybe there was a discussion after that. Could you please tell me what conclusions or ideas that discussion reached? -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-21 1:12 ` Richard Stallman @ 2023-08-21 8:26 ` Ihor Radchenko 0 siblings, 0 replies; 68+ messages in thread From: Ihor Radchenko @ 2023-08-21 8:26 UTC (permalink / raw) To: rms; +Cc: ahyatt, jporterbugs, emacs-devel Richard Stallman <rms@gnu.org> writes: > Could you please tell me what conclusions or ideas that discussion > reached? Among other things, we have discussed Oracle SQL support in sql.el: https://list.orgmode.org/orgmode/E1pKtph-00082q-4Z@fencepost.gnu.org/ Richard Stallman <rms@gnu.org> writes: > ... > > The 'support' is essentially specialised comint based interfaces tweaked > > to work with the various SQL database engine command line clients such > > as psql for Postgres and sqlplus for Oracle. This involves codes to use > > the comint buffer to send commands/regions to the SQL client and read > > back the results and run interactive 'repl' like sessions with the > > client. > > Thanks. > > Based on our general policies, it is ok to do this. It is ok for > Postgres because that is free software. It is ok for Oracle because > that is widely known. Another relevant bit is related to the fact the Oracle SQL, through its free CLI, may actually connect to SaaS server. https://list.orgmode.org/orgmode/E1pKtpq-00086w-9s@fencepost.gnu.org/ Richard Stallman <rms@gnu.org> writes: > ... > > I am not sure about SaaSS - even postgresql (free software) may be used > > as a service provider by running it on server the user does not control. > > For sure, it CAN be used that way. If a Lisp package is designed to > work with a subprocess, a user can certainly rig it to talk with a > remote server. It is the nature of free software that people can > customize it, even so as to do something foolish with it. When a user > does this, it's per responsibility, not ours. > > We should not distribute specific support or recommendations to use > the Lisp package in that particular way. I also suggested the following, although did not yet find time open discussion on emacs-devel: https://list.orgmode.org/orgmode/87k015e80p.fsf@localhost/ Ihor Radchenko <yantar92@posteo.net> writes: > Richard Stallman <rms@gnu.org> writes: > >> > Would it then make sense to note the reasons why we support one or >> > another non-free software in a separate file like etc/NON-FREE-SUPPORT? >> >> I think it is a good idea to document the reasoning for these >> decision. But I think it does not necessarily have to be centralized >> in one file for all of Emacs. Another alternative, also natural, >> would be to describe these decisions with the code that implements the >> support. > > Will file header be a good place? > > Note that there is little point adding the reasons behind supporting > non-free software if they cannot be easily found. Ideally, it should be > a standard place documented as code convention. Then, people can > consistently check the reasons (or lack of) behind each individual > non-free software support decision. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-17 2:02 ` Richard Stallman 2023-08-17 2:48 ` Andrew Hyatt @ 2023-08-17 17:08 ` Daniel Fleischer 2023-08-19 1:49 ` Richard Stallman 2023-08-21 4:48 ` Jim Porter 1 sibling, 2 replies; 68+ messages in thread From: Daniel Fleischer @ 2023-08-17 17:08 UTC (permalink / raw) To: Richard Stallman; +Cc: Jim Porter, ahyatt, emacs-devel Richard Stallman <rms@gnu.org> writes: > In the examples I've heard of, that is never the case. Either they > are secret -- users can only use them on a server, which is SaaSS, see > https://gnu.org/philosophy/who-does-that-server-really-serve.html -- > or they are released under nonfree licenses that restrict freedom 0; > see https://www.gnu.org/philosophy/programs-must-not-limit-freedom-to-run.html. That is not accurate; LLMs can definitely run locally on your machine. Models can be downloaded and ran using Python. Here is an LLM released under Apache 2 license [0]. There are "black-box" models, served in the cloud, but the revolution we're is precisely because many models are released freely and can be ran (and trained) locally, even on a laptop. [0] https://huggingface.co/mosaicml/mpt-7b -- Daniel Fleischer ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-17 17:08 ` Daniel Fleischer @ 2023-08-19 1:49 ` Richard Stallman 2023-08-19 8:15 ` Daniel Fleischer 2023-08-21 4:48 ` Jim Porter 1 sibling, 1 reply; 68+ messages in thread From: Richard Stallman @ 2023-08-19 1:49 UTC (permalink / raw) To: Daniel Fleischer; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > In the examples I've heard of, that is never the case. Either they > > are secret -- users can only use them on a server, which is SaaSS, see > > https://gnu.org/philosophy/who-does-that-server-really-serve.html -- > > or they are released under nonfree licenses that restrict freedom 0; > > see https://www.gnu.org/philosophy/programs-must-not-limit-freedom-to-run.html. > That is not accurate; LLMs can definitely run locally on your machine. We are slightly miscommunicating. Yes there are models that could run locally on your machine, but all the ones I know of were released under a nonfree license. > Here is an LLM released > under Apache 2 license [0]. I haven't seen this before. Maybe it is an exception. Could you confirm that this is a language model itself, not the program that runs the language model? > There are "black-box" models, served in the > cloud, Could we please not use the term "cloud"? There is no cloud, only various companies' computers. See https://gnu.org/philosophy/words-to-avoid.html#CloudComputing. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-19 1:49 ` Richard Stallman @ 2023-08-19 8:15 ` Daniel Fleischer 2023-08-21 1:12 ` Richard Stallman 0 siblings, 1 reply; 68+ messages in thread From: Daniel Fleischer @ 2023-08-19 8:15 UTC (permalink / raw) To: Richard Stallman; +Cc: emacs-devel Local LLMs usually run using the python `transformers' library; in order to interact with them using a REST API, some glue code is needed, for example: https://github.com/go-skynet/LocalAI The API is based on OpenAI.com which is what others are following and thus are relevant for the API access the llm package is going to offer. Richard Stallman <rms@gnu.org> writes: > We are slightly miscommunicating. Yes there are models that could run > locally on your machine, but all the ones I know of were released > under a nonfree license. > Could you confirm that this is a language model itself, not the > program that runs the language model? The most popular software framework for running LLMs is called `transformers' (named after the models' architecture): https://github.com/huggingface/transformers (Apache 2) Huggingface also offers free hosting for models and data sets. There are several families of free models: - XGEN https://huggingface.co/Salesforce/xgen-7b-8k-base - MPT https://huggingface.co/mosaicml/mpt-7b - Falcon https://huggingface.co/tiiuae/falcon-7b These are git project, e.g. see https://huggingface.co/tiiuae/falcon-7b/tree/main. These models are released under Apache 2. The models contains the weights (compressed numerical matrices) and possibly some Python code files needed and they explicitly depend on the `transformers' library and the `pytorch' neural networks library (BSD-3). -- Daniel Fleischer ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-19 8:15 ` Daniel Fleischer @ 2023-08-21 1:12 ` Richard Stallman 0 siblings, 0 replies; 68+ messages in thread From: Richard Stallman @ 2023-08-21 1:12 UTC (permalink / raw) To: Daniel Fleischer; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > The most popular software framework for running LLMs is called > `transformers' (named after the models' architecture): Ok. Is that a problem in any way? If the `transformers' library is libre, I think it is not a problem. Do you know why they use the name "huggingface"? It seems very strange to me as an anglophone. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-17 17:08 ` Daniel Fleischer 2023-08-19 1:49 ` Richard Stallman @ 2023-08-21 4:48 ` Jim Porter 2023-08-21 5:12 ` Andrew Hyatt ` (2 more replies) 1 sibling, 3 replies; 68+ messages in thread From: Jim Porter @ 2023-08-21 4:48 UTC (permalink / raw) To: Daniel Fleischer, Richard Stallman; +Cc: ahyatt, emacs-devel On 8/17/2023 10:08 AM, Daniel Fleischer wrote: > That is not accurate; LLMs can definitely run locally on your machine. > Models can be downloaded and ran using Python. Here is an LLM released > under Apache 2 license [0]. There are "black-box" models, served in the > cloud, but the revolution we're is precisely because many models are > released freely and can be ran (and trained) locally, even on a laptop. > > [0] https://huggingface.co/mosaicml/mpt-7b The link says that this model has been pretrained, which is certainly useful for the average person who doesn't want (or doesn't have the resources) to perform the training themselves, but from the documentation, it's not clear how I *would* perform the training myself if I were so inclined. (I've only toyed with LLMs, so I'm not an expert at more "advanced" cases like this.) I do see that the documentation mentions the training datasets used, but it also says that "great efforts have been taken to clean the pretraining data". Am I able to access the cleaned datasets? I looked over their blog post[1], but I didn't see anything describing this in detail. While I certainly appreciate the effort people are making to produce LLMs that are more open than OpenAI (a low bar), I'm not sure if providing several gigabytes of model weights in binary format is really providing the *source*. It's true that you can still edit these models in a sense by fine-tuning them, but you could say the same thing about a project that only provided the generated output from GNU Bison, instead of the original input to Bison. (Just to be clear, I don't mean any of the above to be leading questions. I really don't know the answers, and using analogies to previous cases like Bison can only get us so far. I truly hope there *is* a freedom-respecting way to interface with LLMs, but I also think it's worth taking some extra care at the beginning so we can choose the right path forward.) [1] https://www.mosaicml.com/blog/mpt-7b ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-21 4:48 ` Jim Porter @ 2023-08-21 5:12 ` Andrew Hyatt 2023-08-21 6:03 ` Jim Porter 2023-08-21 6:36 ` Daniel Fleischer 2023-08-22 1:06 ` Richard Stallman 2 siblings, 1 reply; 68+ messages in thread From: Andrew Hyatt @ 2023-08-21 5:12 UTC (permalink / raw) To: Jim Porter; +Cc: Daniel Fleischer, Richard Stallman, emacs-devel [-- Attachment #1: Type: text/plain, Size: 3703 bytes --] On Mon, Aug 21, 2023 at 12:48 AM Jim Porter <jporterbugs@gmail.com> wrote: > On 8/17/2023 10:08 AM, Daniel Fleischer wrote: > > That is not accurate; LLMs can definitely run locally on your machine. > > Models can be downloaded and ran using Python. Here is an LLM released > > under Apache 2 license [0]. There are "black-box" models, served in the > > cloud, but the revolution we're is precisely because many models are > > released freely and can be ran (and trained) locally, even on a laptop. > > > > [0] https://huggingface.co/mosaicml/mpt-7b > > The link says that this model has been pretrained, which is certainly > useful for the average person who doesn't want (or doesn't have the > resources) to perform the training themselves, but from the > documentation, it's not clear how I *would* perform the training myself > if I were so inclined. (I've only toyed with LLMs, so I'm not an expert > at more "advanced" cases like this.) > The training of these is fairly straightforward, at least if you are familiar with the area. The code for implementing transformers in the original "Attention is All You Need" paper is at https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/models/transformer.py under an Apache License, and the LLM we are talking about here use this technique to train and execute, changing some parameters and adding things like more attention heads, but keeping the fundamental architecture the same. I'm not an expert, but I believe that due to the use of stochastic processes in training, even if you had the exact code, parameters and data used in training, you would never be able to reproduce the model they make available. It should be equivalent in quality, perhaps, but not the same. > > I do see that the documentation mentions the training datasets used, but > it also says that "great efforts have been taken to clean the > pretraining data". Am I able to access the cleaned datasets? I looked > over their blog post[1], but I didn't see anything describing this in > detail. > > While I certainly appreciate the effort people are making to produce > LLMs that are more open than OpenAI (a low bar), I'm not sure if > providing several gigabytes of model weights in binary format is really > providing the *source*. It's true that you can still edit these models > in a sense by fine-tuning them, but you could say the same thing about a > project that only provided the generated output from GNU Bison, instead > of the original input to Bison. > To me, I believe it should be about freedom. Not absolute freedom, but relative freedom: do you, the user, have the same amount of freedom as anyone else, including the creator? For the LLMs like huggingface and many other research LLMs, the answer is yes. You do have the freedom to fine-tune the model, as does the creator. You cannot change the base model in any meaningful way, but neither can the creator, because no one knows how to do that yet. You cannot understand the model, but neither can the creator, because while some progress has been made in understanding simple things about simple LLMs like GPT-2, the modern LLMs are too complex for anyone to make sense out of. > > (Just to be clear, I don't mean any of the above to be leading > questions. I really don't know the answers, and using analogies to > previous cases like Bison can only get us so far. I truly hope there > *is* a freedom-respecting way to interface with LLMs, but I also think > it's worth taking some extra care at the beginning so we can choose the > right path forward.) > > [1] https://www.mosaicml.com/blog/mpt-7b > [-- Attachment #2: Type: text/html, Size: 4798 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-21 5:12 ` Andrew Hyatt @ 2023-08-21 6:03 ` Jim Porter 0 siblings, 0 replies; 68+ messages in thread From: Jim Porter @ 2023-08-21 6:03 UTC (permalink / raw) To: Andrew Hyatt; +Cc: Daniel Fleischer, Richard Stallman, emacs-devel On 8/20/2023 10:12 PM, Andrew Hyatt wrote: > The training of these is fairly straightforward, at least if you are > familiar with the area. ... the LLM we are talking about here use this technique to train and execute, changing some parameters and adding things like more attention heads, but keeping the fundamental architecture the same. I think the parameters would be a key part of this (or potentially all of the code they used for the training, if it does something unique), as well as the *actual* training datasets. That's why I'm especially concerned about the line in their docs saying "great efforts have been taken to clean the pretraining data". I couldn't find out whether they provided the cleaned data or only the "raw" data. From my understanding, properly cleaning the data is labor-intensive, and you wouldn't be able to reproduce another team's efforts in that area unless they gave you a diff or something equivalent. > I'm not an expert, but I believe that due to the use of stochastic > processes in training, even if you had the exact code, parameters and > data used in training, you would never be able to reproduce the model > they make available. It should be equivalent in quality, perhaps, but > not the same. This is a problem for reproducibility (it would be nice if you could *verify* that a model was built the way its makers said it was), but I don't think it's a critical problem for freedom. > To me, I believe it should be about freedom. Not absolute freedom, but > relative freedom: do you, the user, have the same amount of freedom as > anyone else, including the creator? For the LLMs like huggingface and > many other research LLMs, the answer is yes. So long as the creators provide all the necessary parameters to retrain the model from "scratch", I think I'd agree. If some of these aren't provided (cleaned datasets, training parameters, any direct human intervention if applicable, etc), then I think the answer is no. For example, the creator could decide that one data source is bad for some reason, and retrain their model without it. Would I be able to do that work independently with just what the creator has given me? I see that there was a presentation at LibrePlanet 2023 (or maybe shortly after) by Leandro von Werra of HuggingFace on the ethics of code-generating LLMs[1]. It says that it hasn't been published online yet, though. This might not be the final answer on all the concerns about incorporating LLMs into Emacs, but hopefully it would help. In practice though, I think if Emacs were to support communicating with LLMs, it would be good if - at minimum - we could direct users to an essay explaining the potential ethical/freedom issues with them. On that note, maybe we could also take a bit of inspiration from Emacs dynamic modules. They require a GPL compatibility symbol[2] in order to load, and perhaps a hypothetical 'llm-foobar' package that interfaces with the 'foobar' LLM could announce whether it respects users' freedom via some variable/symbol. Freedom-respecting LLMs wouldn't need a warning message then. We could even forbid packages that talk to particularly "bad" LLMs. (I suppose we can't stop users from writing their own packages and just lying about whether they're ok, but we could prevent their inclusion in ELPA.) [1] https://www.fsf.org/bulletin/2023/spring/trademarks-volunteering-and-code-generating-llm [2] https://www.gnu.org/software/emacs/manual/html_node/elisp/Module-Initialization.html ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-21 4:48 ` Jim Porter 2023-08-21 5:12 ` Andrew Hyatt @ 2023-08-21 6:36 ` Daniel Fleischer 2023-08-22 1:06 ` Richard Stallman 2 siblings, 0 replies; 68+ messages in thread From: Daniel Fleischer @ 2023-08-21 6:36 UTC (permalink / raw) To: Jim Porter; +Cc: Richard Stallman, ahyatt, emacs-devel Jim Porter <jporterbugs@gmail.com> writes: > The link says that this model has been pretrained, which is certainly > useful for the average person who doesn't want (or doesn't have the > resources) to perform the training themselves, but from the > documentation, it's not clear how I *would* perform the training > myself if I were so inclined. (I've only toyed with LLMs, so I'm not > an expert at more "advanced" cases like this.) When I say people can train models themselves I mean "fine tuning" which is the process of taking an existing model and make it learn to do a specific task by showing it a small number of examples, as low as 1000 examples. There are advanced techniques that can train a model by modifying a small percentage of its weights; this type of training can be done in a few hours on a laptop. See https://huggingface.co/docs/peft/index for a tool to do that. > I do see that the documentation mentions the training datasets used, > but it also says that "great efforts have been taken to clean the > pretraining data". Am I able to access the cleaned datasets? I looked > over their blog post[1], but I didn't see anything describing this in > detail. > > While I certainly appreciate the effort people are making to produce > LLMs that are more open than OpenAI (a low bar), I'm not sure if > providing several gigabytes of model weights in binary format is > really providing the *source*. It's true that you can still edit these > models in a sense by fine-tuning them, but you could say the same > thing about a project that only provided the generated output from GNU > Bison, instead of the original input to Bison. To a large degree, the model is the weights. Today's models mainly share a single architecture, called a transformer decoder. Once you specify the architecture and a few hyper-parameters in a config file, the model is entirely determined by the weights. https://huggingface.co/mosaicml/mpt-7b/blob/main/config.json Put differently, today's models differ mainly by their weights, not architectural differences. As for reproducibility, the truth is one can not reproduce the models, theoretically and practically. The models can contain 7, 14, 30, 60 billion parameters which are floating point numbers; is it impossible to reproduce it exactly as there are many sources for randomness in the training process. Practically, pretraining is expensive; it requires hundreds of GPUs and training costs are 100,000$ for small models and up to millions for larger models. Some models do release the training data, see e.g. https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T A side note: we are in a stage where our theoretical understanding is lacking while practical applications are flourishing. Things move very very fast, and there is a strong drive to productize this technology, making people and companies invest a lot of resources into this. However the open source aspect is amazing; the fact that the architecture, code and insights are shared between everyone and even some companies share the models they pretrained under open licensing (taking upon themselves the high cost of training) is a huge win to everyone, including the open source and scientific communities because now the innovation can come from anywhere. -- Daniel Fleischer ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-21 4:48 ` Jim Porter 2023-08-21 5:12 ` Andrew Hyatt 2023-08-21 6:36 ` Daniel Fleischer @ 2023-08-22 1:06 ` Richard Stallman 2 siblings, 0 replies; 68+ messages in thread From: Richard Stallman @ 2023-08-22 1:06 UTC (permalink / raw) To: Jim Porter; +Cc: danflscr, ahyatt, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > While I certainly appreciate the effort people are making to produce > LLMs that are more open than OpenAI (a low bar), I'm not sure if > providing several gigabytes of model weights in binary format is really > providing the *source*. It's true that you can still edit these models > in a sense by fine-tuning them, but you could say the same thing about a > project that only provided the generated output from GNU Bison, instead > of the original input to Bison. I don't think that is valid. Bison processing is very different from training a neural net. Incremental retraining of a trained neural net is the same kind of processing as the original training -- except that you use other data and it produces a neural net that is trained differently. My conclusiuon is that the trained neural net is effectively a kind of source code. So we don't need to demand the "original training data" as part of a package's source code. That data does not have to be free, published, or available. > In practice though, I think if Emacs were to support communicating with > LLMs, it would be good if - at minimum - we could direct users to an > essay explaining the potential ethical/freedom issues with them. I agree, in principle. But it needs to be an article that the GNU Project can endorse. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-15 5:14 ` Andrew Hyatt 2023-08-15 17:12 ` Jim Porter @ 2023-08-16 2:30 ` Richard Stallman 2023-08-16 5:11 ` Tomas Hlavaty 1 sibling, 1 reply; 68+ messages in thread From: Richard Stallman @ 2023-08-16 2:30 UTC (permalink / raw) To: Andrew Hyatt; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > I suggested showing the message once a day, because that is what first > > occurred to me. But there are lots of ways to vary the details. > > Here's an idea. For each language model, it could diisplay the > > message the first, second, fifth, tenth, and after that every tenth > > time the user starts that mode. With this, the frequency of little > > annoyance will diminish soon, but the point will not be forgotten. > > > Is there anything else in emacs that does something similar? I'd like to > look at how other modules do the same thing and how they communicate things > to the user. There are various features in Emacs that display some sort of notice temporarily and make it easy to move past. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-16 2:30 ` Richard Stallman @ 2023-08-16 5:11 ` Tomas Hlavaty 2023-08-18 2:10 ` Richard Stallman 0 siblings, 1 reply; 68+ messages in thread From: Tomas Hlavaty @ 2023-08-16 5:11 UTC (permalink / raw) To: rms, Andrew Hyatt; +Cc: emacs-devel On Tue 15 Aug 2023 at 22:30, Richard Stallman <rms@gnu.org> wrote: > There are various features in Emacs that display some sort of notice > temporarily and make it easy to move past. Is there a way to review the notices later? ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-16 5:11 ` Tomas Hlavaty @ 2023-08-18 2:10 ` Richard Stallman 0 siblings, 0 replies; 68+ messages in thread From: Richard Stallman @ 2023-08-18 2:10 UTC (permalink / raw) To: Tomas Hlavaty; +Cc: ahyatt, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > There are various features in Emacs that display some sort of notice > > temporarily and make it easy to move past. > Is there a way to review the notices later? These notices are displayed in various ways. For osme of the notices, there are ways to review them. But I don't think there is any one way that covers all. It might be a good thing to create one -- and that would not be fundamentally hard. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-13 1:43 ` Richard Stallman 2023-08-13 2:11 ` Emanuel Berg 2023-08-15 5:14 ` Andrew Hyatt @ 2023-08-27 1:07 ` Andrew Hyatt 2023-08-27 13:11 ` Philip Kaludercic ` (2 more replies) 2 siblings, 3 replies; 68+ messages in thread From: Andrew Hyatt @ 2023-08-27 1:07 UTC (permalink / raw) To: rms; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 4246 bytes --] I've now made the changes requested to the llm package on github ( https://github.com/ahyatt/llm). Because what was requested was a warning to the user, I used `lwarn', and have added an option to turn the warnings off (and the user can turn the warnings off through the warning mechanism as well, via `warning-suppress-log-types'). To save you the trouble of looking at the code to see what exactly it says, here's the function I'm using to warn: (defun llm--warn-on-nonfree (name tos) "Issue a warning if `llm-warn-on-nonfree' is non-nil. NAME is the human readable name of the LLM (e.g 'Open AI'). TOS is the URL of the terms of service for the LLM. All non-free LLMs should call this function on each llm function invocation." (when llm-warn-on-nonfree (lwarn '(llm nonfree) :warning "%s API is not free software, and your freedom to use it is restricted. See %s for the details on the restrictions on use." name tos))) If this is sufficient, please consider accepting this package into GNU ELPA (see above where we decided this is a better fit than the Non-GNU ELPA). On Sat, Aug 12, 2023 at 9:43 PM Richard Stallman <rms@gnu.org> wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > What you are saying is consistent with the GNU coding standard. > However, I > > think any message about this would be annoying, > > I am sure it would be a little annoying. But assuming the user can > type SPC and move on from that message, the annoyance will be quite > little. > > personally, and would > be a > > deterrent for clients to use this library. > > If the library is quite useful I doubt anyone would be deterred. > If anyone minded it the message enough to stop using the package, perse > could > edit this out of the code. > > This issue is an example of those where two different values are > pertinent. There is convenience, which counts but is superficial. > And there is the purpose of the GNU system, which for 40 years has led > the fight against injustice in software. That value is deep and, in the > long term, the most important value of all. > > When they conflict in a specific practical matter, there is always > pressure to prioritize convenience. But that is not wise. > The right approach is to look for a ocmpromise which serves both > goals. I am sure we can find one here. > > I suggested showing the message once a day, because that is what first > occurred to me. But there are lots of ways to vary the details. > Here's an idea. For each language model, it could diisplay the > message the first, second, fifth, tenth, and after that every tenth > time the user starts that mode. With this, the frequency of little > annoyance will diminish soon, but the point will not be forgotten. > > > You made suggestions for how to exclude more code from Emacs itself, > and support for obscure language models we probably should exclude. > But there is no need to exclude the support for the well-known ones, > as I've explained. > > And we can do better than that! We can educate the users about what > is wrong with those systems -- something that the media hysteria fails > to mention at all. That is important -- let's use Emacs for it! > > > All implementations can then separately be made available on some other > > package library not associated with GNU. In this scenario, I wouldn't > have > > warnings on those implementations, just as the many llm-based packages > on > > various alternative ELPAs do not have warnings today. > > They ought to show warnings -- the issue is exactly the same. > > We should not slide quietly into acceptance and normalization of a new > systematic injustice. Opposing it is our job. > > -- > Dr Richard Stallman (https://stallman.org) > Chief GNUisance of the GNU Project (https://gnu.org) > Founder, Free Software Foundation (https://fsf.org) > Internet Hall-of-Famer (https://internethalloffame.org) > > > [-- Attachment #2: Type: text/html, Size: 5224 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-27 1:07 ` Andrew Hyatt @ 2023-08-27 13:11 ` Philip Kaludercic 2023-08-28 1:31 ` Richard Stallman 2023-08-27 18:36 ` Jim Porter 2023-09-04 1:27 ` Richard Stallman 2 siblings, 1 reply; 68+ messages in thread From: Philip Kaludercic @ 2023-08-27 13:11 UTC (permalink / raw) To: Andrew Hyatt; +Cc: rms, emacs-devel Andrew Hyatt <ahyatt@gmail.com> writes: > I've now made the changes requested to the llm package on github ( > https://github.com/ahyatt/llm). > > Because what was requested was a warning to the user, I used `lwarn', and > have added an option to turn the warnings off (and the user can turn the > warnings off through the warning mechanism as well, via > `warning-suppress-log-types'). > > To save you the trouble of looking at the code to see what exactly it says, > here's the function I'm using to warn: > > (defun llm--warn-on-nonfree (name tos) > "Issue a warning if `llm-warn-on-nonfree' is non-nil. > NAME is the human readable name of the LLM (e.g 'Open AI'). > > TOS is the URL of the terms of service for the LLM. > > All non-free LLMs should call this function on each llm function > invocation." > (when llm-warn-on-nonfree > (lwarn '(llm nonfree) :warning "%s API is not free software, and your > freedom to use it is restricted. > See %s for the details on the restrictions on use." name tos))) > > If this is sufficient, please consider accepting this package into GNU ELPA > (see above where we decided this is a better fit than the Non-GNU ELPA). I would be fine with this, and would go ahead if there are no objections. > > On Sat, Aug 12, 2023 at 9:43 PM Richard Stallman <rms@gnu.org> wrote: > >> [[[ To any NSA and FBI agents reading my email: please consider ]]] >> [[[ whether defending the US Constitution against all enemies, ]]] >> [[[ foreign or domestic, requires you to follow Snowden's example. ]]] >> >> > What you are saying is consistent with the GNU coding standard. >> However, I >> > think any message about this would be annoying, >> >> I am sure it would be a little annoying. But assuming the user can >> type SPC and move on from that message, the annoyance will be quite >> little. >> >> personally, and would >> be a >> > deterrent for clients to use this library. >> >> If the library is quite useful I doubt anyone would be deterred. >> If anyone minded it the message enough to stop using the package, perse >> could >> edit this out of the code. >> >> This issue is an example of those where two different values are >> pertinent. There is convenience, which counts but is superficial. >> And there is the purpose of the GNU system, which for 40 years has led >> the fight against injustice in software. That value is deep and, in the >> long term, the most important value of all. >> >> When they conflict in a specific practical matter, there is always >> pressure to prioritize convenience. But that is not wise. >> The right approach is to look for a ocmpromise which serves both >> goals. I am sure we can find one here. >> >> I suggested showing the message once a day, because that is what first >> occurred to me. But there are lots of ways to vary the details. >> Here's an idea. For each language model, it could diisplay the >> message the first, second, fifth, tenth, and after that every tenth >> time the user starts that mode. With this, the frequency of little >> annoyance will diminish soon, but the point will not be forgotten. >> >> >> You made suggestions for how to exclude more code from Emacs itself, >> and support for obscure language models we probably should exclude. >> But there is no need to exclude the support for the well-known ones, >> as I've explained. >> >> And we can do better than that! We can educate the users about what >> is wrong with those systems -- something that the media hysteria fails >> to mention at all. That is important -- let's use Emacs for it! >> >> > All implementations can then separately be made available on some other >> > package library not associated with GNU. In this scenario, I wouldn't >> have >> > warnings on those implementations, just as the many llm-based packages >> on >> > various alternative ELPAs do not have warnings today. >> >> They ought to show warnings -- the issue is exactly the same. >> >> We should not slide quietly into acceptance and normalization of a new >> systematic injustice. Opposing it is our job. >> >> -- >> Dr Richard Stallman (https://stallman.org) >> Chief GNUisance of the GNU Project (https://gnu.org) >> Founder, Free Software Foundation (https://fsf.org) >> Internet Hall-of-Famer (https://internethalloffame.org) >> >> >> ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-27 13:11 ` Philip Kaludercic @ 2023-08-28 1:31 ` Richard Stallman 2023-08-28 2:32 ` Andrew Hyatt 0 siblings, 1 reply; 68+ messages in thread From: Richard Stallman @ 2023-08-28 1:31 UTC (permalink / raw) To: Philip Kaludercic, ahyatt; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > (defun llm--warn-on-nonfree (name tos) > > "Issue a warning if `llm-warn-on-nonfree' is non-nil. > > NAME is the human readable name of the LLM (e.g 'Open AI'). > > > > TOS is the URL of the terms of service for the LLM. > > > > All non-free LLMs should call this function on each llm function > > invocation." > > (when llm-warn-on-nonfree > > (lwarn '(llm nonfree) :warning "%s API is not free software, and your > > freedom to use it is restricted. > > See %s for the details on the restrictions on use." name tos))) I presume that the developers judge whether any given LLM calls for a warning, and add a call to this function if it does. Right? The basic approach looks right, bit it raises two questions about details: 1. What exactly is the criterion for deciding whether a given LLM should call this function? In other words, what are the conditions on which we should warn the user? Let's discuss that to make sure we get it right. 2. Is it better to include the TSO URL in the warning, or better NOT to include it and thus avoid helping bad guys publicize their demands? -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-28 1:31 ` Richard Stallman @ 2023-08-28 2:32 ` Andrew Hyatt 2023-08-28 2:59 ` Jim Porter 0 siblings, 1 reply; 68+ messages in thread From: Andrew Hyatt @ 2023-08-28 2:32 UTC (permalink / raw) To: rms; +Cc: Philip Kaludercic, emacs-devel [-- Attachment #1: Type: text/plain, Size: 2623 bytes --] On Sun, Aug 27, 2023 at 9:32 PM Richard Stallman <rms@gnu.org> wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > > (defun llm--warn-on-nonfree (name tos) > > > "Issue a warning if `llm-warn-on-nonfree' is non-nil. > > > NAME is the human readable name of the LLM (e.g 'Open AI'). > > > > > > TOS is the URL of the terms of service for the LLM. > > > > > > All non-free LLMs should call this function on each llm function > > > invocation." > > > (when llm-warn-on-nonfree > > > (lwarn '(llm nonfree) :warning "%s API is not free software, and > your > > > freedom to use it is restricted. > > > See %s for the details on the restrictions on use." name tos))) > > I presume that the developers judge whether any given LLM calls for a > warning, and add a call to this function if it does. Right? > > The basic approach looks right, bit it raises two questions about > details: > > 1. What exactly is the criterion for deciding whether a given LLM > should call this function? In other words, what are the conditions on > which we should warn the user? Let's discuss that to make sure we > get it right. > After following Jim Porter's suggestion above, here is the new function, and you can see the advice we're giving in the docstring: (cl-defgeneric llm-nonfree-message-info (provider) "If PROVIDER is non-free, return info for a warning. This should be a cons of the name of the LLM, and the URL of the terms of service. If the LLM is free and has no restrictions on use, this should return nil. Since this function already returns nil, there is no need to override it." (ignore provider) nil) So, "free and no restrictions on use". I'm happy to link to any resources to help users understand better if you think it is needed. > > 2. Is it better to include the TSO URL in the warning, or better NOT > to include it and thus avoid helping bad guys publicize their demands? I think it's best to include it. To claim there are restrictions on use, but not reference those same restrictions strikes me as incomplete, from the point of view of the user who will be looking at the warning. > > > -- > Dr Richard Stallman (https://stallman.org) > Chief GNUisance of the GNU Project (https://gnu.org) > Founder, Free Software Foundation (https://fsf.org) > Internet Hall-of-Famer (https://internethalloffame.org) > > > [-- Attachment #2: Type: text/html, Size: 3929 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-28 2:32 ` Andrew Hyatt @ 2023-08-28 2:59 ` Jim Porter 2023-08-28 4:54 ` Andrew Hyatt 2023-08-31 2:10 ` Richard Stallman 0 siblings, 2 replies; 68+ messages in thread From: Jim Porter @ 2023-08-28 2:59 UTC (permalink / raw) To: Andrew Hyatt, rms; +Cc: Philip Kaludercic, emacs-devel On 8/27/2023 7:32 PM, Andrew Hyatt wrote: > After following Jim Porter's suggestion above, here is the new function, > and you can see the advice we're giving in the docstring: > > (cl-defgeneric llm-nonfree-message-info (provider) > "If PROVIDER is non-free, return info for a warning. > This should be a cons of the name of the LLM, and the URL of the > terms of service. > > If the LLM is free and has no restrictions on use, this should > return nil. Since this function already returns nil, there is no > need to override it." > (ignore provider) > nil) For what it's worth, I was thinking about having the default be the opposite: warn users by default, since we don't really know if an LLM provider is free unless the Elisp code indicates it. (Otherwise, it could simply mean the author of that provider forgot to override 'llm-nonfree-message-info'.) In other words, assume the worst by default. :) That said, if everyone else thinks this isn't an issue, I won't stamp my feet about it. As for the docstring, I see that many models use ordinary software licenses, such as the Apache license. That could make it easier for us to define the criteria for a libre provider: is the model used by the provider available under a license the FSF considers a free software license?[1] (For LLM providers that you use by making a web request, we could also expect that all the code for their web API is libre too. However, that code is comparatively uninteresting, and so long as you could get the model to use on a self-hosted system[2], I don't see a need to warn the user.) (Also, if you prefer to avoid having to say '(ignore provider)', you can also prefix 'provider' with an underscore. That'll make the byte compiler happy.) [1] https://www.gnu.org/licenses/license-list.en.html [2] At least, in theory. A user might not have enough computing power to use the model in practice, but I don't think that matters for this case. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-28 2:59 ` Jim Porter @ 2023-08-28 4:54 ` Andrew Hyatt 2023-08-31 2:10 ` Richard Stallman 1 sibling, 0 replies; 68+ messages in thread From: Andrew Hyatt @ 2023-08-28 4:54 UTC (permalink / raw) To: Jim Porter; +Cc: rms, Philip Kaludercic, emacs-devel [-- Attachment #1: Type: text/plain, Size: 3158 bytes --] On Sun, Aug 27, 2023 at 10:59 PM Jim Porter <jporterbugs@gmail.com> wrote: > On 8/27/2023 7:32 PM, Andrew Hyatt wrote: > > After following Jim Porter's suggestion above, here is the new function, > > and you can see the advice we're giving in the docstring: > > > > (cl-defgeneric llm-nonfree-message-info (provider) > > "If PROVIDER is non-free, return info for a warning. > > This should be a cons of the name of the LLM, and the URL of the > > terms of service. > > > > If the LLM is free and has no restrictions on use, this should > > return nil. Since this function already returns nil, there is no > > need to override it." > > (ignore provider) > > nil) > > For what it's worth, I was thinking about having the default be the > opposite: warn users by default, since we don't really know if an LLM > provider is free unless the Elisp code indicates it. (Otherwise, it > could simply mean the author of that provider forgot to override > 'llm-nonfree-message-info'.) In other words, assume the worst by > default. :) That said, if everyone else thinks this isn't an issue, I > won't stamp my feet about it. > I agree that it'd be nice to have that property. That's the way I had it initially, but since you need info if it's non-free (the name / TOS), but not if it is free, the design where free was the default was the simplest. The alternative was one method indicating it was free/nonfree and the other, if non-free, to provide the additional information. > > As for the docstring, I see that many models use ordinary software > licenses, such as the Apache license. That could make it easier for us > to define the criteria for a libre provider: is the model used by the > provider available under a license the FSF considers a free software > license?[1] (For LLM providers that you use by making a web request, we > could also expect that all the code for their web API is libre too. > However, that code is comparatively uninteresting, and so long as you > could get the model to use on a self-hosted system[2], I don't see a > need to warn the user.) > I agree that it'd be nice to define this in a more clear way, but we also can just wait until someone proposes a free LLM to include to judge it. We can always bring it back to the emacs-devel list if there is uncertainty. The hosting code is not that relevant here. For these companies, there would be restrictions on the use of the model even if there were no other unfree software in the middle (kind of like how Llama 2 is). Notably, no company is going to want the user to train competing models with their model. This is the most common restriction on freedoms of the user. > > (Also, if you prefer to avoid having to say '(ignore provider)', you can > also prefix 'provider' with an underscore. That'll make the byte > compiler happy.) > TIL, that's a great tip, thanks! > > [1] https://www.gnu.org/licenses/license-list.en.html > > [2] At least, in theory. A user might not have enough computing power to > use the model in practice, but I don't think that matters for this case. > [-- Attachment #2: Type: text/html, Size: 4262 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-28 2:59 ` Jim Porter 2023-08-28 4:54 ` Andrew Hyatt @ 2023-08-31 2:10 ` Richard Stallman 2023-08-31 9:06 ` Ihor Radchenko 1 sibling, 1 reply; 68+ messages in thread From: Richard Stallman @ 2023-08-31 2:10 UTC (permalink / raw) To: Jim Porter; +Cc: ahyatt, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > As for the docstring, I see that many models use ordinary software > licenses, such as the Apache license. That could make it easier for us > to define the criteria for a libre provider: is the model used by the > provider available under a license the FSF considers a free software > license In general, an LLM system consists of two parts: the engine, which is a program written in a programming language, and the trained neural network. For the system to be free, both parts must be free. A number of engines are free software, but it is unusual for a trained neural network to be free. I think that "model" refers to the trained neural network. That's how models are implemented. To figure out whether a program is free by scanning it is hard to do reliably. That is why for LibreJS we designed a more precise method for indicating licenses on parts of a file. I recommend against trying to do this. It should not be a lot of work for a human to check this and get a reliable result. That applies to LLM systems that you download and run on your own machine. As for LLMs that run on servers, they are a different issue entirely. They are all SaaSS (Service as a Software Substitute), and SaaSS is always unjust. See https://gnu.org/philosophy/who-does-that-server-really-serve.html for explanation. So if you contact it over the internet, it should get a warning with a reference to that page. Maybe there is no need need to pass info about the terms of service. Only a service can impose terms of service, and the mere fact that it is a service, rather than a program to download and run, inherently means the user does not control its operation. That by itself is reason for a notice that it is bad. Any restrictions imposed by terms of service could add to the bad. Perhaps it would be good to mention that that second injustice exists. Maybe it would be good to say, This language model treats users unjustly because it does the user's computing on a computer where the user has no control over its operation. It is "Service as a Software Substitute", as we call it. See https://gnu.org/philosophy/who-does-that-server-really-serve.html. In addition, it imposes "terms of service", restrictions over what users can do with the system. That is a second injustice. If society needs to restrict some of the uses of language model systems, it should do so by democratically passing laws to penalize those actions -- regardless of how they are done -- and not by allowing companies to impose restrictions arbitrarily on users. The laws would be more effective at achieving the goal, as weil as avoidng giving anyone unjust power over others. I think that it is better to present the URL of the web site's front page rather than the terms of service themselves. If we point the user at the terms of service, we are directly helping the company impose them. If the user visits the front page, perse can easily find the terms of service. But we will not have directly promoted attention to them. This is a compromise between two flaws. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-31 2:10 ` Richard Stallman @ 2023-08-31 9:06 ` Ihor Radchenko 2023-08-31 16:29 ` chad ` (2 more replies) 0 siblings, 3 replies; 68+ messages in thread From: Ihor Radchenko @ 2023-08-31 9:06 UTC (permalink / raw) To: rms; +Cc: Jim Porter, ahyatt, emacs-devel Richard Stallman <rms@gnu.org> writes: > As for LLMs that run on servers, they are a different issue entirely. > They are all SaaSS (Service as a Software Substitute), and SaaSS is > always unjust. > > See https://gnu.org/philosophy/who-does-that-server-really-serve.html > for explanation. I do not fully agree here. A number of more powerful LLMs have very limiting hardware requirements. For example, some LLMs require 64+Gbs of RAM to run: https://github.com/ggerganov/llama.cpp#memorydisk-requirements. Not every PC is able to handle it, even if both the engine and the neural network weights are free. In such scenario, the base assumption you make in https://gnu.org/philosophy/who-does-that-server-really-serve.html may no longer hold for most users: "Suppose that any software tasks you might need for the job are implemented in free software, and you have copies, and you have whatever data you might need, as well as computers of whatever speed, functionality and capacity might be required." Thus, for many users (owning less powerful computers) LLMs as a service are going to be SaaS, not SaaSS. (Given that the SaaS LLM has free licence and users who choose to buy the necessary hardware retain their freedom to run the same LLM on their hardware.) -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-31 9:06 ` Ihor Radchenko @ 2023-08-31 16:29 ` chad 2023-09-01 9:53 ` Ihor Radchenko 2023-09-04 1:27 ` Richard Stallman 2023-09-04 1:27 ` [NonGNU ELPA] New package: llm Richard Stallman 2 siblings, 1 reply; 68+ messages in thread From: chad @ 2023-08-31 16:29 UTC (permalink / raw) To: emacs-tangents; +Cc: Jim Porter, ahyatt, Ihor Radchenko, rms [-- Attachment #1: Type: text/plain, Size: 2796 bytes --] On Thu, Aug 31, 2023 at 5:06 AM Ihor Radchenko <yantar92@posteo.net> wrote: > Richard Stallman <rms@gnu.org> writes: > > > As for LLMs that run on servers, they are a different issue entirely. > > They are all SaaSS (Service as a Software Substitute), and SaaSS is > > always unjust. > > > > See https://gnu.org/philosophy/who-does-that-server-really-serve.html > > for explanation. > > I do not fully agree here. [...] > Thus, for many users (owning less powerful computers) LLMs as a service > are going to be SaaS, not SaaSS. (Given that the SaaS LLM has free > licence and users who choose to buy the necessary hardware retain their > freedom to run the same LLM on their hardware.) > It's a somewhat subtle, gnarly point, and I didn't find a way to express it as well as Ihor Radchenko here, but I will add: the ability for a free software-loving user to run their own SaaS is both increasing and decreasing in ease recently. On the one hand, it's difficult these days to run a personal email service and not get trapped by the shifting myriad of overlapping spam/fraud/monopoly `protection' features, at least if you want to regularly send email to a wide variety of users. On the other hand, it's increasingly viable to have a hand-held machine that's a tiny fraction of a space-cadet keyboard running (mostly; binary blobs are a pernicious evil) free software that easily connects back to one's own free-software "workstation" for medium and large jobs, even while avoiding "the cloud trap", as it were. (Such things have been a long-time hobby/interest of mine, dating back before my time as a professional programmer. They're still not common, but they're getting increasingly moreso; native Android support for emacs, as one example, will likely help.) For large AI models specifically: there are many users for whom it is not practical to _actually_ recreate the model from scratch everywhere they might want to use it. It is important for computing freedom that such recreations be *possible*, but it will be very limiting to insist that everyone who wants to use such services actually do so, in a manner that seems to me to be very similar to not insisting that every potential emacs user actually compile their own. In this case there's the extra wrinkle that the actual details of recreating the currently-most-interesting large language models involves both _gigantic_ amounts of resources and also a fairly large amount of not-directly-reproducible randomness involved. It might be worth further consideration. Just now, re-reading this seems like a topic better suited to emacs-tangents or even gnu-misc-discuss, so I'm changing the CC there. Apologies if this causes an accidental fork. I hope that helps, ~Chad [-- Attachment #2: Type: text/html, Size: 3471 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-31 16:29 ` chad @ 2023-09-01 9:53 ` Ihor Radchenko 0 siblings, 0 replies; 68+ messages in thread From: Ihor Radchenko @ 2023-09-01 9:53 UTC (permalink / raw) To: chad; +Cc: emacs-tangents, Jim Porter, ahyatt, rms chad <yandros@gmail.com> writes: > For large AI models specifically: there are many users for whom it is not > practical to _actually_ recreate the model from scratch everywhere they > might want to use it. It is important for computing freedom that such > recreations be *possible*, but it will be very limiting to insist that > everyone who wants to use such services actually do so, in a manner that > seems to me to be very similar to not insisting that every potential emacs > user actually compile their own. In this case there's the extra wrinkle > that the actual details of recreating the currently-most-interesting large > language models involves both _gigantic_ amounts of resources and also a > fairly large amount of not-directly-reproducible randomness involved. It > might be worth further consideration. Let me refer to another message by RMS: >> > While I certainly appreciate the effort people are making to produce >> > LLMs that are more open than OpenAI (a low bar), I'm not sure if >> > providing several gigabytes of model weights in binary format is really >> > providing the *source*. It's true that you can still edit these models >> > in a sense by fine-tuning them, but you could say the same thing about a >> > project that only provided the generated output from GNU Bison, instead >> > of the original input to Bison. >> >> I don't think that is valid. >> Bison processing is very different from training a neural net. >> Incremental retraining of a trained neural net >> is the same kind of processing as the original training -- except >> that you use other data and it produces a neural net >> that is trained differently. >> >> My conclusiuon is that the trained neural net is effectively a kind of >> source code. So we don't need to demand the "original training data" >> as part of a package's source code. That data does not have to be >> free, published, or available. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-31 9:06 ` Ihor Radchenko 2023-08-31 16:29 ` chad @ 2023-09-04 1:27 ` Richard Stallman 2023-09-06 12:25 ` Ihor Radchenko 2023-09-06 12:51 ` Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm) Ihor Radchenko 2023-09-04 1:27 ` [NonGNU ELPA] New package: llm Richard Stallman 2 siblings, 2 replies; 68+ messages in thread From: Richard Stallman @ 2023-09-04 1:27 UTC (permalink / raw) To: Ihor Radchenko; +Cc: jporterbugs, ahyatt, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > As for LLMs that run on servers, they are a different issue entirely. > > They are all SaaSS (Service as a Software Substitute), and SaaSS is > > always unjust. > > > > See https://gnu.org/philosophy/who-does-that-server-really-serve.html > > for explanation. > I do not fully agree here. A number of more powerful LLMs have very > limiting hardware requirements. For example, some LLMs require 64+Gbs of > RAM to run: That is true, and it is unfortunate. There may be no practical way to run a certain model except for SaaSS. That does not alter the injustice of SaaSS. So we should not silence our criticism of SaaSS in those cases, If a user decides to run that model as SaaSS, given this situation, that is per responsibility. We will not try to prevent per from doing so, But we should inform per of the injustice so perse can make that decision aware of the injustice. This is very important. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-09-04 1:27 ` Richard Stallman @ 2023-09-06 12:25 ` Ihor Radchenko 2023-09-06 12:51 ` Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm) Ihor Radchenko 1 sibling, 0 replies; 68+ messages in thread From: Ihor Radchenko @ 2023-09-06 12:25 UTC (permalink / raw) To: rms; +Cc: jporterbugs, ahyatt, emacs-devel [ I am not sure if this discussion is still relevant enough for emacs-devel. Let me know if we need to move to emacs-tangents or other more suitable place ] Richard Stallman <rms@gnu.org> writes: > > > As for LLMs that run on servers, they are a different issue entirely. > > > They are all SaaSS (Service as a Software Substitute), and SaaSS is > > > always unjust. > > > > > > See https://gnu.org/philosophy/who-does-that-server-really-serve.html > > > for explanation. > > > I do not fully agree here. A number of more powerful LLMs have very > > limiting hardware requirements. For example, some LLMs require 64+Gbs of > > RAM to run: > > That is true, and it is unfortunate. There may be no practical way > to run a certain model except for SaaSS. Clarification: that might still be possible if one runs LLMs on remote virtual or dedicated server. Such server is technically not what the user _owns physically_. However, there is a freedom running arbitrary version (including modified version) of the LLM on such a server. Also, it looks like hosted servers running libre software are becoming a rather common thing. For example, I have seen BBB servers offered for rent: https://owncube.com/bbb_en.html and even LLM servers: https://ownai.org/ The problem is that the extent users are allowed to control that running software is more limited compared to what one can enjoy with full SSH access or in a physically owned machine. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 68+ messages in thread
* Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm) 2023-09-04 1:27 ` Richard Stallman 2023-09-06 12:25 ` Ihor Radchenko @ 2023-09-06 12:51 ` Ihor Radchenko 2023-09-06 16:59 ` Andrew Hyatt ` (2 more replies) 1 sibling, 3 replies; 68+ messages in thread From: Ihor Radchenko @ 2023-09-06 12:51 UTC (permalink / raw) To: rms, emacs-tangents; +Cc: jporterbugs, ahyatt [ Moving this to emacs-tangents ] Richard Stallman <rms@gnu.org> writes: > > > As for LLMs that run on servers, they are a different issue entirely. > > > They are all SaaSS (Service as a Software Substitute), and SaaSS is > > > always unjust. > > > > > > See https://gnu.org/philosophy/who-does-that-server-really-serve.html > > > for explanation. > > > I do not fully agree here. A number of more powerful LLMs have very > > limiting hardware requirements. For example, some LLMs require 64+Gbs of > > RAM to run: > > That is true, and it is unfortunate. There may be no practical way > to run a certain model except for SaaSS. > > That does not alter the injustice of SaaSS. So we should not silence > our criticism of SaaSS in those cases, This is a rather theoretical consideration, but, talking about ChatGTP (owned by OpenAI) specifically, should it even be considered SaaSS? https://www.gnu.org/philosophy/who-does-that-server-really-serve.html says: Using a joint project's servers isn't SaaSS because the computing you do in this way isn't your own. For instance, if you edit pages on Wikipedia, you are not doing your own computing; rather, you are collaborating in Wikipedia's computing. Wikipedia controls its own servers, but organizations as well as individuals encounter the problem of SaaSS if they do their computing in someone else's server. Then, ChatGTP is using the user input to train their model: https://techunwrapped.com/you-can-now-make-chatgpt-not-train-with-your-queries/ ... what is constant is that the company can use our conversations with ChatGPT to train the model. This is not a surprise or a secret, the company has always reported it. There is no doubt that ChatGTP itself is not libre - its model is not available to public. However, users of the ChatGPT model are technically providing input that is collaboratively editing that model weights (training the model further). So, using ChatGTP is a little bit akin editing Wikipedia pages - collaborating to improve ChatGTP. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm) 2023-09-06 12:51 ` Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm) Ihor Radchenko @ 2023-09-06 16:59 ` Andrew Hyatt 2023-09-09 0:37 ` Richard Stallman 2023-09-06 22:52 ` Emanuel Berg 2023-09-09 0:38 ` Richard Stallman 2 siblings, 1 reply; 68+ messages in thread From: Andrew Hyatt @ 2023-09-06 16:59 UTC (permalink / raw) To: Ihor Radchenko; +Cc: rms, emacs-tangents, jporterbugs [-- Attachment #1: Type: text/plain, Size: 2995 bytes --] On Wed, Sep 6, 2023 at 8:50 AM Ihor Radchenko <yantar92@posteo.net> wrote: > [ Moving this to emacs-tangents ] > > Richard Stallman <rms@gnu.org> writes: > > > > > As for LLMs that run on servers, they are a different issue > entirely. > > > > They are all SaaSS (Service as a Software Substitute), and SaaSS is > > > > always unjust. > > > > > > > > See > https://gnu.org/philosophy/who-does-that-server-really-serve.html > > > > for explanation. > > > > > I do not fully agree here. A number of more powerful LLMs have very > > > limiting hardware requirements. For example, some LLMs require > 64+Gbs of > > > RAM to run: > > > > That is true, and it is unfortunate. There may be no practical way > > to run a certain model except for SaaSS. > > > > That does not alter the injustice of SaaSS. So we should not silence > > our criticism of SaaSS in those cases, > > This is a rather theoretical consideration, but, talking about ChatGTP > (owned by OpenAI) specifically, should it even be considered SaaSS? > > https://www.gnu.org/philosophy/who-does-that-server-really-serve.html > says: > > Using a joint project's servers isn't SaaSS because the computing > you do in this way isn't your own. For instance, if you edit pages > on Wikipedia, you are not doing your own computing; rather, you are > collaborating in Wikipedia's computing. Wikipedia controls its own > servers, but organizations as well as individuals encounter the > problem of SaaSS if they do their computing in someone else's > server. > > Then, ChatGTP is using the user input to train their model: > > https://techunwrapped.com/you-can-now-make-chatgpt-not-train-with-your-queries/ > > ... what is constant is that the company can use > our conversations with ChatGPT to train the model. This is not a > surprise or a secret, the company has always reported it. > > There is no doubt that ChatGTP itself is not libre - its model is not > available to public. However, users of the ChatGPT model are technically > providing input that is collaboratively editing that model weights > (training the model further). So, using ChatGTP is a little bit akin > editing Wikipedia pages - collaborating to improve ChatGTP. > In addition, you can pay money to train your own model (via fine-tuning) on top of Open AI's model. Most other providers also let you do this. The model is "yours", and the training is controlled by you with no restrictions I know of. You can't separate it from the underlying model (for technical reasons). I don't know the legal aspects of restrictions on using your own fine-tuned model, but you still access it via SaaSS. > > -- > Ihor Radchenko // yantar92, > Org mode contributor, > Learn more about Org mode at <https://orgmode.org/>. > Support Org development at <https://liberapay.com/org-mode>, > or support my work at <https://liberapay.com/yantar92> > [-- Attachment #2: Type: text/html, Size: 4347 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm) 2023-09-06 16:59 ` Andrew Hyatt @ 2023-09-09 0:37 ` Richard Stallman 0 siblings, 0 replies; 68+ messages in thread From: Richard Stallman @ 2023-09-09 0:37 UTC (permalink / raw) To: Andrew Hyatt; +Cc: yantar92, emacs-tangents, jporterbugs [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > In addition, you can pay money to train your own model (via fine-tuning) on > top of Open AI's model. Most other providers also let you do this. The > model is "yours", and the training is controlled by you with no > restrictions I know of. You can't separate it from the underlying model > (for technical reasons). I don't know the legal aspects of restrictions on > using your own fine-tuned model, but you still access it via SaaSS. Thanks for informing me about this -- it is interesting. However, in regard to freedom and control, it doesn't really change things. That scheme is the neural network equivalent of making a patch set with which to modify the standard version of a program. If you could get your own copy of the program source, apply your patch set, and run that patched version, you'd have control over that version. Under a suitable free license, that would amount to free software. The situation with TeX is that way. But what they are doing makes it SaaSS all the way. It's legitimate to offer a service of virtual servers on which you can run your choice of system and software. But systems and software that you can't extract from those particular virtual servers are not free software. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm) 2023-09-06 12:51 ` Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm) Ihor Radchenko 2023-09-06 16:59 ` Andrew Hyatt @ 2023-09-06 22:52 ` Emanuel Berg 2023-09-07 7:28 ` Lucien Cartier-Tilet 2023-09-09 0:38 ` Richard Stallman 2 siblings, 1 reply; 68+ messages in thread From: Emanuel Berg @ 2023-09-06 22:52 UTC (permalink / raw) To: emacs-tangents Ihor Radchenko wrote: >>>> As for LLMs that run on servers, they are a different >>>> issue entirely. They are all SaaSS (Service as a Software >>>> Substitute), and SaaSS is always unjust. >>>> >>>> See >>>> https://gnu.org/philosophy/who-does-that-server-really-serve.html >>>> for explanation. >>> >>> I do not fully agree here. A number of more powerful LLMs >>> have very limiting hardware requirements. For example, >>> some LLMs require 64+Gbs of RAM to run: >> >> That is true, and it is unfortunate. There may be no >> practical way to run a certain model except for SaaSS. >> >> That does not alter the injustice of SaaSS. So we should >> not silence our criticism of SaaSS in those cases, > > This is a rather theoretical consideration, but, talking > about ChatGTP (owned by OpenAI) specifically, should it even > be considered SaaSS? Can't we, i.e. GNU, offer a server where you can run FOSS programs that offer services? Wouldn't that be "their" software to people who use it? They could improve the software like any other FOSS project and, if necessary even, fork it to also run on the server alongside. -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm) 2023-09-06 22:52 ` Emanuel Berg @ 2023-09-07 7:28 ` Lucien Cartier-Tilet 2023-09-07 7:57 ` Emanuel Berg 0 siblings, 1 reply; 68+ messages in thread From: Lucien Cartier-Tilet @ 2023-09-07 7:28 UTC (permalink / raw) To: emacs-tangents [-- Attachment #1.1.1: Type: text/plain, Size: 1085 bytes --] Emanuel Berg <incal@dataswamp.org> writes: > Can't we, i.e. GNU, offer a server where you can run FOSS > programs that offer services? > > Wouldn't that be "their" software to people who use it? > > They could improve the software like any other FOSS project > and, if necessary even, fork it to also run on the > server alongside. This is exactly what a French non-profit organisation (Framasoft [1]) does, they offer multiple services with FOSS software such as: - Framadate (simple polls), - Framaform (alternative to Google Forms), - Framapad (alternative to Google Docs), - Mobilizon (event plannifier), And quite a few others, I’ll let you check the list of their services [2]. They are even the main developers of both PeerTube, an AGPL video service on the Fediverse, and of Mobilizon which I mentioned above and which is also AGPL. [1] <https://framasoft.org/en/> [2] <https://degooglisons-internet.org/en/> -- Lucien “Phundrak” Cartier-Tilet <https://phundrak.com> (Français) <https://phundrak.com/en> (English) Sent from GNU/Emacs [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 861 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm) 2023-09-07 7:28 ` Lucien Cartier-Tilet @ 2023-09-07 7:57 ` Emanuel Berg 0 siblings, 0 replies; 68+ messages in thread From: Emanuel Berg @ 2023-09-07 7:57 UTC (permalink / raw) To: emacs-tangents Lucien Cartier-Tilet wrote: >> Can't we, i.e. GNU, offer a server where you can run FOSS >> programs that offer services? >> >> Wouldn't that be "their" software to people who use it? >> >> They could improve the software like any other FOSS project >> and, if necessary even, fork it to also run on the >> server alongside. > > This is exactly what a French non-profit organisation (Framasoft [1]) > does, they offer multiple services with FOSS software such as: > - Framadate (simple polls), > - Framaform (alternative to Google Forms), > - Framapad (alternative to Google Docs), > - Mobilizon (event plannifier), > And quite a few others, I’ll let you check the list of their services [2]. > > They are even the main developers of both PeerTube, an AGPL > video service on the Fediverse, and of Mobilizon which > I mentioned above and which is also AGPL. Ikr? Good work! -- underground experts united https://dataswamp.org/~incal ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm) 2023-09-06 12:51 ` Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm) Ihor Radchenko 2023-09-06 16:59 ` Andrew Hyatt 2023-09-06 22:52 ` Emanuel Berg @ 2023-09-09 0:38 ` Richard Stallman 2023-09-09 10:28 ` Collaborative training of Libre LLMs (was: Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm)) Ihor Radchenko 2 siblings, 1 reply; 68+ messages in thread From: Richard Stallman @ 2023-09-09 0:38 UTC (permalink / raw) To: Ihor Radchenko; +Cc: emacs-tangents, jporterbugs, ahyatt [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > ... what is constant is that the company can use > our conversations with ChatGPT to train the model. This is not a > surprise or a secret, the company has always reported it. > There is no doubt that ChatGTP itself is not libre - its model is not > available to public. However, users of the ChatGPT model are technically > providing input that is collaboratively editing that model weights > (training the model further). So, using ChatGTP is a little bit akin > editing Wikipedia pages - collaborating to improve ChatGTP. That is a valid point, at the practical level. But the differences are crucial. 1. In Wikipedia, a contributor voluntarily chooses to participate in editing, Editing participation is separate from consulting the encyclopedia. This fits the word "collaborating. By contrast, a when the develoers of ChatGTP make it learn from the user, that "contribution" is neither voluntary nor active. It is more "being taken advantage of" than "collaborating". 2. Wikipedia is a community project to develop a free/libre work. (It is no coincidence that this resembles the GNU Project.) Morally it deserves community support, despite some things it handles badly. By contrast, ChatGTP is neither a community project nor free/libre. That's perhaps why it arranges to manipulate people into "contributing" rather than letting them choose. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Collaborative training of Libre LLMs (was: Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm)) 2023-09-09 0:38 ` Richard Stallman @ 2023-09-09 10:28 ` Ihor Radchenko 2023-09-09 11:19 ` Jean Louis 2023-09-10 0:22 ` Richard Stallman 0 siblings, 2 replies; 68+ messages in thread From: Ihor Radchenko @ 2023-09-09 10:28 UTC (permalink / raw) To: rms; +Cc: emacs-tangents, jporterbugs, ahyatt, team Richard Stallman <rms@gnu.org> writes: > 1. In Wikipedia, a contributor voluntarily chooses to participate in editing, > Editing participation is separate from consulting the encyclopedia. This > fits the word "collaborating. > > By contrast, a when the develoers of ChatGTP make it learn from the > user, that "contribution" is neither voluntary nor active. It is more > "being taken advantage of" than "collaborating". It is actually voluntary now - according to https://techunwrapped.com/you-can-now-make-chatgpt-not-train-with-your-queries/, one can disable or enable training on user queries. By default, it is enabled though. > 2. Wikipedia is a community project to develop a free/libre work. (It > is no coincidence that this resembles the GNU Project.) Morally it > deserves community support, despite some things it handles badly. > > By contrast, ChatGTP is neither a community project nor free/libre. > That's perhaps why it arranges to manipulate people into "contributing" > rather than letting them choose. Indeed, they do hold coercive power as people have no choice to copy run the model independently. However, I do not care much about OpenAI corporate practices - they are as bad as we are used to in other bigtech SaaSS companies. What might be a more interesting question to discuss is actual genuine collaborative effort training a libre (not ChatGTP) model. Currently, improving models is rather sequential process. If there is one publicly available model, anyone can download the weights, train them locally, and share the results. However, if multiple people take a single _same_ version of the model and train it, the results, AFAIK, cannot be combined. As Andrew mentioned, the approach with "patching" a model is quite interesting idea - if such "patches" may be combined, we can get rid of the above concern with collaborative _ethical_ development of models. However, if the "patching" technology can only serve a single "patch" + main model, there is a problem. Improving libre neural networks will become difficult, unless people utilize collaborative server to continuously improve a model. Such collaborative server, similar to ChatGPT, will combine "editing" (training) and "consulting" together. And, unlike Wikipedia, these activities are hard to separate. This raises a moral question about practical ways to improve libre neural networks without falling into SaaSS practices. As a practical example, there is https://github.com/khoj-ai/khoj/ Libre neural network interface in development (it features Emacs support). They recently started https://khoj.dev/ cloud aiming for people who cannot afford to run the models locally. This discussion might be one of the ethical considerations of using such cloud. I CCed khoj devs. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Collaborative training of Libre LLMs (was: Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm)) 2023-09-09 10:28 ` Collaborative training of Libre LLMs (was: Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm)) Ihor Radchenko @ 2023-09-09 11:19 ` Jean Louis 2023-09-10 0:22 ` Richard Stallman 1 sibling, 0 replies; 68+ messages in thread From: Jean Louis @ 2023-09-09 11:19 UTC (permalink / raw) To: Ihor Radchenko; +Cc: emacs-tangents, jporterbugs, ahyatt, team * Ihor Radchenko <yantar92@posteo.net> [2023-09-09 13:28]: > > By contrast, ChatGTP is neither a community project nor free/libre. > > That's perhaps why it arranges to manipulate people into "contributing" > > rather than letting them choose. > > Indeed, they do hold coercive power as people have no choice to copy run > the model independently. There is free software for that type of artificial intelligence. People do have choice. Llama, Llama 2, Alpaca, GPT4All, Dolly, Vicuna, etc. I think that "they do hold coercive power" is out of reality. To find out if they have coercive power you should find the victim of coercion and be able to tell name victim. The verb coerce has 1 sense (first 1 from tagged texts)p 1. (2) coerce, hale, squeeze, pressure, force -- (to cause to do through pressure or necessity, by physical, moral or intellectual means :"She forced him to take a job in the city"; "He squeezed her for information") Otherwise it sounds as propaganda. There are too many services online, nobody need to use them, I see there no coercion. > However, I do not care much about OpenAI corporate practices - they are > as bad as we are used to in other bigtech SaaSS companies. What might be > a more interesting question to discuss is actual genuine collaborative > effort training a libre (not ChatGTP) model. Their closed software example is followed by free software. I see that as positive not "as bad as we are ued to in other bigtech..." I do not see anything bad here, I see that company offers service and customers can freely decide to take service, or not. Keeping farms of servers for that purpose is very expensive. There must be some exchange between customers and company. Even Wikipedia, and GNU and free software projects needs funds to continue. -- Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Collaborative training of Libre LLMs (was: Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm)) 2023-09-09 10:28 ` Collaborative training of Libre LLMs (was: Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm)) Ihor Radchenko 2023-09-09 11:19 ` Jean Louis @ 2023-09-10 0:22 ` Richard Stallman 2023-09-10 2:18 ` Debanjum Singh Solanky 1 sibling, 1 reply; 68+ messages in thread From: Richard Stallman @ 2023-09-10 0:22 UTC (permalink / raw) To: Ihor Radchenko; +Cc: emacs-tangents, jporterbugs, ahyatt, team [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > However, if the "patching" technology can only serve a single "patch" + > main model, there is a problem. Improving libre neural networks will > become difficult, unless people utilize collaborative server to > continuously improve a model. > Such collaborative server, similar to ChatGPT, will combine "editing" > (training) and "consulting" together. And, unlike Wikipedia, these > activities are hard to separate. If the users in this "community" can't move their work outside of a private "collaborative server", they are in effect prisoners of that server. Whoever keeps them stuck there will have power, and that will tempt per to mistreat them with it. > This raises a moral question about practical ways to improve libre > neural networks without falling into SaaSS practices. From the example above, I conclude it is crucial that people who use a particular platform to modify and run the model have the feasible freedom of copying their modified versions off that platform and onto any other platform that satisfies the specs needed to run these models. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Collaborative training of Libre LLMs (was: Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm)) 2023-09-10 0:22 ` Richard Stallman @ 2023-09-10 2:18 ` Debanjum Singh Solanky 0 siblings, 0 replies; 68+ messages in thread From: Debanjum Singh Solanky @ 2023-09-10 2:18 UTC (permalink / raw) To: rms; +Cc: Ihor Radchenko, emacs-tangents, jporterbugs, ahyatt, team [-- Attachment #1: Type: text/plain, Size: 1995 bytes --] > > However, if the "patching" technology can only serve a single "patch" + > > main model, there is a problem. Improving libre neural networks will > > become difficult, unless people utilize collaborative server to > > continuously improve a model. > > > Such collaborative server, similar to ChatGPT, will combine "editing" > > (training) and "consulting" together. And, unlike Wikipedia, these > > activities are hard to separate. > > If the users in this "community" can't move their work outside of a > private "collaborative server", they are in effect prisoners of that > server. Whoever keeps them stuck there will have power, and that will > tempt per to mistreat them with it. > Versus traditional software, AI systems rely critically on the usage data generated to improve the original model. Using copyleft licensed models maybe enough to prevent a server owner from being able to train a better closed model? This would prevent them from holding users hostage on their server. > > This raises a moral question about practical ways to improve libre > > neural networks without falling into SaaSS practices. > > From the example above, I conclude it is crucial that people who use a > particular platform to modify and run the model have the feasible > freedom of copying their modified versions off that platform and onto > any other platform that satisfies the specs needed to run these models. > Platform portability does not solve for how to improve libre neural networks in an open, community guided way. To collaboratively develop better open models we'd need the generated usage data to be publically shareable. Attempts like open-assistant (https://open-assistant.io) that share usage data under cc-by-sa maybe a good enough solution for this. But it'll fall on the server owners to get explicit user consent and clean sensitive usage data to share this data publically without liability. -- Debanjum Singh Solanky Founder, Khoj (https://khoj.dev/) [-- Attachment #2: Type: text/html, Size: 3980 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-31 9:06 ` Ihor Radchenko 2023-08-31 16:29 ` chad 2023-09-04 1:27 ` Richard Stallman @ 2023-09-04 1:27 ` Richard Stallman 2 siblings, 0 replies; 68+ messages in thread From: Richard Stallman @ 2023-09-04 1:27 UTC (permalink / raw) To: Ihor Radchenko; +Cc: jporterbugs, ahyatt, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > As for LLMs that run on servers, they are a different issue entirely. > > They are all SaaSS (Service as a Software Substitute), and SaaSS is > > always unjust. > > > > See https://gnu.org/philosophy/who-does-that-server-really-serve.html > > for explanation. > I do not fully agree here. A number of more powerful LLMs have very > limiting hardware requirements. For example, some LLMs require 64+Gbs of > RAM to run: That is true, and it is unfortunate. There may be no practical way to run a certain model except for SaaSS. That does not alter the injustice of SaaSS. So we should not silence our criticism of SaaSS in those cases, If a user decides to run that model as SaaSS, given this situation, that is per responsibility. We will not try to prevent per from doing so, But we should inform per of the injustice so perse can make that decision aware of the injustice. This is very important. > In such scenario, the base assumption you make in > https://gnu.org/philosophy/who-does-that-server-really-serve.html may no > longer hold for most users: > "Suppose that any software tasks you might need for the job are > implemented in free software, and you have copies, and you have whatever > data you might need, as well as computers of whatever speed, > functionality and capacity might be required." > Thus, for many users (owning less powerful computers) LLMs as a service > are going to be SaaS, not SaaSS. (Given that the SaaS LLM has free > licence and users who choose to buy the necessary hardware retain their > freedom to run the same LLM on their hardware.) I think that is a misunderstanding. The text quoted from the page says you have whatever data you might need, as well as computers of whatever speed, functionality and capacity might be required. Whether that is feasible for usual usersion in some real case is not part of the question. This question is part of the thought experiment. This is meant to clarify the concept of "SaaSS". It is not part of the DEFINITION of SaaSS. It is not intended to say that any service that needs so much resources that you could not run it yourself _is not_ SaaSS. I don't think that practical details about what other choices are feasible for a user affect whether use of a certain service is SaaSS. In principle, those should be independent questions. I will take a look at clarifying that page. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-27 1:07 ` Andrew Hyatt 2023-08-27 13:11 ` Philip Kaludercic @ 2023-08-27 18:36 ` Jim Porter 2023-08-28 0:19 ` Andrew Hyatt 2023-09-04 1:27 ` Richard Stallman 2023-09-04 1:27 ` Richard Stallman 2 siblings, 2 replies; 68+ messages in thread From: Jim Porter @ 2023-08-27 18:36 UTC (permalink / raw) To: Andrew Hyatt, rms; +Cc: emacs-devel On 8/26/2023 6:07 PM, Andrew Hyatt wrote: > To save you the trouble of looking at the code to see what exactly it > says, here's the function I'm using to warn: > > (defun llm--warn-on-nonfree (name tos) > "Issue a warning if `llm-warn-on-nonfree' is non-nil. > NAME is the human readable name of the LLM (e.g 'Open AI'). > > TOS is the URL of the terms of service for the LLM. > > All non-free LLMs should call this function on each llm function > invocation." > (when llm-warn-on-nonfree > (lwarn '(llm nonfree) :warning "%s API is not free software, and > your freedom to use it is restricted. > See %s for the details on the restrictions on use." name tos))) To make this easier on third parties writing their own implementations for other LLMs, maybe you could make this (mostly) automatic? I see that you're using 'cl-defgeneric' in the code, so what about something like this? (cl-defgeneric llm-free-p (provider) "Return non-nil if PROVIDER is a freedom-respecting model." nil) (cl-defmethod llm-free-p ((provider my-free-llm)) t) Then, if all user-facing functions have some implementation that always calls this (maybe using the ":before" key for the generic functions?), third parties won't forget to set up the warning code; instead, they'll need to explicitly mark their LLM provider as free. I also see that there's a defcustom ('llm-warn-on-nonfree') that lets people opt out of this. I think it's a good idea to give users that control, but should this follow a similar pattern to 'inhibit-startup-echo-area-message'? Its docstring says: > The startup message is in the echo area as it provides information > about GNU Emacs and the GNU system in general, which we want all > users to see. As this is the least intrusive startup message, > this variable gets specialized treatment to prevent the message > from being disabled site-wide by systems administrators, while > still allowing individual users to do so. > > Setting this variable takes effect only if you do it with the > customization buffer or if your init file contains a line of this > form: > (setq inhibit-startup-echo-area-message "YOUR-USER-NAME") If we want it to be easy for users to opt out of the message, but hard for admins (or other packages) to automate opting out, something like the above might make sense. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-27 18:36 ` Jim Porter @ 2023-08-28 0:19 ` Andrew Hyatt 2023-09-04 1:27 ` Richard Stallman 1 sibling, 0 replies; 68+ messages in thread From: Andrew Hyatt @ 2023-08-28 0:19 UTC (permalink / raw) To: Jim Porter; +Cc: rms, emacs-devel [-- Attachment #1: Type: text/plain, Size: 2988 bytes --] On Sun, Aug 27, 2023 at 2:36 PM Jim Porter <jporterbugs@gmail.com> wrote: > On 8/26/2023 6:07 PM, Andrew Hyatt wrote: > > To save you the trouble of looking at the code to see what exactly it > > says, here's the function I'm using to warn: > > > > (defun llm--warn-on-nonfree (name tos) > > "Issue a warning if `llm-warn-on-nonfree' is non-nil. > > NAME is the human readable name of the LLM (e.g 'Open AI'). > > > > TOS is the URL of the terms of service for the LLM. > > > > All non-free LLMs should call this function on each llm function > > invocation." > > (when llm-warn-on-nonfree > > (lwarn '(llm nonfree) :warning "%s API is not free software, and > > your freedom to use it is restricted. > > See %s for the details on the restrictions on use." name tos))) > > To make this easier on third parties writing their own implementations > for other LLMs, maybe you could make this (mostly) automatic? I see that > you're using 'cl-defgeneric' in the code, so what about something like > this? > > (cl-defgeneric llm-free-p (provider) > "Return non-nil if PROVIDER is a freedom-respecting model." > nil) > > (cl-defmethod llm-free-p ((provider my-free-llm)) > t) > > Then, if all user-facing functions have some implementation that always > calls this (maybe using the ":before" key for the generic functions?), > third parties won't forget to set up the warning code; instead, they'll > need to explicitly mark their LLM provider as free. > Good idea. I implemented something close to what you suggest, but I had to make a few changes to get it to be workable. Thank you for the suggestion! > I also see that there's a defcustom ('llm-warn-on-nonfree') that lets > people opt out of this. I think it's a good idea to give users that > control, but should this follow a similar pattern to > 'inhibit-startup-echo-area-message'? Its docstring says: > > > The startup message is in the echo area as it provides information > > about GNU Emacs and the GNU system in general, which we want all > > users to see. As this is the least intrusive startup message, > > this variable gets specialized treatment to prevent the message > > from being disabled site-wide by systems administrators, while > > still allowing individual users to do so. > > > > Setting this variable takes effect only if you do it with the > > customization buffer or if your init file contains a line of this > > form: > > (setq inhibit-startup-echo-area-message "YOUR-USER-NAME") > > If we want it to be easy for users to opt out of the message, but hard > for admins (or other packages) to automate opting out, something like > the above might make sense. > Very interesting, thanks. I took a look at the implementation, and I'd prefer not to do anything like that (which involves looking through the user's init file, and seems like it would miss at least some cases). For now, I'll keep it simple. [-- Attachment #2: Type: text/html, Size: 3788 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-27 18:36 ` Jim Porter 2023-08-28 0:19 ` Andrew Hyatt @ 2023-09-04 1:27 ` Richard Stallman 2023-09-04 5:18 ` Andrew Hyatt 1 sibling, 1 reply; 68+ messages in thread From: Richard Stallman @ 2023-09-04 1:27 UTC (permalink / raw) To: Jim Porter; +Cc: ahyatt, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > If we want it to be easy for users to opt out of the message, but hard > for admins (or other packages) to automate opting out, something like > the above might make sense. I think that would be good here. But I think the moral warning msssage for LLM should be displayed in the main display area. Users will be able to suppres it once they know the point; but unless/until they do, we want it to make an impression. So, when it is displayed, it should not be hidden in an obscure part of the screen like the echo area. That will also make it easy to inform users HOW to suppress the message after having seen it. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-09-04 1:27 ` Richard Stallman @ 2023-09-04 5:18 ` Andrew Hyatt 2023-09-07 1:21 ` Richard Stallman 0 siblings, 1 reply; 68+ messages in thread From: Andrew Hyatt @ 2023-09-04 5:18 UTC (permalink / raw) To: rms; +Cc: Jim Porter, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1546 bytes --] On Sun, Sep 3, 2023 at 9:27 PM Richard Stallman <rms@gnu.org> wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > If we want it to be easy for users to opt out of the message, but hard > > for admins (or other packages) to automate opting out, something like > > the above might make sense. > > I think that would be good here. > > But I think the moral warning msssage for LLM should be displayed in > the main display area. Users will be able to suppres it once they know > the point; but unless/until they do, we want it to make an impression. > So, when it is displayed, it should not be hidden in an obscure part > of the screen like the echo area. > > That will also make it easy to inform users HOW to suppress the message > after having seen it. > The warn functionality in emacs does this already: it will pop up a buffer with a warning. The user can choose, by clicking on the (-) symbol to the left, to suppress the warning, or suppress the popup. Since the warn functionality is built-into emacs, I prefer to use it then create a similar functionality that is nonstandard. > > > -- > Dr Richard Stallman (https://stallman.org) > Chief GNUisance of the GNU Project (https://gnu.org) > Founder, Free Software Foundation (https://fsf.org) > Internet Hall-of-Famer (https://internethalloffame.org) > > > [-- Attachment #2: Type: text/html, Size: 2310 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-09-04 5:18 ` Andrew Hyatt @ 2023-09-07 1:21 ` Richard Stallman 2023-09-12 4:54 ` Andrew Hyatt 0 siblings, 1 reply; 68+ messages in thread From: Richard Stallman @ 2023-09-07 1:21 UTC (permalink / raw) To: Andrew Hyatt; +Cc: jporterbugs, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > The warn functionality in emacs does this already: it will pop up a buffer > with a warning. The user can choose, by clicking on the (-) symbol to the > left, to suppress the warning, or suppress the popup. Since the warn > functionality is built-into emacs, I prefer to use it then create a similar > functionality that is nonstandard. That is a good approach for this. A few days ago, someone asked if it might be possible to have a general Emacs-wide way of customizing warnings and notifications that would apply to the various mechanisms. It could be a good idea. If someone wants to think about what specific customizations this might do, that might lead to ideas to implement. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-09-07 1:21 ` Richard Stallman @ 2023-09-12 4:54 ` Andrew Hyatt 2023-09-12 9:57 ` Philip Kaludercic 2023-09-12 15:05 ` Stefan Kangas 0 siblings, 2 replies; 68+ messages in thread From: Andrew Hyatt @ 2023-09-12 4:54 UTC (permalink / raw) To: rms; +Cc: jporterbugs, emacs-devel [-- Attachment #1: Type: text/plain, Size: 2033 bytes --] To bring this thread back to the original purpose: It doesn't seem like there are any objections to having this package in GNU ELPA, in its current form. I'd like to resolve this long-running discussion by committing the first version. I believe I have commit access, so if no one does object, I can add this to GNU ELPA myself. I'll do so on Friday (September 15th), unless someone wants me to hold off. Another question is whether this should be one package or many. The many-package option would have the llm and llm-fake package in the main llm package, with a package for all llm clients, such as llm-openai and llm-vertex (which are the two options I have now). If someone has an opinion on this, please let me know. On Wed, Sep 6, 2023 at 9:21 PM Richard Stallman <rms@gnu.org> wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > The warn functionality in emacs does this already: it will pop up a > buffer > > with a warning. The user can choose, by clicking on the (-) symbol to > the > > left, to suppress the warning, or suppress the popup. Since the warn > > functionality is built-into emacs, I prefer to use it then create a > similar > > functionality that is nonstandard. > > That is a good approach for this. > > A few days ago, someone asked if it might be possible > to have a general Emacs-wide way of customizing warnings > and notifications that would apply to the various mechanisms. > It could be a good idea. If someone wants to think about what > specific customizations this might do, that might lead to ideas > to implement. > > > -- > Dr Richard Stallman (https://stallman.org) > Chief GNUisance of the GNU Project (https://gnu.org) > Founder, Free Software Foundation (https://fsf.org) > Internet Hall-of-Famer (https://internethalloffame.org) > > > [-- Attachment #2: Type: text/html, Size: 2716 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-09-12 4:54 ` Andrew Hyatt @ 2023-09-12 9:57 ` Philip Kaludercic 2023-09-12 15:05 ` Stefan Kangas 1 sibling, 0 replies; 68+ messages in thread From: Philip Kaludercic @ 2023-09-12 9:57 UTC (permalink / raw) To: Andrew Hyatt; +Cc: rms, jporterbugs, emacs-devel Andrew Hyatt <ahyatt@gmail.com> writes: > To bring this thread back to the original purpose: It doesn't seem like > there are any objections to having this package in GNU ELPA, in its current > form. I'd like to resolve this long-running discussion by committing the > first version. I believe I have commit access, so if no one does object, I > can add this to GNU ELPA myself. I'll do so on Friday (September 15th), > unless someone wants me to hold off. No objections from my side. > Another question is whether this should be one package or many. The > many-package option would have the llm and llm-fake package in the main llm > package, with a package for all llm clients, such as llm-openai and > llm-vertex (which are the two options I have now). If someone has an > opinion on this, please let me know. I think it would be easier to have it distributed as a single package, but no strong opinions here. > > > On Wed, Sep 6, 2023 at 9:21 PM Richard Stallman <rms@gnu.org> wrote: > >> [[[ To any NSA and FBI agents reading my email: please consider ]]] >> [[[ whether defending the US Constitution against all enemies, ]]] >> [[[ foreign or domestic, requires you to follow Snowden's example. ]]] >> >> > The warn functionality in emacs does this already: it will pop up a >> buffer >> > with a warning. The user can choose, by clicking on the (-) symbol to >> the >> > left, to suppress the warning, or suppress the popup. Since the warn >> > functionality is built-into emacs, I prefer to use it then create a >> similar >> > functionality that is nonstandard. >> >> That is a good approach for this. >> >> A few days ago, someone asked if it might be possible >> to have a general Emacs-wide way of customizing warnings >> and notifications that would apply to the various mechanisms. >> It could be a good idea. If someone wants to think about what >> specific customizations this might do, that might lead to ideas >> to implement. >> >> >> -- >> Dr Richard Stallman (https://stallman.org) >> Chief GNUisance of the GNU Project (https://gnu.org) >> Founder, Free Software Foundation (https://fsf.org) >> Internet Hall-of-Famer (https://internethalloffame.org) >> >> >> ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-09-12 4:54 ` Andrew Hyatt 2023-09-12 9:57 ` Philip Kaludercic @ 2023-09-12 15:05 ` Stefan Kangas 2023-09-19 16:26 ` Andrew Hyatt 1 sibling, 1 reply; 68+ messages in thread From: Stefan Kangas @ 2023-09-12 15:05 UTC (permalink / raw) To: Andrew Hyatt, rms; +Cc: jporterbugs, emacs-devel Andrew Hyatt <ahyatt@gmail.com> writes: > Another question is whether this should be one package or many. The > many-package option would have the llm and llm-fake package in the main llm > package, with a package for all llm clients, such as llm-openai and > llm-vertex (which are the two options I have now). If someone has an > opinion on this, please let me know. It's easier for users if it's just one package. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-09-12 15:05 ` Stefan Kangas @ 2023-09-19 16:26 ` Andrew Hyatt 2023-09-19 16:34 ` Philip Kaludercic 0 siblings, 1 reply; 68+ messages in thread From: Andrew Hyatt @ 2023-09-19 16:26 UTC (permalink / raw) To: Stefan Kangas; +Cc: rms, jporterbugs, emacs-devel [-- Attachment #1: Type: text/plain, Size: 843 bytes --] I've submitted the configuration for llm and set up the branch from my repository last Friday. However, I'm still not seeing this package being reflected in GNU ELPA's package archive. I followed the instructions, but perhaps there's some step that I've missed, or it is only periodically rebuilt? On Tue, Sep 12, 2023 at 11:05 AM Stefan Kangas <stefankangas@gmail.com> wrote: > Andrew Hyatt <ahyatt@gmail.com> writes: > > > Another question is whether this should be one package or many. The > > many-package option would have the llm and llm-fake package in the main > llm > > package, with a package for all llm clients, such as llm-openai and > > llm-vertex (which are the two options I have now). If someone has an > > opinion on this, please let me know. > > It's easier for users if it's just one package. > [-- Attachment #2: Type: text/html, Size: 1238 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-09-19 16:26 ` Andrew Hyatt @ 2023-09-19 16:34 ` Philip Kaludercic 2023-09-19 18:19 ` Andrew Hyatt 0 siblings, 1 reply; 68+ messages in thread From: Philip Kaludercic @ 2023-09-19 16:34 UTC (permalink / raw) To: Andrew Hyatt; +Cc: Stefan Kangas, rms, jporterbugs, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1080 bytes --] Andrew Hyatt <ahyatt@gmail.com> writes: > I've submitted the configuration for llm and set up the branch from my > repository last Friday. However, I'm still not seeing this package being > reflected in GNU ELPA's package archive. I followed the instructions, but > perhaps there's some step that I've missed, or it is only periodically > rebuilt? Did you try to run make build/llm? I get this error: --8<---------------cut here---------------start------------->8--- $ make build/llm emacs --batch -l /home/philip/.../elpa/admin/elpa-admin.el \ -f elpaa-batch-make-one-package llm Cloning branch llm: Preparing worktree (new branch 'externals/llm') branch 'externals/llm' set up to track 'origin/externals/llm'. HEAD is now at 39ae6fc794 Assign copyright to FSF, in preparation of inclusion to GNU ELPA Debugger entered--Lisp error: (search-failed ";;; llm.el ends here") ... --8<---------------cut here---------------end--------------->8--- In other words the footer is missing. I have prepared a patch that would address that and a few other checkdoc issues: [-- Attachment #2: Type: text/plain, Size: 4851 bytes --] diff --git a/llm.el b/llm.el index 11b508cb36..08f07b65ca 100644 --- a/llm.el +++ b/llm.el @@ -23,9 +23,9 @@ ;;; Commentary: ;; This file defines a generic interface for LLMs (large language models), and -;; functionality they can provide. Not all LLMs will support all of these, but +;; functionality they can provide. Not all LLMs will support all of these, but ;; programs that want to integrate with LLMs can code against the interface, and -;; users can then choose the LLMs they want to use. It's advisable to have the +;; users can then choose the LLMs they want to use. It's advisable to have the ;; possibility of using multiple LLMs when that make sense for different ;; functionality. ;; @@ -50,7 +50,7 @@ (defun llm--warn-on-nonfree (name tos) "Issue a warning if `llm-warn-on-nonfree' is non-nil. -NAME is the human readable name of the LLM (e.g 'Open AI'). +NAME is the human readable name of the LLM (e.g \"Open AI\"). TOS is the URL of the terms of service for the LLM. @@ -72,7 +72,7 @@ EXAMPLES is a list of conses, where the car is an example inputs, and cdr is the corresponding example outputs. This is optional. INTERACTIONS is a list message sent by either the llm or the -user. It is a list of `llm-chat-prompt-interaction' objects. This +user. It is a list of `llm-chat-prompt-interaction' objects. This is required. TEMPERATURE is a floating point number with a minimum of 0, and @@ -80,8 +80,7 @@ maximum of 1, which controls how predictable the result is, with 0 being the most predicatable, and 1 being the most creative. This is not required. -MAX-TOKENS is the maximum number of tokens to generate. This is optional. -" +MAX-TOKENS is the maximum number of tokens to generate. This is optional." context examples interactions temperature max-tokens) (cl-defstruct llm-chat-prompt-interaction @@ -102,19 +101,20 @@ This should be a cons of the name of the LLM, and the URL of the terms of service. If the LLM is free and has no restrictions on use, this should -return nil. Since this function already returns nil, there is no +return nil. Since this function already returns nil, there is no need to override it." (ignore provider) nil) (cl-defgeneric llm-chat (provider prompt) "Return a response to PROMPT from PROVIDER. -PROMPT is a `llm-chat-prompt'. The response is a string." +PROMPT is a `llm-chat-prompt'. The response is a string." (ignore provider prompt) (signal 'not-implemented nil)) (cl-defmethod llm-chat ((_ (eql nil)) _) - (error "LLM provider was nil. Please set the provider in the application you are using.")) + "Catch trivial configuration mistake." + (error "LLM provider was nil. Please set the provider in the application you are using")) (cl-defmethod llm-chat :before (provider _) "Issue a warning if the LLM is non-free." @@ -130,7 +130,8 @@ ERROR-CALLBACK receives the error response." (signal 'not-implemented nil)) (cl-defmethod llm-chat-async ((_ (eql nil)) _ _ _) - (error "LLM provider was nil. Please set the provider in the application you are using.")) + "Catch trivial configuration mistake." + (error "LLM provider was nil. Please set the provider in the application you are using")) (cl-defmethod llm-chat-async :before (provider _ _ _) "Issue a warning if the LLM is non-free." @@ -143,7 +144,8 @@ ERROR-CALLBACK receives the error response." (signal 'not-implemented nil)) (cl-defmethod llm-embedding ((_ (eql nil)) _) - (error "LLM provider was nil. Please set the provider in the application you are using.")) + "Catch trivial configuration mistake." + (error "LLM provider was nil. Please set the provider in the application you are using")) (cl-defmethod llm-embedding :before (provider _) "Issue a warning if the LLM is non-free." @@ -159,7 +161,8 @@ error signal and a string message." (signal 'not-implemented nil)) (cl-defmethod llm-embedding-async ((_ (eql nil)) _ _ _) - (error "LLM provider was nil. Please set the provider in the application you are using.")) + "Catch trivial configuration mistake." + (error "LLM provider was nil. Please set the provider in the application you are using")) (cl-defmethod llm-embedding-async :before (provider _ _ _) "Issue a warning if the LLM is non-free." @@ -169,7 +172,7 @@ error signal and a string message." (cl-defgeneric llm-count-tokens (provider string) "Return the number of tokens in STRING from PROVIDER. This may be an estimate if the LLM does not provide an exact -count. Different providers might tokenize things in different +count. Different providers might tokenize things in different ways." (ignore provider) (with-temp-buffer @@ -199,3 +202,4 @@ This should only be used for logging or debugging." ""))) (provide 'llm) +;;; llm.el ends here [-- Attachment #3: Type: text/plain, Size: 559 bytes --] > > On Tue, Sep 12, 2023 at 11:05 AM Stefan Kangas <stefankangas@gmail.com> > wrote: > >> Andrew Hyatt <ahyatt@gmail.com> writes: >> >> > Another question is whether this should be one package or many. The >> > many-package option would have the llm and llm-fake package in the main >> llm >> > package, with a package for all llm clients, such as llm-openai and >> > llm-vertex (which are the two options I have now). If someone has an >> > opinion on this, please let me know. >> >> It's easier for users if it's just one package. >> ^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-09-19 16:34 ` Philip Kaludercic @ 2023-09-19 18:19 ` Andrew Hyatt 0 siblings, 0 replies; 68+ messages in thread From: Andrew Hyatt @ 2023-09-19 18:19 UTC (permalink / raw) To: Philip Kaludercic; +Cc: Stefan Kangas, emacs-devel, jporterbugs, rms [-- Attachment #1: Type: text/plain, Size: 7223 bytes --] On Tue, Sep 19, 2023 at 12:34 PM Philip Kaludercic <philipk@posteo.net> wrote: > Andrew Hyatt <ahyatt@gmail.com> writes: > > > I've submitted the configuration for llm and set up the branch from my > > repository last Friday. However, I'm still not seeing this package being > > reflected in GNU ELPA's package archive. I followed the instructions, > but > > perhaps there's some step that I've missed, or it is only periodically > > rebuilt? > > Did you try to run make build/llm? I get this error: I did build it, and it seemed to work. I’m not sure what I’m doing differently but I appreciate the patch which I’ll apply later today. Thank you for your help. > > > --8<---------------cut here---------------start------------->8--- > $ make build/llm > emacs --batch -l /home/philip/.../elpa/admin/elpa-admin.el \ > -f elpaa-batch-make-one-package llm > Cloning branch llm: > Preparing worktree (new branch 'externals/llm') > branch 'externals/llm' set up to track 'origin/externals/llm'. > HEAD is now at 39ae6fc794 Assign copyright to FSF, in preparation of > inclusion to GNU ELPA > > Debugger entered--Lisp error: (search-failed ";;; llm.el ends here") > ... > --8<---------------cut here---------------end--------------->8--- > > In other words the footer is missing. I have prepared a patch that > would address that and a few other checkdoc issues: > > diff --git a/llm.el b/llm.el > index 11b508cb36..08f07b65ca 100644 > --- a/llm.el > +++ b/llm.el > @@ -23,9 +23,9 @@ > > ;;; Commentary: > ;; This file defines a generic interface for LLMs (large language > models), and > -;; functionality they can provide. Not all LLMs will support all of > these, but > +;; functionality they can provide. Not all LLMs will support all of > these, but > ;; programs that want to integrate with LLMs can code against the > interface, and > -;; users can then choose the LLMs they want to use. It's advisable to > have the > +;; users can then choose the LLMs they want to use. It's advisable to > have the > ;; possibility of using multiple LLMs when that make sense for different > ;; functionality. > ;; > @@ -50,7 +50,7 @@ > > (defun llm--warn-on-nonfree (name tos) > "Issue a warning if `llm-warn-on-nonfree' is non-nil. > -NAME is the human readable name of the LLM (e.g 'Open AI'). > +NAME is the human readable name of the LLM (e.g \"Open AI\"). > > TOS is the URL of the terms of service for the LLM. > > @@ -72,7 +72,7 @@ EXAMPLES is a list of conses, where the car is an example > inputs, and cdr is the corresponding example outputs. This is optional. > > INTERACTIONS is a list message sent by either the llm or the > -user. It is a list of `llm-chat-prompt-interaction' objects. This > +user. It is a list of `llm-chat-prompt-interaction' objects. This > is required. > > TEMPERATURE is a floating point number with a minimum of 0, and > @@ -80,8 +80,7 @@ maximum of 1, which controls how predictable the result > is, with > 0 being the most predicatable, and 1 being the most creative. > This is not required. > > -MAX-TOKENS is the maximum number of tokens to generate. This is optional. > -" > +MAX-TOKENS is the maximum number of tokens to generate. This is > optional." > context examples interactions temperature max-tokens) > > (cl-defstruct llm-chat-prompt-interaction > @@ -102,19 +101,20 @@ This should be a cons of the name of the LLM, and > the URL of the > terms of service. > > If the LLM is free and has no restrictions on use, this should > -return nil. Since this function already returns nil, there is no > +return nil. Since this function already returns nil, there is no > need to override it." > (ignore provider) > nil) > > (cl-defgeneric llm-chat (provider prompt) > "Return a response to PROMPT from PROVIDER. > -PROMPT is a `llm-chat-prompt'. The response is a string." > +PROMPT is a `llm-chat-prompt'. The response is a string." > (ignore provider prompt) > (signal 'not-implemented nil)) > > (cl-defmethod llm-chat ((_ (eql nil)) _) > - (error "LLM provider was nil. Please set the provider in the > application you are using.")) > + "Catch trivial configuration mistake." > + (error "LLM provider was nil. Please set the provider in the > application you are using")) > > (cl-defmethod llm-chat :before (provider _) > "Issue a warning if the LLM is non-free." > @@ -130,7 +130,8 @@ ERROR-CALLBACK receives the error response." > (signal 'not-implemented nil)) > > (cl-defmethod llm-chat-async ((_ (eql nil)) _ _ _) > - (error "LLM provider was nil. Please set the provider in the > application you are using.")) > + "Catch trivial configuration mistake." > + (error "LLM provider was nil. Please set the provider in the > application you are using")) > > (cl-defmethod llm-chat-async :before (provider _ _ _) > "Issue a warning if the LLM is non-free." > @@ -143,7 +144,8 @@ ERROR-CALLBACK receives the error response." > (signal 'not-implemented nil)) > > (cl-defmethod llm-embedding ((_ (eql nil)) _) > - (error "LLM provider was nil. Please set the provider in the > application you are using.")) > + "Catch trivial configuration mistake." > + (error "LLM provider was nil. Please set the provider in the > application you are using")) > > (cl-defmethod llm-embedding :before (provider _) > "Issue a warning if the LLM is non-free." > @@ -159,7 +161,8 @@ error signal and a string message." > (signal 'not-implemented nil)) > > (cl-defmethod llm-embedding-async ((_ (eql nil)) _ _ _) > - (error "LLM provider was nil. Please set the provider in the > application you are using.")) > + "Catch trivial configuration mistake." > + (error "LLM provider was nil. Please set the provider in the > application you are using")) > > (cl-defmethod llm-embedding-async :before (provider _ _ _) > "Issue a warning if the LLM is non-free." > @@ -169,7 +172,7 @@ error signal and a string message." > (cl-defgeneric llm-count-tokens (provider string) > "Return the number of tokens in STRING from PROVIDER. > This may be an estimate if the LLM does not provide an exact > -count. Different providers might tokenize things in different > +count. Different providers might tokenize things in different > ways." > (ignore provider) > (with-temp-buffer > @@ -199,3 +202,4 @@ This should only be used for logging or debugging." > ""))) > > (provide 'llm) > +;;; llm.el ends here > > > > > > On Tue, Sep 12, 2023 at 11:05 AM Stefan Kangas <stefankangas@gmail.com> > > wrote: > > > >> Andrew Hyatt <ahyatt@gmail.com> writes: > >> > >> > Another question is whether this should be one package or many. The > >> > many-package option would have the llm and llm-fake package in the > main > >> llm > >> > package, with a package for all llm clients, such as llm-openai and > >> > llm-vertex (which are the two options I have now). If someone has an > >> > opinion on this, please let me know. > >> > >> It's easier for users if it's just one package. > >> > [-- Attachment #2: Type: text/html, Size: 8668 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-27 1:07 ` Andrew Hyatt 2023-08-27 13:11 ` Philip Kaludercic 2023-08-27 18:36 ` Jim Porter @ 2023-09-04 1:27 ` Richard Stallman 2 siblings, 0 replies; 68+ messages in thread From: Richard Stallman @ 2023-09-04 1:27 UTC (permalink / raw) To: Andrew Hyatt; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > (defun llm--warn-on-nonfree (name tos) > "Issue a warning if `llm-warn-on-nonfree' is non-nil. > NAME is the human readable name of the LLM (e.g 'Open AI'). > TOS is the URL of the terms of service for the LLM. > All non-free LLMs should call this function on each llm function > invocation." > (when llm-warn-on-nonfree > (lwarn '(llm nonfree) :warning "%s API is not free software, and your > freedom to use it is restricted. > See %s for the details on the restrictions on use." name tos))) > If this is sufficient, please consider accepting this package into GNU ELPA > (see above where we decided this is a better fit than the Non-GNU ELPA). I think GNU ELPA (or Emacs core) is the right place for this package, once whatever details get finalized. It is not a mere nice add-on; rather, it is something that various other packages will depend on. Anything like that should be a part of Emacs. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-07 23:54 [NonGNU ELPA] New package: llm Andrew Hyatt 2023-08-08 5:42 ` Philip Kaludercic 2023-08-09 3:47 ` Richard Stallman @ 2023-08-09 3:47 ` Richard Stallman 2023-08-09 4:06 ` Andrew Hyatt 2 siblings, 1 reply; 68+ messages in thread From: Richard Stallman @ 2023-08-09 3:47 UTC (permalink / raw) To: Andrew Hyatt; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > I've created a new package called llm, for the purpose of abstracting the > interface to various large language model providers. Note that packages in core Emacs or in GNU ELPA should not depend on anything in NonGNU ELPA. If llm is meant for other packages to use, it should be in GNU ELPA, not NonGNU ELPA. Why did you plan to put it in NonGNU ELPA? > I prefer that this is NonGNU, because I suspect people would like to > contribute interfaces to different LLM, and not all of them will have FSF > papers. I don't follow the logic here. It looks like the llm package is intended to be generic, so it would be used by other packages to implementr support for specific models. If llm package is on GNU ELPA, it can be used from packages no matter how those packages are distributed. But if the llm package is in NonGNU ELPA, it can only be used from packages in NonGNU ELPA. Have I misunderstood the intended design? -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-09 3:47 ` Richard Stallman @ 2023-08-09 4:06 ` Andrew Hyatt 2023-08-12 2:44 ` Richard Stallman 0 siblings, 1 reply; 68+ messages in thread From: Andrew Hyatt @ 2023-08-09 4:06 UTC (permalink / raw) To: rms; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 2145 bytes --] On Tue, Aug 8, 2023 at 11:47 PM Richard Stallman <rms@gnu.org> wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > I've created a new package called llm, for the purpose of abstracting > the > > interface to various large language model providers. > > Note that packages in core Emacs or in GNU ELPA > should not depend on anything in NonGNU ELPA. > If llm is meant for other packages to use, > it should be in GNU ELPA, not NonGNU ELPA. > > Why did you plan to put it in NonGNU ELPA? The logic was the same logic you quote below (I'll explain better what my point was below), but I agree that it would limit the use, so GNU ELPA makes more sense. Another factor was that I am using request.el, which is not in GNU ELPA, so I'd have to rewrite it, which complicates the code. > > > I prefer that this is NonGNU, because I suspect people would like to > > contribute interfaces to different LLM, and not all of them will have > FSF > > papers. > > I don't follow the logic here. It looks like the llm package is > intended to be generic, so it would be used by other packages to > implementr support for specific models. If llm package is on GNU ELPA, > it can be used from packages no matter how those packages are distributed. > It wasn't about use, it's more about accepting significant code contributions, which is less restricted with NonGNU ELPA, since I wouldn't have to ask for FSF papers. > > But if the llm package is in NonGNU ELPA, it can only be used from packages > in NonGNU ELPA. > > Have I misunderstood the intended design? > You understood correctly. This is a package designed to be used as a library from other packages. > > > > > -- > Dr Richard Stallman (https://stallman.org) > Chief GNUisance of the GNU Project (https://gnu.org) > Founder, Free Software Foundation (https://fsf.org) > Internet Hall-of-Famer (https://internethalloffame.org) > > > [-- Attachment #2: Type: text/html, Size: 3409 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [NonGNU ELPA] New package: llm 2023-08-09 4:06 ` Andrew Hyatt @ 2023-08-12 2:44 ` Richard Stallman 0 siblings, 0 replies; 68+ messages in thread From: Richard Stallman @ 2023-08-12 2:44 UTC (permalink / raw) To: Andrew Hyatt; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > It wasn't about use, it's more about accepting significant code > contributions, which is less restricted with NonGNU ELPA, since I wouldn't > have to ask for FSF papers. This problem does arise, but it isn't a big problem in practice. We get lots of significan code contributions for the Emacs core and GNU ELPA. It is only rarly that there is an obstacle. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 68+ messages in thread
end of thread, other threads:[~2023-09-19 18:19 UTC | newest] Thread overview: 68+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-08-07 23:54 [NonGNU ELPA] New package: llm Andrew Hyatt 2023-08-08 5:42 ` Philip Kaludercic 2023-08-08 15:08 ` Spencer Baugh 2023-08-08 15:09 ` Andrew Hyatt 2023-08-09 3:47 ` Richard Stallman 2023-08-09 4:37 ` Andrew Hyatt 2023-08-13 1:43 ` Richard Stallman 2023-08-13 1:43 ` Richard Stallman 2023-08-13 2:11 ` Emanuel Berg 2023-08-15 5:14 ` Andrew Hyatt 2023-08-15 17:12 ` Jim Porter 2023-08-17 2:02 ` Richard Stallman 2023-08-17 2:48 ` Andrew Hyatt 2023-08-19 1:51 ` Richard Stallman 2023-08-19 9:08 ` Ihor Radchenko 2023-08-21 1:12 ` Richard Stallman 2023-08-21 8:26 ` Ihor Radchenko 2023-08-17 17:08 ` Daniel Fleischer 2023-08-19 1:49 ` Richard Stallman 2023-08-19 8:15 ` Daniel Fleischer 2023-08-21 1:12 ` Richard Stallman 2023-08-21 4:48 ` Jim Porter 2023-08-21 5:12 ` Andrew Hyatt 2023-08-21 6:03 ` Jim Porter 2023-08-21 6:36 ` Daniel Fleischer 2023-08-22 1:06 ` Richard Stallman 2023-08-16 2:30 ` Richard Stallman 2023-08-16 5:11 ` Tomas Hlavaty 2023-08-18 2:10 ` Richard Stallman 2023-08-27 1:07 ` Andrew Hyatt 2023-08-27 13:11 ` Philip Kaludercic 2023-08-28 1:31 ` Richard Stallman 2023-08-28 2:32 ` Andrew Hyatt 2023-08-28 2:59 ` Jim Porter 2023-08-28 4:54 ` Andrew Hyatt 2023-08-31 2:10 ` Richard Stallman 2023-08-31 9:06 ` Ihor Radchenko 2023-08-31 16:29 ` chad 2023-09-01 9:53 ` Ihor Radchenko 2023-09-04 1:27 ` Richard Stallman 2023-09-06 12:25 ` Ihor Radchenko 2023-09-06 12:51 ` Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm) Ihor Radchenko 2023-09-06 16:59 ` Andrew Hyatt 2023-09-09 0:37 ` Richard Stallman 2023-09-06 22:52 ` Emanuel Berg 2023-09-07 7:28 ` Lucien Cartier-Tilet 2023-09-07 7:57 ` Emanuel Berg 2023-09-09 0:38 ` Richard Stallman 2023-09-09 10:28 ` Collaborative training of Libre LLMs (was: Is ChatGTP SaaSS? (was: [NonGNU ELPA] New package: llm)) Ihor Radchenko 2023-09-09 11:19 ` Jean Louis 2023-09-10 0:22 ` Richard Stallman 2023-09-10 2:18 ` Debanjum Singh Solanky 2023-09-04 1:27 ` [NonGNU ELPA] New package: llm Richard Stallman 2023-08-27 18:36 ` Jim Porter 2023-08-28 0:19 ` Andrew Hyatt 2023-09-04 1:27 ` Richard Stallman 2023-09-04 5:18 ` Andrew Hyatt 2023-09-07 1:21 ` Richard Stallman 2023-09-12 4:54 ` Andrew Hyatt 2023-09-12 9:57 ` Philip Kaludercic 2023-09-12 15:05 ` Stefan Kangas 2023-09-19 16:26 ` Andrew Hyatt 2023-09-19 16:34 ` Philip Kaludercic 2023-09-19 18:19 ` Andrew Hyatt 2023-09-04 1:27 ` Richard Stallman 2023-08-09 3:47 ` Richard Stallman 2023-08-09 4:06 ` Andrew Hyatt 2023-08-12 2:44 ` Richard Stallman
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.