* Help building Pen.el (GPT for emacs) @ 2021-06-30 4:36 Shane Mulligan 2021-07-02 13:30 ` Jean Louis ` (3 more replies) 0 siblings, 4 replies; 75+ messages in thread From: Shane Mulligan @ 2021-06-30 4:36 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 599 bytes --] Hey guys. It looks like OpenAI is collaborating with GitHub on their GPT stuff, so any assistance in building an editor in emacs would be greatly appreciated. I made a start 4 months ago, link below: - https://github.com/semiosis/pen.el/ - https://copilot.github.com/ - https://news.ycombinator.com/item?id=27676266 GPT-3+vscode is an emacs killer, but emacs is a much better platform for building this stuff, so please help! Thanks. Shane How to contact me: 🇦🇺 00 61 421 641 250 🇳🇿 00 64 21 1462 759 <+64-21-1462-759> mullikine@gmail.com [-- Attachment #2: Type: text/html, Size: 4720 bytes --] ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-06-30 4:36 Help building Pen.el (GPT for emacs) Shane Mulligan @ 2021-07-02 13:30 ` Jean Louis 2021-07-02 13:40 ` Jean Louis ` (2 subsequent siblings) 3 siblings, 0 replies; 75+ messages in thread From: Jean Louis @ 2021-07-02 13:30 UTC (permalink / raw) To: Shane Mulligan; +Cc: emacs-devel * Shane Mulligan <mullikine@gmail.com> [2021-06-30 07:31]: > Hey guys. It looks like OpenAI is collaborating with GitHub on their GPT > stuff, so any assistance in building an editor in emacs would be greatly > appreciated. I made a start 4 months ago, link below: > > https://github.com/semiosis/pen.el/ In regards to licensing, it is recommended that you apply properly the license in each file, not just main one file. There is reason for that, files may be re-used and distributed not only in the package. Each file should point to the license. Please do {C-h C-c} and search for it. I will do more of review when I get it to work. How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. <one line to give the program's name and a brief idea of what it does.> Copyright (C) <year> <name of author> This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <https://www.gnu.org/licenses/>. Also add information on how to contact you by electronic and paper mail. If the program does terminal interaction, make it output a short notice like this when it starts in an interactive mode: <program> Copyright (C) <year> <name of author> This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, your program's commands might be different; for a GUI interface, you would use an "about box". You should also get your employer (if you work as a programmer) or school, if any, to sign a "copyright disclaimer" for the program, if necessary. For more information on this, and how to apply and follow the GNU GPL, see <https://www.gnu.org/licenses/>. -- Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-06-30 4:36 Help building Pen.el (GPT for emacs) Shane Mulligan 2021-07-02 13:30 ` Jean Louis @ 2021-07-02 13:40 ` Jean Louis 2021-07-02 13:57 ` Jean Louis 2021-07-15 11:58 ` Stefan Kangas 3 siblings, 0 replies; 75+ messages in thread From: Jean Louis @ 2021-07-02 13:40 UTC (permalink / raw) To: Shane Mulligan; +Cc: emacs-devel * Shane Mulligan <mullikine@gmail.com> [2021-06-30 07:31]: > Hey guys. It looks like OpenAI is collaborating with GitHub on their GPT > stuff, so any assistance in building an editor in emacs would be greatly > appreciated. I made a start 4 months ago, link below: > > https://github.com/semiosis/pen.el/ I could install `openai' by using `pip', so far so good. Though using your API key I have to reject for privacy purposes, so I have applied on their website. -- Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-06-30 4:36 Help building Pen.el (GPT for emacs) Shane Mulligan 2021-07-02 13:30 ` Jean Louis 2021-07-02 13:40 ` Jean Louis @ 2021-07-02 13:57 ` Jean Louis 2021-07-03 6:34 ` Shane Mulligan 2021-07-15 11:58 ` Stefan Kangas 3 siblings, 1 reply; 75+ messages in thread From: Jean Louis @ 2021-07-02 13:57 UTC (permalink / raw) To: Shane Mulligan; +Cc: emacs-devel * Shane Mulligan <mullikine@gmail.com> [2021-06-30 07:31]: > Hey guys. It looks like OpenAI is collaborating with GitHub on their GPT > stuff, so any assistance in building an editor in emacs would be greatly > appreciated. I made a start 4 months ago, link below: > https://github.com/semiosis/pen.el/ Sadly, I canot istall the requiremet depedency, I cannot find libclang in the list of my packages. error: failed to run custom build command for `emacs_module v0.10.0` Caused by: process didn't exit successfully: `/home/data1/protected/Programming/git/emacs-yamlmod/target/release/build/emacs_module-a4300f25c129cfa5/build-script-build` (exit code: 101) --- stderr thread 'main' panicked at 'Unable to find libclang: "couldn\'t find any valid shared libraries matching: [\'libclang.so\', \'libclang-*.so\', \'libclang.so.*\'], set the `LIBCLANG_PATH` environment variable to a path where one of these files can be found (invalid: [])"', /home/data1/protected/.cargo/registry/src/github.com-1ecc6299db9ec823/bindgen-0.48.1/src/lib.rs:1652:31 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace -- Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-02 13:57 ` Jean Louis @ 2021-07-03 6:34 ` Shane Mulligan 2021-07-03 22:21 ` Jean Louis 2021-07-23 15:37 ` Jean Louis 0 siblings, 2 replies; 75+ messages in thread From: Shane Mulligan @ 2021-07-03 6:34 UTC (permalink / raw) To: Shane Mulligan, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1664 bytes --] Hi Jean, Thanks for emailing me to help. I'm actively trying to set up a docker image to make this whole thing much easier. Hopefully I can get this done in the next few days. I will let you know when I have made progress. Thank you, On Sat, Jul 3, 2021 at 2:03 AM Jean Louis <bugs@gnu.support> wrote: > * Shane Mulligan <mullikine@gmail.com> [2021-06-30 07:31]: > > Hey guys. It looks like OpenAI is collaborating with GitHub on their GPT > > stuff, so any assistance in building an editor in emacs would be greatly > > appreciated. I made a start 4 months ago, link below: > > > https://github.com/semiosis/pen.el/ > > Sadly, I canot istall the requiremet depedency, I cannot find libclang > in the list of my packages. > > error: failed to run custom build command for `emacs_module v0.10.0` > > Caused by: > process didn't exit successfully: > `/home/data1/protected/Programming/git/emacs-yamlmod/target/release/build/emacs_module-a4300f25c129cfa5/build-script-build` > (exit code: 101) > --- stderr > thread 'main' panicked at 'Unable to find libclang: "couldn\'t find any > valid shared libraries matching: [\'libclang.so\', \'libclang-*.so\', > \'libclang.so.*\'], set the `LIBCLANG_PATH` environment variable to a path > where one of these files can be found (invalid: [])"', > /home/data1/protected/.cargo/registry/src/github.com-1ecc6299db9ec823/bindgen-0.48.1/src/lib.rs:1652 > :31 > note: run with `RUST_BACKTRACE=1` environment variable to display a > backtrace > > > -- > Jean > > Take action in Free Software Foundation campaigns: > https://www.fsf.org/campaigns > > In support of Richard M. Stallman > https://stallmansupport.org/ > [-- Attachment #2: Type: text/html, Size: 2412 bytes --] ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-03 6:34 ` Shane Mulligan @ 2021-07-03 22:21 ` Jean Louis 2021-07-03 23:21 ` Arthur Miller 2021-07-23 15:37 ` Jean Louis 1 sibling, 1 reply; 75+ messages in thread From: Jean Louis @ 2021-07-03 22:21 UTC (permalink / raw) To: Shane Mulligan; +Cc: emacs-devel * Shane Mulligan <mullikine@gmail.com> [2021-07-03 15:06]: > Hi Jean, > > Thanks for emailing me to help. I'm actively trying to set up a docker > image to make this whole thing much easier. Hopefully I can get this done > in the next few days. > I will let you know when I have made progress. I don't know how is that easier. Not that it will be easier for me, I cannot use docker. You should maybe describe dependencies without relying that everybody should use apt package manager, describe URLs for dependencies in addition to their possible package names, but not how to build them. That way every user can reach for source independently of package manager. Please also think on BSD and other systems where maybe docker does not work. -- Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-03 22:21 ` Jean Louis @ 2021-07-03 23:21 ` Arthur Miller 2021-07-03 23:42 ` Jean Louis 0 siblings, 1 reply; 75+ messages in thread From: Arthur Miller @ 2021-07-03 23:21 UTC (permalink / raw) To: Shane Mulligan; +Cc: emacs-devel Jean Louis <bugs@gnu.support> writes: > Please also think on BSD and other systems where maybe docker does not work. https://wiki.freebsd.org/Docker ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-03 23:21 ` Arthur Miller @ 2021-07-03 23:42 ` Jean Louis 2021-07-12 3:24 ` Shane Mulligan 0 siblings, 1 reply; 75+ messages in thread From: Jean Louis @ 2021-07-03 23:42 UTC (permalink / raw) To: Arthur Miller; +Cc: Shane Mulligan, emacs-devel * Arthur Miller <arthur.miller@live.com> [2021-07-04 02:26]: > Jean Louis <bugs@gnu.support> writes: > > > Please also think on BSD and other systems where maybe docker does not work. > > https://wiki.freebsd.org/Docker Not on DragonFlyBSD -- and even FreeBSD is broken as you read. -- Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-03 23:42 ` Jean Louis @ 2021-07-12 3:24 ` Shane Mulligan 2021-07-17 23:53 ` Richard Stallman 0 siblings, 1 reply; 75+ messages in thread From: Shane Mulligan @ 2021-07-12 3:24 UTC (permalink / raw) To: Arthur Miller, Shane Mulligan, emacs-devel [-- Attachment #1: Type: text/plain, Size: 826 bytes --] Hi Jean, I have working setup instructions for Debian 10 and a working docker image. https://github.com/semiosis/pen.el/blob/master/installation.org https://github.com/semiosis/pen.el/blob/master/Dockerfile I will also try to get working on other platforms in the future. Shane On Sun, Jul 4, 2021 at 11:45 AM Jean Louis <bugs@gnu.support> wrote: > * Arthur Miller <arthur.miller@live.com> [2021-07-04 02:26]: > > Jean Louis <bugs@gnu.support> writes: > > > > > Please also think on BSD and other systems where maybe docker does not > work. > > > > https://wiki.freebsd.org/Docker > > Not on DragonFlyBSD -- and even FreeBSD is broken as you read. > > -- > Jean > > Take action in Free Software Foundation campaigns: > https://www.fsf.org/campaigns > > In support of Richard M. Stallman > https://stallmansupport.org/ > [-- Attachment #2: Type: text/html, Size: 1739 bytes --] ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-12 3:24 ` Shane Mulligan @ 2021-07-17 23:53 ` Richard Stallman 0 siblings, 0 replies; 75+ messages in thread From: Richard Stallman @ 2021-07-17 23:53 UTC (permalink / raw) To: Shane Mulligan; +Cc: mullikine, arthur.miller, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] What does pen.el do? What is its relation to GPT-3? -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-03 6:34 ` Shane Mulligan 2021-07-03 22:21 ` Jean Louis @ 2021-07-23 15:37 ` Jean Louis 1 sibling, 0 replies; 75+ messages in thread From: Jean Louis @ 2021-07-23 15:37 UTC (permalink / raw) To: Shane Mulligan; +Cc: emacs-tangents I have tried following: 1. emacs -q 2. load src/init.el 3. then add the directory src to variable load-path 4. then to load pen.el I could see that it is asking for s-join function which it cannot find, it looked like Emacs was looping, beeping, totally unresponsive. I had to kill it. I am using Emacs development version, are you also using it? In general I was thinking that init.el and init-setup.el will set it up for me. The file installation.org is unclear, what do I need to do to start using pen.el ? $ emacs -q Connection lost to X server ':0' When compiled with GTK, Emacs cannot recover from X disconnects. This is a GTK bug: https://gitlab.gnome.org/GNOME/gtk/issues/221 For details, see etc/PROBLEMS. Fatal error 6: Aborted Backtrace: emacs(+0x1516d4)[0x55ba843c36d4] emacs(+0x463d8)[0x55ba842b83d8] emacs(+0x468d3)[0x55ba842b88d3] emacs(+0x45940)[0x55ba842b7940] emacs(+0x459fe)[0x55ba842b79fe] /usr/lib/libX11.so.6(_XIOError+0x5f)[0x7fac94d5bb9f] /usr/lib/libX11.so.6(_XEventsQueued+0xa7)[0x7fac94d59217] /usr/lib/libX11.so.6(XPending+0x62)[0x7fac94d4aa92] /usr/lib/libgdk-3.so.0(+0x8d722)[0x7fac95530722] /usr/lib/libglib-2.0.so.0(g_main_context_prepare+0x1b0)[0x7fac94ed5bc0] /usr/lib/libglib-2.0.so.0(+0xa7a06)[0x7fac94f29a06] /usr/lib/libglib-2.0.so.0(g_main_context_pending+0x2a)[0x7fac94ed371a] /usr/lib/libgtk-3.so.0(gtk_events_pending+0x10)[0x7fac95779690] emacs(+0x105c5d)[0x55ba84377c5d] emacs(+0x13e932)[0x55ba843b0932] emacs(+0x13f635)[0x55ba843b1635] emacs(+0x228e9b)[0x55ba8449ae9b] emacs(+0x993fa)[0x55ba8430b3fa] emacs(+0x7d3c9)[0x55ba842ef3c9] emacs(+0x82525)[0x55ba842f4525] emacs(+0x83cf4)[0x55ba842f5cf4] emacs(+0x686f2)[0x55ba842da6f2] emacs(+0x943a9)[0x55ba843063a9] emacs(+0x14729e)[0x55ba843b929e] emacs(+0x1b4047)[0x55ba84426047] emacs(+0x137404)[0x55ba843a9404] emacs(+0x1b6773)[0x55ba84428773] emacs(+0x1373ab)[0x55ba843a93ab] emacs(+0x13cd76)[0x55ba843aed76] emacs(+0x13d0a2)[0x55ba843af0a2] emacs(+0x4e929)[0x55ba842c0929] /usr/lib/libc.so.6(__libc_start_main+0xd5)[0x7fac938bfb25] emacs(+0x4ef2e)[0x55ba842c0f2e] Aborted ~/Programming/git/pen.el $ -- Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-06-30 4:36 Help building Pen.el (GPT for emacs) Shane Mulligan ` (2 preceding siblings ...) 2021-07-02 13:57 ` Jean Louis @ 2021-07-15 11:58 ` Stefan Kangas 2021-07-15 12:40 ` dick ` (2 more replies) 3 siblings, 3 replies; 75+ messages in thread From: Stefan Kangas @ 2021-07-15 11:58 UTC (permalink / raw) To: Shane Mulligan; +Cc: Emacs developers Shane Mulligan <mullikine@gmail.com> writes: > Hey guys. It looks like OpenAI is collaborating with GitHub on their GPT stuff, so any assistance in building an editor in emacs would be greatly appreciated. I made a start 4 months ago, link below: > > https://github.com/semiosis/pen.el/ > > https://copilot.github.com/ > > https://news.ycombinator.com/item?id=27676266 > > GPT-3+vscode is an emacs killer, but emacs is a much better platform for building this stuff, so please help! Thanks. Could you briefly elaborate on the capabilities of "GPT-3+vscode" and why you think it is an Emacs killer? What are (briefly) the capabilities of your package so far, and how does it compare to what the competition has? How much work remains if we would want to catch up? Thanks. ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-15 11:58 ` Stefan Kangas @ 2021-07-15 12:40 ` dick 2021-07-15 23:52 ` Shane Mulligan 2021-07-17 0:51 ` Richard Stallman 2 siblings, 0 replies; 75+ messages in thread From: dick @ 2021-07-15 12:40 UTC (permalink / raw) To: Stefan Kangas; +Cc: Shane Mulligan, Emacs developers SK> Could you briefly elaborate on the capabilities of "GPT-3+vscode" and why SK> you think it is an Emacs killer? “If the people believe there’s an imaginary river out there, you don’t tell them there’s no river there. You build an imaginary bridge over the imaginary river.” -- Nikita Kruschev The Copilot brouhaha takes the premature optimization and pointless speculation endemic to emacs-devel to another level. ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-15 11:58 ` Stefan Kangas 2021-07-15 12:40 ` dick @ 2021-07-15 23:52 ` Shane Mulligan 2021-07-16 7:30 ` tomas 2021-07-17 0:51 ` Richard Stallman 2 siblings, 1 reply; 75+ messages in thread From: Shane Mulligan @ 2021-07-15 23:52 UTC (permalink / raw) To: Stefan Kangas; +Cc: Emacs developers [-- Attachment #1: Type: text/plain, Size: 4714 bytes --] Hi Stefan and dick, * Reponse to Stefan ** Capabilities of "GPT-3+vscode" (Copilot) Copilot uses a specialised version of GPT-3 called codex which is optimised to generate code. Copilot is technologically capable of also querying (classifying) code. The simulateous usage of both AI generation and classification completely changes the way you use an editing environment. *** Why it's an emacs killer I must coin a term, imaginary programming, for the sake of shortening my explanation. Imaginary programming is imaginary in the mathematical and creative sense. It's a dimension of programming non-existent in emacs. It's stochastic and allows you to predict what will happen without needing to write correct code. Clear demonstration here: http://semiosis.github.io/posts/nlsh-natural-language-shell/ - meta-prompts (see below) - there is no GPT support for emacs. - This is the elephant in the room. - Asides from what me, a single person in the world is struggling to put together. I'm rate-limited. There needs to be core support for integrating and designing prompts. - I'm working on core features rather than building prompts. - An analogy: Copilot + GPT-3 + vscode are now firmly in the area of imaginary programming. completely missing with emacs. And that's very worrisome. - It's the cause of anxiety about what is the future of programming and is it going to make all coders redundant. - It's closed source so people are literally unable to see the way ahead. - It's an attack against open-source. ** Risky capabilities of Copilot in the near future - meta-prompts that encode the provenance of text. This is an existential risk to coding, generally, because right now, today, GitHub (Microsoft) and OpenAI (the not-for-profit turned for-profit) are encoding using Copilot the way that people are solving problems in order to further automate that process, and it's closed-source. - imaginary generation of user interfaces, such as emacs. There are decades worth of text from emacs online and on GitHub. That means GPT-3 and codex are quite capable of already imagining at least part of the emacs user interface via prompting. The question is, do you want emacs to be in control of your system or a copilot that you can't control? ** Capabilities of "Pen.el" My vision for Pen.el has always been much broader. - n-many language models - n-many prompts (classification and generation) - sharing prompts, open-source - file format for encoding the provenance of text - https://github.com/semiosis/ink.el - fully transparent - Use emacs as an interface to remain in control of conversations, whether they be infinitely many chatbots or solving communication barriers between people. - When I use words such as infinite, I mean it in the truest sense, and it's not hype. I've done my best to be a harbinger. - Literally, just select any topic and create a chatbot for it. - https://semiosis.github.io/posts/gpt-3-for-building-mind-maps-with-an-ai-tutor-for-any-topic/ *** To catch up and surpass and save open-source - We need a centralised repository of 'prompts', like melpa - This doesn't exist yet because the technology is closed-source inside GitHub copilot and there are not any standard formats. - I have made a start with this - https://github.com/semiosis/prompts/ - Pen needs to be integrated (working on this currently) - GPT-j needs to be integrated (working on this currently) I have many blog articles now of me trying to demonstrate what the capabilities of Pen are. https://mullikine.github.io/tags/pen/ * In reponse to dick's message: "The Copilot brouhaha takes the premature optimization and pointless speculation endemic to emacs-devel to another level." I do not believe this is pointless speculation. On Thu, Jul 15, 2021 at 11:58 PM Stefan Kangas <stefan@marxist.se> wrote: > Shane Mulligan <mullikine@gmail.com> writes: > > > Hey guys. It looks like OpenAI is collaborating with GitHub on their GPT > stuff, so any assistance in building an editor in emacs would be greatly > appreciated. I made a start 4 months ago, link below: > > > > https://github.com/semiosis/pen.el/ > > > > https://copilot.github.com/ > > > > https://news.ycombinator.com/item?id=27676266 > > > > GPT-3+vscode is an emacs killer, but emacs is a much better platform for > building this stuff, so please help! Thanks. > > Could you briefly elaborate on the capabilities of "GPT-3+vscode" and > why you think it is an Emacs killer? > > What are (briefly) the capabilities of your package so far, and how > does it compare to what the competition has? How much work remains if > we would want to catch up? > > Thanks. > [-- Attachment #2: Type: text/html, Size: 6287 bytes --] ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-15 23:52 ` Shane Mulligan @ 2021-07-16 7:30 ` tomas 2021-07-17 0:33 ` Shane Mulligan 2021-07-17 7:52 ` Jean Louis 0 siblings, 2 replies; 75+ messages in thread From: tomas @ 2021-07-16 7:30 UTC (permalink / raw) To: Shane Mulligan; +Cc: Stefan Kangas, Emacs developers [-- Attachment #1: Type: text/plain, Size: 1104 bytes --] On Fri, Jul 16, 2021 at 11:52:41AM +1200, Shane Mulligan wrote: > Hi Stefan and dick, > > * Reponse to Stefan > ** Capabilities of "GPT-3+vscode" (Copilot) > Copilot uses a specialised version of GPT-3 called codex which is optimised > to generate code. GPT-3 is not free software [1]. Only the service is accessible to us, mere mortals. As for Copilot, one could even argue that it harvests [2] free software at the costs of all of us. As far as I am concerned, I'll put as much distance as I can between myself and Copilot (or Github, for the same reasons). I often asked myself how Github could have been worth $7 billion to Microsoft. Now I begin to understand. Cheers [1] "Microsoft announced on September 22, 2020 that it had licensed 'exclusive' use of GPT-3; others can still use the public API to receive output, but only Microsoft has access to GPT-3’s underlying code." https://en.wikipedia.org/wiki/GPT-3 [2] https://mjg59.dreamwidth.org/ https://juliareda.eu/2021/07/github-copilot-is-not-infringing-your-copyright/ - tomás [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-16 7:30 ` tomas @ 2021-07-17 0:33 ` Shane Mulligan 2021-07-17 7:54 ` tomas 2021-07-17 7:52 ` Jean Louis 1 sibling, 1 reply; 75+ messages in thread From: Shane Mulligan @ 2021-07-17 0:33 UTC (permalink / raw) To: tomas; +Cc: Stefan Kangas, Emacs developers [-- Attachment #1: Type: text/plain, Size: 1576 bytes --] I completely agree with Tomas. The neural network weights should be given the same software license as the code it has been trained on and they need to do more to support and FOSS analog, as the technology. I am expecting a conversation with Nat Fridman on a collaboration to bring Copilot to emacs. If he contacts me I will raise this issue. On Fri, Jul 16, 2021 at 7:30 PM <tomas@tuxteam.de> wrote: > On Fri, Jul 16, 2021 at 11:52:41AM +1200, Shane Mulligan wrote: > > Hi Stefan and dick, > > > > * Reponse to Stefan > > ** Capabilities of "GPT-3+vscode" (Copilot) > > Copilot uses a specialised version of GPT-3 called codex which is > optimised > > to generate code. > > GPT-3 is not free software [1]. Only the service is accessible to us, > mere mortals. > > As for Copilot, one could even argue that it harvests [2] free software > at the costs of all of us. > > As far as I am concerned, I'll put as much distance as I can between > myself and Copilot (or Github, for the same reasons). > > I often asked myself how Github could have been worth $7 billion to > Microsoft. Now I begin to understand. > > Cheers > > [1] "Microsoft announced on September 22, 2020 that it had licensed > 'exclusive' use of GPT-3; others can still use the public API to > receive output, but only Microsoft has access to GPT-3’s underlying > code." > https://en.wikipedia.org/wiki/GPT-3 > > [2] https://mjg59.dreamwidth.org/ > > https://juliareda.eu/2021/07/github-copilot-is-not-infringing-your-copyright/ > > > - tomás > [-- Attachment #2: Type: text/html, Size: 2435 bytes --] ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-17 0:33 ` Shane Mulligan @ 2021-07-17 7:54 ` tomas 0 siblings, 0 replies; 75+ messages in thread From: tomas @ 2021-07-17 7:54 UTC (permalink / raw) To: Shane Mulligan; +Cc: Stefan Kangas, Emacs developers [-- Attachment #1: Type: text/plain, Size: 200 bytes --] On Sat, Jul 17, 2021 at 12:33:30PM +1200, Shane Mulligan wrote: > I completely agree with Tomas [...] Thanks for the links in your other post. It'll take me a while to mull over them :) Cheers - t [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-16 7:30 ` tomas 2021-07-17 0:33 ` Shane Mulligan @ 2021-07-17 7:52 ` Jean Louis 1 sibling, 0 replies; 75+ messages in thread From: Jean Louis @ 2021-07-17 7:52 UTC (permalink / raw) To: tomas; +Cc: Stefan Kangas, Shane Mulligan, Emacs developers * tomas@tuxteam.de <tomas@tuxteam.de> [2021-07-16 10:31]: > On Fri, Jul 16, 2021 at 11:52:41AM +1200, Shane Mulligan wrote: > > Hi Stefan and dick, > > > > * Reponse to Stefan > > ** Capabilities of "GPT-3+vscode" (Copilot) > > Copilot uses a specialised version of GPT-3 called codex which is optimised > > to generate code. > > GPT-3 is not free software [1]. Only the service is accessible to us, > mere mortals. > > As for Copilot, one could even argue that it harvests [2] free software > at the costs of all of us. > > As far as I am concerned, I'll put as much distance as I can between > myself and Copilot (or Github, for the same reasons). > > I often asked myself how Github could have been worth $7 billion to > Microsoft. Now I begin to understand. 👍👍👍 -- Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-15 11:58 ` Stefan Kangas 2021-07-15 12:40 ` dick 2021-07-15 23:52 ` Shane Mulligan @ 2021-07-17 0:51 ` Richard Stallman 2021-07-17 2:36 ` Shane Mulligan 2 siblings, 1 reply; 75+ messages in thread From: Richard Stallman @ 2021-07-17 0:51 UTC (permalink / raw) To: Stefan Kangas; +Cc: mullikine, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] The idea of "GPT for Emacs" is, alas, not an option. GPT-3 is nonfree software. I think it is not even released. Thus, we cannot include it in a free system; we cannot distribute it with Emacs. It would be possible to utilize GPT-3 running on Microsoft's server by sending it questions -- but that is SaaSS, which is an injustice similar to nonfree software. For explanation of this issue, see https://gnu.org/philosophy/who-does-that-server-really-serve.html. For ethical reasons we don't recommend SaaSS in GNU software, and a fortiori we don't distribute or recommend code to invoke SaaSS. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-17 0:51 ` Richard Stallman @ 2021-07-17 2:36 ` Shane Mulligan 2021-07-17 9:01 ` Eli Zaretskii 2021-07-17 23:53 ` Richard Stallman 0 siblings, 2 replies; 75+ messages in thread From: Shane Mulligan @ 2021-07-17 2:36 UTC (permalink / raw) To: rms; +Cc: Stefan Kangas, Emacs developers [-- Attachment #1: Type: text/plain, Size: 3104 bytes --] Thank you for tuning in Richard. :) I think the end-goal should be to have a close collaboration with EleutherAI, who already have an open-source alternative to the Copilot model. It's called GPT-j. ελευθερία is a greek word that means Freedom. EleutherAI are open-sourcing language models. The problem is that there are very few people within EleutherAI using emacs and few people who can help. If you'd please excuse my speculative musings, emacs has 40 years of design waiting to be augmented with GPT3 and I believe that emacs is way ahead of the competition. It's a gold rush really. Name a package and I can name an augmentation. GPT is orthogonal to coding the way macros are orthoganal to functions. emacs has tens of thousands of packages which are essentially just a skeleton for GPT to become the body, so this is why I recommend fostering a prompts repository right now. For example, take nano-emacs and turn it into the best writers environment ever. Take 'erc' and make it the first IRC client to automatically translate all messages into any type of dialect -- French, Klingon or Pirate. Company-mode + GPT = Copilot. Org-roam + GPT = A multiversal prose editor ( https://github.com/socketteer/loom) Org-brain + GPT = a mind map, which automatically generates and suggests nodes, then lets you talk to a chatbot tutor on any weird topic you can think of. VSCode literally cant do this stuff because it doesn't have the structure created yet. The biggest bottleneck to unlocking GPT-3's potential is the latency of the human imagination to cope with anything that departs from realism. I'm a little overwhelmed building Pen.el, but EleutherAI has been very helpful in supporting my project in guiding me to the right projects. It is, in my humble opinion, still important to foster a FOSS prompts repository in the meantime. https://www.eleuther.ai/projects/gpt-neox/ On Sat, Jul 17, 2021 at 12:51 PM Richard Stallman <rms@gnu.org> wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > The idea of "GPT for Emacs" is, alas, not an option. GPT-3 is nonfree > software. I think it is not even released. Thus, we cannot include > it in a free system; we cannot distribute it with Emacs. > > It would be possible to utilize GPT-3 running on Microsoft's server by > sending it questions -- but that is SaaSS, which is an injustice > similar to nonfree software. For explanation of this issue, see > https://gnu.org/philosophy/who-does-that-server-really-serve.html. > > For ethical reasons we don't recommend SaaSS in GNU software, and a > fortiori we don't distribute or recommend code to invoke SaaSS. > > > -- > Dr Richard Stallman (https://stallman.org) > Chief GNUisance of the GNU Project (https://gnu.org) > Founder, Free Software Foundation (https://fsf.org) > Internet Hall-of-Famer (https://internethalloffame.org) > > > [-- Attachment #2: Type: text/html, Size: 4152 bytes --] ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-17 2:36 ` Shane Mulligan @ 2021-07-17 9:01 ` Eli Zaretskii 2021-07-17 9:27 ` Shane Mulligan 2021-07-17 23:53 ` Richard Stallman 1 sibling, 1 reply; 75+ messages in thread From: Eli Zaretskii @ 2021-07-17 9:01 UTC (permalink / raw) To: Shane Mulligan; +Cc: stefan, rms, emacs-devel > From: Shane Mulligan <mullikine@gmail.com> > Date: Sat, 17 Jul 2021 14:36:15 +1200 > Cc: Stefan Kangas <stefan@marxist.se>, Emacs developers <emacs-devel@gnu.org> > > I think the end-goal should be to have a close collaboration with EleutherAI, who already have an > open-source alternative to the Copilot model. It's called GPT-j. > ελευθερία is a greek word that means Freedom. EleutherAI are open-sourcing language models. > The problem is that there are very few people within EleutherAI using emacs and few people who can help. I'm not sure I understand what features in Emacs this could enable. And the references you provided don't seem to answer this question (or maybe the answer is buried deeper than I'm prepared to dig at this point). I understand that EleutherAI doesn't seem to support programming at this point, only natural language (is that true?), but that still means there could be any number of useful features where it could help. But what are they? The stuff on the EleutherAI site is oriented towards people who work in the machine learning domain, not to programmers who design applications that could take advantage of those capabilities, so it's not easy to understand what these capabilities have in store for Emacs. Thus, description of relevant Emacs features, whether existing or imaginary, with enough details for us to be able to discuss that intelligently, will be appreciated. I don't think this discussion will be meaningful without at least some idea of what we are trying to accomplish. > If you'd please excuse my speculative musings, emacs has 40 years of design waiting to be augmented with > GPT3 and I believe that emacs is way ahead of the competition. It's a gold rush really. Why do you think Emacs is better fitted to this than other editors? It sounds like most of the processing is done server-side, so what exactly is the significance of Emacs being the client? > Name a package and I can name an augmentation. Is this based on what these services (EleutherAI in particular) can do, or are these just unrelated fantasies? We need ideas based on capabilities that exist, not on what could exist years from now. AI history is chock-full of ideas that didn't work out. > Take 'erc' and make it the first IRC client to automatically translate all messages into any type of dialect -- > French, Klingon or Pirate. How is this different from existing translation servers? > Company-mode + GPT = Copilot. I don't see how this is true. Copilot is not just generalized completion, and AFAIU doesn't fit into the presentation methods used by Company. What am I missing? > Org-roam + GPT = A multiversal prose editor (https://github.com/socketteer/loom) I couldn't understand what that does, looking at the above URL. Any details how it works and how it helps the writer? > Org-brain + GPT = a mind map, which automatically generates and suggests nodes, then lets you talk to a > chatbot tutor on any weird topic you can think of. Does this capability really exist? Thanks. ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-17 9:01 ` Eli Zaretskii @ 2021-07-17 9:27 ` Shane Mulligan 2021-07-17 21:02 ` Shane Mulligan 2021-07-17 21:35 ` Juri Linkov 0 siblings, 2 replies; 75+ messages in thread From: Shane Mulligan @ 2021-07-17 9:27 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Stefan Kangas, rms, Emacs developers [-- Attachment #1: Type: text/plain, Size: 6933 bytes --] Hi Eli, It's nice to talk again. At this stage I am only seeking to inform you of this new technology which will be transformative to programming and open-source and show you that we have some quick catching up to do to integrate into emacs so Microsoft does not have a monopoly on the technology. "I understand that EleutherAI doesn't seem to support programming at this point, only natural language (is that true?)" The existing models which are not optimised to do code, can still do code well. GPT-j is EleutherAI's code model. It's designed as a direct competitor to codex (the Copilot model) and trained on open-source code. The other part of Copilot is the automatic fine-tuning of the model to enable it to learn your behaviour. This would be trickier to distribute as a service open source and probably isn't necessary, but GPT-j supports it. "any number of useful features where it could help." Name an emacs package and I can explain how GPT will affect that package. For `dired-git-info-mode`, for instance, a model connected to GPT can explain what files are for. "Name a package and I can name an augmentation." This is not fantasy. I have many examples. I have blogged for this exact purpose, to explain to people what OpenAI will be working on behind closed doors, to build a version for emacs. - https://mullikine.github.io/posts/explainshell-with-gpt-3/ - https://mullikine.github.io/posts/nlsh-natural-language-shell/ - https://mullikine.github.io/posts/context-menus-based-on-gpt-3/ - https://mullikine.github.io/posts/autocompleting-anything-with-gpt-3-in-emacs/ - https://mullikine.github.io/posts/translating-haskell-to-clojure-with-gpt-3/ - https://mullikine.github.io/posts/a-natural-language-database-using-a-single-gpt-prompt/ - https://mullikine.github.io/posts/imaginary-programming-with-gpt-3/ - https://mullikine.github.io/posts/creating-a-playground-for-gpt-3-in-emacs/ "How is this different from existing translation servers?" GPT can replace Google search, Google translate, and many other services, and GPT can repond to requests with equal time for each request. It can also be used like stackoverflow to answer questions to many common problems. "Org-brain + GPT = a mind map, which automatically generates and suggests nodes, then lets you talk to a > chatbot tutor on any weird topic you can think of." Does this capability really exist? Yes it does I have demonstated it. - https://mullikine.github.io/posts/gpt-3-for-building-mind-maps-with-an-ai-tutor-for-any-topic/ This is on my readme for my GPT project for emacs which supports GPT-3 and EleutherAI. https://github.com/semiosis/pen.el At its heart, emacs is an operating system based on a tty, which is a text stream. emacs supports a text-only mode. This makes it ideally suited for training a LM such as a GPT (Generative Pre-trained Transformer). emacs lisp provides a skeleton on which NLP functions can built around. Ultimately, emacs will become a fractal in the latent space of a future LM (language model). A graphical editor would not benefit from this effect until much later on. emacs could, if supported, become the vehicle for controllable text generation, or has the potential to become that, only actually surpassed when the imaginary programming environment is normal and other interfaces can be prompted into existence. Between then and now we can write prompt functions to help preserve emacs. This is my inspiration for the project. It sounds like science fiction, I know. On Sat, Jul 17, 2021 at 9:01 PM Eli Zaretskii <eliz@gnu.org> wrote: > > From: Shane Mulligan <mullikine@gmail.com> > > Date: Sat, 17 Jul 2021 14:36:15 +1200 > > Cc: Stefan Kangas <stefan@marxist.se>, Emacs developers < > emacs-devel@gnu.org> > > > > I think the end-goal should be to have a close collaboration with > EleutherAI, who already have an > > open-source alternative to the Copilot model. It's called GPT-j. > > ελευθερία is a greek word that means Freedom. EleutherAI are > open-sourcing language models. > > The problem is that there are very few people within EleutherAI using > emacs and few people who can help. > > I'm not sure I understand what features in Emacs this could enable. > And the references you provided don't seem to answer this question (or > maybe the answer is buried deeper than I'm prepared to dig at this > point). I understand that EleutherAI doesn't seem to support > programming at this point, only natural language (is that true?), but > that still means there could be any number of useful features where it > could help. But what are they? The stuff on the EleutherAI site is > oriented towards people who work in the machine learning domain, not > to programmers who design applications that could take advantage of > those capabilities, so it's not easy to understand what these > capabilities have in store for Emacs. > > Thus, description of relevant Emacs features, whether existing or > imaginary, with enough details for us to be able to discuss that > intelligently, will be appreciated. I don't think this discussion > will be meaningful without at least some idea of what we are trying to > accomplish. > > > If you'd please excuse my speculative musings, emacs has 40 years of > design waiting to be augmented with > > GPT3 and I believe that emacs is way ahead of the competition. It's a > gold rush really. > > Why do you think Emacs is better fitted to this than other editors? > It sounds like most of the processing is done server-side, so what > exactly is the significance of Emacs being the client? > > > Name a package and I can name an augmentation. > > Is this based on what these services (EleutherAI in particular) can > do, or are these just unrelated fantasies? We need ideas based on > capabilities that exist, not on what could exist years from now. AI > history is chock-full of ideas that didn't work out. > > > Take 'erc' and make it the first IRC client to automatically translate > all messages into any type of dialect -- > > French, Klingon or Pirate. > > How is this different from existing translation servers? > > > Company-mode + GPT = Copilot. > > I don't see how this is true. Copilot is not just generalized > completion, and AFAIU doesn't fit into the presentation methods used > by Company. What am I missing? > > > Org-roam + GPT = A multiversal prose editor ( > https://github.com/socketteer/loom) > > I couldn't understand what that does, looking at the above URL. Any > details how it works and how it helps the writer? > > > Org-brain + GPT = a mind map, which automatically generates and suggests > nodes, then lets you talk to a > > chatbot tutor on any weird topic you can think of. > > Does this capability really exist? > > Thanks. > [-- Attachment #2: Type: text/html, Size: 8973 bytes --] ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-17 9:27 ` Shane Mulligan @ 2021-07-17 21:02 ` Shane Mulligan 2021-07-18 5:38 ` Jean Louis ` (2 more replies) 2021-07-17 21:35 ` Juri Linkov 1 sibling, 3 replies; 75+ messages in thread From: Shane Mulligan @ 2021-07-17 21:02 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Stefan Kangas, rms, Emacs developers [-- Attachment #1: Type: text/plain, Size: 8712 bytes --] The following is why emacs needs open-source prompts -- ones that don't learn from you or are sold to you - Ones that you write for yourself. - An open-source prompts melpa at the very least! As I tried to describe before, it's a fundamentally new way of programming. An extension of Donald Knuth's literate programming becoming imaginary programming, but being hijacked by microsoft. Microsft GPT is an attack on the innermost workings of emacs -- the text stream. So embracing the OpenSource alternatives from EleutherAI is crucial. I have said enough. I leave you with this article. https://venturebeat.com/2021/07/16/openai-disbands-its-robotics-research-team/ """ OpenAI said recently that GPT-3 is now being used in more than 300 different apps by “tens of thousands” of developers and producing 4.5 billion words per day.) Toward the end of 2020, Microsoft announced that it would exclusively license GPT-3 to develop and deliver AI solutions for customers, as well as creating new products that harness the power of NLG. Microsoft recently announced that GPT-3 will be integrated “deeply” with Power Apps, its low-code app development platform — specifically for formula generation. The AI- powered features will allow a user building an ecommerce app, for example, to describe a programming goal using conversational language like “find products where the name starts with ‘kids.'” """" On Sat, Jul 17, 2021 at 9:27 PM Shane Mulligan <mullikine@gmail.com> wrote: > Hi Eli, > It's nice to talk again. > > At this stage I am only seeking to inform you of this new technology > which will be transformative to programming and open-source and show you > that we have some quick catching up to do to integrate into emacs so > Microsoft does not have a monopoly on the technology. > > > "I understand that EleutherAI doesn't seem to support > programming at this point, only natural language (is that true?)" > > The existing models which are not optimised to do code, can still do > code well. > > GPT-j is EleutherAI's code model. It's designed as a direct > competitor to codex (the Copilot model) and trained on open-source code. > > The other part of Copilot is the automatic fine-tuning of the model to > enable it to learn your behaviour. > > This would be trickier to distribute as a service open source and > probably isn't necessary, but GPT-j supports it. > > "any number of useful features where it could help." > Name an emacs package and I can explain how GPT will affect that package. > For `dired-git-info-mode`, for instance, a model connected to GPT can > explain what files are for. > > "Name a package and I can name an augmentation." > This is not fantasy. I have many examples. > > I have blogged for this exact purpose, to explain to people what OpenAI > will be working on behind closed doors, to build a version for emacs. > > - https://mullikine.github.io/posts/explainshell-with-gpt-3/ > - https://mullikine.github.io/posts/nlsh-natural-language-shell/ > - https://mullikine.github.io/posts/context-menus-based-on-gpt-3/ > - > https://mullikine.github.io/posts/autocompleting-anything-with-gpt-3-in-emacs/ > - > https://mullikine.github.io/posts/translating-haskell-to-clojure-with-gpt-3/ > - > https://mullikine.github.io/posts/a-natural-language-database-using-a-single-gpt-prompt/ > - https://mullikine.github.io/posts/imaginary-programming-with-gpt-3/ > - > https://mullikine.github.io/posts/creating-a-playground-for-gpt-3-in-emacs/ > > "How is this different from existing translation servers?" > GPT can replace Google search, Google translate, and many other > services, and GPT can repond to requests with equal time for each > request. It can also be used like stackoverflow to answer questions to > many common problems. > > "Org-brain + GPT = a mind map, which automatically generates and > suggests nodes, then lets you talk to a > > chatbot tutor on any weird topic you can think of." > > Does this capability really exist? > > Yes it does I have demonstated it. > > - > https://mullikine.github.io/posts/gpt-3-for-building-mind-maps-with-an-ai-tutor-for-any-topic/ > > This is on my readme for my GPT project for emacs which supports GPT-3 > and EleutherAI. > > https://github.com/semiosis/pen.el > > At its heart, emacs is an operating system based on a tty, which is a > text stream. > > emacs supports a text-only mode. This makes it ideally suited for > training a LM such as a GPT (Generative Pre-trained Transformer). > > emacs lisp provides a skeleton on which NLP functions can built around. > Ultimately, emacs will become a fractal in the latent space of a future > LM (language model). A graphical editor would not benefit from this > effect until much later on. > > emacs could, if supported, become the vehicle for controllable text > generation, or has the potential to become that, only actually surpassed > when the imaginary programming environment is normal and other > interfaces can be prompted into existence. > > Between then and now we can write prompt functions to help preserve > emacs. > > This is my inspiration for the project. It sounds like science fiction, I > know. > > > On Sat, Jul 17, 2021 at 9:01 PM Eli Zaretskii <eliz@gnu.org> wrote: > >> > From: Shane Mulligan <mullikine@gmail.com> >> > Date: Sat, 17 Jul 2021 14:36:15 +1200 >> > Cc: Stefan Kangas <stefan@marxist.se>, Emacs developers < >> emacs-devel@gnu.org> >> > >> > I think the end-goal should be to have a close collaboration with >> EleutherAI, who already have an >> > open-source alternative to the Copilot model. It's called GPT-j. >> > ελευθερία is a greek word that means Freedom. EleutherAI are >> open-sourcing language models. >> > The problem is that there are very few people within EleutherAI using >> emacs and few people who can help. >> >> I'm not sure I understand what features in Emacs this could enable. >> And the references you provided don't seem to answer this question (or >> maybe the answer is buried deeper than I'm prepared to dig at this >> point). I understand that EleutherAI doesn't seem to support >> programming at this point, only natural language (is that true?), but >> that still means there could be any number of useful features where it >> could help. But what are they? The stuff on the EleutherAI site is >> oriented towards people who work in the machine learning domain, not >> to programmers who design applications that could take advantage of >> those capabilities, so it's not easy to understand what these >> capabilities have in store for Emacs. >> >> Thus, description of relevant Emacs features, whether existing or >> imaginary, with enough details for us to be able to discuss that >> intelligently, will be appreciated. I don't think this discussion >> will be meaningful without at least some idea of what we are trying to >> accomplish. >> >> > If you'd please excuse my speculative musings, emacs has 40 years of >> design waiting to be augmented with >> > GPT3 and I believe that emacs is way ahead of the competition. It's a >> gold rush really. >> >> Why do you think Emacs is better fitted to this than other editors? >> It sounds like most of the processing is done server-side, so what >> exactly is the significance of Emacs being the client? >> >> > Name a package and I can name an augmentation. >> >> Is this based on what these services (EleutherAI in particular) can >> do, or are these just unrelated fantasies? We need ideas based on >> capabilities that exist, not on what could exist years from now. AI >> history is chock-full of ideas that didn't work out. >> >> > Take 'erc' and make it the first IRC client to automatically translate >> all messages into any type of dialect -- >> > French, Klingon or Pirate. >> >> How is this different from existing translation servers? >> >> > Company-mode + GPT = Copilot. >> >> I don't see how this is true. Copilot is not just generalized >> completion, and AFAIU doesn't fit into the presentation methods used >> by Company. What am I missing? >> >> > Org-roam + GPT = A multiversal prose editor ( >> https://github.com/socketteer/loom) >> >> I couldn't understand what that does, looking at the above URL. Any >> details how it works and how it helps the writer? >> >> > Org-brain + GPT = a mind map, which automatically generates and >> suggests nodes, then lets you talk to a >> > chatbot tutor on any weird topic you can think of. >> >> Does this capability really exist? >> >> Thanks. >> > [-- Attachment #2: Type: text/html, Size: 11394 bytes --] ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-17 21:02 ` Shane Mulligan @ 2021-07-18 5:38 ` Jean Louis 2021-07-18 5:38 ` Jean Louis 2021-07-18 6:42 ` Eli Zaretskii 2 siblings, 0 replies; 75+ messages in thread From: Jean Louis @ 2021-07-18 5:38 UTC (permalink / raw) To: Shane Mulligan; +Cc: Eli Zaretskii, Stefan Kangas, rms, Emacs developers Issues related to AI: ===================== - obviously there are licensing issues, taking snippets from everywhere without contribution and licensing compliance have caused recently so much discussion and protest. I am actually glad for that as people are getting aware that GPL is protecting their work and it is now clear how much Github is abusing the GPL and other free software. - And I do not think it should be in GNU ELPA due to above reasons. To try the software functionality, I have pulled your Git again. However, now there is directory change and installation is not straight forward. Why don't you simply make an Emacs package as .tar as described in Emacs Lisp manual? See: (info "(elisp) Multi-file Packages") or at least make sure that user can add the load path: (add-to-list 'load-path (expand-file-name ".")) and then: M-x load-library RET pen RET I cannot load it that way as it currently it asks for the package `projectile' is it really necessary? -- Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-17 21:02 ` Shane Mulligan 2021-07-18 5:38 ` Jean Louis @ 2021-07-18 5:38 ` Jean Louis 2021-07-18 7:03 ` Eli Zaretskii 2021-07-18 6:42 ` Eli Zaretskii 2 siblings, 1 reply; 75+ messages in thread From: Jean Louis @ 2021-07-18 5:38 UTC (permalink / raw) To: Shane Mulligan; +Cc: Eli Zaretskii, Stefan Kangas, rms, Emacs developers * Shane Mulligan <mullikine@gmail.com> [2021-07-18 00:03]: > Microsft GPT is an attack on the innermost workings of emacs -- the text > stream. So embracing the OpenSource alternatives from EleutherAI is > crucial. How does that solves the licensing problems? -- Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-18 5:38 ` Jean Louis @ 2021-07-18 7:03 ` Eli Zaretskii 2021-07-18 8:00 ` Shane Mulligan 0 siblings, 1 reply; 75+ messages in thread From: Eli Zaretskii @ 2021-07-18 7:03 UTC (permalink / raw) To: Jean Louis; +Cc: stefan, mullikine, rms, emacs-devel > Date: Sun, 18 Jul 2021 08:38:52 +0300 > From: Jean Louis <bugs@gnu.support> > Cc: Eli Zaretskii <eliz@gnu.org>, Stefan Kangas <stefan@marxist.se>, > rms@gnu.org, Emacs developers <emacs-devel@gnu.org> > > * Shane Mulligan <mullikine@gmail.com> [2021-07-18 00:03]: > > Microsft GPT is an attack on the innermost workings of emacs -- the text > > stream. So embracing the OpenSource alternatives from EleutherAI is > > crucial. > > How does that solves the licensing problems? Please take discussions of the GPT and OpenAI licensing to emacs-tangents. It isn't relevant to this list. ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-18 7:03 ` Eli Zaretskii @ 2021-07-18 8:00 ` Shane Mulligan 2021-07-19 17:00 ` Jean Louis 0 siblings, 1 reply; 75+ messages in thread From: Shane Mulligan @ 2021-07-18 8:00 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Stefan Kangas, rms, Jean Louis, Emacs developers [-- Attachment #1: Type: text/plain, Size: 4706 bytes --] * Thank you all for your attention I will move my conversations into emacs- tangents@gnu.org until asked otherwise to return. ** Response to Juri Linkov > Can it help to write git commit messages from diffs? > Has anyone tried to train a model on the existing > git commit logs? This would be a killer feature. Now you are catching on. I already have a task for myself to do this when I have to time. I am focusing on core features though. ** Response to Richard Stallman *** What does pen.el do? Pen.el stands for Prompt Engineering in emacs. Prompt Engineering is the art of describing what you would like a language model (transformer) to do. It is a new type of programming, example oriented; like literate programming, but manifested automatically. A transformer takes some text (called a prompt) and continues it. However, the continuation is the superset of all NLP tasks, as the generation can also be a classification, for instance. Those NLP tasks extend beyond world languages and into programming languages (whatever has been 'indexed' or 'learned') from these large LMs. Pen.el is an editing environment for designing 'prompts' to LMs. It is better than anything that exists, even at OpenAI or at Microsoft. I have been working on it and preparing for this for a long time. These prompts are example- based tasks. There are a number of design patterns which Pen.el is seeking to encode into a domain-specific language called 'examplary' for example- oriented programming. Pen.el creates functions 1:1 for a prompt to an emacs lisp function. Emacs is Grammarly, Google Translate, Copilot, Stackoveflow and infinitely many other services all rolled into one and allows you to have a private parallel to all these services that is completely private and open source -- that is if you have downloaded the EleutherAI model locally. *** What is its relation to GPT-3? Pen.el is GPL and completely separate from GPT-3 but currently GPT-3 is the only standardised service in which to model the the prompt-engineering workflow towards. No such API or standard exists yet and so I am designing my own interface and prompt format standard. *** "Perhaps that is a good path, but we need to know more." I encourage GNU to look into EleutherAI. I will continue to work with EleutherAI and make it a priority to bring this technology to emacs. However, I have absolutely no support by anyone and this project is too big for me alone. I'm focusing on core Pen.el features right now and will seek help from the EleutherAI community to build the open-source component and host the free GPT. ** Response to Jean Louis - And I do not think it should be in GNU ELPA due to above reasons. I am glad I have forewarned you guys. This is my current goal. Help in my project would be appreciated. I cannot do it alone and I cannot convince all of you. > Why don't you simply make an Emacs package as .tar as described in Emacs Lisp manual? Thank you for taking a look at my emacs package. It's not ready net for Melpa merge. I hope that I will be able to find some help in order to prepare it, but the rules are very strict and this may not happen. > How does that solves the licensing problems? The current EleutherAI model which competes with GPT-3 is GPT-Neo. It is MIT licensed. Also the data it has been trained on is MIT licensed. The current EleutherAI model which competes with Codex is GPT-j. It is licensed with Apache-2.0 License Both models are trained on The Pile, which is licensed MIT. https://github.com/EleutherAI/the-pile > "AI-related developments out there, and who purchased whom and for how much, is not appropriate." Understood. ** Thank you all for your time Best of luck and contact me any time. Sincerely, Shane Shane Mulligan How to contact me: 🇦🇺 00 61 421 641 250 🇳🇿 00 64 21 1462 759 <+64-21-1462-759> mullikine@gmail.com On Sun, Jul 18, 2021 at 7:03 PM Eli Zaretskii <eliz@gnu.org> wrote: > > Date: Sun, 18 Jul 2021 08:38:52 +0300 > > From: Jean Louis <bugs@gnu.support> > > Cc: Eli Zaretskii <eliz@gnu.org>, Stefan Kangas <stefan@marxist.se>, > > rms@gnu.org, Emacs developers <emacs-devel@gnu.org> > > > > * Shane Mulligan <mullikine@gmail.com> [2021-07-18 00:03]: > > > Microsft GPT is an attack on the innermost workings of emacs -- the > text > > > stream. So embracing the OpenSource alternatives from EleutherAI is > > > crucial. > > > > How does that solves the licensing problems? > > Please take discussions of the GPT and OpenAI licensing to > emacs-tangents. It isn't relevant to this list. > [-- Attachment #2: Type: text/html, Size: 7249 bytes --] ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-18 8:00 ` Shane Mulligan @ 2021-07-19 17:00 ` Jean Louis 2021-07-23 6:51 ` Shane Mulligan 0 siblings, 1 reply; 75+ messages in thread From: Jean Louis @ 2021-07-19 17:00 UTC (permalink / raw) To: Shane Mulligan; +Cc: Eli Zaretskii, emacs-tangents, Stefan Kangas, rms * Shane Mulligan <mullikine@gmail.com> [2021-07-18 11:01]: > Pen.el stands for Prompt Engineering in emacs. > Prompt Engineering is the art of describing what you would > like a language model (transformer) to do. It is a new type of programming, > example oriented; > like literate programming, but manifested automatically. Sounds like a replacement for a programmer's mind. > A transformer takes some text (called a prompt) and continues > it. However, the continuation is the superset of all NLP tasks, Where is definition of the abbreviation NLP? > as the generation can also be a classification, for instance. Those > NLP tasks extend beyond world languages and into programming > languages (whatever has been 'indexed' or 'learned') from these > large LMs. What is definition of the abbreviation LM? > Pen.el is an editing environment for designing 'prompts' to LMs. It > is better than anything that exists, even at OpenAI or at > Microsoft. I have been working on it and preparing for this for a > long time. Good, but is there a video to show what it really does? > These prompts are example- based tasks. There are a number of design > patterns which Pen.el is seeking to encode into a domain-specific > language called 'examplary' for example- oriented programming. Do you mean "exemplary" or "examplary", is it spelling mistake? I have to ask as your description is still pretty abstract without particular example. > Pen.el creates functions 1:1 for a prompt to an emacs lisp function. The above does not tell me anything. > Emacs is Grammarly, Google Translate, Copilot, Stackoveflow and > infinitely many other services all rolled into one and allows you to > have a private parallel to all these services that is completely > private and open source -- that is if you have downloaded the > EleutherAI model locally. I understand that it is kind of fetching information, but that does not solve licensing issues, it sounds like licensing hell. > ** Response to Jean Louis > - And I do not think it should be in GNU ELPA due to above reasons. > > I am glad I have forewarned you guys. This is my current goal. Help > in my project would be appreciated. I cannot do it alone and I > cannot convince all of you. Why don't you tell about licensing issues? Taking code without proper licensing compliance is IMHO, not an option. It sounds as problem generator. > > Why don't you simply make an Emacs package as .tar as described in Emacs > Lisp manual? > Thank you for taking a look at my emacs package. It's not ready net > for Melpa merge. I hope that I will be able to find some help in > order to prepare it, but the rules are very strict and this may not > happen. I did not say to put it in Melpa. Package you can make for yourself and users so that users can M-x package-install-file That is really not related to any online Emacs package repository. It is way how to install Emacs packages no matter where one gets it. > > How does that solves the licensing problems? > The current EleutherAI model which competes with GPT-3 is GPT-Neo. > It is MIT licensed. That is good. But the code that is generated and injected requires proper contribution. > Also the data it has been trained on is MIT licensed. Yes, and then the program should also solve the proper contributions automatically. You cannot just say "MIT licensed", this has to be proven, source has to be found and proper attributions applied. Why don't you implement proper licensing? Please find ONE license that you are using from code that is being used as database for generation of future code and provide link to it. Then show how is license complied to. > The current EleutherAI model which competes with Codex is GPT-j. > It is licensed with Apache-2.0 License That is good, but I am referring to the generated code. -- Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-19 17:00 ` Jean Louis @ 2021-07-23 6:51 ` Shane Mulligan 2021-07-23 10:12 ` Jean Louis 2021-07-25 1:06 ` Richard Stallman 0 siblings, 2 replies; 75+ messages in thread From: Shane Mulligan @ 2021-07-23 6:51 UTC (permalink / raw) To: Jean Louis; +Cc: Eli Zaretskii, emacs-tangents, Stefan Kangas, rms [-- Attachment #1: Type: text/plain, Size: 10102 bytes --] Hi Jean and GNU friends, GPT is potentially the best thing to happen to emacs in a very long time. It will bring back power from the corporations and save it to your computer, open source and transparent, and offline. Please consider including a collaborative, open source prompts repository in the next version of emacs. So far I'm yet to see anything like it, but I see in commercial products everywhere that they have full domain over this new type of code. I am trying to build up relationships in my project Pen.el with others who value open source. gptprompts.org, for example. This is to create a catalogue for pen.el. One thing we have just introduced is a field to specify a licence for each prompt. However, I must say that prompts are more like functions. *Soft prompts* are very granular prompts as they have been reduced to a minimal number of characters using optimisation. https://arxiv.org/abs/2104.08691 Therefore, there must be support for prompting at the syntax level of emacs, in my opinion. And it is also clear now that since a prompt looks more like binary code, that this is a new type of function definition and a new type of programming is emerging. A prompt function is a function defined by a version of a Language Model (LM) and a prompt (input), but as is the case in haskell, every function may be reduced to one that takes a single input and returns a single output. In other words, most prompt functions will be parameterised and have an arity greater than one. I am building a collaborative imaginary programming environment in emacs. This is an editing environment where people can integrate LMs into emacs, extending emacs with prompt functions. The power of this is profound and beyond belief. I have coined the term "prompt functions", so don't expect to be able to find it online if you go searching. Here is a new corporation which is creating a prompt engineering environment. However, they do not have their own operating system to integrate prompting into. That's why emacs is years ahead, potentially. A prompt is merely a function with a language model as a parameter. Without integration, it's quite useless. https://gpt3demo.com/apps/mantiumai I think a prompts database -- something like Datomic or other RDF-like, immutable storage must be added into GNU organisation to store selected prompts and generations, and a GPL or EleutherAI GPT model is ultimately integrated into core emacs via some low level syntax through partnership with EleutherAI. I would expect in the future to download emacs along with an open- source GPT model, and be able to create prompt functions as easily as creating macros. A 1:1 prompt:function database of sorts is a good starting point in my opinion, but remembering the generations is also important. But the scale is immense. This is why a p2p database that can remember immutably is important, in my opinion. If this seems too grand of scale, then at the very least consider a GNU prompts repository. > Sounds like a replacement for a programmer's mind. Yes it is. It trivialises the implementation and requires that programmers now be more imaginative, and will be supported by the language model. Rather than writing an implementation, function is defined by the input types and a Language Model and version of the language model. > Where is definition of the abbreviation NLP? NLP stands for Natural Language Processing. Until recently, code was not considered part of that domain, but the truth is NLP algorithms are extremely useful for code generation, code search and code understanding. > What is definition of the abbreviation LM? LM stands for Language Model. It is a statistical model of language, rather than use formal grammars. Emacs lisp functions and macros do not have a syntax for stochastic/probabilistic programming. Good, but is there a video to show what it really does? Here is an online catalogue of GPT tools. Pen.el is among the developer tools. https://gpt3demo.com/category/developer-tools =Pen.el= and emacs has the potential to do all the things for all of the products in =gpt3demo.com=. > I would like to demonstrate Pen.el with this particular video which I have created to demonstrate a new type of programming -- collaborative within a language model. https://mullikine.github.io/posts/caching-and-saving-results-of-prompt-functions-in-pen-el/ https://asciinema.org/a/MhOU0eMnJsRpXf2Ak9YStPlz8 > Do you mean "exemplary" or "examplary", is it spelling mistake? I am building a DSL for encoding prompt design patterns to generate prompt functions for emacs. http://github.com/semiosis/pen.el/blob/master/src/pen-examplary.el > Pen.el creates functions 1:1 for a prompt to an emacs lisp function. What this means is that a prompt may be parameterized to define a relation (i.e. function) and therefore code and I have chosen to create one parameterized function per prompt. The prompt text once associated to a LM becomes a type of query (i.e. code), so prompts should not be discounted as being any less than such, and qualify for the GPL3 license. > I understand that it is kind of fetching information, but that does not solve licensing issues, it sounds like licensing hell. This is exactly why a GPL LM or compatible LM is absolutely crucial and needs to be integrated, otherwise all imaginary code will be violating and harvesting open source for the foreseeable future as there is no alternative. Sincerely, Shane . Shane Mulligan How to contact me: 🇦🇺 00 61 421 641 250 🇳🇿 00 64 21 1462 759 <+64-21-1462-759> mullikine@gmail.com On Tue, Jul 20, 2021 at 5:04 AM Jean Louis <bugs@gnu.support> wrote: > * Shane Mulligan <mullikine@gmail.com> [2021-07-18 11:01]: > > Pen.el stands for Prompt Engineering in emacs. > > Prompt Engineering is the art of describing what you would > > like a language model (transformer) to do. It is a new type of > programming, > > example oriented; > > like literate programming, but manifested automatically. > > Sounds like a replacement for a programmer's mind. > > > A transformer takes some text (called a prompt) and continues > > it. However, the continuation is the superset of all NLP tasks, > > Where is definition of the abbreviation NLP? > > > as the generation can also be a classification, for instance. Those > > NLP tasks extend beyond world languages and into programming > > languages (whatever has been 'indexed' or 'learned') from these > > large LMs. > > What is definition of the abbreviation LM? > > > Pen.el is an editing environment for designing 'prompts' to LMs. It > > is better than anything that exists, even at OpenAI or at > > Microsoft. I have been working on it and preparing for this for a > > long time. > > Good, but is there a video to show what it really does? > > > These prompts are example- based tasks. There are a number of design > > patterns which Pen.el is seeking to encode into a domain-specific > > language called 'examplary' for example- oriented programming. > > Do you mean "exemplary" or "examplary", is it spelling mistake? > > I have to ask as your description is still pretty abstract without > particular example. > > > Pen.el creates functions 1:1 for a prompt to an emacs lisp function. > > The above does not tell me anything. > > > Emacs is Grammarly, Google Translate, Copilot, Stackoveflow and > > infinitely many other services all rolled into one and allows you to > > have a private parallel to all these services that is completely > > private and open source -- that is if you have downloaded the > > EleutherAI model locally. > > I understand that it is kind of fetching information, but that does > not solve licensing issues, it sounds like licensing hell. > > > ** Response to Jean Louis > > - And I do not think it should be in GNU ELPA due to above reasons. > > > > I am glad I have forewarned you guys. This is my current goal. Help > > in my project would be appreciated. I cannot do it alone and I > > cannot convince all of you. > > Why don't you tell about licensing issues? Taking code without proper > licensing compliance is IMHO, not an option. It sounds as problem > generator. > > > > Why don't you simply make an Emacs package as .tar as described in > Emacs > > Lisp manual? > > > Thank you for taking a look at my emacs package. It's not ready net > > for Melpa merge. I hope that I will be able to find some help in > > order to prepare it, but the rules are very strict and this may not > > happen. > > I did not say to put it in Melpa. Package you can make for yourself > and users so that users can M-x package-install-file > > That is really not related to any online Emacs package repository. It > is way how to install Emacs packages no matter where one gets it. > > > > How does that solves the licensing problems? > > The current EleutherAI model which competes with GPT-3 is GPT-Neo. > > It is MIT licensed. > > That is good. > > But the code that is generated and injected requires proper > contribution. > > > Also the data it has been trained on is MIT licensed. > > Yes, and then the program should also solve the proper contributions > automatically. You cannot just say "MIT licensed", this has to be > proven, source has to be found and proper attributions applied. > > Why don't you implement proper licensing? > > Please find ONE license that you are using from code that is being > used as database for generation of future code and provide link to > it. Then show how is license complied to. > > > The current EleutherAI model which competes with Codex is GPT-j. > > It is licensed with Apache-2.0 License > > That is good, but I am referring to the generated code. > > -- > Jean > > Take action in Free Software Foundation campaigns: > https://www.fsf.org/campaigns > > In support of Richard M. Stallman > https://stallmansupport.org/ > [-- Attachment #2: Type: text/html, Size: 13458 bytes --] ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-23 6:51 ` Shane Mulligan @ 2021-07-23 10:12 ` Jean Louis 2021-07-23 10:54 ` Eli Zaretskii 2021-07-25 1:06 ` Richard Stallman 1 sibling, 1 reply; 75+ messages in thread From: Jean Louis @ 2021-07-23 10:12 UTC (permalink / raw) To: Shane Mulligan; +Cc: Eli Zaretskii, Stefan Kangas, emacs-tangents, rms * Shane Mulligan <mullikine@gmail.com> [2021-07-23 10:07]: > Hi Jean and GNU friends, > > GPT is potentially the best thing to happen to emacs in a very long time. > It will bring back power from the corporations and save it to your > computer, open source and transparent, and offline. Description is too apstract. Show me how it brought power from corporations and saved it to your computer? The term "open source" is vague, you know it? So I do not know what is meant with it. There is no clear licensing that you are proposing, neither so far you explained how is licensing solved. If you have not read the licenses please let me know, as then you most probably do not know to what I am referring. Example: You are using artificial intelligence, but in fact pieces of codes are in chunks copied from other sources without attribution and without knowing if licenses are compatible. If that issue is not solve I do not see why would anybody serious use AI to create code as that would potentially generate so many legal problems. And it does so now. So many people gave up on Github because Github does not comply to licenses when using Copilot. You are more or less proposing the same conflict to come to Emacs and I did not see where is your solution? > I understand that it is kind of fetching information, but that does > not solve licensing issues, it sounds like licensing hell. This is > exactly why a GPL LM or compatible LM is absolutely crucial and > needs to be integrated, otherwise all imaginary code will be > violating and harvesting open source for the foreseeable future as > there is no alternative. So how? That should be first to start with as one cannot even experiment in public without it. Experimenting at home is fine, but as soon as anything is published in public without compliance to licenses it generates problems. -- Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-23 10:12 ` Jean Louis @ 2021-07-23 10:54 ` Eli Zaretskii 2021-07-23 11:32 ` Jean Louis 0 siblings, 1 reply; 75+ messages in thread From: Eli Zaretskii @ 2021-07-23 10:54 UTC (permalink / raw) To: Jean Louis; +Cc: stefan, emacs-tangents, mullikine, rms > Date: Fri, 23 Jul 2021 13:12:12 +0300 > From: Jean Louis <bugs@gnu.support> > Cc: Eli Zaretskii <eliz@gnu.org>, Stefan Kangas <stefan@marxist.se>, > emacs-tangents@gnu.org, rms@gnu.org > > You are using artificial intelligence, but in fact pieces of codes are > in chunks copied from other sources without attribution and without > knowing if licenses are compatible. That's not what happens with these services: they don't _copy_ code from other software (that won't work, because the probability of the variables being called by other names is 100%, and thus such code, if pasted into your program, will not compile). What they do, they extract ideas and algorithms from those other places, and express them in terms of your variables and your data types. So licenses are not relevant here. ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-23 10:54 ` Eli Zaretskii @ 2021-07-23 11:32 ` Jean Louis 2021-07-23 11:51 ` Eli Zaretskii 2021-07-24 1:14 ` Richard Stallman 0 siblings, 2 replies; 75+ messages in thread From: Jean Louis @ 2021-07-23 11:32 UTC (permalink / raw) To: Eli Zaretskii; +Cc: stefan, emacs-tangents, mullikine, rms * Eli Zaretskii <eliz@gnu.org> [2021-07-23 13:55]: > > Date: Fri, 23 Jul 2021 13:12:12 +0300 > > From: Jean Louis <bugs@gnu.support> > > Cc: Eli Zaretskii <eliz@gnu.org>, Stefan Kangas <stefan@marxist.se>, > > emacs-tangents@gnu.org, rms@gnu.org > > > > You are using artificial intelligence, but in fact pieces of codes are > > in chunks copied from other sources without attribution and without > > knowing if licenses are compatible. > > That's not what happens with these services: they don't _copy_ code > from other software (that won't work, because the probability of the > variables being called by other names is 100%, and thus such code, if > pasted into your program, will not compile). What they do, they > extract ideas and algorithms from those other places, and express them > in terms of your variables and your data types. So licenses are not > relevant here. According to online reviews chunks of code is copied even verbatim and people find from where. Even if modified, it still requires licensing compliance. If code compiles or not is irrelevant. If one runs it or not is also irrelevant, code need not even run. I do not believe that any of the AI-s so far "extract ideas". I never heard of it. Which algorithms is there on this planet that may extract idea? I also do not believe that AI extract algorithms, though AI has its own algorithms to generate the relevant possible code. If newly generated code is modification from other code, what we know now that it is, and is based on, that requires licensing attributions. That licenses are relevant one can see from online discussions related to Github Copilot: https://fossa.com/blog/analyzing-legal-implications-github-copilot/ https://medium.com/geekculture/githubs-ai-copilot-might-get-you-sued-if-you-use-it-c1cade1ea229 https://fosspost.org/github-copilot/ https://artificialintelligence-news.com/2021/07/06/experts-debate-github-latest-ai-tool-violates-copyright-law/ https://www.opensourceforu.com/2021/07/github-copilot-sparks-debates-around-open-source-licenses/ That is just a small excerpt from a very large debat online related to licensing issues. Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-23 11:32 ` Jean Louis @ 2021-07-23 11:51 ` Eli Zaretskii 2021-07-23 12:47 ` Jean Louis 2021-07-24 1:14 ` Richard Stallman 1 sibling, 1 reply; 75+ messages in thread From: Eli Zaretskii @ 2021-07-23 11:51 UTC (permalink / raw) To: Jean Louis; +Cc: stefan, emacs-tangents, mullikine, rms > Date: Fri, 23 Jul 2021 14:32:00 +0300 > From: Jean Louis <bugs@gnu.support> > Cc: mullikine@gmail.com, stefan@marxist.se, emacs-tangents@gnu.org, > rms@gnu.org > > > That's not what happens with these services: they don't _copy_ code > > from other software (that won't work, because the probability of the > > variables being called by other names is 100%, and thus such code, if > > pasted into your program, will not compile). What they do, they > > extract ideas and algorithms from those other places, and express them > > in terms of your variables and your data types. So licenses are not > > relevant here. > > According to online reviews chunks of code is copied even verbatim and > people find from where. That cannot be true. It is nonsense to copy unrelated code into a program and tell people this is what they should use. > If code compiles or not is irrelevant. If one runs it or not is also > irrelevant, code need not even run. A feature or service that is based on this idea will never fly, believe me. Which program would want to have code pasted into his/her program that would cause compilation errors or, worse, break it at run time? > I do not believe that any of the AI-s so far "extract ideas". I never > heard of it. Which algorithms is there on this planet that may extract > idea? That's a very general question, it is impossible to answer it in a post to a mailing list. If you are really interested, you will have to read up on that. But you are wrong in your beliefs. > If newly generated code is modification from other code, what we know > now that it is, and is based on, that requires licensing > attributions. Once again, your assumptions are all wrong, so your conclusions are also wrong. Why not try one of these services and see what they actually do, before you pass your (quite harsh) judgment on them, and on the modern state of AI in general? > That licenses are relevant one can see from online discussions related > to Github Copilot: That people ask these questions and discuss this doesn't mean the problem is real. many people don't really understand what copyright means and how to apply it to program code. People also ask questions about the GPL, and there's a vociferous group of people who think the copyright assignment of code to the FSF means you give up all your rights in the code you've written. None of that is true, but still the rumors and the heated discussions go on and on. Their existence proves nothing, except that some people misunderstand something. ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-23 11:51 ` Eli Zaretskii @ 2021-07-23 12:47 ` Jean Louis 2021-07-23 13:39 ` Shane Mulligan 2021-07-23 19:33 ` Eli Zaretskii 0 siblings, 2 replies; 75+ messages in thread From: Jean Louis @ 2021-07-23 12:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: stefan, emacs-tangents, mullikine, rms * Eli Zaretskii <eliz@gnu.org> [2021-07-23 14:51]: > > According to online reviews chunks of code is copied even verbatim and > > people find from where. > > That cannot be true. It is nonsense to copy unrelated code into a > program and tell people this is what they should use. I wonder how sure you are in that, did you do the online research? It is not about related or unrelated, I do believe that AI finds and generates related code. But Here are references disputing how "it cannot be true": https://hacker-news.news/post/27710287 https://mmacvicar.medium.com/it-is-best-if-copilot-copies-everything-d84506128e5a https://loudlabs.nl/news/githubs-commercial-ai-tool-was-built-from-open-source-code/ > > If code compiles or not is irrelevant. If one runs it or not is also > > irrelevant, code need not even run. > > A feature or service that is based on this idea will never fly, > believe me. Which program would want to have code pasted into his/her > program that would cause compilation errors or, worse, break it at run > time? Of course people want code to fun. Just that copyright laws don't handle technical functionality. It is irrelevant if program works or does not work. There are thousands of copyrighted programs that cannot work any more as devices are not on the market, they are still under copyright. > > I do not believe that any of the AI-s so far "extract ideas". I never > > heard of it. Which algorithms is there on this planet that may extract > > idea? > > That's a very general question, it is impossible to answer it in a > post to a mailing list. If you are really interested, you will have > to read up on that. But you are wrong in your beliefs. > > > If newly generated code is modification from other code, what we know > > now that it is, and is based on, that requires licensing > > attributions. > > Once again, your assumptions are all wrong, so your conclusions are > also wrong. Why not try one of these services and see what they > actually do, before you pass your (quite harsh) judgment on them, and > on the modern state of AI in general? I can hear you how I am wrong, conclusions are wrong, though I gave you references enough to research it on Internet that will tell that there are possible serious licensing problems with such generated code. > > That licenses are relevant one can see from online discussions related > > to Github Copilot: > > That people ask these questions and discuss this doesn't mean the > problem is real. many people don't really understand what copyright > means and how to apply it to program code. Well said! Though that is not relevant. Question is very particular, specific and concrete: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ How does Pen.el and background AI services ensure of licensing compliance? I would appreciate if you find solution to that or stay on that subject, as if I am wrong or right is not relevant, what I wish is to have assurance that it is free software. Prove me wrong by providing exact references in not only on country's law but also other countries' laws, the lows that make it legal, or how otherwise the legality of such code is justified and how users may get free software. For example you may wish to mention "fair use" and on the other hand similar laws must be found in other countries that would justify it to be free software. As long as you don't tackle those subjects there is no legal solution for Pen.el and background AI to be used with assurance that software is truly free software. Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-23 12:47 ` Jean Louis @ 2021-07-23 13:39 ` Shane Mulligan 2021-07-23 14:39 ` Jean Louis 2021-07-26 0:16 ` Richard Stallman 2021-07-23 19:33 ` Eli Zaretskii 1 sibling, 2 replies; 75+ messages in thread From: Shane Mulligan @ 2021-07-23 13:39 UTC (permalink / raw) To: Jean Louis; +Cc: Eli Zaretskii, Stefan Kangas, emacs-tangents, rms [-- Attachment #1: Type: text/plain, Size: 9744 bytes --] Hi Jean, Eli, GNU, > "open source" I am referring to free software in the spirit of GNU. Free as in freedom, from oppression, from an attack against creative and cognitive intelligence. > GPT is potentially the best thing to happen to emacs in a very long time. > It will bring back power from the corporations and save it to your > computer, open source and transparent, and offline. The way this will work is you will download the free GPT model, such as GPT-j, GPT-neo or GPT-neox and then you will have an offline and private alternative to many things previously you would go online for. I have been working 5 months on demonstrations this whole time and I have informed you guys via emails, using specific demonstrations. I've even hand picked for you. Now you ask again and I'll give you another, but you are missing the point by focusing on one example when the possibilities are infinite. If I was a computing pioneer and I had to convince you of the importance of AI and all I had was the lambda calculus, would you see it? And you ask me for another cherry-picked demo, but it is very much beyond your current understanding. It's so much beyond what you believe is possible that you ask for an example. I have shown you 10+ already. GPT turns emacs into something very powerful beyond your current comprehension. It's so profound that it will replace many of the online and offline services you may have come to take for granted. It goes way beyond that too. This is the telos and purpose of emacs. It can save free software by absorbing GPT. GPT is not a toy. Here is another demo. The below instructions were given to me by the tutor in Pen.el when I asked it for help. There are two ways to quit Emacs, the hard way, and the easy way. In the hard way, you type M-x kill-emacs, and press enter. In the easy way, you press C-x C-c. The following is a prompt that created this interactive function. #+BEGIN_SRC yaml prompt: | This is a conversation between a human and a brilliant AI. The topic is "<2>". Human: Hello, are you my <1> tutor? ### AI: Hi there. Yes I am. How can I help you? ### Human: Thanks. I have a question. <2> ### AI: I would be happy to answer your question. #+END_SRC Here is the recording of me doing that: https://asciinema.org/a/SCUhm3l11N3w5eilUfewBDCiP In the future the core prompts will be only a few bytes and most prompts we use will be defined using types alone, or subjective to previously executed prompt functions, collaboratively. A prompt may be defined by type names alone, plus the version of a LM; the rest is inferred or subjective to peer to peer prompts: #+BEGIN_SRC emacs-lisp (defprompt ("short lines of code" "regex")) #+END_SRC You haven't yet understood the profundity of GPT and doing a great disservice to free software by stifling imaginary programming inside emacs. Eli is right in that these LMs are not copy pasting. They are indeed generating from a statistical model. I know many people who are creating their own models on themselves. There will be infinitely many of these models in the future. GNU needs one. LMs may be trained on subsets of the code from GitHub to ensure there is no possibility for derivative works from conflicting licensed code. > You are more or less proposing the same conflict to come to Emacs and > I did not see where is your solution? Emacs is dead without GPT. That's why I raised the issue. It's dead because it can be imagined by LMs and will lose its power. It'll be just another imagined environment. Software is changing and Emacs can't miss out because it represents freedom. > So how? As I have said countless times in the past and is clearly visible from their website, EleutherAI has MIT and Apache licensed alternative models which you can freely download. Here is a completer you can freely use: https://github.com/samrawal/emacs-secondmate This will not replace everything from =gpt3demo.com=. You're failing to see the full picture here. It's absolutely vital for emacs' survival to have GPT incorporated. Make it happen. > As soon as anything is published in public > without compliance to licenses it generates > problems. Prompts are completely at the license of the person who created it, even if they are queries to GPT3. Just as I can write SQL to Microsoft SQL Server and license my SQL as GPL3. Prompts are code and I'm talking about a new type of programming that is nonexistent in emacs. My suggestions: - Create prompts database - So people can collaborate on open source prompts - So people can extend emacs with language models - So people know it's ok to use their imagination and emacs supports creative intelligence - Integrate prompt functions into emacs somehow - defprompt - Optionally ship GPT-neo and GPT-j with emacs - Consider creating a prompting server - Consider a database for saving generations =Pen.el= is GPL3. There's nothing wrong with typing on a keyboard so it's fully compliant with licensing. =Pen.el= allows you to select the completion engine and you may use a libre completion engine such as GPT-j, GPT-neo or GPT-neox. > "Prove me wrong" Do me a favour and do some research yourself. I have too much to do. Sincerely, Shane Shane Mulligan How to contact me: 🇦🇺 00 61 421 641 250 🇳🇿 00 64 21 1462 759 <+64-21-1462-759> mullikine@gmail.com On Sat, Jul 24, 2021 at 12:50 AM Jean Louis <bugs@gnu.support> wrote: > * Eli Zaretskii <eliz@gnu.org> [2021-07-23 14:51]: > > > According to online reviews chunks of code is copied even verbatim and > > > people find from where. > > > > That cannot be true. It is nonsense to copy unrelated code into a > > program and tell people this is what they should use. > > I wonder how sure you are in that, did you do the online research? It > is not about related or unrelated, I do believe that AI finds and > generates related code. But > > Here are references disputing how "it cannot be true": > > https://hacker-news.news/post/27710287 > > > https://mmacvicar.medium.com/it-is-best-if-copilot-copies-everything-d84506128e5a > > > https://loudlabs.nl/news/githubs-commercial-ai-tool-was-built-from-open-source-code/ > > > > If code compiles or not is irrelevant. If one runs it or not is also > > > irrelevant, code need not even run. > > > > A feature or service that is based on this idea will never fly, > > believe me. Which program would want to have code pasted into his/her > > program that would cause compilation errors or, worse, break it at run > > time? > > Of course people want code to fun. Just that copyright laws don't > handle technical functionality. It is irrelevant if program works or > does not work. There are thousands of copyrighted programs that cannot > work any more as devices are not on the market, they are still under > copyright. > > > > I do not believe that any of the AI-s so far "extract ideas". I never > > > heard of it. Which algorithms is there on this planet that may extract > > > idea? > > > > That's a very general question, it is impossible to answer it in a > > post to a mailing list. If you are really interested, you will have > > to read up on that. But you are wrong in your beliefs. > > > > > If newly generated code is modification from other code, what we know > > > now that it is, and is based on, that requires licensing > > > attributions. > > > > Once again, your assumptions are all wrong, so your conclusions are > > also wrong. Why not try one of these services and see what they > > actually do, before you pass your (quite harsh) judgment on them, and > > on the modern state of AI in general? > > I can hear you how I am wrong, conclusions are wrong, though I gave > you references enough to research it on Internet that will tell that > there are possible serious licensing problems with such generated > code. > > > > That licenses are relevant one can see from online discussions related > > > to Github Copilot: > > > > That people ask these questions and discuss this doesn't mean the > > problem is real. many people don't really understand what copyright > > means and how to apply it to program code. > > Well said! Though that is not relevant. > > Question is very particular, specific and concrete: > ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ > > How does Pen.el and background AI services ensure of licensing > compliance? > > I would appreciate if you find solution to that or stay on that > subject, as if I am wrong or right is not relevant, what I wish is to > have assurance that it is free software. Prove me wrong by providing > exact references in not only on country's law but also other > countries' laws, the lows that make it legal, or how otherwise the > legality of such code is justified and how users may get free > software. > > For example you may wish to mention "fair use" and on the other hand > similar laws must be found in other countries that would justify it to > be free software. > > As long as you don't tackle those subjects there is no legal solution > for Pen.el and background AI to be used with assurance that software > is truly free software. > > > > Jean > > Take action in Free Software Foundation campaigns: > https://www.fsf.org/campaigns > > In support of Richard M. Stallman > https://stallmansupport.org/ > > > > > [-- Attachment #2: Type: text/html, Size: 13153 bytes --] ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-23 13:39 ` Shane Mulligan @ 2021-07-23 14:39 ` Jean Louis 2021-07-26 0:16 ` Richard Stallman 1 sibling, 0 replies; 75+ messages in thread From: Jean Louis @ 2021-07-23 14:39 UTC (permalink / raw) To: Shane Mulligan; +Cc: Eli Zaretskii, Stefan Kangas, emacs-tangents, rms * Shane Mulligan <mullikine@gmail.com> [2021-07-23 16:40]: > Hi Jean, Eli, GNU, > > > "open source" > I am referring to free software in the spirit > of GNU. Free as in freedom, from oppression, > from an attack against creative and cognitive > intelligence. Alright, though in GNU project we don't use the term "Open Source" as it was never about it. The term is "free software". Open Source is today vague, people use the term "Open" for things which are not so open, including source code, there is open source from Microsoft which is proprietary and yet called "open source", there we have now OpenAI which is not free software, and so on. See "Open": https://www.gnu.org/philosophy/words-to-avoid.html#Open > Now you ask again and I'll give you another, but you are missing the > point by focusing on one example when the possibilities are > infinite. I hope that generated code will not take longer time to verify then writing it by hand. Not to mistake me, if the databases are free as in the definition of free software and your code is free, then I am definitely for that, and I like AI, we have too little of the artificial intelligence in 21st century. We are under developed civilization in that regard. The movie 2001: A Space Odyssey was made in 1968, prediction was already that we would have space ships with Hal AI that guides us, but we are not far from bedroom with Amazon spying "AI" devices. It is very important to have all parts free as in free software definition. Review again: https://www.gnu.org/philosophy/free-sw.html > GPT turns emacs into something very powerful beyond your current > comprehension. It's so profound that it will replace many of the > online and offline services you may have come to take for > granted. It goes way beyond that too. Yes, I am asking for less abstract, more practical examples. I have no use from the hype. > Here is another demo. > The below instructions were given to me by the > tutor in Pen.el when I asked it for help. > > There are two ways to quit Emacs, the hard way, and the easy way. > In the hard way, you type M-x kill-emacs, and press enter. > In the easy way, you press C-x C-c. I have to look on it realistically from my angle, so I do not see any use in this example. I see that AI guessed something and generated text. I would not agree that M-x kill-emacs is hard way, and that C-x C-c is easy way. I would rather say that easy way is to choose File and Quit menu options. > The following is a prompt that created this interactive function. > > #+BEGIN_SRC yaml > prompt: | > This is a conversation between a human and a brilliant AI. > The topic is "<2>". > > Human: Hello, are you my <1> tutor? > ### > AI: Hi there. > Yes I am. > How can I help you? > ### > Human: Thanks. I have a . <2> > ### > AI: I would be happy to answer your question. > #+END_SRC I am sorry, I wish to see example of usefulness. I will go over your previous examples. It is definitely possible that I neglected it, but you know from beginning that I am interested in this. I have my reasons why I am interested as I do generate a lot of text and I wish to spare my writings. In the above quote I do not see that prompt, and how is relevant to how to quit Emacs. > Here is the recording of me doing that: > > https://asciinema.org/a/SCUhm3l11N3w5eilUfewBDCiP I have clicked on that link and could not find exact reference. I found "haskell lsp with HIE", something about "htop" and "stackexchange". > A prompt may be defined by type names alone, > plus the version of a LM; the rest is inferred > or subjective to peer to peer prompts: > > #+BEGIN_SRC emacs-lisp > (defprompt ("short lines of code" "regex")) > #+END_SRC > > You haven't yet understood the profundity of GPT and > doing a great disservice to free software by > stifling imaginary programming inside emacs. The above hype paragraphs are suspiciously AI-looking. > > You are more or less proposing the same conflict to come to Emacs and > > I did not see where is your solution? > Emacs is dead without GPT. That's why I raised the issue. It's dead > because it can be imagined by LMs and will lose its power. It'll be > just another imagined environment. Software is changing and Emacs > can't miss out because it represents freedom. That was not context of my question. Did you read last email to Eli about it? It seem like you either ignored my question or you keep using AI to generate hype about it. Julia Reda from Germany is at least trying to answer my question related to licensing compliance here: https://juliareda.eu/2021/07/github-copilot-is-not-infringing-your-copyright/ So there are at least ways to go to understand how it complies or could comply to licenses or be liberated from licenses. > You're failing to see the full picture here. > It's absolutely vital for emacs' survival to > have GPT incorporated. Make it happen. I'm having a hard time following your proposal. I'm not sure what you're asking. But I am too old to understand it. Though it is not my age that matters. It is your experience, your knowledge, your education, your understanding, your wisdom. In all of these you and me have to be the best judge of what is good for you and me and what is not. The above paragraph was created by AI with small corrections. It says nothing just as the above quoted paragraph says nothing essential. "full picture", "absolutely vital", "Emacs survival", "Make it happen" -- that is sales pitch. And I am sales manager btw. That AI is useful in general, no need to convince me. Question was about licensing and I see that some activists like Julia Reda, which I have contacted previously in relation to copyright issues in EU, have found legal justifications for licensing compliance. That is what I wanted to know. The issue is however not closed with the assumptions of Julia Reda, as she may know EU laws, but not all jurisdictions are quite aligned, so we have still to be vigilant and follow up on that. IMHO, you should incorporate justifications used by Mrs. Reda in your commentary or README files with references so that licensing becomes clear for future readers. > > As soon as anything is published in public > > without compliance to licenses it generates > > problems. > Prompts are completely at the license of the > person who created it, even if they are > queries to GPT3. If prompts never appear in published works those are not relevant. If they do appear, their licensing compliance is assured by programmer or author. Even prompts could be coming from proprietary software. > My suggestions: > - Create prompts database > - So people can collaborate on open source prompts > - So people can extend emacs with language models > - So people know it's ok to use their imagination and emacs supports > creative intelligence > - Integrate prompt functions into emacs somehow? > - defprompt > - Optionally ship GPT-neo and GPT-j with emacs > - Consider creating a prompting server > - Consider a database for saving generations I find it all goods ideas, I just wish I could see more practical example: - prompts database, should be on the website? Where? Hosted by which party? Is it centralized or decentralized? - collaboration by which means? Email, chat? Website forum? How exactly? - I understand your 3rd point. - GPT-neo and GPT-j is how big? -- Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-23 13:39 ` Shane Mulligan 2021-07-23 14:39 ` Jean Louis @ 2021-07-26 0:16 ` Richard Stallman 2021-07-26 0:28 ` Shane Mulligan 1 sibling, 1 reply; 75+ messages in thread From: Richard Stallman @ 2021-07-26 0:16 UTC (permalink / raw) To: Shane Mulligan; +Cc: eliz, stefan, emacs-tangents, bugs [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > GPT turns emacs into something very powerful > beyond your current comprehension. It's so > profound that it will replace many of the > online and offline services you may have come > to take for granted. It goes way beyond that too. Unfortunately, telling me that something is "powerful beyond [my] current comprehension" does not help me start to comprehend any of it. Would you like to name some of the services that GPT would replace? I might learn something concrete from that. > Here is the recording of me doing that: > https://asciinema.org/a/SCUhm3l11N3w5eilUfewBDCiP I looked at that page, but I have no idea what it means. The page shows three boxes side by side. Each seems to contain some code, or maybe parameter specs, in a language I don't know. I clicked on the first box and it brought me to a similar page with three other boxes. It tasks about "asciicasts" but I don't know what that means. If it is something to be viewed, how can I do so? -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-26 0:16 ` Richard Stallman @ 2021-07-26 0:28 ` Shane Mulligan 2021-07-30 3:20 ` Shane Mulligan 0 siblings, 1 reply; 75+ messages in thread From: Shane Mulligan @ 2021-07-26 0:28 UTC (permalink / raw) To: rms; +Cc: Eli Zaretskii, Stefan Kangas, emacs-tangents, Jean Louis [-- Attachment #1: Type: text/plain, Size: 2511 bytes --] Hey Richard and all. I have just participated in the Augment Minds unconference and have a recorded demo of Pen.el I will also be presenting the demo to Nat Friedman. I have made some references to the new codex model and how it has stolen the inspiration from Free software. The point I'm making is this: Pen.el and software which combines GPT into the operating system is the future and I'm alerting GNU to this first but I'm also showing GitHub. This is for the following reasons - The Copilot/codex model is a disgrace - We need an free repository of prompts and prompt functions for emacs I hope the demo which I will send in the next day or two (or whenever it becomes available) will be informative. It will be easier than the asciicast. Thank you. Shane Mulligan How to contact me: 🇦🇺 00 61 421 641 250 🇳🇿 00 64 21 1462 759 <+64-21-1462-759> mullikine@gmail.com On Mon, Jul 26, 2021 at 12:16 PM Richard Stallman <rms@gnu.org> wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > GPT turns emacs into something very powerful > > beyond your current comprehension. It's so > > profound that it will replace many of the > > online and offline services you may have come > > to take for granted. It goes way beyond that too. > > Unfortunately, telling me that something is "powerful beyond [my] > current comprehension" does not help me start to comprehend any of it. > > Would you like to name some of the services that GPT would replace? > I might learn something concrete from that. > > > Here is the recording of me doing that: > > > https://asciinema.org/a/SCUhm3l11N3w5eilUfewBDCiP > > I looked at that page, but I have no idea what it means. The page > shows three boxes side by side. Each seems to contain some code, or > maybe parameter specs, in a language I don't know. I clicked on the > first box and it brought me to a similar page with three other boxes. > > It tasks about "asciicasts" but I don't know what that means. > If it is something to be viewed, how can I do so? > > > > -- > Dr Richard Stallman (https://stallman.org) > Chief GNUisance of the GNU Project (https://gnu.org) > Founder, Free Software Foundation (https://fsf.org) > Internet Hall-of-Famer (https://internethalloffame.org) > > > [-- Attachment #2: Type: text/html, Size: 4961 bytes --] ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-26 0:28 ` Shane Mulligan @ 2021-07-30 3:20 ` Shane Mulligan 2021-07-30 6:55 ` Jean Louis 0 siblings, 1 reply; 75+ messages in thread From: Shane Mulligan @ 2021-07-30 3:20 UTC (permalink / raw) To: rms; +Cc: Eli Zaretskii, Stefan Kangas, emacs-tangents, Jean Louis [-- Attachment #1.1: Type: text/plain, Size: 3705 bytes --] Hey guys. In the last week I have been writing a thesis for Imaginary Programming, which aims to make all of this clear and formalised. I am very sorry if I have sounded frustrated, but I think that this is so important for free software and a GPL-4 may be required to protect people, but also that Copilot and OpenAI's Codex and GPT-3 models infringe upon the spirit of GPT-3 code. I will attach the thesis into this email. https://github.com/semiosis/imaginary-programming-thesis/blob/master/thesis.org I am working around the clock to finish this thesis and have it published, but it's really important to have these protections in place before the huge suite of SASS services and Microsoft Apps hit the market which are using Copilot and Codex to generate derivative works and applications built upon the backs of free software developers. Thank you. Shane Mulligan How to contact me: 🇦🇺 00 61 421 641 250 🇳🇿 00 64 21 1462 759 <+64-21-1462-759> mullikine@gmail.com On Mon, Jul 26, 2021 at 12:28 PM Shane Mulligan <mullikine@gmail.com> wrote: > Hey Richard and all. > > I have just participated in the Augment Minds unconference and have a > recorded demo of Pen.el > > I will also be presenting the demo to Nat Friedman. I have made some > references to the new codex model and how it has stolen the inspiration > from Free software. > > The point I'm making is this: Pen.el and software which combines GPT into > the operating system is the future > and I'm alerting GNU to this first but I'm also showing GitHub. This is > for the following reasons > > - The Copilot/codex model is a disgrace > - We need an free repository of prompts and prompt functions for emacs > > I hope the demo which I will send in the next day or two (or whenever it > becomes available) will be informative. It will be easier than the > asciicast. > > Thank you. > > Shane Mulligan > > How to contact me: > 🇦🇺 00 61 421 641 250 > 🇳🇿 00 64 21 1462 759 <+64-21-1462-759> > mullikine@gmail.com > > > On Mon, Jul 26, 2021 at 12:16 PM Richard Stallman <rms@gnu.org> wrote: > >> [[[ To any NSA and FBI agents reading my email: please consider ]]] >> [[[ whether defending the US Constitution against all enemies, ]]] >> [[[ foreign or domestic, requires you to follow Snowden's example. ]]] >> >> > GPT turns emacs into something very powerful >> > beyond your current comprehension. It's so >> > profound that it will replace many of the >> > online and offline services you may have come >> > to take for granted. It goes way beyond that too. >> >> Unfortunately, telling me that something is "powerful beyond [my] >> current comprehension" does not help me start to comprehend any of it. >> >> Would you like to name some of the services that GPT would replace? >> I might learn something concrete from that. >> >> > Here is the recording of me doing that: >> >> > https://asciinema.org/a/SCUhm3l11N3w5eilUfewBDCiP >> >> I looked at that page, but I have no idea what it means. The page >> shows three boxes side by side. Each seems to contain some code, or >> maybe parameter specs, in a language I don't know. I clicked on the >> first box and it brought me to a similar page with three other boxes. >> >> It tasks about "asciicasts" but I don't know what that means. >> If it is something to be viewed, how can I do so? >> >> >> >> -- >> Dr Richard Stallman (https://stallman.org) >> Chief GNUisance of the GNU Project (https://gnu.org) >> Founder, Free Software Foundation (https://fsf.org) >> Internet Hall-of-Famer (https://internethalloffame.org) >> >> >> [-- Attachment #1.2: Type: text/html, Size: 8049 bytes --] [-- Attachment #2: thesis.org --] [-- Type: application/octet-stream, Size: 26754 bytes --] * Imaginary programming is a new programming paradigm based on language models ** Abstract Imaginary code is code who's behaviour is influenced by LMs. The side effects or return values of imaginary code, therefore, are imagined by a LM, but may also be used to facilitate the imagination of the programmer and may be considered to be a bicycle for the imagination. This is very obvious when interacting with an imaginary REPL. I will attempt to formalise imaginary programming, make some demonstrations of programming within this paradigm and explore some useful data structures and algorithms that are both impurely and purely imaginary. I'll also give an example of an imaginary programming language that I have created (perhaps the first of its kind), examplary. The motivation for formalising imaginary programming is not purely academic. Imaginary code needs to be recognised as code so that it may be protected by GPL. I also posit that models of NL, if trained on source code, create a holographic representation of the software, which I argue is a derivative work and a reflection of the original code. I argue that a holographic representation of software both within (author inspiration) and without (how it is used) is just another representation of the software, alongside the original source code, just as functions may be represented differently. ** Introduction The recently burgeoning and soon to diminish programming paradigm of prompt engineering is about to be superseded by prompt-tuning and the fine-tuning of LMs which will further occlude the way that software works. Prompt engineering has barely had it's time in the spotlight and as a result has not established itself as a sovereign programming paradigm. However, imaginary programming is a broader definition that encapsulates all programming that solicits LMs and uses their output to effect change in a program's logic and will outlast prompt engineering as a useful concept. In contrast with imaginary code, ordinary code has not yet been contaminated by a LM, and we say that it has no imaginary dimension to it. Impure imaginary code is where ordinary code intersects with pure imaginary code. An impure imaginary function is a function that queries a LM to directly affect its own logic or output. We say that an impure imaginary function is grounded [to reality] because it's connecting base reality to a LM. The output and behaviour of an impure imaginary function is directly influenced by base reality plus a query to a LM. The query [or prompt] to the LM may be in part constructed manually through prompt engineering, or in part constructed automatically via prompt tuning, or in part constructed or eliminated by the fine-tuning of a LM. Even after fine-tuning, there is still a query to be formulated to the LM, and that query may indeed be the empty string. Considering that large LMs such as GPT-3 can perform multiple tasks, the process of refining a query through prompt-engineering, prompt-tuning or fine-tuning also characterises the expected output from the LM. All that is left is to map a prompt along with its associated LM to a function and then you have a prompt function. Prompt functions reconcile LMs with programming languages. A prompt function is just a function that prompts a LM and may optionally be parameterized with template variables that are substituted into the prompt or also contain hyperparameters to affect the LM's operation. Such functions are the basis for services such as GitHub Copilot. ** Impure imaginary code is useful Impure imaginary code is very obviously useful as such code utilises LMs that are trained to perform useful tasks. GPT-3, for example, is a generalist and only requires a tiny amount of prompt design and/or fine-tuning to direct it to perform the task you want. Following are some demonstrations of using impure imaginary code to construct part of an imaginary programming environment, perform code generation, transpile code and translate world languages. *** Useful impure imaginary functions **** With one =Pen.el= system The following prompt function definition function associates a prompt with a LM (OpenAI's GPT-3 davinci) and defines the parameters for a function in emacs lisp. #+BEGIN_SRC yaml -n :async :results verbatim code title: bash one liner generator on OS from natural language doc: Get a bash one liner on OS from natural language notes: - "rlprompt is used here outside of pen.el" rlprompt: nlsh <1> prompt: | # List of one-liner shell commands for <1>. # Language: Shell # Operating System: <1> Input: Print the current directory Output: pwd ### Input: List files Output: ls -l ### Input: Change directory to /tmp Output: cd /tmp ### repeater: | Input: {} Output: lm-command: "openai-complete.sh" engine: davinci temperature: 0.8 max-tokens: 60 top-p: 1 stop-sequences: - "###" vars: - Operating System - command examples: - Arch Linux - Install package postprocessor: 'sed ''s/^Output: //''' conversation-mode: true #+END_SRC The following is the generated documentation for the interactive prompt function in emacs. #+BEGIN_SRC text -n :async :results verbatim code pf-bash-one-liner-generator-from-natural-language is an interactive function defined in pen-example-config.el. Signature (pf-bash-one-liner-generator-from-natural-language &optional TASK-DESCRIPTION &key NO-SELECT-RESULT) Documentation bash one liner generator from natural language Get a bash one liner from natural language path: - /home/shane/source/git/spacemacs/prompts/prompts/bash-one-liner.prompt examples: - shift last argument Key Bindings This command is not in any keymaps. References pf-bash-one-liner-generator-from-natural-language is unused in pen-example-config.el. #+END_SRC Below is the generated interactive function in emacs lisp. #+BEGIN_SRC emacs-lisp -n :async :results verbatim code (lambda (&optional task-description &rest --cl-rest--) "bash one liner generator from natural language\nGet a bash one liner from natural language\n\npath:\n- /home/shane/source/git/spacemacs/prompts/prompts/bash-one-liner.prompt\n\nexamples:\n- shift last argument\n\n(fn &optional TASK-DESCRIPTION &key NO-SELECT-RESULT)" (interactive (list (if mark-active (pen-selected-text) (if nil (etv "shift last argument") (read-string-hist "task-description: " "shift last argument"))))) (let* ((no-select-result (car (cdr (plist-member --cl-rest-- ':no-select-result))))) (progn (let ((--cl-keys-- --cl-rest--)) (while --cl-keys-- (cond ((memq (car --cl-keys--) '(:no-select-result :allow-other-keys)) (setq --cl-keys-- (cdr (cdr --cl-keys--)))) ((car (cdr (memq ':allow-other-keys --cl-rest--))) (setq --cl-keys-- nil)) (t (error "Keyword argument %s not one of (:no-select-result)" (car --cl-keys--)))))) (cl-block pf-bash-one-liner-generator-from-natural-language (let* ((final-prompt "The following is a list of one-liners for the linux command-line:\n\n# get newest file in directory bash\n$ ls -t * | head -1\n###\n# Find with invert match - e.g. find every file that is not mp3\n$ find . -name '*' -type f -not -path '*.mp3'\n###\n# Recursively remove all \"node_modules\" folders\n$ find . -name \"node_modules\" -exec rm -rf '{}' +\n###\n# <1>\n$\n") (final-max-tokens (str (if (variable-p 'max-tokens) (eval 'max-tokens) 60))) (final-stop-sequences (if (variable-p 'stop-sequences) (eval 'stop-sequences) '("###"))) (vals (mapcar 'str (if (not (interactive-p)) (progn (cl-loop for sym in '(task-description) for iarg in '((if mark-active (pen-selected-text) (if nil (etv "shift last argument") (read-string-hist "task-description: " "shift last argument")))) collect (let* ((initval (eval sym))) (if (and (not initval) iarg) (eval iarg) initval)))) (cl-loop for v in '(task-description) until (eq v '&key) collect (eval v))))) (vals (cl-loop for tp in (-zip-fill nil vals 'nil) collect (let* ((v (car tp)) (pp (cdr tp))) (if pp (pen-sn pp v) v)))) (i 1) (final-prompt (pen-expand-template final-prompt vals)) (prompt-end-pos (or (byte-string-search "<:pp>" "The following is a list of one-liners for the linux command-line:\n\n# get newest file in directory bash\n$ ls -t * | head -1\n###\n# Find with invert match - e.g. find every file that is not mp3\n$ find . -name '*' -type f -not -path '*.mp3'\n###\n# Recursively remove all \"node_modules\" folders\n$ find . -name \"node_modules\" -exec rm -rf '{}' +\n###\n# <1>\n$\n") (string-bytes final-prompt))) (final-prompt (string-replace "<:pp>" "" final-prompt)) (final-prompt (if nil (sor (pen-snc nil final-prompt) (concat "prompt-filter " nil " failed.")) final-prompt)) (pen-sh-update (or pen-sh-update (>= (prefix-numeric-value current-global-prefix-arg) 4))) (shcmd (pen-log (concat (sh-construct-envs `(("PEN_PROMPT" ,(pen-encode-string final-prompt)) ("PEN_LM_COMMAND" ,"openai-complete.sh") ("PEN_ENGINE" ,"davinci") ("PEN_MAX_TOKENS" ,(pen-expand-template final-max-tokens vals)) ("PEN_TEMPERATURE" ,(pen-expand-template (str 0.8) vals)) ("PEN_STOP_SEQUENCE" ,(pen-encode-string (str (if (variable-p 'stop-sequence) (eval 'stop-sequence) "###")))) ("PEN_TOP_P" ,1) ("PEN_CACHE" ,nil) ("PEN_N_COMPLETIONS" ,5) ("PEN_END_POS" ,prompt-end-pos))) " " "upd lm-complete"))) (resultsdirs (cl-loop for i in (number-sequence 1 1) collect (progn (message (concat "pf-bash-one-liner-generator-from-natural-language" " query " (int-to-string i) "...")) (let ((ret (pen-prompt-snc shcmd i))) (message (concat "pf-bash-one-liner-generator-from-natural-language" " done " (int-to-string i))) ret)))) (results (-uniq (flatten-once (cl-loop for rd in resultsdirs collect (if (sor rd) (->> (glob (concat rd "/*")) (mapcar 'e/cat) (mapcar (lambda (r) (if (and nil (sor nil)) (pen-sn nil r) r))) (mapcar (lambda (r) (if (and (variable-p 'prettify) prettify nil (sor nil)) (pen-sn nil r) r))) (mapcar (lambda (r) (if (not nil) (s-trim-left r) r))) (mapcar (lambda (r) (if (not nil) (s-trim-right r) r))) (mapcar (lambda (r) (cl-loop for stsq in final-stop-sequences do (let ((matchpos (string-search stsq r))) (if matchpos (setq r (s-truncate matchpos r ""))))) r))) (list (message "Try UPDATE=y or debugging"))))))) (result (if no-select-result (length results) (cl-fz results :prompt (concat "pf-bash-one-liner-generator-from-natural-language" ": ") :select-only-match t)))) (if no-select-result results (if (interactive-p) (cond ((>= (prefix-numeric-value current-prefix-arg) 4) (etv result)) ((and nil mark-active) (replace-region result)) ((or nil nil) (insert result)) (t (etv result))) result))))))) #+END_SRC The above function creates a NL shell. This enables you to generate shell commands based on NL and it is parameterized to enable you to specify the operating system that the commands generated should run on. #+BEGIN_SRC emacs-lisp -n :async :results raw (list2str (pf-bash-one-liner-generator-on-os-from-natural-language "Arch Linux" "Disable firewall" :no-select-result t)) #+END_SRC Here is a list of suggestions generated from the above prompt function. #+BEGIN_SRC text -n :async :results verbatim code iptables -F iptables -P OUTPUT DROP sed -i 's/^[ \t]*firewall=.*$/firewall=0/' /etc/sysconfig/iptables systemctl stop iptables.service sudo systemctl stop iptables sudo ufw disable #+END_SRC You may also run it as a REPL. https://semiosis.github.io/posts/imaginary-programming-with-gpt-3/ #+BEGIN_SRC yaml -n :async :results verbatim code title: Code interpreter kickstarter future-titles: - Code interpreter kickstarter doc: Given a line of code, infer the result of running that code prompt-version: 4 prompt: | Code examples: Language: Python Input: print(random.randint(0,9)) Output: 5 ### Language: Bash Input: Str="Learn Linux from LinuxHint"; subStr=${Str:6:5} Output: Linux ### repeater: | Language: <1> Input: {} Output: issues: engine: davinci temperature: 0.8 max-tokens: 60 top-p: 1 stop-sequences: - "##" - "\n" vars: - language - code examples: - haskell - '"Hello" ++ " " ++ "World"' prefer-external: true external: iol similarity-test: string-equal quality-script: levenshtein -s conversation-mode: true n-test-runs: 5 #+END_SRC #+BEGIN_SRC emacs-lisp -n :async :results raw (car (pf-code-interpreter-kickstarter "Haskell" "\"Hello\" ++ \" \" ++ \"World\"" :no-select-result t)) #+END_SRC #+BEGIN_SRC text -n :async :results verbatim code Hello World #+END_SRC **** With two =Pen.el= systems ***** Using a common language model Translating communications with a world language translation prompt function. #+BEGIN_SRC yaml -n :async :results verbatim code title: Translate from world language X to Y prompt-version: 3 doc: This prompt translates English text to any world langauge prompt: | ### # English: Hello # Russian: Zdravstvuyte # Italian: Salve # Japanese: Konnichiwa # German: Guten Tag # French: Bonjour # Spanish: Hola ### # English: Happy birthday! # French: Bon anniversaire ! # German: Alles Gute zum Geburtstag! # Italian: Buon compleanno! # Indonesian: Selamat ulang tahun! ### # <1>: <3> # <2>: engine: davinci temperature: 0.5 max-tokens: 200 top-p: 1 stop-sequences: - "#" vars: - from-language - to-language - phrase preprocessors: - cat - cat - pen-s onelineify postprocessor: pen-s unonelineify examples: - English - French - Goodnight var-defaults: - "(or (sor (nth 0 (pf-get-language (pen-selected-text) :no-select-result t))) (read-string-hist \"Pen From language: \"))" - "(read-string-hist \"Pen To language: \")" - "(pen-selected-text)" filter: on #+END_SRC A demonstration of two people who understand different world languages using a common LM to understand one another. #+NAME: fromenglish #+BEGIN_SRC text -n :async :results verbatim code Happy birthday To you #+END_SRC #+BEGIN_SRC emacs-lisp -n :async :results code raw ;; Alice translates into french for Bob (car (pf-translate-from-world-language-x-to-y "English" "French" "Happy birthday\nTo you" :no-select-result t)) #+END_SRC #+NAME: fromfrench #+BEGIN_SRC text -n :async :results verbatim code Bon anniversaire A vous #+END_SRC #+BEGIN_SRC text -n :async :results verbatim code Merci beaucoup #+END_SRC #+BEGIN_SRC emacs-lisp -n :async :results code raw ;; Bob translates back into English for Alice (car (pf-translate-from-world-language-x-to-y "French" "English" "Merci\nbeaucoup" :no-select-result t)) #+END_SRC #+BEGIN_SRC text -n :async :results verbatim code Thank you! #+END_SRC https://asciinema.org/a/7YnSnrrLgbiFlyMyYxBgaZYUb #+BEGIN_EXPORT html <!-- Play on asciinema.com --> <!-- <a title="asciinema recording" href="https://asciinema.org/a/7YnSnrrLgbiFlyMyYxBgaZYUb" target="_blank"><img alt="asciinema recording" src="https://asciinema.org/a/7YnSnrrLgbiFlyMyYxBgaZYUb.svg" /></a> --> <!-- Play on the blog --> <script src="https://asciinema.org/a/7YnSnrrLgbiFlyMyYxBgaZYUb.js" id="asciicast-7YnSnrrLgbiFlyMyYxBgaZYUb" async></script> #+END_EXPORT ***** With different language models - GPT-neo and GPT-3? - curie vs davinci? - Generate a story about a meeting with one prompt - Summarize with bullet points - meeting-bullets-to-summary.prompt *** An impure imaginary data structure **** With one =Pen.el= system - Natural language database entry **** With two =Pen.el= systems - Database prompt **** With three =Pen.el= systems - Database prompt *** TODO Find a useful impure imaginary algorithm **** With one =Pen.el= system - Translate from X to Y - Backtranslate from Y to X Find a better prompt? **** With two =Pen.el= systems **** With three =Pen.el= systems ** Pure imaginary code is useful Pure imaginary programming is a type of programming where the original language models may not even be known. I demonstate that collaborative pure imaginary programming is useful. *** Translation between two =Pen.el= systems with different language models A common library of pure imaginary functions. #+BEGIN_SRC emacs-lisp -n :async :results verbatim code ("translate" "prose" "from" "to") #+END_SRC Pure imaginary functions can be composed. #+BEGIN_SRC emacs-lisp -n :async :results verbatim code ("translate" ("make analogy about" "topic") "from" "to") #+END_SRC ** Imaginary programming languages are required to work with language models *** Examplary - Part of it is task-oriented, which defers imagination to a language model to understand what it means. - Part of it is example-oriented, which is pure-imaginary. *** Example-oriented #+BEGIN_SRC emacs-lisp -n :async :results verbatim code ;; Convert lines to regex. (xl-defprompt ("lines of code" regex) ;; :task "Convert lines to regex" ;; Generate input with this ;; :gen "examplary-edit-generator shane" :gen examplary-edit-generator :filter "grex" ;; The third argument (if supplied) should be incorrect output (a counterexample). ;; If the 2nd argument is left out, it will be generated by the command specified by :external :examples (("example 1\nexample2") ("example 2\nexample3" "^example [23]$") ("pi4\npi5" "^pi[45]$" "pi4\npi5")) :lm-command "openai-complete.sh") #+END_SRC *** Task oriented #+BEGIN_SRC emacs-lisp -n :async :results verbatim code ("translate" ("make analogy about" "topic") "from" "to") #+END_SRC ** Projecting the code back to the starting LM is possible - Semantic search on existing documents - Semantic search on existing functions in emacs ** Language models encode holographic representations of software It's important to avoid mixing training data of varying licenses when training LMs. One risk is that in the future, as holographic representations of software are used more in place of running original source code (i.e. as LMs are used more to simulate software), a software's hologram is more likely to be used in ways that violate the original license or the spirit of the license. LMs bring with them understanding of the way software is used, and also an understanding of the inspiration that went into designing that software. The issue is that this is all automated and right now new software companies are staking their future on LMs and using said models to their fullest. Therefore, the inexorable conclusion is that software that has been used to train these models will be used holographically, perhaps more than even from their original software and their holographic representation that encodes the value of the software (the way it's used as opposed to written) is what's more important and that's is what is being exploited. If the original code of an example of free software was part of the training data of a NN alongside software of other conflicting licenses then that effectively relicences the same software without consent, going forward into the future. *** Generating parts of emacs with GPT-3 I am able to generate parts of GPL protected software using LMs and can query the LMs as to how they are used. Therefore, the software exists now in the latent space of a language model in the form of a hologram, within and without the source code. Language models encode contrived associations made between different pieces of software in order to create an accurate model that is useful for simulation, code generation, code understanding and modelling the usage of software. - The holographic representation *** =0.9 / 1= is still stealing ** Counter arguments *** It's not imaginary, it's just... English? more like, stochastic programming? Imaginary programming is more of an activity and a style of programming and is not really concerned with the amount of uncertainty. Your code might take a trip through someone else's LM along the way and be projected back to your own. That means that some of the logic is completely obscured and you have to make assumptions. You may collaborate on a user interface or program with others and since that code can't be fully understood by one person because of the veil then you are compelled to imagine in order to create something useful. A person must build their own interface from the pure imaginary functions that are shared. It's a paradigm completely made up so it's useful as far as it's useful. All this is based on this idea that we will have many finetuned and completely different transformer models and we must learn to communicate. The NeverEnding story also influenced my thoughts. Once everyone stops believing in Fantasia it ceases to exist, as does the utility of applications built in pure imaginary code. ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-30 3:20 ` Shane Mulligan @ 2021-07-30 6:55 ` Jean Louis 0 siblings, 0 replies; 75+ messages in thread From: Jean Louis @ 2021-07-30 6:55 UTC (permalink / raw) To: Shane Mulligan; +Cc: Eli Zaretskii, emacs-tangents, Stefan Kangas, rms * Shane Mulligan <mullikine@gmail.com> [2021-07-30 06:20]: > Hey guys. > > In the last week I have been writing a thesis for Imaginary Programming, > which aims to make all of this clear and formalised. > > I am very sorry if I have sounded frustrated, but I think that this is so > important for free software and a GPL-4 may be required to protect people, > but also that Copilot and OpenAI's Codex and GPT-3 models infringe upon the > spirit of GPT-3 code. > > I will attach the thesis into this email. > > https://github.com/semiosis/imaginary-programming-thesis/blob/master/thesis.org Please try to see if you can help on this: FSF-funded call for white papers on philosophical and legal questions around Copilot https://www.fsf.org/blogs/licensing/fsf-funded-call-for-white-papers-on-philosophical-and-legal-questions-around-copilot -- Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-23 12:47 ` Jean Louis 2021-07-23 13:39 ` Shane Mulligan @ 2021-07-23 19:33 ` Eli Zaretskii 2021-07-24 3:07 ` Jean Louis 1 sibling, 1 reply; 75+ messages in thread From: Eli Zaretskii @ 2021-07-23 19:33 UTC (permalink / raw) To: Jean Louis; +Cc: stefan, emacs-tangents, mullikine, rms > Date: Fri, 23 Jul 2021 15:47:21 +0300 > From: Jean Louis <bugs@gnu.support> > Cc: mullikine@gmail.com, stefan@marxist.se, emacs-tangents@gnu.org, > rms@gnu.org > > * Eli Zaretskii <eliz@gnu.org> [2021-07-23 14:51]: > > > According to online reviews chunks of code is copied even verbatim and > > > people find from where. > > > > That cannot be true. It is nonsense to copy unrelated code into a > > program and tell people this is what they should use. > > I wonder how sure you are in that, did you do the online research? It > is not about related or unrelated, I do believe that AI finds and > generates related code. But > > Here are references disputing how "it cannot be true": You take everything you read in these blogs for granted? Did you actually see the original code which these allude to? > > > If code compiles or not is irrelevant. If one runs it or not is also > > > irrelevant, code need not even run. > > > > A feature or service that is based on this idea will never fly, > > believe me. Which program would want to have code pasted into his/her > > program that would cause compilation errors or, worse, break it at run > > time? > > Of course people want code to fun. Just that copyright laws don't > handle technical functionality. It is irrelevant if program works or > does not work. For copyright purposes, it doesn't. But for the programmer who uses the code it very much does. So if these services give them code that doesn't work they will not use it, and eventually will stop using the services. > > Once again, your assumptions are all wrong, so your conclusions are > > also wrong. Why not try one of these services and see what they > > actually do, before you pass your (quite harsh) judgment on them, and > > on the modern state of AI in general? > > I can hear you how I am wrong, conclusions are wrong, though I gave > you references enough to research it on Internet that will tell that > there are possible serious licensing problems with such generated > code. See above. > Question is very particular, specific and concrete: > ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ > > How does Pen.el and background AI services ensure of licensing > compliance? > > I would appreciate if you find solution to that or stay on that > subject, as if I am wrong or right is not relevant, what I wish is to > have assurance that it is free software. Prove me wrong by providing > exact references in not only on country's law but also other > countries' laws, the lows that make it legal, or how otherwise the > legality of such code is justified and how users may get free > software. I'm sorry, but I don't work for you. If you have problems with using code from these services, then the onus is on you to do the research and make up your own mind. The discussion here is not about the code these services give their users, it's whether and how Emacs can make use of those services. Emacs allows the user to write proprietary code, and there's no legal issues when the user does that. Emacs also allows the user to copy someone else's code without permission, and that's not a problem for Emacs when the user does that. > As long as you don't tackle those subjects there is no legal solution > for Pen.el and background AI to be used with assurance that software > is truly free software. You confuse "free software" with "software being used to write free programs". They are not the same. ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-23 19:33 ` Eli Zaretskii @ 2021-07-24 3:07 ` Jean Louis 2021-07-24 7:32 ` Eli Zaretskii 2021-07-25 1:09 ` Richard Stallman 0 siblings, 2 replies; 75+ messages in thread From: Jean Louis @ 2021-07-24 3:07 UTC (permalink / raw) To: Eli Zaretskii; +Cc: stefan, emacs-tangents, mullikine, rms * Eli Zaretskii <eliz@gnu.org> [2021-07-23 22:34]: > > Here are references disputing how "it cannot be true": > > You take everything you read in these blogs for granted? Did you > actually see the original code which these allude to? In case of Copilot, Github admits: https://docs.github.com/en/github/copilot/research-recitation "This investigation demonstrates that GitHub Copilot can quote a body of code verbatim, but that it rarely does so, and when it does, it mostly quotes code that everybody quotes, and mostly at the beginning of a file, as if to break the ice.This investigation demonstrates that GitHub Copilot can quote a body of code verbatim, but that it rarely does so, and when it does, it mostly quotes code that everybody quotes, and mostly at the beginning of a file, as if to break the ice." And fact that it is "rare" does not make it a less problem for copyright purposes as the new author cannot know which part of the code has used "rare" verbatim. > > Question is very particular, specific and concrete: > > ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ > > > > How does Pen.el and background AI services ensure of licensing > > compliance? > > > > I would appreciate if you find solution to that or stay on that > > subject, as if I am wrong or right is not relevant, what I wish is to > > have assurance that it is free software. Prove me wrong by providing > > exact references in not only on country's law but also other > > countries' laws, the lows that make it legal, or how otherwise the > > legality of such code is justified and how users may get free > > software. > > I'm sorry, but I don't work for you. If you have problems with using > code from these services, then the onus is on you to do the research > and make up your own mind. The discussion here is not about the code > these services give their users, it's whether and how Emacs can make > use of those services. Emacs allows the user to write proprietary > code, and there's no legal issues when the user does that. Emacs also > allows the user to copy someone else's code without permission, and > that's not a problem for Emacs when the user does that. If you don't wish to correspond, don't, you are free. I have never said nor implied "you work for me" and I cannot see how is that relevant to the question. If you participate in discussion and respond to my question relating to licensing compliance, then provide a reference justifying its legality. Or simply say you don't have such. Your employment is not subject of my question nor relevant. I am not user of proprietary software and I don't consider options of writing proprietary software. Neither I am participating in discussion to foster ideas of creation of proprietary software. I am free software user and for that specific case I am interested how the licensing issue is solved. However, my question is at least answered by my online research as I have already found the refrences: 1. Julia Reda's reference; and 2. OpenAI_RFC-84-FR-58141.pdf https://www.uspto.gov/sites/default/files/documents/OpenAI_RFC-84-FR-58141.pdf Conclusions are: - legal justifications exists for US jurisdiction as the companies providing the AI are strong enough to find their ways, they are playing on the card as given in references above; as somebody already said, I doubt they would use "fair use" doctrine if the AI would be trained on proprietary software such as Windows; - conflict is serious and it is out there among the people and remains unsolved; AI has been trained on GPL and other free software and is used by corporations to generate new code without attributions; people complain that it is misuse of intentions of authors; - overall international legal situation is thus unclear, especially considering that free software spans the whole world, not just the US jurisdiction, as what may work within US is not same among all jurisdictions; > > As long as you don't tackle those subjects there is no legal solution > > for Pen.el and background AI to be used with assurance that software > > is truly free software. > > You confuse "free software" with "software being used to write free > programs". They are not the same. Maybe I have expressed myself in such way as not to get the point understood. It must be so, as I have finally found the first legal references myself. For Pen.el I have never made any relevance to legality question I made, and I have the pen.el repository over here and license is clear. Never mentioned it. I have not made reference to "software being used to write free programs" as a server side service I did not tackle that, it is most probably proprietary software or some versions could be free software. But that is not relevant. What is at hand is: ━━━━━━━━━━━━━━━━━━━ 1. There is pool of GPL and other free software which authors expect compliance to their licenses; 2. Large corporation is trying to use "fair use" doctrine on the pool of software to create a service; 3. Service generates new software, sometimes duplicating verbatim code; Question was and still remains largely unsolved is how authors who use newly generated code can be sure that generated software is free software and to comply to GPL and other free software licenses? Conclusion as of 2021-07-24 is that authors cannot be sure as there are legal uncertainties. Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 3:07 ` Jean Louis @ 2021-07-24 7:32 ` Eli Zaretskii 2021-07-24 7:54 ` Jean Louis 2021-07-25 1:09 ` Richard Stallman 1 sibling, 1 reply; 75+ messages in thread From: Eli Zaretskii @ 2021-07-24 7:32 UTC (permalink / raw) To: Jean Louis; +Cc: stefan, emacs-tangents, mullikine, rms > Date: Sat, 24 Jul 2021 06:07:18 +0300 > From: Jean Louis <bugs@gnu.support> > Cc: mullikine@gmail.com, stefan@marxist.se, emacs-tangents@gnu.org, > rms@gnu.org > > * Eli Zaretskii <eliz@gnu.org> [2021-07-23 22:34]: > > > Here are references disputing how "it cannot be true": > > > > You take everything you read in these blogs for granted? Did you > > actually see the original code which these allude to? > > In case of Copilot, Github admits: > https://docs.github.com/en/github/copilot/research-recitation Yes, and there are cases of real code stealing out there. The only thing that proves is that some mistaken or dishonest operators can do this. > > > I would appreciate if you find solution to that or stay on that > > > subject, as if I am wrong or right is not relevant, what I wish is to > > > have assurance that it is free software. Prove me wrong by providing > > > exact references in not only on country's law but also other > > > countries' laws, the lows that make it legal, or how otherwise the > > > legality of such code is justified and how users may get free > > > software. > > > > I'm sorry, but I don't work for you. If you have problems with using > > code from these services, then the onus is on you to do the research > > and make up your own mind. The discussion here is not about the code > > these services give their users, it's whether and how Emacs can make > > use of those services. Emacs allows the user to write proprietary > > code, and there's no legal issues when the user does that. Emacs also > > allows the user to copy someone else's code without permission, and > > that's not a problem for Emacs when the user does that. > > If you don't wish to correspond, don't, you are free. > > I have never said nor implied "you work for me" and I cannot see how > is that relevant to the question. You consistently take the stance that implies, and many times explicitly states, that (a) you represent the views of the GNU project, and (b) the GNU project should or should not do this and that. Then, when people like me object, you demand that they prove something to you, or else. But no one here is under any obligation of proving anything to you, and your views and opinions (which are quite radical, I must say) are your own and no one else's. They are your own responsibility, and if you want them to be proven or dis-proven, you should do that yourself. > If you participate in discussion and respond to my question relating > to licensing compliance, then provide a reference justifying its > legality. Or simply say you don't have such. Your employment is not > subject of my question nor relevant. I could ask you to do the same. You never provided any reference justifying the legality, just a lot of blogs that spread FUD (whose motivation, which many times is struggle against Free Software, I described in my previous message). If you demand something of your correspondents, please live up to the same high standards, or stop demanding that others do. Quoting a random selection of blog postings is NOT research and does NOT justify anything, except that the issue is being "discussed" by some people. It doesn't even mean that those discussions are serious, let alone that whoever posts those opinions doesn't have an agenda. > I am not user of proprietary software and I don't consider options of > writing proprietary software. Neither I am participating in discussion > to foster ideas of creation of proprietary software. > > I am free software user and for that specific case I am interested how > the licensing issue is solved. You are free to do whatever you like in your work; that is your prerogative and no one else's. But here we discuss what the Emacs project should or should not do about this technology, not your private decisions. > 2. OpenAI_RFC-84-FR-58141.pdf > https://www.uspto.gov/sites/default/files/documents/OpenAI_RFC-84-FR-58141.pdf > > Conclusions are: > > - legal justifications exists for US jurisdiction as the companies > providing the AI are strong enough to find their ways, they are > playing on the card as given in references above; as somebody > already said, I doubt they would use "fair use" doctrine if the AI > would be trained on proprietary software such as Windows; > > - conflict is serious and it is out there among the people and remains > unsolved; AI has been trained on GPL and other free software and is > used by corporations to generate new code without attributions; > people complain that it is misuse of intentions of authors; > > - overall international legal situation is thus unclear, especially > considering that free software spans the whole world, not just the > US jurisdiction, as what may work within US is not same among all > jurisdictions; That's not what the above document concludes. Quote: Conclusion We submit that: I. Under current law, training AI systems constitutes fair use. II. Policy considerations underlying fair use doctrine support the finding that training AI systems constitute fair use. III. Nevertheless, legal uncertainty on the copyright implications of training AI systems imposes substantial costs on AI developers and so should be authoritatively resolved. > Conclusion as of 2021-07-24 is that authors cannot be sure as there > are legal uncertainties. Those are your personal conclusions. They don't follow, and sometimes directly contradict, the references you yourself posted (sometimes so much so that I wonder whether we really read the same text). My opinions differ substantially, for the reasons I explained above and in other messages. (Full disclosure: part of my daytime job is development of sophisticated AI-based algorithms that use machine learning technologies for various practical purposes, including analysis of "natural language" text.) ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 7:32 ` Eli Zaretskii @ 2021-07-24 7:54 ` Jean Louis 2021-07-24 8:50 ` Eli Zaretskii 0 siblings, 1 reply; 75+ messages in thread From: Jean Louis @ 2021-07-24 7:54 UTC (permalink / raw) To: Eli Zaretskii; +Cc: stefan, emacs-tangents, mullikine, rms * Eli Zaretskii <eliz@gnu.org> [2021-07-24 10:33]: > > I have never said nor implied "you work for me" and I cannot see how > > is that relevant to the question. > > You consistently take the stance that implies, and many times > explicitly states, that (a) you represent the views of the GNU > project I have never in my life said so, you please stop with it. > and (b) the GNU project should or should not do this and that. I have always beeing following GNU project as such because I agree to its principles and guidances as published on GNU website. I will definitely compare issues at hand with the already well known GNU guidelines just as you and other people do. For example when there is recommendation of proprietary software I will say that GNU project does not endorse such. > Then, when people like me object, you demand that they prove > something to you, or else. The issues are totally unrelated. I am sorry for your misunderstandings. I was interested to find out how is legality solved when re-using the code generated by AI. Nothing else beyond that. And I have found references and made conclusions. > > If you participate in discussion and respond to my question relating > > to licensing compliance, then provide a reference justifying its > > legality. Or simply say you don't have such. Your employment is not > > subject of my question nor relevant. > > I could ask you to do the same. You never provided any reference > justifying the legality, just a lot of blogs that spread FUD (whose > motivation, which many times is struggle against Free Software, I > described in my previous message). Take it easy. I asked simple question, refined the question, and found references, made conclusions of legality justification in US jurisdiction and that there are unsolved global problem. There is no need to expand discussion into directions which is purely irrelevant to the question about licensing compliance. > > 2. OpenAI_RFC-84-FR-58141.pdf > > https://www.uspto.gov/sites/default/files/documents/OpenAI_RFC-84-FR-58141.pdf > > > > Conclusions are: > > > > - legal justifications exists for US jurisdiction as the companies > > providing the AI are strong enough to find their ways, they are > > playing on the card as given in references above; as somebody > > already said, I doubt they would use "fair use" doctrine if the AI > > would be trained on proprietary software such as Windows; > > > > - conflict is serious and it is out there among the people and remains > > unsolved; AI has been trained on GPL and other free software and is > > used by corporations to generate new code without attributions; > > people complain that it is misuse of intentions of authors; > > > > - overall international legal situation is thus unclear, especially > > considering that free software spans the whole world, not just the > > US jurisdiction, as what may work within US is not same among all > > jurisdictions; > > That's not what the above document concludes. Quote: My conclusions are not what document conclude. I never said so. The document is related exclusively to US jurisdiction and its final determination is vague. It is their proposal, and not a court decision. They have openly said that they are financially strong enough and will try to defend any cases by using "fair use" doctrine in the US. This however, does not solve all jurisdictions. "fair use" doctrine is also not finally solved in the US. > Conclusion > > We submit that: > > I. Under current law, training AI systems constitutes fair use. That is their opinion, as that is a corporation that has the strength to submit such document and of course that they found some legal defense. If anybody starts complaining it will be a court case that will give final judgments. > II. Policy considerations underlying fair use doctrine support the > finding that training AI systems constitute fair use. > > III. Nevertheless, legal uncertainty on the copyright implications > of training AI systems imposes substantial costs on AI developers > and so should be authoritatively resolved. Everything from that document relates to US jurisdiction only. It is one-sided thus biased document, clearly opposing the views of many GPL authors. It is the corporations defense argument for ripping off the GPL software. They found the way and wish to play on that card. It is document of conflict, not document of friendship or collaboration. It is document of one-sidede defense, not a document that contributes to free software. It is document that defend proprietary software, not document that fosters free software. Thus it does not resolve anything in the community. It serves to one party only. Would that corporation release all software as free software that would bring or make a new leap forward. They are making one big step backwards. One can see that by number of aware free software developers canceling the Github accounts. > > Conclusion as of 2021-07-24 is that authors cannot be sure as there > > are legal uncertainties. > > Those are your personal conclusions. Personal definitely, but not as the only one with the same opinion, which should be clear from references which left probably unread. Legality of free software on the planet was ensured by the GPL license. Maybe the license was never planned to be international, but it does function well internationally. Legality of AI generated code and "free use" doctrine in the US is at this point of development yet far from functioning well internationally. > (Full disclosure: part of my daytime job is development of > sophisticated AI-based algorithms that use machine learning > technologies for various practical purposes, including analysis of > "natural language" text.) That is great, it is technical part of software. Keep doing. Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 7:54 ` Jean Louis @ 2021-07-24 8:50 ` Eli Zaretskii 2021-07-24 16:16 ` Jean Louis 0 siblings, 1 reply; 75+ messages in thread From: Eli Zaretskii @ 2021-07-24 8:50 UTC (permalink / raw) To: Jean Louis; +Cc: stefan, emacs-tangents, mullikine, rms > Date: Sat, 24 Jul 2021 10:54:07 +0300 > From: Jean Louis <bugs@gnu.support> > Cc: mullikine@gmail.com, stefan@marxist.se, emacs-tangents@gnu.org, > rms@gnu.org > > * Eli Zaretskii <eliz@gnu.org> [2021-07-24 10:33]: > > > I have never said nor implied "you work for me" and I cannot see how > > > is that relevant to the question. > > > > You consistently take the stance that implies, and many times > > explicitly states, that (a) you represent the views of the GNU > > project > > I have never in my life said so, you please stop with it. Please re-read your postings. They say otherwise. I realize that you didn't intend that, but that's how your words sound, here and elsewhere. If you want to avoid such interpretation, please take great care to tone down your categorical statements, use qualifiers like IMO and AFAIK and "I think", and generally make sure your words say that's your opinion, not an absolute truth, let alone something the GNU project decided to do or is doing. > For example when there is recommendation of proprietary software I > will say that GNU project does not endorse such. Please consider adding "AFAIK" or somesuch, otherwise this sounds like you are speaking for the project. > I was interested to find out how is legality solved when re-using the > code generated by AI. Nothing else beyond that. And I have found > references and made conclusions. And I challenged your conclusions. Nothing beyond that. > > That's not what the above document concludes. Quote: > > My conclusions are not what document conclude. I never said so. You didn't say otherwise, either. Someone who didn't have time to read the document could think that it's what the document concluded. That's why I posted the actual quotation, so that people could draw their own conclusions, or decide they do want to read the document itself. > The document is related exclusively to US jurisdiction and its final > determination is vague. It is their proposal, and not a court > decision. They have openly said that they are financially strong > enough and will try to defend any cases by using "fair use" doctrine > in the US. All true, but the same can be said about all the other posts and blogs you quoted. Which leaves us none the wiser about the problem. We still need to assess the original facts and data and make our own conclusions. I did, and my conclusions are starkly different from yours. > Everything from that document relates to US jurisdiction only. It is > one-sided thus biased document, clearly opposing the views of many GPL > authors. > > It is the corporations defense argument for ripping off the GPL > software. They found the way and wish to play on that card. > > It is document of conflict, not document of friendship or > collaboration. It is document of one-sidede defense, not a document > that contributes to free software. > > It is document that defend proprietary software, not document that > fosters free software. > > Thus it does not resolve anything in the community. It serves to one > party only. And yet you draw conclusions from it about how GNU and Emacs should behave about this technology? How does that make sense? > > > Conclusion as of 2021-07-24 is that authors cannot be sure as there > > > are legal uncertainties. > > > > Those are your personal conclusions. > > Personal definitely, but not as the only one with the same opinion, How does that make any difference? Should GNU and Emacs take the fact that several people expressed this opinion as meaning it is the truth for our purposes? Especially since some (perhaps many) of them are driven by motivation that is explicitly anti-Free Software? > which should be clear from references which left probably unread. Please don't assume I didn't read those references. You have no basis for making such nasty assumptions about me, let alone expressing them publicly here. > Legality of free software on the planet was ensured by the GPL > license. Maybe the license was never planned to be international, but > it does function well internationally. > > Legality of AI generated code and "free use" doctrine in the US is at > this point of development yet far from functioning well > internationally. Which, to me, says that we should carefully examine this issue by ourselves, not draw any premature conclusions from the hoop-la out there. ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 8:50 ` Eli Zaretskii @ 2021-07-24 16:16 ` Jean Louis 2021-07-24 16:44 ` Eli Zaretskii 0 siblings, 1 reply; 75+ messages in thread From: Jean Louis @ 2021-07-24 16:16 UTC (permalink / raw) To: Eli Zaretskii; +Cc: stefan, emacs-tangents, mullikine, rms * Eli Zaretskii <eliz@gnu.org> [2021-07-24 11:51]: > Please re-read your postings. They say otherwise. I realize that you > didn't intend that, but that's how your words sound, here and > elsewhere. If you want to avoid such interpretation, please take > great care to tone down your categorical statements, use qualifiers > like IMO and AFAIK and "I think", and generally make sure your words > say that's your opinion, not an absolute truth, let alone something > the GNU project decided to do or is doing. > > > For example when there is recommendation of proprietary software I > > will say that GNU project does not endorse such. > Please consider adding "AFAIK" or somesuch, otherwise this sounds like > you are speaking for the project. Thanks, but no, all opinions are private. What is official for GNU project is on the GNU website, in the GNU manuals and other documentation. Though, that is not the subject of this thread. You have said your opinion though you did not mention not even one legal reference to the questions about licensing compliance that I have mentioned. So I keep it as your differing opinion though without references to legalities I do not find it relevant. Julia Reda's opinion I find relevant, and I found it online. I wish I could find it as answer on this mailing list sooner than online, but it is not so. > And yet you draw conclusions from it about how GNU and Emacs should > behave about this technology? How does that make sense? I never said so, that is misunderstanding. Once again, my question related GPL licensing compliance is relevant to GNU project, and to GPL licensed software authors. Licensing is legality. It is not related to technological parts. I have never mentioned technology and how GNU and Emacs should behave. In the thread of Pen.el subject I wanted to find out how is compliance to licenses solved. Don't make fuss about the simple questions. Maybe is better to wait and let maybe somebody else jump in and answer it. You seem to personally chase me that I stop asking questions? It does not really seem welcoming, it seems like I did something bad to you and you are pushing with force to stop me asking such simple banal question. We talk about GPL licensing compliance for years in various GNU related discussion within GNU project and without GNU project. I was asking German companies about licensing compliance to GNU GPL software and had such a nice conversation with them and they agreed to comply to it, and provided sources. You please make it easy, as I am asking logical question, don't call me radical as that has negative connotations. > > Legality of AI generated code and "free use" doctrine in the US is at > > this point of development yet far from functioning well > > internationally. > > Which, to me, says that we should carefully examine this issue by > ourselves, not draw any premature conclusions from the hoop-la out > there. Remember that it was me who first responded to original poster and installed pen.el and tried to run it, at that time I did not have the OpenAI key, but now I have it. From that, it should be obvious that I am interested in the technology. Without even looking online (due to my limited Internet) I have asked about licensing compliance, there was no answer until I found it today from online sources. That question is related to adopting the technology, not to rejecting it. If you wish to adopt anything into private use one has to have permissions, or in this case "fair use" exemption granted by US government. It should be obvious that I have referenced legal advisors, attorneys who made that document, including Julia Reda, known as activist in Germany, and it was me who found references and listed it here. Beside those really deficient expressions, if you have a constructive references on how how each jurisdiction would accept "fair use" let me know, otherwise leave this discussion in peace and myself. Stay on subject, don't call me words as me and you didn't graze the sheep together. Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 16:16 ` Jean Louis @ 2021-07-24 16:44 ` Eli Zaretskii 2021-07-24 18:01 ` Jean Louis 0 siblings, 1 reply; 75+ messages in thread From: Eli Zaretskii @ 2021-07-24 16:44 UTC (permalink / raw) To: Jean Louis; +Cc: stefan, emacs-tangents, mullikine, rms > Date: Sat, 24 Jul 2021 19:16:34 +0300 > From: Jean Louis <bugs@gnu.support> > Cc: mullikine@gmail.com, stefan@marxist.se, emacs-tangents@gnu.org, > rms@gnu.org > > > > For example when there is recommendation of proprietary software I > > > will say that GNU project does not endorse such. > > > Please consider adding "AFAIK" or somesuch, otherwise this sounds like > > you are speaking for the project. > > Thanks, but no, all opinions are private. Again, yours don't sound private, and many people will become confused at best. > You have said your opinion though you did not mention not even one > legal reference to the questions about licensing compliance that I > have mentioned. So I keep it as your differing opinion though without > references to legalities I do not find it relevant. This is a public list. It is important for me to state my differing opinions so that people could make up their own minds. > You seem to personally chase me that I stop asking questions? No. I'm trying to correct the wrong impression your postings could have made on people reading them. > It does not really seem welcoming, it seems like I did something bad > to you and you are pushing with force to stop me asking such simple > banal question. I said exactly what I think was wrong with your postings. I wish you would change the style and wording when you speak on these issues, but I have no real control on what you will do. > You please make it easy, as I am asking logical question, don't call > me radical as that has negative connotations. I explained in detail why I said that. I'm sorry to conclude that you disregard that. ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 16:44 ` Eli Zaretskii @ 2021-07-24 18:01 ` Jean Louis 0 siblings, 0 replies; 75+ messages in thread From: Jean Louis @ 2021-07-24 18:01 UTC (permalink / raw) To: Eli Zaretskii; +Cc: stefan, emacs-tangents, mullikine, rms Eli, my question is solved, there was subject "Pen.el" and I found what is the legal justification from third parties. Thus my question is solved. I would like to try Pen.el but I have here technical problems. Thank for suggestions, I will think about it. Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 3:07 ` Jean Louis 2021-07-24 7:32 ` Eli Zaretskii @ 2021-07-25 1:09 ` Richard Stallman 1 sibling, 0 replies; 75+ messages in thread From: Richard Stallman @ 2021-07-25 1:09 UTC (permalink / raw) To: Jean Louis; +Cc: stefan, eliz, mullikine, emacs-tangents [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > And fact that it is "rare" does not make it a less problem for > copyright purposes as the new author cannot know which part of the > code has used "rare" verbatim. I think that is correct. Falling into this pit may be unusual, but that doesn't make it painless. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-23 11:32 ` Jean Louis 2021-07-23 11:51 ` Eli Zaretskii @ 2021-07-24 1:14 ` Richard Stallman 2021-07-24 2:10 ` Shane Mulligan 2021-07-24 6:49 ` Eli Zaretskii 1 sibling, 2 replies; 75+ messages in thread From: Richard Stallman @ 2021-07-24 1:14 UTC (permalink / raw) To: Jean Louis; +Cc: stefan, eliz, mullikine, emacs-tangents [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > That's not what happens with these services: they don't _copy_ code > > from other software (that won't work, because the probability of the > > variables being called by other names is 100%, and thus such code, if > > pasted into your program, will not compile). What they do, they > > extract ideas and algorithms from those other places, and express them > > in terms of your variables and your data types. So licenses are not > > relevant here. > According to online reviews chunks of code is copied even verbatim and > people find from where. Even if modified, it still requires licensing > compliance. From what I have read, it seems that the behavior of copilot runs on a spectrum from the first description to the second description. I expect that in many cases, nothing copyrightable has been copied, but in some cases copilot does copy a substantial amount from a copyrighted work. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 1:14 ` Richard Stallman @ 2021-07-24 2:10 ` Shane Mulligan 2021-07-24 2:34 ` Shane Mulligan 2021-07-24 6:49 ` Eli Zaretskii 1 sibling, 1 reply; 75+ messages in thread From: Shane Mulligan @ 2021-07-24 2:10 UTC (permalink / raw) To: rms; +Cc: Eli Zaretskii, Stefan Kangas, emacs-tangents, Jean Louis [-- Attachment #1: Type: text/plain, Size: 2665 bytes --] It's a bit like whitewashing because it's reconstructing generatively by finding artificial/contrived associations between different works that the author had not intended but may have been part of their inspiration inspiration, and it compresses the information based on these assocations. It's a bit like running a lossy 'zip' on the internet and then decompressing probabilistically. When run deterministically (set the temperature of GPT to 0), you may actually see 'snippets' from various places, every time, with the same input generating the same snippets. So the source material is important. What GitHub did was very, very bad but they did it anyway. That doesn't mean GPT is bad, it just means they zipped up content they should not have and created this language 'index' or ('codex' is what they call it). What they really should do, if they are honest people, is train the model on subsets of GitHub code by separate licence and release the models with the same license. Shane Mulligan How to contact me: 🇦🇺 00 61 421 641 250 🇳🇿 00 64 21 1462 759 <+64-21-1462-759> mullikine@gmail.com On Sat, Jul 24, 2021 at 1:14 PM Richard Stallman <rms@gnu.org> wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > > That's not what happens with these services: they don't _copy_ code > > > from other software (that won't work, because the probability of the > > > variables being called by other names is 100%, and thus such code, if > > > pasted into your program, will not compile). What they do, they > > > extract ideas and algorithms from those other places, and express > them > > > in terms of your variables and your data types. So licenses are not > > > relevant here. > > > According to online reviews chunks of code is copied even verbatim and > > people find from where. Even if modified, it still requires licensing > > compliance. > > From what I have read, it seems that the behavior of copilot runs on a > spectrum from the first description to the second description. I > expect that in many cases, nothing copyrightable has been copied, but > in some cases copilot does copy a substantial amount from a > copyrighted work. > > -- > Dr Richard Stallman (https://stallman.org) > Chief GNUisance of the GNU Project (https://gnu.org) > Founder, Free Software Foundation (https://fsf.org) > Internet Hall-of-Famer (https://internethalloffame.org) > > > [-- Attachment #2: Type: text/html, Size: 4946 bytes --] ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 2:10 ` Shane Mulligan @ 2021-07-24 2:34 ` Shane Mulligan 2021-07-24 3:14 ` Shane Mulligan 0 siblings, 1 reply; 75+ messages in thread From: Shane Mulligan @ 2021-07-24 2:34 UTC (permalink / raw) To: rms; +Cc: Eli Zaretskii, Stefan Kangas, emacs-tangents, Jean Louis [-- Attachment #1: Type: text/plain, Size: 3939 bytes --] This is why the technology is a bit like a personal Google search, Stackoverflow, which you can store offline because it's an index of the internet that is capable of reconstruction. But it's not limited to code generation. Codex is nothing. Emacs + GPT would carve a large piece out of M$. Codex is a model trained for the purpose of generating code, but GPT models will become abundant for all tasks, including image and audio synthesis and understanding. Emacs is a complete operating system. VSCode is geared towards programming. Emacs can do infinitely more things with GPT than VSCode can because it's holistic. Even the 'eliza' in emacs can pass the turing test with GPT. GPT can run sequences of commands in emacs to automate entire workflows with natural language. But the future is in collaborative GPT. The basis/base truth would become versions of LMs or ontologies. Right now that's EleutherAI. Shane Mulligan How to contact me: 🇦🇺 00 61 421 641 250 🇳🇿 00 64 21 1462 759 <+64-21-1462-759> mullikine@gmail.com On Sat, Jul 24, 2021 at 2:10 PM Shane Mulligan <mullikine@gmail.com> wrote: > It's a bit like whitewashing because it's > reconstructing generatively by finding > artificial/contrived associations between > different works that the author had not > intended but may have been part of their > inspiration inspiration, and it compresses the > information based on these assocations. > > It's a bit like running a lossy 'zip' on the > internet and then decompressing > probabilistically. > > When run deterministically (set the temperature of GPT to 0), you may > actually > see 'snippets' from various places, every time, with the same input > generating > the same snippets. > > So the source material is important. > > What GitHub did was very, very bad but they > did it anyway. > > That doesn't mean GPT is bad, it just means > they zipped up content they should not have > and created this language 'index' or ('codex' > is what they call it). > > What they really should do, if they are honest > people, is train the model on subsets of > GitHub code by separate licence and release > the models with the same license. > > Shane Mulligan > > How to contact me: > 🇦🇺 00 61 421 641 250 > 🇳🇿 00 64 21 1462 759 <+64-21-1462-759> > mullikine@gmail.com > > > On Sat, Jul 24, 2021 at 1:14 PM Richard Stallman <rms@gnu.org> wrote: > >> [[[ To any NSA and FBI agents reading my email: please consider ]]] >> [[[ whether defending the US Constitution against all enemies, ]]] >> [[[ foreign or domestic, requires you to follow Snowden's example. ]]] >> >> > > That's not what happens with these services: they don't _copy_ code >> > > from other software (that won't work, because the probability of the >> > > variables being called by other names is 100%, and thus such code, >> if >> > > pasted into your program, will not compile). What they do, they >> > > extract ideas and algorithms from those other places, and express >> them >> > > in terms of your variables and your data types. So licenses are not >> > > relevant here. >> >> > According to online reviews chunks of code is copied even verbatim and >> > people find from where. Even if modified, it still requires licensing >> > compliance. >> >> From what I have read, it seems that the behavior of copilot runs on a >> spectrum from the first description to the second description. I >> expect that in many cases, nothing copyrightable has been copied, but >> in some cases copilot does copy a substantial amount from a >> copyrighted work. >> >> -- >> Dr Richard Stallman (https://stallman.org) >> Chief GNUisance of the GNU Project (https://gnu.org) >> Founder, Free Software Foundation (https://fsf.org) >> Internet Hall-of-Famer (https://internethalloffame.org) >> >> >> [-- Attachment #2: Type: text/html, Size: 7947 bytes --] ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 2:34 ` Shane Mulligan @ 2021-07-24 3:14 ` Shane Mulligan 0 siblings, 0 replies; 75+ messages in thread From: Shane Mulligan @ 2021-07-24 3:14 UTC (permalink / raw) To: rms; +Cc: Eli Zaretskii, Stefan Kangas, emacs-tangents, Jean Louis [-- Attachment #1: Type: text/plain, Size: 4856 bytes --] Proprietary code from within the M$ ecosystem is uninspired and bad code by comparison. Open source code is the gold mine so M$ will not like being told they cannot use open source to compile codex. It's a complete r*pe of open source. GPT is trained on public language and language belongs to people generally, not some select group. It's not meant to be a tool for controlling people. GPT is literally the soul of a billion people and should be public domain and not feared by GNU but instead rescued. Sorry for the rhetoric! On Sat, Jul 24, 2021 at 2:34 PM Shane Mulligan <mullikine@gmail.com> wrote: > This is why the technology is a bit like a > personal Google search, Stackoverflow, which > you can store offline because it's an index of the internet that is > capable of reconstruction. > > But it's not limited to code generation. Codex > is nothing. Emacs + GPT would carve a large > piece out of M$. > > Codex is a model trained for the purpose of > generating code, but GPT models will become > abundant for all tasks, including image and > audio synthesis and understanding. > > Emacs is a complete operating system. > VSCode is geared towards programming. > > Emacs can do infinitely more things with GPT > than VSCode can because it's holistic. > > Even the 'eliza' in emacs can pass the turing > test with GPT. GPT can run sequences of commands in emacs to automate > entire workflows with natural language. > > But the future is in collaborative GPT. > > The basis/base truth would become versions of > LMs or ontologies. > > Right now that's EleutherAI. > > Shane Mulligan > > How to contact me: > 🇦🇺 00 61 421 641 250 > 🇳🇿 00 64 21 1462 759 <+64-21-1462-759> > mullikine@gmail.com > > > On Sat, Jul 24, 2021 at 2:10 PM Shane Mulligan <mullikine@gmail.com> > wrote: > >> It's a bit like whitewashing because it's >> reconstructing generatively by finding >> artificial/contrived associations between >> different works that the author had not >> intended but may have been part of their >> inspiration inspiration, and it compresses the >> information based on these assocations. >> >> It's a bit like running a lossy 'zip' on the >> internet and then decompressing >> probabilistically. >> >> When run deterministically (set the temperature of GPT to 0), you may >> actually >> see 'snippets' from various places, every time, with the same input >> generating >> the same snippets. >> >> So the source material is important. >> >> What GitHub did was very, very bad but they >> did it anyway. >> >> That doesn't mean GPT is bad, it just means >> they zipped up content they should not have >> and created this language 'index' or ('codex' >> is what they call it). >> >> What they really should do, if they are honest >> people, is train the model on subsets of >> GitHub code by separate licence and release >> the models with the same license. >> >> Shane Mulligan >> >> How to contact me: >> 🇦🇺 00 61 421 641 250 >> 🇳🇿 00 64 21 1462 759 <+64-21-1462-759> >> mullikine@gmail.com >> >> >> On Sat, Jul 24, 2021 at 1:14 PM Richard Stallman <rms@gnu.org> wrote: >> >>> [[[ To any NSA and FBI agents reading my email: please consider ]]] >>> [[[ whether defending the US Constitution against all enemies, ]]] >>> [[[ foreign or domestic, requires you to follow Snowden's example. ]]] >>> >>> > > That's not what happens with these services: they don't _copy_ code >>> > > from other software (that won't work, because the probability of >>> the >>> > > variables being called by other names is 100%, and thus such code, >>> if >>> > > pasted into your program, will not compile). What they do, they >>> > > extract ideas and algorithms from those other places, and express >>> them >>> > > in terms of your variables and your data types. So licenses are >>> not >>> > > relevant here. >>> >>> > According to online reviews chunks of code is copied even verbatim >>> and >>> > people find from where. Even if modified, it still requires licensing >>> > compliance. >>> >>> From what I have read, it seems that the behavior of copilot runs on a >>> spectrum from the first description to the second description. I >>> expect that in many cases, nothing copyrightable has been copied, but >>> in some cases copilot does copy a substantial amount from a >>> copyrighted work. >>> >>> -- >>> Dr Richard Stallman (https://stallman.org) >>> Chief GNUisance of the GNU Project (https://gnu.org) >>> Founder, Free Software Foundation (https://fsf.org) >>> Internet Hall-of-Famer (https://internethalloffame.org) >>> >>> >>> -- Shane Mulligan How to contact me: 🇦🇺 00 61 421 641 250 🇳🇿 00 64 21 1462 759 <+64-21-1462-759> mullikine@gmail.com [-- Attachment #2: Type: text/html, Size: 10603 bytes --] ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 1:14 ` Richard Stallman 2021-07-24 2:10 ` Shane Mulligan @ 2021-07-24 6:49 ` Eli Zaretskii 2021-07-24 7:33 ` Jean Louis 2021-07-24 7:41 ` Philip Kaludercic 1 sibling, 2 replies; 75+ messages in thread From: Eli Zaretskii @ 2021-07-24 6:49 UTC (permalink / raw) To: rms; +Cc: mullikine, emacs-tangents, stefan, bugs > From: Richard Stallman <rms@gnu.org> > Date: Fri, 23 Jul 2021 21:14:23 -0400 > Cc: stefan@marxist.se, eliz@gnu.org, mullikine@gmail.com, > emacs-tangents@gnu.org > > > > That's not what happens with these services: they don't _copy_ code > > > from other software (that won't work, because the probability of the > > > variables being called by other names is 100%, and thus such code, if > > > pasted into your program, will not compile). What they do, they > > > extract ideas and algorithms from those other places, and express them > > > in terms of your variables and your data types. So licenses are not > > > relevant here. > > > According to online reviews chunks of code is copied even verbatim and > > people find from where. Even if modified, it still requires licensing > > compliance. > > From what I have read, it seems that the behavior of copilot runs on a > spectrum from the first description to the second description. I > expect that in many cases, nothing copyrightable has been copied, but > in some cases copilot does copy a substantial amount from a > copyrighted work. It cannot be a verbatim copy, because at least the variables, and sometimes also the data types, need to be renamed. Whether the result is still under the original copyright cannot be established without actually comparing the two versions of the code. So any general flat rejection of the idea of these services on these grounds is not serious, IMO. Of course, someone like Jean will not use any code until a bunch of lawyers submit an official opinion about the legal implications, but IMO that's a radical view that doesn't make a lot of sense, especially since none of the code accessible openly via the net can be proprietary, for obvious reasons. Jean could do whatever he personally likes, but his radical views don't necessarily bind the GNU project in general and Emacs in particular. Moreover, ironically Jean bases his views on opinions and issues expressed by clear opponents of Free Software. The strongest drive behind many of these blogs' aversion from these services is the fear that GPL-licensed code creeps into proprietary software produced by enterprises and their software subcontractors, because that would require them to make the sources available or at least put them at a risk of lawsuits. It is a well-known fact that most, if not all, software contracts for proprietary software nowadays include explicit prohibition of using GPL-licensed code in the product. It is those people that serve these contracts and enterprises who drive the whoop-la about licensing issues in code offered by these AI-based services. So before embracing their FUD and biased opinions, I really suggest to actually look at the code, compare it with the original, and make an independent assessment of both whether it's a "copy" from the copyright POV and of the licenses of the original code. ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 6:49 ` Eli Zaretskii @ 2021-07-24 7:33 ` Jean Louis 2021-07-24 8:10 ` Eli Zaretskii 2021-07-24 7:41 ` Philip Kaludercic 1 sibling, 1 reply; 75+ messages in thread From: Jean Louis @ 2021-07-24 7:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mullikine, emacs-tangents, stefan, rms Eli, I do take care of licensing when re-using somebody's software, and when publishing software or distributing it. There is nothing "radical" about it. Concerns of other people are also not radical. Intention of authors is not respected even if there is legal circumvention in the US such as "fair use", that does not fly in other jurisdictions. I do understand you have some unsolved issues or something you cannot handle related to licensing as you are more for technical side, but please don't call it "radical" as that does not teach people about GPL licensing. Jean ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 7:33 ` Jean Louis @ 2021-07-24 8:10 ` Eli Zaretskii 2021-07-24 8:21 ` Jean Louis 2021-07-24 8:35 ` Jean Louis 0 siblings, 2 replies; 75+ messages in thread From: Eli Zaretskii @ 2021-07-24 8:10 UTC (permalink / raw) To: Jean Louis; +Cc: mullikine, emacs-tangents, stefan, rms > Date: Sat, 24 Jul 2021 10:33:57 +0300 > From: Jean Louis <bugs@gnu.support> > Cc: rms@gnu.org, stefan@marxist.se, mullikine@gmail.com, > emacs-tangents@gnu.org > > > Eli, I do take care of licensing when re-using somebody's software, > and when publishing software or distributing it. > > There is nothing "radical" about it. Considering the license of the code is not radical, indeed. But the criteria you personally apply when considering that _are_ radical. You posted enough opinions about these matters to make that abundantly clear. There's nothing wrong with having such views, they are your personal views, and are entirely legitimate. All I'm saying is that the Emacs project should not be guided by such views, for the reasons I explained. > Concerns of other people are also not radical. No, but your interpretation of those "concerns" is. > Intention of authors is not respected even if there is legal > circumvention in the US such as "fair use", that does not fly in > other jurisdictions. So you agree that the problems you raised don't seem to exist at least in the US? > I do understand you have some unsolved issues or something you cannot > handle related to licensing No, I don't have any unsolved issues. > as you are more for technical side ??? What is that supposed to mean? > but please don't call it "radical" as that does not teach people > about GPL licensing. When I see a radical view, I call it "radical". Promoting Free Software requires healthy pragmatism, because we want the Free Software to flourish and remain relevant by picking up the advances in technology. Rejecting such new technologies just because there's some doubts expressed by someone in some blog is "radical", and IMO eventually detrimental to Free Software development. We should instead carefully and independently assess the issues and make our own judgment based on specific details of each such development. We cannot run away of every idea because some people say it might cause trouble in some cases. ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 8:10 ` Eli Zaretskii @ 2021-07-24 8:21 ` Jean Louis 2021-07-24 8:35 ` Jean Louis 1 sibling, 0 replies; 75+ messages in thread From: Jean Louis @ 2021-07-24 8:21 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mullikine, emacs-tangents, stefan, rms * Eli Zaretskii <eliz@gnu.org> [2021-07-24 11:11]: > There's nothing wrong with having such views, they are your personal > views, and are entirely legitimate. All I'm saying is that the Emacs > project should not be guided by such views, for the reasons I > explained. My question has been resolved. It is over. Did you read it? ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 8:10 ` Eli Zaretskii 2021-07-24 8:21 ` Jean Louis @ 2021-07-24 8:35 ` Jean Louis 2021-07-24 8:59 ` Eli Zaretskii 1 sibling, 1 reply; 75+ messages in thread From: Jean Louis @ 2021-07-24 8:35 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mullikine, emacs-tangents, stefan, rms * Eli Zaretskii <eliz@gnu.org> [2021-07-24 11:11]: > When I see a radical view, I call it "radical". Promoting Free > Software requires healthy pragmatism, because we want the Free > Software to flourish and remain relevant by picking up the advances in > technology. Rejecting such new technologies just because there's some > doubts expressed by someone in some blog is "radical", and IMO > eventually detrimental to Free Software development. We should > instead carefully and independently assess the issues and make our own > judgment based on specific details of each such development. We > cannot run away of every idea because some people say it might cause > trouble in some cases. I am totally for advances of technology as long as we foster free software and freedom in computing. We have too little of AI today in 21st century. Question is definitely not so general how presented in your paragraph above. It is very specific, related on how to solve licensing issues. There are no doubts that code may be copied verbatin, as here is authorized and official documentation by Github related to Copilot: https://docs.github.com/en/github/copilot/research-recitation It is not related to various other AIs, etc. I am not sure if the same AI is even used in Pen.el. It may not be relevant. In the Copilot documentation it says: Quote: This investigation demonstrates that GitHub Copilot can quote a body of code verbatim, but that it rarely does so, and when it does, it mostly quotes code that everybody quotes, and mostly at the beginning of a file, as if to break the ice. Additionally I have been using OpenAI and found not 0.1 percent verbatim responses, I could find those pages on Internet from where verbatim paragraphs were cited. I am still in playground. I can find paragraphs from websites from our competitors as a response. I still have to discover I in the AI in the playground of OpenAI service. Licensing issues I have made and for which I have found partial solution are in no way related to rejecting, rather to adopting it in free software. My question was how we can adopt the code generated into free software (for example by using Pen.el) as it generates code by using other GPL free software without attributions. Partially it is resolved in the US, though unproven and with great conflict with authors. It does not give assurance. I am not sure if I can generate the code and that it is really "original" and infringement free. Is the OpenAI company giving me some kind of guarantee that I will be held without liabilities if I use that code? Thus those issues may be temporarily brushed off with "fair use" in US, they remain unsolved in the US until the first few court cases or class action suite, and are not resolved on international level at all. Julia Reda's statement does not apply in all jurisdictions. At this moment there is no verified legal statement by let us say FSF attorneys or legal experts or some other organization that will confirm legal status of such generated code or text on international level. Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 8:35 ` Jean Louis @ 2021-07-24 8:59 ` Eli Zaretskii 2021-07-24 16:18 ` Jean Louis 0 siblings, 1 reply; 75+ messages in thread From: Eli Zaretskii @ 2021-07-24 8:59 UTC (permalink / raw) To: Jean Louis; +Cc: mullikine, emacs-tangents, stefan, rms > Date: Sat, 24 Jul 2021 11:35:41 +0300 > From: Jean Louis <bugs@gnu.support> > Cc: rms@gnu.org, stefan@marxist.se, mullikine@gmail.com, > emacs-tangents@gnu.org > > There are no doubts that code may be copied verbatin, as here is > authorized and official documentation by Github related to Copilot: > https://docs.github.com/en/github/copilot/research-recitation So we won't use that Github stuff. Which we won't anyway, because we avoid Github in general. How does this help us decide about general usability of this technology? It doesn't. > At this moment there is no verified legal statement by let us say FSF > attorneys or legal experts or some other organization that will > confirm legal status of such generated code or text on international > level. I suggest to leave the legal stuff to the legal experts. We should assess the data and provide them with facts, not make the decisions for them. Which means this discussion, and your suggestions that we should already stay away of this technology, are premature at best, if not in the wrong place at the wrong time. So please don't discourage independent assessment of this technology by posting half-baked "legal" opinions from people with questionable motivation (present company excluded, of course) representing that this technology is legally incompatible with Free Software. IMO, we don't yet know enough for any such definitive opinions and conclusions. ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 8:59 ` Eli Zaretskii @ 2021-07-24 16:18 ` Jean Louis 2021-07-24 16:45 ` Eli Zaretskii 0 siblings, 1 reply; 75+ messages in thread From: Jean Louis @ 2021-07-24 16:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mullikine, emacs-tangents, stefan, rms * Eli Zaretskii <eliz@gnu.org> [2021-07-24 12:00]: > So please don't discourage independent assessment of this technology > by posting half-baked "legal" opinions from people with questionable > motivation (present company excluded, of course) representing that > this technology is legally incompatible with Free Software. IMO, we > don't yet know enough for any such definitive opinions and > conclusions. Quite contrary, the GPL licensing compliance question is related to adoption and expansion of free software. I have never stated it is incompatible with free software, I have asked how is GPL licensing compliance solved. Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 16:18 ` Jean Louis @ 2021-07-24 16:45 ` Eli Zaretskii 2021-07-24 17:57 ` Jean Louis 0 siblings, 1 reply; 75+ messages in thread From: Eli Zaretskii @ 2021-07-24 16:45 UTC (permalink / raw) To: Jean Louis; +Cc: mullikine, emacs-tangents, stefan, rms > Date: Sat, 24 Jul 2021 19:18:02 +0300 > From: Jean Louis <bugs@gnu.support> > Cc: rms@gnu.org, stefan@marxist.se, mullikine@gmail.com, > emacs-tangents@gnu.org > > * Eli Zaretskii <eliz@gnu.org> [2021-07-24 12:00]: > > So please don't discourage independent assessment of this technology > > by posting half-baked "legal" opinions from people with questionable > > motivation (present company excluded, of course) representing that > > this technology is legally incompatible with Free Software. IMO, we > > don't yet know enough for any such definitive opinions and > > conclusions. > > Quite contrary, the GPL licensing compliance question is related to > adoption and expansion of free software. > > I have never stated it is incompatible with free software, I have > asked how is GPL licensing compliance solved. No, you said we shouldn't use this. ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 16:45 ` Eli Zaretskii @ 2021-07-24 17:57 ` Jean Louis 2021-07-24 18:15 ` Eli Zaretskii 0 siblings, 1 reply; 75+ messages in thread From: Jean Louis @ 2021-07-24 17:57 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mullikine, emacs-tangents, stefan, rms * Eli Zaretskii <eliz@gnu.org> [2021-07-24 19:46]: > > I have never stated it is incompatible with free software, I have > > asked how is GPL licensing compliance solved. > > No, you said we shouldn't use this. Sorry for misunderstandings. I don't remember I ever said we shouldn't use this. It does not appear to me logical as I remember my intention was to find out how is licensing compliance solved so that it becomes clear how it works. Question was directed to author of Pen.el and there was no clear answer neither from you, so I found myself from online research that at least in US for now it is based on "fair use" doctrine. When we take the word "fair" in its original definition, it should be obvious from online comments that many GPL authors do not really find it "fair". It is however one defense that all of present similar AI models have in common, they are to use "fair use" doctrine. We will see that. Also to mention, AI as such is not related to this particular case of using GPL and other free software, it is just one small application of overall artificial technology, there are many other applications which are totally out of this context. Jean Take action in Free Software Foundation campaigns: https://www.fsf.org/campaigns In support of Richard M. Stallman https://stallmansupport.org/ ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 17:57 ` Jean Louis @ 2021-07-24 18:15 ` Eli Zaretskii 0 siblings, 0 replies; 75+ messages in thread From: Eli Zaretskii @ 2021-07-24 18:15 UTC (permalink / raw) To: Jean Louis; +Cc: mullikine, emacs-tangents, stefan, rms > Date: Sat, 24 Jul 2021 20:57:42 +0300 > From: Jean Louis <bugs@gnu.support> > Cc: rms@gnu.org, stefan@marxist.se, mullikine@gmail.com, > emacs-tangents@gnu.org > > * Eli Zaretskii <eliz@gnu.org> [2021-07-24 19:46]: > > > > I have never stated it is incompatible with free software, I have > > > asked how is GPL licensing compliance solved. > > > > No, you said we shouldn't use this. > > Sorry for misunderstandings. To avoid such misunderstandings, I suggest to tone down your language when you are talking about licensing issues associated with some technologies or products, so that what you write couldn't be interpreted as saying that there are legal problems which prevent our use of those technologies and products. > Question was directed to author of Pen.el and there was no clear > answer neither from you, so I found myself from online research that > at least in US for now it is based on "fair use" doctrine. > > When we take the word "fair" in its original definition, it should be > obvious from online comments that many GPL authors do not really find > it "fair". It is however one defense that all of present similar AI > models have in common, they are to use "fair use" doctrine. We will > see that. That is a separate issue, which is IMO completely unrelated. Emacs is Free Software, and is distributed under GPL, so for Emacs it is OK to allow users to use other GPL code out there in their programs. That there are producers of proprietary software who use pieces of GPL code in their proprietary products without complying with GPL is completely unrelated to what the Emacs project can do with this technology. ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 6:49 ` Eli Zaretskii 2021-07-24 7:33 ` Jean Louis @ 2021-07-24 7:41 ` Philip Kaludercic 2021-07-24 7:59 ` Eli Zaretskii 1 sibling, 1 reply; 75+ messages in thread From: Philip Kaludercic @ 2021-07-24 7:41 UTC (permalink / raw) To: Eli Zaretskii; +Cc: stefan, emacs-tangents, mullikine, rms, bugs Eli Zaretskii <eliz@gnu.org> writes: >> From: Richard Stallman <rms@gnu.org> >> Date: Fri, 23 Jul 2021 21:14:23 -0400 >> Cc: stefan@marxist.se, eliz@gnu.org, mullikine@gmail.com, >> emacs-tangents@gnu.org >> >> > > That's not what happens with these services: they don't _copy_ code >> > > from other software (that won't work, because the probability of the >> > > variables being called by other names is 100%, and thus such code, if >> > > pasted into your program, will not compile). What they do, they >> > > extract ideas and algorithms from those other places, and express them >> > > in terms of your variables and your data types. So licenses are not >> > > relevant here. >> >> > According to online reviews chunks of code is copied even verbatim and >> > people find from where. Even if modified, it still requires licensing >> > compliance. >> >> From what I have read, it seems that the behavior of copilot runs on a >> spectrum from the first description to the second description. I >> expect that in many cases, nothing copyrightable has been copied, but >> in some cases copilot does copy a substantial amount from a >> copyrighted work. > > It cannot be a verbatim copy, because at least the variables, and > sometimes also the data types, need to be renamed. Whether the result > is still under the original copyright cannot be established without > actually comparing the two versions of the code. So any general > flat rejection of the idea of these services on these grounds is not > serious, IMO. Not necessarily, if it generates a pure, top-level function. Someone could type something like "Sort list of postcodes" and it generates a Radix Sort function. And if this is part of some code that was copied a lot, the model might tend to generate this verbatim even more likely. Or that is at least my understanding. -- Philip Kaludercic ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 7:41 ` Philip Kaludercic @ 2021-07-24 7:59 ` Eli Zaretskii 2021-07-24 9:31 ` Philip Kaludercic 0 siblings, 1 reply; 75+ messages in thread From: Eli Zaretskii @ 2021-07-24 7:59 UTC (permalink / raw) To: Philip Kaludercic; +Cc: stefan, emacs-tangents, mullikine, rms, bugs > From: Philip Kaludercic <philipk@posteo.net> > Cc: rms@gnu.org, mullikine@gmail.com, emacs-tangents@gnu.org, > stefan@marxist.se, bugs@gnu.support > Date: Sat, 24 Jul 2021 07:41:21 +0000 > > > It cannot be a verbatim copy, because at least the variables, and > > sometimes also the data types, need to be renamed. Whether the result > > is still under the original copyright cannot be established without > > actually comparing the two versions of the code. So any general > > flat rejection of the idea of these services on these grounds is not > > serious, IMO. > > Not necessarily, if it generates a pure, top-level function. Someone > could type something like "Sort list of postcodes" and it generates a > Radix Sort function. And if this is part of some code that was copied a > lot, the model might tend to generate this verbatim even more likely. A sort function must state at least the data type before it can be compiled. And if you are talking about pseudo-code that is data-type agnostic, then that's an algorithm, and is not copyrightable, AFAIK. ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 7:59 ` Eli Zaretskii @ 2021-07-24 9:31 ` Philip Kaludercic 2021-07-24 11:19 ` Eli Zaretskii 0 siblings, 1 reply; 75+ messages in thread From: Philip Kaludercic @ 2021-07-24 9:31 UTC (permalink / raw) To: Eli Zaretskii; +Cc: stefan, emacs-tangents, mullikine, rms, bugs Eli Zaretskii <eliz@gnu.org> writes: >> From: Philip Kaludercic <philipk@posteo.net> >> Cc: rms@gnu.org, mullikine@gmail.com, emacs-tangents@gnu.org, >> stefan@marxist.se, bugs@gnu.support >> Date: Sat, 24 Jul 2021 07:41:21 +0000 >> >> > It cannot be a verbatim copy, because at least the variables, and >> > sometimes also the data types, need to be renamed. Whether the result >> > is still under the original copyright cannot be established without >> > actually comparing the two versions of the code. So any general >> > flat rejection of the idea of these services on these grounds is not >> > serious, IMO. >> >> Not necessarily, if it generates a pure, top-level function. Someone >> could type something like "Sort list of postcodes" and it generates a >> Radix Sort function. And if this is part of some code that was copied a >> lot, the model might tend to generate this verbatim even more likely. > > A sort function must state at least the data type before it can be > compiled. And if you are talking about pseudo-code that is data-type > agnostic, then that's an algorithm, and is not copyrightable, AFAIK. No, I was thinking about concrete code, that depending on the language might even just rely on the standard library, especially if the language has generics. Seeing how often SO code has been found in random repositories[0], I don't think it is improbable that the trained models might notice these patterns. [0] For example https://programming.guide/worlds-most-copied-so-snippet.html -- Philip Kaludercic ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 9:31 ` Philip Kaludercic @ 2021-07-24 11:19 ` Eli Zaretskii 2021-07-24 14:16 ` Philip Kaludercic 0 siblings, 1 reply; 75+ messages in thread From: Eli Zaretskii @ 2021-07-24 11:19 UTC (permalink / raw) To: Philip Kaludercic; +Cc: stefan, emacs-tangents, mullikine, rms, bugs > From: Philip Kaludercic <philipk@posteo.net> > Cc: rms@gnu.org, mullikine@gmail.com, emacs-tangents@gnu.org, > stefan@marxist.se, bugs@gnu.support > Date: Sat, 24 Jul 2021 09:31:38 +0000 > > >> Not necessarily, if it generates a pure, top-level function. Someone > >> could type something like "Sort list of postcodes" and it generates a > >> Radix Sort function. And if this is part of some code that was copied a > >> lot, the model might tend to generate this verbatim even more likely. > > > > A sort function must state at least the data type before it can be > > compiled. And if you are talking about pseudo-code that is data-type > > agnostic, then that's an algorithm, and is not copyrightable, AFAIK. > > No, I was thinking about concrete code, that depending on the language > might even just rely on the standard library, especially if the language > has generics. Seeing how often SO code has been found in random > repositories[0], I don't think it is improbable that the trained models > might notice these patterns. Sorry, I don't understand what you have in mind. Can you show an example of useful code that could be copied verbatim into a program without at least some renaming, without breaking the program? ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 11:19 ` Eli Zaretskii @ 2021-07-24 14:16 ` Philip Kaludercic 2021-07-24 14:37 ` Eli Zaretskii 0 siblings, 1 reply; 75+ messages in thread From: Philip Kaludercic @ 2021-07-24 14:16 UTC (permalink / raw) To: Eli Zaretskii; +Cc: stefan, emacs-tangents, mullikine, rms, bugs Eli Zaretskii <eliz@gnu.org> writes: >> > A sort function must state at least the data type before it can be >> > compiled. And if you are talking about pseudo-code that is data-type >> > agnostic, then that's an algorithm, and is not copyrightable, AFAIK. >> >> No, I was thinking about concrete code, that depending on the language >> might even just rely on the standard library, especially if the language >> has generics. Seeing how often SO code has been found in random >> repositories[0], I don't think it is improbable that the trained models >> might notice these patterns. > > Sorry, I don't understand what you have in mind. Can you show an > example of useful code that could be copied verbatim into a program > without at least some renaming, without breaking the program? To take the example from the article I mentioned above public static String humanReadableByteCount(long bytes, boolean si) { int unit = si ? 1000 : 1024; if (bytes < unit) return bytes + " B"; int exp = (int) (Math.log(bytes) / Math.log(unit)); String pre = (si ? "kMGTPE" : "KMGTPE").charAt(exp-1) + (si ? "" : "i"); return String.format("%.1f %sB", bytes / Math.pow(unit, exp), pre); } can be copied into a Java program, and assuming that there is no other method called humanReadableByteCount in the same class, it should compile and run without renaming or re-typing. CoPilot might generate this from a comment like, // Convert a byte count to a human-readable string since it is mentioned over 6000 times on GitHub (and this method even has a bug, as the article explains -- but that is a totally different issue). -- Philip Kaludercic ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 14:16 ` Philip Kaludercic @ 2021-07-24 14:37 ` Eli Zaretskii 2021-07-24 14:49 ` Philip Kaludercic 0 siblings, 1 reply; 75+ messages in thread From: Eli Zaretskii @ 2021-07-24 14:37 UTC (permalink / raw) To: Philip Kaludercic; +Cc: stefan, emacs-tangents, mullikine, rms, bugs > From: Philip Kaludercic <philipk@posteo.net> > Cc: rms@gnu.org, mullikine@gmail.com, emacs-tangents@gnu.org, > stefan@marxist.se, bugs@gnu.support > Date: Sat, 24 Jul 2021 14:16:55 +0000 > > > Sorry, I don't understand what you have in mind. Can you show an > > example of useful code that could be copied verbatim into a program > > without at least some renaming, without breaking the program? > > To take the example from the article I mentioned above > > public static String humanReadableByteCount(long bytes, boolean si) { > int unit = si ? 1000 : 1024; > if (bytes < unit) return bytes + " B"; > int exp = (int) (Math.log(bytes) / Math.log(unit)); > String pre = (si ? "kMGTPE" : "KMGTPE").charAt(exp-1) + (si ? "" : "i"); > return String.format("%.1f %sB", bytes / Math.pow(unit, exp), pre); > } > > can be copied into a Java program, and assuming that there is no other > method called humanReadableByteCount in the same class, it should > compile and run without renaming or re-typing. How would one know it's 'long' and not some other data type? > CoPilot might generate this from a comment like, > > // Convert a byte count to a human-readable string > > since it is mentioned over 6000 times on GitHub (and this method even > has a bug, as the article explains -- but that is a totally different > issue). That's not how AI works: it doesn't just count the number of times something is mentioned. That usually leads to unsatisfactory results. ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 14:37 ` Eli Zaretskii @ 2021-07-24 14:49 ` Philip Kaludercic 2021-07-24 15:13 ` Eli Zaretskii 0 siblings, 1 reply; 75+ messages in thread From: Philip Kaludercic @ 2021-07-24 14:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: stefan, emacs-tangents, mullikine, rms, bugs Eli Zaretskii <eliz@gnu.org> writes: >> From: Philip Kaludercic <philipk@posteo.net> >> Cc: rms@gnu.org, mullikine@gmail.com, emacs-tangents@gnu.org, >> stefan@marxist.se, bugs@gnu.support >> Date: Sat, 24 Jul 2021 14:16:55 +0000 >> >> > Sorry, I don't understand what you have in mind. Can you show an >> > example of useful code that could be copied verbatim into a program >> > without at least some renaming, without breaking the program? >> >> To take the example from the article I mentioned above >> >> public static String humanReadableByteCount(long bytes, boolean si) { >> int unit = si ? 1000 : 1024; >> if (bytes < unit) return bytes + " B"; >> int exp = (int) (Math.log(bytes) / Math.log(unit)); >> String pre = (si ? "kMGTPE" : "KMGTPE").charAt(exp-1) + (si ? "" : "i"); >> return String.format("%.1f %sB", bytes / Math.pow(unit, exp), pre); >> } >> >> can be copied into a Java program, and assuming that there is no other >> method called humanReadableByteCount in the same class, it should >> compile and run without renaming or re-typing. > > How would one know it's 'long' and not some other data type? I am not sure what you mean? "long" makes sense here because Java will automatically up-cast any other type to fit. >> CoPilot might generate this from a comment like, >> >> // Convert a byte count to a human-readable string >> >> since it is mentioned over 6000 times on GitHub (and this method even >> has a bug, as the article explains -- but that is a totally different >> issue). > > That's not how AI works: it doesn't just count the number of times > something is mentioned. That usually leads to unsatisfactory results. Of course, that would be oversimplifying. At the same time, if the training samples have common patterns, a model is more likely to reproduce that behaviour. But since these are neural networks we are talking about, it is hard to determine causality to begin with, which probably makes the whole situation even more difficult (speaking as a non-lawyer). -- Philip Kaludercic ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-24 14:49 ` Philip Kaludercic @ 2021-07-24 15:13 ` Eli Zaretskii 0 siblings, 0 replies; 75+ messages in thread From: Eli Zaretskii @ 2021-07-24 15:13 UTC (permalink / raw) To: Philip Kaludercic; +Cc: stefan, emacs-tangents, mullikine, rms, bugs > From: Philip Kaludercic <philipk@posteo.net> > Cc: rms@gnu.org, mullikine@gmail.com, emacs-tangents@gnu.org, > stefan@marxist.se, bugs@gnu.support > Date: Sat, 24 Jul 2021 14:49:02 +0000 > > > How would one know it's 'long' and not some other data type? > > I am not sure what you mean? "long" makes sense here because Java will > automatically up-cast any other type to fit. So you came up with perhaps the single example that exists in the whole world where the issues I mentioned _might_ not matter, and even that only under some assumptions. A feature that aspires to be generally useful cannot possibly depend on such problematic assumptions. > >> since it is mentioned over 6000 times on GitHub (and this method even > >> has a bug, as the article explains -- but that is a totally different > >> issue). > > > > That's not how AI works: it doesn't just count the number of times > > something is mentioned. That usually leads to unsatisfactory results. > > Of course, that would be oversimplifying. At the same time, if the > training samples have common patterns, a model is more likely to > reproduce that behaviour. No, that's not it: a single example repeated in identical form many times doesn't reinforce the learned pattern. You need many similar, but different code samples, and most probably in different languages. ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-23 6:51 ` Shane Mulligan 2021-07-23 10:12 ` Jean Louis @ 2021-07-25 1:06 ` Richard Stallman 1 sibling, 0 replies; 75+ messages in thread From: Richard Stallman @ 2021-07-25 1:06 UTC (permalink / raw) To: Shane Mulligan; +Cc: eliz, stefan, emacs-tangents, bugs [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > GPT is potentially the best thing to happen to emacs in a very long time. If GPT-3 were released as a free program, we might want to use it. Perhaps it would be very useful. Please correct me if I am mistaken, but I think GPT-3 is an unreleased program which people can use only via SaaSS. SaaSS stands for Service as a Software Substitute. It means that a "service" accepts your data, does a specific computing job, and sends you back the results. Using such a service is morally mostly equivalent to running a nonfree program -- so we cannot suggest that anyone DO that. See https://gnu.org/philosophy/who-does-that-server-really-serve.html for more explanation of this issue. > The way this will work is you will download > the free GPT model, such as GPT-j, GPT-neo or > GPT-neox and then you will have an offline and > private alternative to many things previously > you would go online for. Are you saying there is a free replacement for GPT-3 and we can run these free models with it on our own computers? That could be good news, because we could actually use it. > It will bring back power from the corporations and save it to your > computer, That sounds exciting but it is not concrete enough to think about. open source and transparent, What follows is a side issue, but it's an important side issue. "Open source" is the slogan of a campaign we don't advocate. It is partly similar to the free software movement but discards the moral foundation: the idea of freedom. We don't use the slogan "open source" because we want to advocate freedom, not forget it. See https://gnu.org/philosophy/open-source-misses-the-point.html for more explanation of the difference between free software and open source. See also https://thebaffler.com/salvos/the-meme-hustler for Evgeny Morozov's article on the same point. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-17 21:02 ` Shane Mulligan 2021-07-18 5:38 ` Jean Louis 2021-07-18 5:38 ` Jean Louis @ 2021-07-18 6:42 ` Eli Zaretskii 2 siblings, 0 replies; 75+ messages in thread From: Eli Zaretskii @ 2021-07-18 6:42 UTC (permalink / raw) To: Shane Mulligan; +Cc: emacs-devel > From: Shane Mulligan <mullikine@gmail.com> > Date: Sun, 18 Jul 2021 09:02:17 +1200 > Cc: rms@gnu.org, Stefan Kangas <stefan@marxist.se>, Emacs developers <emacs-devel@gnu.org> > > The following is why emacs needs open-source prompts -- ones that don't learn from you or are sold to you > > - Ones that you write for yourself. > - An open-source prompts melpa at the very least! > > As I tried to describe before, it's a fundamentally new way of programming. An extension of Donald Knuth's > literate programming becoming imaginary programming, but being hijacked by microsoft. > > Microsft GPT is an attack on the innermost workings of emacs -- the text stream. So embracing the > OpenSource alternatives from EleutherAI is crucial. > > I have said enough. I leave you with this article. > > https://venturebeat.com/2021/07/16/openai-disbands-its-robotics-research-team/ We have a special mailing list for such Emacs "tangents": emacs-tangents@gnu.org. Please post this stuff there, and reserve posting to this list for stuff that directly pertains to Emacs development. For example, with the issues you raise here, if you have specific suggestions to provide such capabilities in Emacs, describing that would be appropriate. By contrast, articles about AI-related developments out there, and who purchased whom and for how much, is not appropriate. Please help us keeping the signal-to-noise ratio of this list as high as possible. TIA ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-17 9:27 ` Shane Mulligan 2021-07-17 21:02 ` Shane Mulligan @ 2021-07-17 21:35 ` Juri Linkov 1 sibling, 0 replies; 75+ messages in thread From: Juri Linkov @ 2021-07-17 21:35 UTC (permalink / raw) To: Shane Mulligan; +Cc: Eli Zaretskii, Stefan Kangas, rms, Emacs developers > "any number of useful features where it could help." > Name an emacs package and I can explain how GPT will affect that package. > For `dired-git-info-mode`, for instance, a model connected to GPT can > explain what files are for. Can it help to write git commit messages from diffs? Has anyone tried to train a model on the existing git commit logs? This would be a killer feature. ^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: Help building Pen.el (GPT for emacs) 2021-07-17 2:36 ` Shane Mulligan 2021-07-17 9:01 ` Eli Zaretskii @ 2021-07-17 23:53 ` Richard Stallman 1 sibling, 0 replies; 75+ messages in thread From: Richard Stallman @ 2021-07-17 23:53 UTC (permalink / raw) To: Shane Mulligan; +Cc: stefan, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > I think the end-goal should be to have a close collaboration with > EleutherAI, who already have an open-source alternative to the Copilot > model. It's called GPT-j. Perhaps that is a good path, but we need to know more. You've described this as an "open-source alternative". Most of the time, when developers describe their work as "open-source", it is because they don't share our ethical principles and criteria. For instance, most open-source programs are free/libre, but there are exceptions. So we need to look at what they are developing to see what software it consists of, and whether it consists entirely of free software that can be included in a version of the GNU system. See https://gnu.org/philosophy/open-source-misses-the-point.html. > The problem is that there are very few people within EleutherAI using emacs > and few people who can help. Assuming they are heading for the right goal, it would be great for you to offer your help, and I'm sure they will find plenty of people here who would like to do that. If they could make their system keep track of licenses from which text was obtained, they could implement something to avoid one of the pitfalls of Copilot: that it gives you a substantial amount of code that is substantially similar to one copyright work and thus leads you unknowing into infringement. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 75+ messages in thread
end of thread, other threads:[~2021-07-30 6:55 UTC | newest] Thread overview: 75+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-06-30 4:36 Help building Pen.el (GPT for emacs) Shane Mulligan 2021-07-02 13:30 ` Jean Louis 2021-07-02 13:40 ` Jean Louis 2021-07-02 13:57 ` Jean Louis 2021-07-03 6:34 ` Shane Mulligan 2021-07-03 22:21 ` Jean Louis 2021-07-03 23:21 ` Arthur Miller 2021-07-03 23:42 ` Jean Louis 2021-07-12 3:24 ` Shane Mulligan 2021-07-17 23:53 ` Richard Stallman 2021-07-23 15:37 ` Jean Louis 2021-07-15 11:58 ` Stefan Kangas 2021-07-15 12:40 ` dick 2021-07-15 23:52 ` Shane Mulligan 2021-07-16 7:30 ` tomas 2021-07-17 0:33 ` Shane Mulligan 2021-07-17 7:54 ` tomas 2021-07-17 7:52 ` Jean Louis 2021-07-17 0:51 ` Richard Stallman 2021-07-17 2:36 ` Shane Mulligan 2021-07-17 9:01 ` Eli Zaretskii 2021-07-17 9:27 ` Shane Mulligan 2021-07-17 21:02 ` Shane Mulligan 2021-07-18 5:38 ` Jean Louis 2021-07-18 5:38 ` Jean Louis 2021-07-18 7:03 ` Eli Zaretskii 2021-07-18 8:00 ` Shane Mulligan 2021-07-19 17:00 ` Jean Louis 2021-07-23 6:51 ` Shane Mulligan 2021-07-23 10:12 ` Jean Louis 2021-07-23 10:54 ` Eli Zaretskii 2021-07-23 11:32 ` Jean Louis 2021-07-23 11:51 ` Eli Zaretskii 2021-07-23 12:47 ` Jean Louis 2021-07-23 13:39 ` Shane Mulligan 2021-07-23 14:39 ` Jean Louis 2021-07-26 0:16 ` Richard Stallman 2021-07-26 0:28 ` Shane Mulligan 2021-07-30 3:20 ` Shane Mulligan 2021-07-30 6:55 ` Jean Louis 2021-07-23 19:33 ` Eli Zaretskii 2021-07-24 3:07 ` Jean Louis 2021-07-24 7:32 ` Eli Zaretskii 2021-07-24 7:54 ` Jean Louis 2021-07-24 8:50 ` Eli Zaretskii 2021-07-24 16:16 ` Jean Louis 2021-07-24 16:44 ` Eli Zaretskii 2021-07-24 18:01 ` Jean Louis 2021-07-25 1:09 ` Richard Stallman 2021-07-24 1:14 ` Richard Stallman 2021-07-24 2:10 ` Shane Mulligan 2021-07-24 2:34 ` Shane Mulligan 2021-07-24 3:14 ` Shane Mulligan 2021-07-24 6:49 ` Eli Zaretskii 2021-07-24 7:33 ` Jean Louis 2021-07-24 8:10 ` Eli Zaretskii 2021-07-24 8:21 ` Jean Louis 2021-07-24 8:35 ` Jean Louis 2021-07-24 8:59 ` Eli Zaretskii 2021-07-24 16:18 ` Jean Louis 2021-07-24 16:45 ` Eli Zaretskii 2021-07-24 17:57 ` Jean Louis 2021-07-24 18:15 ` Eli Zaretskii 2021-07-24 7:41 ` Philip Kaludercic 2021-07-24 7:59 ` Eli Zaretskii 2021-07-24 9:31 ` Philip Kaludercic 2021-07-24 11:19 ` Eli Zaretskii 2021-07-24 14:16 ` Philip Kaludercic 2021-07-24 14:37 ` Eli Zaretskii 2021-07-24 14:49 ` Philip Kaludercic 2021-07-24 15:13 ` Eli Zaretskii 2021-07-25 1:06 ` Richard Stallman 2021-07-18 6:42 ` Eli Zaretskii 2021-07-17 21:35 ` Juri Linkov 2021-07-17 23:53 ` Richard Stallman
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.