From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Shane Mulligan Newsgroups: gmane.emacs.tangents Subject: Re: Help building Pen.el (GPT for emacs) Date: Fri, 23 Jul 2021 18:51:19 +1200 Message-ID: References: <83im1948mj.fsf@gnu.org> <83lf642jeh.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="000000000000576daa05c7c4d579" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="26992"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Eli Zaretskii , emacs-tangents@gnu.org, Stefan Kangas , rms@gnu.org To: Jean Louis Original-X-From: emacs-tangents-bounces+get-emacs-tangents=m.gmane-mx.org@gnu.org Fri Jul 23 09:07:03 2021 Return-path: Envelope-to: get-emacs-tangents@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1m6pGw-0006oP-Po for get-emacs-tangents@m.gmane-mx.org; Fri, 23 Jul 2021 09:07:02 +0200 Original-Received: from localhost ([::1]:54282 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m6pGv-0005s6-Qs for get-emacs-tangents@m.gmane-mx.org; Fri, 23 Jul 2021 03:07:01 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:48346) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m6p20-00085Y-21 for emacs-tangents@gnu.org; Fri, 23 Jul 2021 02:51:36 -0400 Original-Received: from mail-yb1-xb32.google.com ([2607:f8b0:4864:20::b32]:34790) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m6p1w-0003HI-W7; Fri, 23 Jul 2021 02:51:35 -0400 Original-Received: by mail-yb1-xb32.google.com with SMTP id a93so873270ybi.1; Thu, 22 Jul 2021 23:51:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=zMinWIo6ctP0EPQj0qME8lGbzQyh5OTsALnce/atW5I=; b=mLa6A5Ks0GHz3XxNijbMf1BIKWKkE1zzvdAUtsP8qfdYK0/L/nfOk0eUlO1xUCMycq ohobEGv7+QfVnBwTYsWIFOPmcKaQl+r5jL32Sw8aTS8ed3y95zdJOBRd/VutWjB5r4th IVmjNJPraOlorsyaUDvKpspzz4yfqCL6ydXqiQvDbvNQaxTJJHd07Gpz4CEH9Me+g2T7 mwyDnIX5XZwIRVga16GkA73C1xVOYX2eel39nAttBwu+kaAEz/YVvPPnQEb5ybSXsKe6 M80xW2lFX8HvDMxajFFx3wx/iJJz1WrNbvbwexHMD3HcwGmmxWpau9yB63Hbie9HYx9s ZsMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=zMinWIo6ctP0EPQj0qME8lGbzQyh5OTsALnce/atW5I=; b=NosZ2TAgEn2mApqGclOmiI3eOYoZtZ3vLW6geDWpwosE8I2urnSox9aHXOFZWWH0ce zyl6Bn0VcgGDlqvwo5S2Xq7xMBR/YNVydEo6Bs3O92BzeSG3IBvaQAoQfGZX7r4se+jo ZJFw2mo8oBkBSXazoe38olhqzltRxICHuuMkuwjbbtGV5WWcegQU7GiLqEaqULRoxCde m7K6cTqeXgk/YqQpUudeLUaxc69JBQ9j6NXYZxqgdpIgBdP9CSFP4XERtkjzg/pOamkz vyRkCAfd3bFIi1mj0lGWQfajmy9jsjqdLSQbOYgGbtWIHDf4wZeUFXJgEvnK7+8HFrsd 282g== X-Gm-Message-State: AOAM531SqkRs1IP6aCHMfbbPFtRgxe0O8DGtoQciL3rnsKLPrnVG8irn 4Ux1ExalU4g3BXi/BZ3n9QxdTKXDKRPmzm56zH4S/ehBApymKzM= X-Google-Smtp-Source: ABdhPJyicflmYuJA/2p3vN/+TsQvqGB0thRDdhE9o7Fv2S/HpI2ePlpxuRVzgGWmLhSJLk+sIBvfB5T1Db1bGYMD4sU= X-Received: by 2002:a5b:5c6:: with SMTP id w6mr4591637ybp.279.1627023090346; Thu, 22 Jul 2021 23:51:30 -0700 (PDT) In-Reply-To: Received-SPF: pass client-ip=2607:f8b0:4864:20::b32; envelope-from=mullikine@gmail.com; helo=mail-yb1-xb32.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Fri, 23 Jul 2021 03:06:43 -0400 X-BeenThere: emacs-tangents@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Emacs news and miscellaneous discussions outside the scope of other Emacs mailing lists List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-tangents-bounces+get-emacs-tangents=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-tangents" Xref: news.gmane.io gmane.emacs.tangents:661 Archived-At: --000000000000576daa05c7c4d579 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Jean and GNU friends, GPT is potentially the best thing to happen to emacs in a very long time. It will bring back power from the corporations and save it to your computer, open source and transparent, and offline. Please consider including a collaborative, open source prompts repository in the next version of emacs. So far I'm yet to see anything like it, but I see in commercial products everywhere that they have full domain over this new type of code. I am trying to build up relationships in my project Pen.el with others who value open source. gptprompts.org, for example. This is to create a catalogue for pen.el. One thing we have just introduced is a field to specify a licence for each prompt. However, I must say that prompts are more like functions. *Soft prompts* are very granular prompts as they have been reduced to a minimal number of characters using optimisation. https://arxiv.org/abs/2104.08691 Therefore, there must be support for prompting at the syntax level of emacs, in my opinion. And it is also clear now that since a prompt looks more like binary code, that this is a new type of function definition and a new type of programming is emerging. A prompt function is a function defined by a version of a Language Model (LM) and a prompt (input), but as is the case in haskell, every function may be reduced to one that takes a single input and returns a single output. In other words, most prompt functions will be parameterised and have an arity greater than one. I am building a collaborative imaginary programming environment in emacs. This is an editing environment where people can integrate LMs into emacs, extending emacs with prompt functions. The power of this is profound and beyond belief. I have coined the term "prompt functions", so don't expect to be able to find it online if you go searching. Here is a new corporation which is creating a prompt engineering environment. However, they do not have their own operating system to integrate prompting into. That's why emacs is years ahead, potentially. A prompt is merely a function with a language model as a parameter. Without integration, it's quite useless. https://gpt3demo.com/apps/mantiumai I think a prompts database -- something like Datomic or other RDF-like, immutable storage must be added into GNU organisation to store selected prompts and generations, and a GPL or EleutherAI GPT model is ultimately integrated into core emacs via some low level syntax through partnership with EleutherAI. I would expect in the future to download emacs along with an open- source GPT model, and be able to create prompt functions as easily as creating macros. A 1:1 prompt:function database of sorts is a good starting point in my opinion, but remembering the generations is also important. But the scale is immense. This is why a p2p database that can remember immutably is important, in my opinion. If this seems too grand of scale, then at the very least consider a GNU prompts repository. > Sounds like a replacement for a programmer's mind. Yes it is. It trivialises the implementation and requires that programmers now be more imaginative, and will be supported by the language model. Rather than writing an implementation, function is defined by the input types and a Language Model and version of the language model. > Where is definition of the abbreviation NLP? NLP stands for Natural Language Processing. Until recently, code was not considered part of that domain, but the truth is NLP algorithms are extremely useful for code generation, code search and code understanding. > What is definition of the abbreviation LM? LM stands for Language Model. It is a statistical model of language, rather than use formal grammars. Emacs lisp functions and macros do not have a syntax for stochastic/probabilistic programming. Good, but is there a video to show what it really does? Here is an online catalogue of GPT tools. Pen.el is among the developer tools. https://gpt3demo.com/category/developer-tools =3DPen.el=3D and emacs has the potential to do all the things for all of the products in =3Dgpt3demo.com=3D. > I would like to demonstrate Pen.el with this particular video which I have created to demonstrate a new type of programming -- collaborative within a language model. https://mullikine.github.io/posts/caching-and-saving-results-of-prompt-func= tions-in-pen-el/ https://asciinema.org/a/MhOU0eMnJsRpXf2Ak9YStPlz8 > Do you mean "exemplary" or "examplary", is it spelling mistake? I am building a DSL for encoding prompt design patterns to generate prompt functions for emacs. http://github.com/semiosis/pen.el/blob/master/src/pen-examplary.el > Pen.el creates functions 1:1 for a prompt to an emacs lisp function. What this means is that a prompt may be parameterized to define a relation (i.e. function) and therefore code and I have chosen to create one parameterized function per prompt. The prompt text once associated to a LM becomes a type of query (i.e. code), so prompts should not be discounted as being any less than such, and qualify for the GPL3 license. > I understand that it is kind of fetching information, but that does not solve licensing issues, it sounds like licensing hell. This is exactly why a GPL LM or compatible LM is absolutely crucial and needs to be integrated, otherwise all imaginary code will be violating and harvesting open source for the foreseeable future as there is no alternative. Sincerely, Shane . Shane Mulligan How to contact me: =F0=9F=87=A6=F0=9F=87=BA 00 61 421 641 250 =F0=9F=87=B3=F0=9F=87=BF 00 64 21 1462 759 <+64-21-1462-759> mullikine@gmail.com On Tue, Jul 20, 2021 at 5:04 AM Jean Louis wrote: > * Shane Mulligan [2021-07-18 11:01]: > > Pen.el stands for Prompt Engineering in emacs. > > Prompt Engineering is the art of describing what you would > > like a language model (transformer) to do. It is a new type of > programming, > > example oriented; > > like literate programming, but manifested automatically. > > Sounds like a replacement for a programmer's mind. > > > A transformer takes some text (called a prompt) and continues > > it. However, the continuation is the superset of all NLP tasks, > > Where is definition of the abbreviation NLP? > > > as the generation can also be a classification, for instance. Those > > NLP tasks extend beyond world languages and into programming > > languages (whatever has been 'indexed' or 'learned') from these > > large LMs. > > What is definition of the abbreviation LM? > > > Pen.el is an editing environment for designing 'prompts' to LMs. It > > is better than anything that exists, even at OpenAI or at > > Microsoft. I have been working on it and preparing for this for a > > long time. > > Good, but is there a video to show what it really does? > > > These prompts are example- based tasks. There are a number of design > > patterns which Pen.el is seeking to encode into a domain-specific > > language called 'examplary' for example- oriented programming. > > Do you mean "exemplary" or "examplary", is it spelling mistake? > > I have to ask as your description is still pretty abstract without > particular example. > > > Pen.el creates functions 1:1 for a prompt to an emacs lisp function. > > The above does not tell me anything. > > > Emacs is Grammarly, Google Translate, Copilot, Stackoveflow and > > infinitely many other services all rolled into one and allows you to > > have a private parallel to all these services that is completely > > private and open source -- that is if you have downloaded the > > EleutherAI model locally. > > I understand that it is kind of fetching information, but that does > not solve licensing issues, it sounds like licensing hell. > > > ** Response to Jean Louis > > - And I do not think it should be in GNU ELPA due to above reasons. > > > > I am glad I have forewarned you guys. This is my current goal. Help > > in my project would be appreciated. I cannot do it alone and I > > cannot convince all of you. > > Why don't you tell about licensing issues? Taking code without proper > licensing compliance is IMHO, not an option. It sounds as problem > generator. > > > > Why don't you simply make an Emacs package as .tar as described in > Emacs > > Lisp manual? > > > Thank you for taking a look at my emacs package. It's not ready net > > for Melpa merge. I hope that I will be able to find some help in > > order to prepare it, but the rules are very strict and this may not > > happen. > > I did not say to put it in Melpa. Package you can make for yourself > and users so that users can M-x package-install-file > > That is really not related to any online Emacs package repository. It > is way how to install Emacs packages no matter where one gets it. > > > > How does that solves the licensing problems? > > The current EleutherAI model which competes with GPT-3 is GPT-Neo. > > It is MIT licensed. > > That is good. > > But the code that is generated and injected requires proper > contribution. > > > Also the data it has been trained on is MIT licensed. > > Yes, and then the program should also solve the proper contributions > automatically. You cannot just say "MIT licensed", this has to be > proven, source has to be found and proper attributions applied. > > Why don't you implement proper licensing? > > Please find ONE license that you are using from code that is being > used as database for generation of future code and provide link to > it. Then show how is license complied to. > > > The current EleutherAI model which competes with Codex is GPT-j. > > It is licensed with Apache-2.0 License > > That is good, but I am referring to the generated code. > > -- > Jean > > Take action in Free Software Foundation campaigns: > https://www.fsf.org/campaigns > > In support of Richard M. Stallman > https://stallmansupport.org/ > --000000000000576daa05c7c4d579 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Jean and GNU friends,

GPT is potentially the= best thing to happen to emacs in a very long time. It will bring back powe= r from the corporations and save it to your computer, open source and trans= parent, and offline.

Please consider incl= uding a collaborative, open source prompts repository in the next=C2=A0vers= ion of emacs.

So far I'm yet to see anyt= hing like it, but I see in commercial products everywhere that they have fu= ll domain over this new type of code.
I am trying to build up rel= ationships in my project Pen.el with others who value open source. gptprompts.org, for example.
This = is to=C2=A0create a catalogue for pen.el. One thing we have just introduced=
is a field to specify a licence for each prompt. However, I must saythat prompts are more like functions. *Soft prompts* are very granular
= prompts as they have been reduced to a minimal=C2=A0number of characters us= ing optimisation.=C2=A0


<= /div>
Therefore, there must be support for prompting at
the syntax l= evel of emacs, in my opinion. And it is also clear now that since a prompt = looks more like binary code,
that this is a new type of function = definition and a new type of programming is emerging.

A prompt funct= ion is a function defined by a
version of a Language Model (LM) and a pr= ompt (input), but
as is the case in haskell, every function maybe reduced to one that takes a single input
and returns a single outpu= t. In other words,
most prompt functions will be parameterised
and ha= ve an arity greater than one.

I am building a collaborative imaginar= y
programming environment in emacs. This is an
editing environment wh= ere people can integrate
LMs into emacs, extending emacs with prompt
= functions. The power of this is profound and beyond belief.
I have coine= d the term "prompt functions", so don't expect to be able to = find
it online if you go searching.

Here is a n= ew corporation which is creating a prompt engineering environment.
However, they do not have their own operating system to integrate prompti= ng into. That's why emacs is years ahead, potentially.
A prom= pt is merely a function with a language model as a parameter. Without integ= ration, it's quite useless.

https://gpt3demo.com/apps/mantiumai
I think a prompts database -- something like
Datomic or other RDF-like= , immutable storage
must be added into GNU organisation to store
sele= cted prompts and generations, and a GPL or EleutherAI GPT model
is ultim= ately integrated into core emacs via
some low level syntax through partn= ership with EleutherAI.
I would expect in the future to download emacsalong with an open- source GPT model, and be
able to create prompt fun= ctions as easily as
creating macros.

A 1:1 prompt:function databa= se of sorts is a
good starting point in my opinion, but
remembering t= he generations is also important.
But the scale is immense. This is why = a p2p
database that can remember immutably is
important, in my opinio= n. If this seems too
grand of scale, then at the very least
consider = a GNU prompts repository.

> Sounds like a replacement for a progr= ammer's mind.
Yes it is. It trivialises the implementation and requi= res that programmers now be more imaginative, and will be supported by the = language model.
Rather than writing an implementation, function is defin= ed by the
input types and a Language Model and version of the language m= odel.

> Where is definition of the abbreviation NLP?
NLP stand= s for Natural Language Processing. Until recently, code was not considered = part of that domain, but the truth is NLP algorithms are extremely useful f= or code generation, code search and code understanding.

> What is= definition of the abbreviation LM?
LM stands for Language Model. It is = a statistical model of language, rather than use formal grammars. Emacs lis= p functions and macros do not have a syntax for stochastic/probabilistic pr= ogramming.
Good, but is there a video to show what it really does?
Here is an online catalogue of GPT tools. Pen.el is among the developer t= ools.
https://= gpt3demo.com/category/developer-tools

=3DPen.el=3D and emacs=C2= =A0has the potential to do all the things for all of
the products in =3D= gpt3demo.com=3D.

> I would li= ke to demonstrate Pen.el with this particular video which I have created to= demonstrate a new type of programming -- collaborative within a language m= odel.https://mullikine.github.io/posts/caching= -and-saving-results-of-prompt-functions-in-pen-el/
https://asciinema.org/a/MhOU0e= MnJsRpXf2Ak9YStPlz8

> Do you mean "exemplary" or &q= uot;examplary", is it spelling mistake?
I am building a DSL for enc= oding prompt design
patterns to generate prompt functions for
emacs.<= br>
http://github.com/semiosis/pen.el/blob/master/src/pen-examplary.= el

> Pen.el creates functions 1:1 for a prompt to an emacs li= sp function.

What this means is that a prompt may be
parameterize= d to define a relation (i.e.
function) and therefore code and I have cho= sen
to create one parameterized function per prompt.

The prompt t= ext once associated to a LM
becomes a type of query (i.e. code), so
p= rompts should not be discounted as being any
less than such, and qualify= for the GPL3 license.

> I understand that it is kind of fetching= information, but that does not solve licensing issues, it sounds like lice= nsing hell.
This is exactly why a GPL LM or compatible LM
is absolute= ly crucial and needs to be
integrated, otherwise all imaginary code will=
be violating and harvesting open source for
the foreseeable future a= s there is no
alternative.

Sincerely,
Shane

=
.

Shane Mulligan

=
How to contact me:
3D""
=F0=9F=87=A6=F0=9F=87=BA00 61 421 641 250
=F0=9F= =87=B3=F0=9F=87=BF00 6= 4 21 1462 759
mullikine@gmail.com



On Tue, Jul 20, 2021 at 5:04 = AM Jean Louis <bugs@gnu.support> wrote:
* Shane Mulligan <mullikine@gmail.com> [2021-07-18 11= :01]:
> Pen.el stands for Prompt Engineering in emacs.
> Prompt Engineering is the art of describing what you would
> like a language model (transformer) to do. It is a new type of program= ming,
> example oriented;
> like literate programming, but manifested automatically.

Sounds like a replacement for a programmer's mind.

> A transformer takes some text (called a prompt) and continues
> it. However, the continuation is the superset of all NLP tasks,

Where is definition of the abbreviation NLP?

> as the generation can also be a classification, for instance. Those > NLP tasks extend beyond world languages and into programming
> languages (whatever has been 'indexed' or 'learned') f= rom these
> large LMs.

What is definition of the abbreviation LM?

> Pen.el is an editing environment for designing 'prompts' to LM= s. It
> is better than anything that exists, even at OpenAI or at
> Microsoft. I have been working on it and preparing for this for a
> long time.

Good, but is there a video to show what it really does?

> These prompts are example- based tasks. There are a number of design > patterns which Pen.el is seeking to encode into a domain-specific
> language called 'examplary' for example- oriented programming.=

Do you mean "exemplary" or "examplary", is it spelling = mistake?

I have to ask as your description is still pretty abstract without
particular example.

> Pen.el creates functions 1:1 for a prompt to an emacs lisp function.
The above does not tell me anything.

> Emacs is Grammarly, Google Translate, Copilot, Stackoveflow and
> infinitely many other services all rolled into one and allows you to > have a private parallel to all these services that is completely
> private and open source -- that is if you have downloaded the
> EleutherAI model locally.

I understand that it is kind of fetching information, but that does
not solve licensing issues, it sounds like licensing hell.

> ** Response to Jean Louis
> - And I do not think it should be in GNU ELPA due to above reasons. >
> I am glad I have forewarned you guys. This is my current goal. Help > in my project would be appreciated. I cannot do it alone and I
> cannot convince all of you.

Why don't you tell about licensing issues? Taking code without proper licensing compliance is IMHO, not an option. It sounds as problem
generator.

> > Why don't you simply make an Emacs package as .tar as describ= ed in Emacs
> Lisp manual?

> Thank you for taking a look at my emacs package. It's not ready ne= t
> for Melpa merge. I hope that I will be able to find some help in
> order to prepare it, but the rules are very strict and this may not > happen.

I did not say to put it in Melpa. Package you can make for yourself
and users so that users can M-x package-install-file

That is really not related to any online Emacs package repository. It
is way how to install Emacs packages no matter where one gets it.

> > How does that solves the licensing problems?
> The current EleutherAI model which competes with GPT-3 is GPT-Neo.
> It is MIT licensed.

That is good.

But the code that is generated and injected requires proper
contribution.

> Also the data it has been trained on is MIT licensed.

Yes, and then the program should also solve the proper contributions
automatically. You cannot just say "MIT licensed", this has to be=
proven, source has to be found and proper attributions applied.

Why don't you implement proper licensing?

Please find ONE license that you are using from code that is being
used as database for generation of future code and provide link to
it. Then show how is license complied to.

> The current EleutherAI model which competes with Codex is GPT-j.
> It is licensed with Apache-2.0 License

That is good, but I am referring to the generated code.

--
Jean

Take action in Free Software Foundation campaigns:
https://www.fsf.org/campaigns

In support of Richard M. Stallman
https://stallmansupport.org/
--000000000000576daa05c7c4d579--