From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Andrew Hyatt Newsgroups: gmane.emacs.devel Subject: Re: LLM Experiments, Part 1: Corrections Date: Mon, 22 Jan 2024 20:52:18 -0400 Message-ID: References: <2BA188C7-3886-49F6-A916-6220BD9BA77D@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; format=flowed Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="6055"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: Sergey Kostyaev , emacs-devel@gnu.org To: "T.V Raman" Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Jan 23 01:52:51 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rS522-0001Jb-6a for ged-emacs-devel@m.gmane-mx.org; Tue, 23 Jan 2024 01:52:50 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rS51g-0007ad-6Q; Mon, 22 Jan 2024 19:52:28 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rS51e-0007aL-OH for emacs-devel@gnu.org; Mon, 22 Jan 2024 19:52:26 -0500 Original-Received: from mail-yb1-xb2a.google.com ([2607:f8b0:4864:20::b2a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rS51d-0001b2-0Y for emacs-devel@gnu.org; Mon, 22 Jan 2024 19:52:26 -0500 Original-Received: by mail-yb1-xb2a.google.com with SMTP id 3f1490d57ef6-dc21d7a7042so3098086276.2 for ; Mon, 22 Jan 2024 16:52:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705971143; x=1706575943; darn=gnu.org; h=mime-version:user-agent:message-id:date:references:in-reply-to :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=WuFPbGNLCJOrPUqMFzQyNcCfUIuQ8PBwpBuYwoBQm8c=; b=WD+IDBI5jZ/YBFNRWZEABLeZtivbU2jslGubvpjfj7zvRSHG9+etbSwNJ4gNfFdgt/ lSaZC1reAaImkZOS+QyvLJRyyh8LB3kgWF895RWHjjDQlEajc6JnZ0mb7acXFbz6aqmI VRnXVEzRP6TpxQgpv8n5gmlkEIyjbUdJUmBZxFZ1PT2eobYG4t2Dbolb6wOkH8ec7SnQ 6WantwqjbFFyNDc004msYJLrlJ+YmTzIjbXnVKM0j83KjMo9QMpxGsL7GsGH3obc//EG 5Crw81XD0toQPkjtrpsDvd9kycDNQBxbFfDrep/1A7ObiiFZcd2dYswdbGGRaBpS8n5M VApw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705971143; x=1706575943; h=mime-version:user-agent:message-id:date:references:in-reply-to :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=WuFPbGNLCJOrPUqMFzQyNcCfUIuQ8PBwpBuYwoBQm8c=; b=t8vFXh7r1p0RnxZTUtX1vEh0cTcY1b8rJWyK2ROuRo3pcTfsds8cp8a4J1Fs4GM1aB fPVbt1LDY9GIvq7Br62dffZASY0/Zv+juFR1p95YPUKPszhYECAXR/22jfa3f2THH5X2 SCcUtl7n7bSUMRLL5XehdZGOgRaMgeBuqNouKe+fz0RPFZIK/GYIdmewCjwwC2Bzl0Lp 7PnOhunacMv5xvtrZfSLzvmHJplY0tNGtzjIIXebxDMkBGlugoRTH2TMMBgYa0temOwV K60aQVZT5lwim2pdryRs9aP3tOtIot2lkPzrmcjlIkptcXLJw55ViQaEmw4jvk4k0zk0 PZ6w== X-Gm-Message-State: AOJu0Yy1aj3qlI9xfx/DdY0/MNrY9xOpvyKyStlwixnuxnrpLz0QHHVQ Z9z2DpXaKLqhNH6yCf9Ezw71HSGQBDQuM6sjE+pJax3heO6K0E4z35UlTVANLBXC5A== X-Google-Smtp-Source: AGHT+IEMJxg1b6b837CXWlbGxCTj1JYLBrZGKEiTdyHsPJnErfcaVamq0xJt8A/zvOsBGlXc1bPO+w== X-Received: by 2002:a5b:c11:0:b0:dc2:3818:f36a with SMTP id f17-20020a5b0c11000000b00dc23818f36amr2652412ybq.47.1705971143157; Mon, 22 Jan 2024 16:52:23 -0800 (PST) Original-Received: from ahyatt-home.local ([190.83.214.104]) by smtp.gmail.com with ESMTPSA id fc12-20020a056130178c00b007ccf12a3016sm1222197uab.14.2024.01.22.16.52.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Jan 2024 16:52:22 -0800 (PST) In-Reply-To: (T. V. Raman's message of "Mon, 22 Jan 2024 14:06:00 -0800") Received-SPF: pass client-ip=2607:f8b0:4864:20::b2a; envelope-from=ahyatt@gmail.com; helo=mail-yb1-xb2a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:315230 Archived-At: On 22 January 2024 14:06, "T.V Raman" wrote: Some more related thoughts below, mostly thinking aloud: 1. From using gptel and ellama against the same model, I see different style responses, and that kind of inconsistency would be good to get a handle on; LLMs are difficult enough to figure out re what they're doing without this additional variation. Is this keeping the prompt and temperature constant? There's inconsistency, though, even keeping everything constant due to the randomness of the LLM. I often get very different results, for example, to make the demo I shared, I had to run it like 5 times because it would either do things too well (no need to demo corrections), or not well enough (for example, it wouldn't follow my orders to put everything in one paragraph). 2. Package LLM has the laudible goal of bridgeing between models and front-ends, and this is going to be vital. 3. (1,2) above lead to the following question: 4. Can we write down a list of common configuration vars --- here common across the model axis. Make it a union of all such params. I think the list of common model-and-prompt configuration should already be already in the llm package already, but we probably will need to keep expanding this. 5. Next, write down a list of all configurable params on the UI side. This will change quite a bit depending on the task. It's unclear how much should be configurable - for example, in the demo, I have ediff so the user can see and evaluate the diff. But maybe that should be configurable, so if the user wants to see just a diff output instead, perhaps that should be allowed? When I was thinking about a state machine, I was thinking that parts of the state machine might be overridable by the user, such as a "have the user check the results of the operation" is a state in the state machine that the user can just define their own function for. I suspect we'll have a better idea of this after a few more demos. 6. When stable, define a single data-structure in elisp that acts as the bridge between the front-end emacs UI and the LLM module. If I understand you correctly, this would be the configuration you listed in your point (4) and (5)? 7. Finally factor out the settings of that structure and make it possible to create "profiles" so that one can predictably experiment across front-ends and models. I like this idea, thanks!