From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.tangents Subject: Re: Help building Pen.el (GPT for emacs) Date: Sat, 24 Jul 2021 18:13:34 +0300 Message-ID: <83bl6rzqw1.fsf@gnu.org> References: <83im1948mj.fsf@gnu.org> <83lf642jeh.fsf@gnu.org> <83r1fp1es9.fsf@gnu.org> <837dhg1a1l.fsf@gnu.org> <87czr89n1a.fsf@posteo.net> <83y29wywfg.fsf@gnu.org> <87a6mc12it.fsf@posteo.net> <83mtqcyn6h.fsf@gnu.org> <875ywz23vs.fsf@posteo.net> <83czr7zsjw.fsf@gnu.org> <871r7n22e9.fsf@posteo.net> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="20106"; mail-complaints-to="usenet@ciao.gmane.io" Cc: stefan@marxist.se, emacs-tangents@gnu.org, mullikine@gmail.com, rms@gnu.org, bugs@gnu.support To: Philip Kaludercic Original-X-From: emacs-tangents-bounces+get-emacs-tangents=m.gmane-mx.org@gnu.org Sat Jul 24 17:14:08 2021 Return-path: Envelope-to: get-emacs-tangents@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1m7JLs-00056D-5a for get-emacs-tangents@m.gmane-mx.org; Sat, 24 Jul 2021 17:14:08 +0200 Original-Received: from localhost ([::1]:44466 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m7JLr-0001me-72 for get-emacs-tangents@m.gmane-mx.org; Sat, 24 Jul 2021 11:14:07 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:47524) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m7JLh-0001mV-Du for emacs-tangents@gnu.org; Sat, 24 Jul 2021 11:13:57 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:42914) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m7JLf-0007Kk-AD; Sat, 24 Jul 2021 11:13:55 -0400 Original-Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:3769 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m7JLZ-0006Vs-6h; Sat, 24 Jul 2021 11:13:49 -0400 In-Reply-To: <871r7n22e9.fsf@posteo.net> (message from Philip Kaludercic on Sat, 24 Jul 2021 14:49:02 +0000) X-BeenThere: emacs-tangents@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Emacs news and miscellaneous discussions outside the scope of other Emacs mailing lists List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-tangents-bounces+get-emacs-tangents=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-tangents" Xref: news.gmane.io gmane.emacs.tangents:692 Archived-At: > From: Philip Kaludercic > Cc: rms@gnu.org, mullikine@gmail.com, emacs-tangents@gnu.org, > stefan@marxist.se, bugs@gnu.support > Date: Sat, 24 Jul 2021 14:49:02 +0000 > > > How would one know it's 'long' and not some other data type? > > I am not sure what you mean? "long" makes sense here because Java will > automatically up-cast any other type to fit. So you came up with perhaps the single example that exists in the whole world where the issues I mentioned _might_ not matter, and even that only under some assumptions. A feature that aspires to be generally useful cannot possibly depend on such problematic assumptions. > >> since it is mentioned over 6000 times on GitHub (and this method even > >> has a bug, as the article explains -- but that is a totally different > >> issue). > > > > That's not how AI works: it doesn't just count the number of times > > something is mentioned. That usually leads to unsatisfactory results. > > Of course, that would be oversimplifying. At the same time, if the > training samples have common patterns, a model is more likely to > reproduce that behaviour. No, that's not it: a single example repeated in identical form many times doesn't reinforce the learned pattern. You need many similar, but different code samples, and most probably in different languages.