From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.devel Subject: Re: What's missing in ELisp that makes people want to use cl-lib? Date: Tue, 14 Nov 2023 13:47:32 +0200 Message-ID: <2580712a-8411-18e7-4cd7-d3a17ed4c50e@gutov.dev> References: <8734xetjkk.fsf@yahoo.com> <87cywhsrcf.fsf@yahoo.com> <87cywgx1z0.fsf@web.de> <83wmuowwp3.fsf@gnu.org> <8334xcwank.fsf@gnu.org> <320999cc-6c83-2315-0044-cc0403400af3@gutov.dev> <9ab5d2bd-a648-cae0-a4a7-ae86be10af0f@gutov.dev> <87r0kuqxbf.fsf@gmail.com> <54e115a2-fc36-3056-a030-0dbf32416ddb@gutov.dev> <43f290b0-4119-597b-c89a-0fb4c7db1665@gutov.dev> <6ec0607f-3047-91d3-6a55-40e06fa002fa@gutov.dev> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="39346"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Cc: =?UTF-8?Q?Gerd_M=c3=b6llmann?= , Eli Zaretskii , michael_heerdegen@web.de, emacs-devel@gnu.org To: =?UTF-8?B?Sm/Do28gVMOhdm9yYQ==?= Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Nov 14 12:48:54 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1r2ruX-0009tw-Gd for ged-emacs-devel@m.gmane-mx.org; Tue, 14 Nov 2023 12:48:54 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2rte-00078l-7u; Tue, 14 Nov 2023 06:47:58 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2rtP-00078U-41 for emacs-devel@gnu.org; Tue, 14 Nov 2023 06:47:44 -0500 Original-Received: from wout4-smtp.messagingengine.com ([64.147.123.20]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2rtM-0006oE-AV; Tue, 14 Nov 2023 06:47:42 -0500 Original-Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.west.internal (Postfix) with ESMTP id 22AC73202029; Tue, 14 Nov 2023 06:47:37 -0500 (EST) Original-Received: from mailfrontend2 ([10.202.2.163]) by compute5.internal (MEProxy); Tue, 14 Nov 2023 06:47:37 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc :cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to; s=fm2; t= 1699962456; x=1700048856; bh=/iNOXc7nxdb8bJmFHiaF82Lwm7IHLhkDHPn HEpU02kw=; b=pYLY9sNcQ1cMxM7FVKMSbcDvHQUssVhFCImx81MFzSTJ0pQ/s+d InoByCAyp4bM4DetifbW3f3IKTSzW7Dt36SRKCYJIOquYiS1Jy41D2L+yvV1fq29 njlTfKsjf39sx7u8LEs8i/hD6XwguZ1Utr5PNzuytx2+L8nm/Ep6oBv0j8VimkCW B/ThdNuVA2DhCxvCiJFABv+yGUo/5hJCXiafreRyLfhFMloaBKjnzt9OOLlvgidb /ASHChp1f1GLFxcYw1e7Ah/+kzeK6w4HWVxC0IrGG4vVBwwikNB6QNCKtyggYO1l xmapM82/sgv+Pocj95YO9H1vedaI5wI15bA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t= 1699962456; x=1700048856; bh=/iNOXc7nxdb8bJmFHiaF82Lwm7IHLhkDHPn HEpU02kw=; b=3N+tAH4f/TPjm5Ri1aiToFoXuoQ5cD1LIQ55TczEOdpxQNCCCJY 8qsw3WpPeJ/aPYcu5kQaaM/YWXUMmhLPabDyK/rH/FCBH9zefI6eVp5U/pFBJOyh 0e85QQbFtW3nvlUpj0JSyYk+CAYNoHJHuil3zNwV9vZkfCbiJ3f5957I6o5liNzN CNmxo3JqFR5lV0WJgkVpMoEStR3vB8xCFGNVxDngULlnO8EnRb9kV3RlWIt8gq1y MWU8Hvm09m078qW2DMwZr/scDYCqxr15OkCU3nkA4GWh4R7wjO87Zo3cKT92dd2B NsgAwp4yhrcCx4SLvw83Xp0KpuBp9yAC4Qg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrudefvddgfeduucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepkfffgggfuffvvehfhfgjtgfgsehtkeertddtfeejnecuhfhrohhmpeffmhhi thhrhicuifhuthhovhcuoegumhhithhrhiesghhuthhovhdruggvvheqnecuggftrfgrth htvghrnhepueffveeiffeugffgveejvdegteeuhfdugfehleelfeejtdelteethfdtieeg vddunecuffhomhgrihhnpehgihhthhhusgdrtghomhenucevlhhushhtvghrufhiiigvpe dtnecurfgrrhgrmhepmhgrihhlfhhrohhmpegumhhithhrhiesghhuthhovhdruggvvh X-ME-Proxy: Feedback-ID: i0e71465a:Fastmail Original-Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 14 Nov 2023 06:47:34 -0500 (EST) Content-Language: en-US In-Reply-To: Received-SPF: pass client-ip=64.147.123.20; envelope-from=dmitry@gutov.dev; helo=wout4-smtp.messagingengine.com X-Spam_score_int: -67 X-Spam_score: -6.8 X-Spam_bar: ------ X-Spam_report: (-6.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-3.971, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:312724 Archived-At: On 14/11/2023 12:34, João Távora wrote: > [Replying to both of your mails here] > > On Tue, Nov 14, 2023 Dmitry Gutov > wrote: > > > And if the list has just 1 element (or zero?), the gap would be even > > larger. > > > > Only seeing a 1.6x difference here is a nice surprise, actually. > > These two statements together seem contradictory.  Why would > you be happy to see a particular factor for a given > list length if you know the factor is unbounded in general > for small lists? I don't think it's unbounded, just high. And 12-element list is a nice size for this particular comparison. Programs using tiny list would in their majority be tiny as well, so it wouldn't matter. Though of course there are exceptions: compilers, for example. > And it's only in the small lists case.  If I pass my own predicate to > both cl-set-difference and your best seq-difference-3 > > (defun myequal (a b) >   (equal a b)) > > (setq cc (make-list 12 "blabla")) > (setq list2 (make-list 12 "shooveedoowaa")) > > (when nil >   ;; (0.106080265 4 0.03849865399999963) >   (benchmark-run 10000 (cl-set-difference cc list2 :test #'myequal)) >   ;; (0.5504704210000001 39 0.394207416) >   (benchmark-run 10000 (seq-difference-3 cc list2 #'myequal)) > >    ) > > I get the 5.5x slowdown again (in one experiment, not shown, Right. This is still for the small lists case. Here we suffer the dynamic dispatch at least twice, along with funcall indirection. Again, would be great to get these ratios down, but the main scenarios where I *did* worry about the performance of a sequence primitive, have always been large lists. > I got a whopping 200x slowdown, but I can't reproduce it right now > from Emacs -Q, maybe I had some method on some crucial seq.el generic) That would be helpful to know as well. > > > This is all interesting, until one ponders what happens if an existing > > > seq.el user somewhere has: > > > > > > (cl-defmethod seq-contains-p ((seq my-voodoo-seq) > > >                                (elt (eql :secret-voodoo)) &optional > _tesfn) > > >    (invoke-voodoo-priests seq)) > > > > > > making use of seq.el's support for abstract polymorphic sequences. > > > > > > With seq.el 2.24 a seq-difference operation would consider this user's > > > method, with seq.el 2.24.dmitry (i.e. your fast seq-difference-3) it > > > simply won't.  This user's code is clearly broken. > > > > > > But was the user allowed to do that in the first place?  If not, > > > why is seq-contains-p a public generic function? > > > > It was allowed, and yes, indeed, it's a breaking change. So we should at > > least examine the existing public code out there and see whether such > > overrides exist. > > Not so easy, I'm afraid.  seq-contains-p is not the only such generic, > it's just one I happened to spot first.  seq-do is a generic and in that > group mentioned in the sparse seq.el documentation about mandatory > customization, meaning presumably it is essential for custom sequences. > > See https://github.com/search?q=%22%28cl-defmethod+seq-do%22&type=code > for > a list of custom sequences that use it. > > Your seq-difference-3 doesn't call seq-do like the original seq-difference > does, so the set difference operations for those custom sequences might > well be broken.  What's even more problematic is that it skips seq-do > entirely for certain predicates and not entirely for certain other > predicates. The way I understand this, is any new sequence type has to implement seq-do. As soon as that happens, a lot of (probably all) sequence functions in seq.el start working on that type. But the author is also free to provide specialized implementations for individual functions (usually for better performance) -- and those implementations don't have to use seq-do. Indeed, in most cases the optimization would involve cutting out the overhead that seq-do brings in the general case. > It's still bizarre.  Here's how I think: if an application is customizing > the entry point directly, why even call the entry point?  OK, so what if > a library is customizing the entry point?   It's presumably because that > library provides a data type for applications to instantiate and use.  But > if the library does that, all calling guarantees of other generic > functions are lost for, say, user method for subtypes of that data type > (or :around, or :after, etc).  So it's also nonsensical.  The only think > where it _could_ make a shred of sense is in a kind of "private library" > where > the application controls both the data type and knows exactly the shortcuts > taken.  But then "private library" is an oxymoron. An application shouldn't be customizing separate entry points. Separate libs (providing new data types) can do that. E.g. if we have a lib providing lazy sequences, it can provide its own set of implementations for seq.el. And when doing that, make sure the resulting behavior is consistent between functions. I haven't though much about the interaction with :around, :after, etc. Maybe they'll be fine, or maybe it would be better to avoid it. Look fine on the superficial level, though. > Oh, don't get me wrong: cl-lib.el's implementation details are not pretty > and not easy, likely typical library code (though i've seen much much > worse).  What's fundamentally easier about optimizing cl-lib is that the > contract it offers is much, much more well specified. > > Possibly because it comes as a result of careful design in committees of > very capable programmers in the 90's, before the CL winter, that were > already on to this class of difficulties and wanted to make a common > interface. > > The interface they made for sequences is not fully generic, but there > _are_ a lot of different implementations for that interface, each CL > compiler has  one.  Most of the tricks, like what I did to optimize > cl-{n}set-difference  are standard stuff in the CL world.   There are even > reference LOOP implementations (likely much more performant than ours, > though possibly not compatible with some non-standard edge cases our > cl-loop has of which I know a couple). I do believe it's helpful to have it around. > > > The case in the CL world against generic functions' performance > > > is often not that the dispatch is slow, but that > > > employing them too liberally makes framework optimization > > > hard, because you restrict yourself from optimization > > > opportunities. > > > > I don't know if that's a major issue: just like cl-some has different > > branches, seq-some could have three different definitions. The > > approaches to optimization seem transferable, more or less. > > Seq-some also calls into a lot of user customizable code, so it'll > suffer from the same class of problems.  Say, are you going to skip > the seq-do generic sometimes there as well? I believe so. > > I believe the actual value it provides is a list of implementations that > > are quite compact and as such easy to fix/extend/modify. And if cl-lib > > is handy for someone with CL experience, I'll guess that seq.el rings > > more familiar to the Clojure users. > > Maybe in the superficial user interface, because of names etc.  Or > are you saying seq.el interface is extracted from Clojure's standard much > like cl-lib is from CL standard?  (Do Clojure generic functions work like > ours?)  Then by all means we should rush to study that contract closely, > to know what we can and can't do as optimizers. It looks very much inspired, both in naming and in the implementation approach. But it's not a carbon copy (much farther from it than in cl-lib's example), and our VM is also quite different from JVM in performance characteristics. If someone were to write up a direct comparison, that would be great, of course.