From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.devel Subject: Re: What's missing in ELisp that makes people want to use cl-lib? Date: Mon, 13 Nov 2023 04:43:29 +0200 Message-ID: <54e115a2-fc36-3056-a030-0dbf32416ddb@gutov.dev> References: <87il6bt4z0.fsf@yahoo.com> <8734xetjkk.fsf@yahoo.com> <87cywhsrcf.fsf@yahoo.com> <87cywgx1z0.fsf@web.de> <83wmuowwp3.fsf@gnu.org> <8334xcwank.fsf@gnu.org> <320999cc-6c83-2315-0044-cc0403400af3@gutov.dev> <9ab5d2bd-a648-cae0-a4a7-ae86be10af0f@gutov.dev> <87r0kuqxbf.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="3092"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Cc: =?UTF-8?Q?Gerd_M=c3=b6llmann?= , Eli Zaretskii , michael_heerdegen@web.de, emacs-devel@gnu.org To: =?UTF-8?B?Sm/Do28gVMOhdm9yYQ==?= Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon Nov 13 03:44:38 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1r2MwH-0000a9-2j for ged-emacs-devel@m.gmane-mx.org; Mon, 13 Nov 2023 03:44:38 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2MvM-000881-2X; Sun, 12 Nov 2023 21:43:40 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2MvK-00087Z-JH for emacs-devel@gnu.org; Sun, 12 Nov 2023 21:43:38 -0500 Original-Received: from wout4-smtp.messagingengine.com ([64.147.123.20]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2MvI-0003sh-8V; Sun, 12 Nov 2023 21:43:38 -0500 Original-Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.west.internal (Postfix) with ESMTP id 4DEF232000E5; Sun, 12 Nov 2023 21:43:33 -0500 (EST) Original-Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Sun, 12 Nov 2023 21:43:33 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc :cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to; s=fm2; t= 1699843412; x=1699929812; bh=ihaKSBuJtYeiBtXlf7JHDfuJDlF/KX6RBFT b8GYb7bU=; b=DH02iHsr6Yw3spFqS0isUJ4GjWvjPB8WZZpr/nWyar8uwPV+e/1 ji8I7SZod4GgcneRfG6zeSNhL3LspZ7Fcaxm74Dxu8XuzTyRfXbueTcJ6kSpg6Kq xiK2XHD0IaC4Zbpmz6+mbmhfl1JiNDAWe2ybU+mMrWAyQntbWT3WDBxCNAA8/h+D Q6+z56GHs4o2imu701Mgcb3I/dzRULBu2mxzc+dRZqwz5Yl2i0WOSba6lNslJLip fwR9ylfcUElCLJWdZUF+3+mITXCRzuozPI1xcNRM+gf3IIeCaaN6Pjl3+X0WaJk7 9T+wm7r92CrSFTz1Ojpy0TE7s9RjAoRLx9Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t= 1699843412; x=1699929812; bh=ihaKSBuJtYeiBtXlf7JHDfuJDlF/KX6RBFT b8GYb7bU=; b=aTbTY7JzNgzootu42vZlIL7OOg1HzkfNMcmJp4rxKNZFhK3Nsip g4eXGsurwxmHVq60NLggtjkOu+rPj+EtzqY/tGQwSTCK+z3kl2xmGBOl6CFvwywh OvCv2y8VJWpfOjWQfQQBygu5R7u9O96mbuc60FRls2uxcXByh75Yy+Cy64pL0Q6q KNI1ZFvNnmYy9Mg+UTmRM5V2brFOfSkuuYE23tZItXEOwXjLIwpONT8I7AUJYS2L afyLaqwp2V2t2qSkhUnR8dKZZYlK4Fe4lJbngWZPMGPVbsb9nUGY0YYlHVZApHzp QKEAXUXrxYEeOIpsJUO0FGykaQT+tsM+GmQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedruddvledggeelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepkfffgggfuffvvehfhfgjtgfgsehtkeertddtfeejnecuhfhrohhmpeffmhhi thhrhicuifhuthhovhcuoegumhhithhrhiesghhuthhovhdruggvvheqnecuggftrfgrth htvghrnhepvdetffektdfftdelledtgedtuddufffhvdeilefgfedujefhheeiueeugfeh geeunecuffhomhgrihhnpehgnhhurdhorhhgnecuvehluhhsthgvrhfuihiivgeptdenuc frrghrrghmpehmrghilhhfrhhomhepughmihhtrhihsehguhhtohhvrdguvghv X-ME-Proxy: Feedback-ID: i0e71465a:Fastmail Original-Received: by mail.messagingengine.com (Postfix) with ESMTPA; Sun, 12 Nov 2023 21:43:31 -0500 (EST) Content-Language: en-US In-Reply-To: <87r0kuqxbf.fsf@gmail.com> Received-SPF: pass client-ip=64.147.123.20; envelope-from=dmitry@gutov.dev; helo=wout4-smtp.messagingengine.com X-Spam_score_int: -63 X-Spam_score: -6.4 X-Spam_bar: ------ X-Spam_report: (-6.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-2.553, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H5=-1, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:312681 Archived-At: On 13/11/2023 03:03, João Távora wrote: > Dmitry Gutov writes: > >>> More importantly, there's an important takeaway here. Your >>> results seem to show that regardless of the alternative (cl-lib >>> or hand-rolled) the solution with Emacs's current recommended >>> sequence processing library is nearly 6 times slower in a >>> real-world use-case. >> >> Note that it's still the difference for the case where the "business >> logic" (the filtering predicate or whatever) doesn't do >> anything. > > OK. So can you provide an even more realistic case? Probably not. Not sure which big users of cl-lib we can find in the core: comp.el uses it, but mostly for the 'loop' macro. Anyway, my point was these benchmarks are very good for improving each of the libs, but not necessarily for making the ultimate choice between them (if we really had to). >> Although certain order-of-magnitude differences are worrying. > > You can say that again... I optimized cl-nset-difference considerably > more. In the process also found your handrolled version can be sped > up considerably. Have a look at this: > > (defun joaot/handrolled-nset-difference (list1 list2) > (if (or (null list1) (null list2)) list1 > (let ((res nil)) > (while (consp list1) > (if (funcall #'member (car list1) list2) > (setf list1 (cdr list1)) > (cl-shiftf list1 (cdr list1) res list1))) > res))) > > (defun dmitry/set-difference (list1 list2) > (delq 'wrong > (mapcar > (lambda (c) (if (member c list2) > 'wrong > c)) > list1))) > > (setq cc (all-completions "" obarray)) > (setq list2 (make-list 12 "shooveedoowaa")) > > (when nil > ;; FASTEST (0.074594068 0 0.0) > (benchmark-run 10 (setq cc (joaot/handrolled-nset-difference cc list2))) > > ;; FASTEST (0.070370948 0 0.0) > (benchmark-run 10 (setq cc (cl-nset-difference cc list2 :test #'equal))) > > ;; 1.8x SLOWER (0.138797637 1 0.06212087500000507) > (benchmark-run 10 (cl-set-difference cc list2 :test #'equal)) > > ;; 3.2x SLOWER (0.22628817199999998 2 0.13694317399999534) > (benchmark-run 10 (dmitry/set-difference cc list2)) > > ;; 18x SLOWER (1.29373404 12 0.7763814810000014) > (benchmark-run 10 (seq-difference cc list2 #'equal)) > > ) All right, time to roll out the big guns. For your attention, ladies and gentlemen, a version in pure, unadulterated Elisp, extracted from my unused patch of two weeks ago (https://debbugs.gnu.org/66806#17): (defun dmitry/set-difference-nocons (list1 list2) (let (ref) (while (member (car list1) list2) (setq list1 (cdr list1))) (setq ref list1) (while ref (if (member (cadr ref) list2) (setcdr ref (cddr ref)) (setq ref (cdr ref)))) list1)) And the benchmarks (we're so fast, 100 iterations for stable numbers): (when nil ;; (0.38175291299999997 0 0.0) 1.1X SLOWER (benchmark-run 100 (setq cc (joaot/handrolled-nset-difference cc list2))) ;; (0.9393577359999999 16 0.5063504589999965) NO COMMENTS (benchmark-run 100 (dmitry/set-difference cc list2)) ;; (0.345876673 0 0.0) FASTEST (of course) (benchmark-run 100 (dmitry/set-difference-nocons cc list2)) ) Anyway, it's pretty cool to have cl-nset-difference so "down to the metal". Although I'm not sure if we should be worried that the funcall overhead makes basically no difference (e.g. if I inline the funcall in joaot/handrolled-nset-difference). I think it's been said that our funcalls are relatively slow. > All my work, including the docstring overhauls Alan requested and some > new tests, now in branch feature/cl-lib-improvements. Nice. >>> Maybe seq.el can be made faster too? Who knows, but it seems >>> difficult without breaking at least some of its defgeneric-based >>> contract. >> >> I was wondering whether you tried looking into seq.el's performance >> problems. It being slower on shorter lists is quite expected: if the >> type dispatch has non-negligible overhead, that should be most >> noticeable when the rest of the work is small. >> >> The case with longer lists and other data structures should be >> possible to improve, though. As long as the type dispatch only happens >> once per sequence, and not for each element. > > Maybe it's possible. But there are two things here: first, you need > non-destructive versions of things in seq.el, because consing is always > a killer. Second, the generic function interface means the typical > shortcuts I applied in cl-lib.el are difficult. Maybe even impossible > without breaking the current contract? I don't know the contract well > enough to tell. At any rate seems like non-trivial work, but I'm happy > if someone can give it a shot. I hope someone will. And agree about destructive versions. > Another thing I noticed is that recently cl-lib.el started depending on > seql.el, in its implementation. Given what I've been seeing, this tells > me there's more low-hanging fruit to optimize in cl-lib.el. Stefan moved a bunch of code there in 2019 (0e4dd67aae8b1003). Good example to try and see if this actually made anything slower.