From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: =?UTF-8?B?Sm/Do28gVMOhdm9yYQ==?= Newsgroups: gmane.emacs.devel Subject: Re: What's missing in ELisp that makes people want to use cl-lib? Date: Sun, 12 Nov 2023 14:34:27 +0000 Message-ID: References: <87bkc4jpja.fsf@dataswamp.org> <12da6bcb-1818-7fbe-12af-8d4607724332@gutov.dev> <87il6bt4z0.fsf@yahoo.com> <8734xetjkk.fsf@yahoo.com> <87cywhsrcf.fsf@yahoo.com> <87cywgx1z0.fsf@web.de> <83wmuowwp3.fsf@gnu.org> <8334xcwank.fsf@gnu.org> <83msvjv63v.fsf@gnu.org> <838r73usg9.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="8231"; mail-complaints-to="usenet@ciao.gmane.io" Cc: dmitry@gutov.dev, gerd.moellmann@gmail.com, michael_heerdegen@web.de, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Nov 12 15:32:27 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1r2BVj-0001uO-P4 for ged-emacs-devel@m.gmane-mx.org; Sun, 12 Nov 2023 15:32:27 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2BUy-0003K3-AJ; Sun, 12 Nov 2023 09:31:42 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2BUw-0003Ju-Rd for emacs-devel@gnu.org; Sun, 12 Nov 2023 09:31:38 -0500 Original-Received: from mail-lf1-x130.google.com ([2a00:1450:4864:20::130]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2BUu-0005BN-TY; Sun, 12 Nov 2023 09:31:38 -0500 Original-Received: by mail-lf1-x130.google.com with SMTP id 2adb3069b0e04-507a3b8b113so4904618e87.0; Sun, 12 Nov 2023 06:31:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1699799494; x=1700404294; darn=gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7cC1xZitDRUnlSynvfeBCaI1sNX+SOSedOMrDAJVPLE=; b=RD9gPMqGh7p79FJ1JJ7ZjuKHWorSw9Y4zaRv8ND6uQWgbWkaX5hSXOPEAWeQfES5Zj C09CzErCzK27rsBUtSVXTz+chsyYR6W1iwsE9m+wiAZ2VodD7LejW5VNP8npffzFyt5b ugOxkFFrKP6IIpxMWsFkGTBHfU6ngSmKcM5c6Dyt4kg3ms4VVkAHrQyzIjipc0DQeOy1 aA/75cHLH7bgU7ZJt7yzphXZ/ZoAp/cT/9WrV+SjkEa4yvTno8ak3tkAM+foTj0uWy1y UC2PPEKo/zlR9qXeZ6Ib6e5e7CZQuF12Z/zSNCATysw/B6I1NerGzWgTrdxQ08IDga3s Zvyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699799494; x=1700404294; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7cC1xZitDRUnlSynvfeBCaI1sNX+SOSedOMrDAJVPLE=; b=lnS59UY2KvVw6YXEFLtvw1AlNPwlNw5JasOtuPmjiK+vKNz3VlPlgGeempXLGKIhow 1c6jGQAafM7yoztnDRwXlR5NWSl3N0HJPXxS6uMOyvrArZJIXCxP2rgy5E+bKhUl5wZW J5Cs24D/AyfUaOZEAHY3wZYKTedzgYmKc1801Hs3SoZS/s3CYoZx7iZsSzB+u5WUXsiV SzAir9qqyvnHhtL7JDyu3pF1w320lcpxkBRxnR+DLqq9k9CKsmuDrxsReTKWrjJZMp+L PCswtN0KQK3KwSWDuU+z/PRy4V8u8qnBz11jfE4pTPjcvjD2YCxP1dLrDFw11kIQg7qK 5siQ== X-Gm-Message-State: AOJu0Yzgm97f7S22gjU+hB2aTqWPlPWwJnqmVfAZXQShcyZrXYMHg3qr 0DDFzHF5la8ZOvjYquVOAS5qSmvTlFQZqb6xFFvBuxkRf/w= X-Google-Smtp-Source: AGHT+IF6kAS2aqoAI+D+w4U8wVBQMdhlB2gD3iuUXAsaDTJH+Lj2DNkQDCbPfOxDsDF4kfjfUNdNJD3jmQ4aYo5CgZM= X-Received: by 2002:ac2:5185:0:b0:509:ffe4:e3d2 with SMTP id u5-20020ac25185000000b00509ffe4e3d2mr2523216lfi.6.1699799493599; Sun, 12 Nov 2023 06:31:33 -0800 (PST) In-Reply-To: <838r73usg9.fsf@gnu.org> Received-SPF: pass client-ip=2a00:1450:4864:20::130; envelope-from=joaotavora@gmail.com; helo=mail-lf1-x130.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:312655 Archived-At: On Sun, Nov 12, 2023 at 11:22=E2=80=AFAM Eli Zaretskii wrote= : > > 1. I posted some micro-benchmarks about cl-some and seq-some. > > 2. Gerd wrote "Joao showed that it's slow.". > > 3. You replied, directly below "No, he didn't.". > > > > Now you say I misquoted you. Hope to have rectified that. > > > > So you _do_ think this shows seq.el is slower than cl-lib.el? > > I believe I explained the issue in another message. It has to do with > the (significant in this case) difference between "slower" and "slow". > I responded to "slow", whereas you said I disagreed with "slower". That's clear now, thanks. I read that email as well. I agree things can never be taken in absolute terms, and one has to consider specific jobs. However, I think it is even more important to note that to have any useful statement regarding the efficiency -- relative or absolute -- of any library, that statement should be falsifiable. For statements like "micro-benchmark X doesn't matter in the common case" to be useful we need to characterize the "common case" well, because if we don't they're not falsifiable. So I hope that in this discussion we can eventually come up with some characterization of this "common case". so that we can design experiments around it. I think performance, despite not being the sole one, is an important consideration in this discussion. IME sequence-processing ability and efficiency does play a very significant role in determining the performance of certain Lisp programs (including Common Lisp, of course). This is so especially when garbage generation is involved. In certain Common Lisp implementations with more advanced multi-generational garbage collectors, the results are sometimes counter-intuitive: generating lots of fast-to-scavenge garbage ends up being better than long-lived slow-to-scavenge garbage). But in Elisp at least, it seems garbage is always bad. Destructive versions of sequence-processing functions are in that respect very useful. cl-lib.el has them but it seems many of them are missing low-hanging optimization opportunities. However I worked on that in another patch and it seems to have provided a good speedup (in micro-benchmarks). Let's see what Dmitry comes up with when he takes them to a real-world case. > Sometimes much slower, sometimes slightly slower (and in at least > one case faster). Anyway, I took the "cl-some vs seq-some" measurements again, after applying this simple patch that I attach after my sig. The summary: Small lists: cl-some fastest seq-some 5.7 times slower, garbage collects Big lists cl-some fastest seq-some 1.6 times slower Small vectors: cl-some fastest seq-some 5.13 times slower, garbage collects Big vectors cl-some fastest seq-some 1.27 times slower The "in one case faster" is gone. Also interesting is the garbage generation footprint of seq.el functions in their current form. I'm not sure we can easily dismiss the "small seqs/tight loop" 5x slowdown that seq.el exhibits in this microbenchmark, because as you can see in the benchmark detail there's a lot of garbage generation involved. So even if the tight loop were to go away, that garbage would pile up. I also think that tight sequence-processing loops are not all that uncommon in Elisp programs. Finally, a program may not use these tight loops but do lots of different sequence processing functions: if it selects slow abstractions for all of them, it ends up being just as if there was a tight loop. But as I wrote above it is best if we find real-world applications to benchmark the two approaches. Maybe someone else reading this can help find one or two. Jo=C3=A3o First the benchmarks in full: (require 'cl-lib) (defun bench-seq-some (seq) (seq-some #'identity seq)) (defun bench-cl-some (seq) (cl-some #'identity seq)) (defun bench-cl-loop-list (l) ;; checks for some types of improper lists (cl-loop for e in l thereis (identity e))) (defun bench-cl-loop-vec (v) (cl-loop for e across v thereis (identity e))) (when nil ;; Small lists (let ((l (list nil nil nil nil t))) ;; FASTEST (0.23516409500000002 0 0.0) (benchmark-run 1000000 (bench-cl-some l))) (let ((l (list nil nil nil nil t))) ;; 5.7x SLOWER (1.338184149 16 0.8307866220000051) (benchmark-run 1000000 (bench-seq-some l))) (let ((l (list nil nil nil nil t))) ;; 1.14x SLOWER (0.26885113699999996 0 0.0) (benchmark-run 1000000 (bench-cl-loop-list l))) ;; Big lists (let ((l (make-list 10000000 nil))) ;; FASTEST (0.266716895 0 0.0) (benchmark-run 1 (bench-cl-some l))) (let ((l (make-list 10000000 nil))) ;; 1.6x SLOWER (0.428996694 0 0.0) (benchmark-run 1 (bench-seq-some l))) (let ((l (make-list 10000000 nil))) ;; 1.05x SLOWER (0.279309231 0 0.0) (benchmark-run 1 (bench-cl-loop-list l))) ;; Small vectors (let ((v (vector nil nil nil nil t))) ;; FASTEST (0.257238335 0 0.0) (benchmark-run 1000000 (bench-cl-some v))) (let ((v (vector nil nil nil nil t))) ;; 5.13x SLOWER (1.317641304 16 0.8448574659999935) (benchmark-run 1000000 (bench-seq-some v))) (let ((v (vector nil nil nil nil t))) ;; 1.14x SLOWER (0.29413928100000003 0 0.0) (benchmark-run 1000000 (bench-cl-loop-vec v))) ;; Big vectors (let ((v (make-vector 10000000 nil))) ;; FASTEST (0.316211001 0 0.0) (benchmark-run 1 (bench-cl-some v))) (let ((v (make-vector 10000000 nil))) ;; 1.27x SLOWER (0.40362057500000004 0 0.0) (benchmark-run 1 (bench-seq-some v))) (let ((v (make-vector 10000000 nil))) ;; 1.11x SLOWER (benchmark-run 1 (bench-cl-loop-vec v))) ) And here's the patch I used to make cl-some faster for the common case of only one non-list sequence passed. diff --git a/lisp/emacs-lisp/cl-extra.el b/lisp/emacs-lisp/cl-extra.el index 2ca2d03170a..7c09328eda5 100644 --- a/lisp/emacs-lisp/cl-extra.el +++ b/lisp/emacs-lisp/cl-extra.el @@ -206,16 +206,26 @@ cl-some non-nil value. \n(fn PREDICATE SEQ...)" - (if (or cl-rest (nlistp cl-seq)) - (catch 'cl-some - (apply #'cl-map nil - (lambda (&rest cl-x) - (let ((cl-res (apply cl-pred cl-x))) - (if cl-res (throw 'cl-some cl-res)))) - cl-seq cl-rest) nil) - (let ((cl-x nil)) - (while (and cl-seq (not (setq cl-x (funcall cl-pred (pop cl-seq)))))= ) - cl-x))) + (cond + (cl-rest + (catch 'cl-some + (apply #'cl-map nil + (lambda (&rest cl-x) + (let ((cl-res (apply cl-pred cl-x))) + (if cl-res (throw 'cl-some cl-res)))) + cl-seq cl-rest) nil)) + ((nlistp cl-seq) + (let ((cl-x nil) + (l (length cl-seq)) + (i 0)) + (while (and (< i l) + (prog1 (not (setq cl-x (funcall cl-pred + (aref cl-seq i)))) + (cl-incf i)))) + cl-x)) + (t (let ((cl-x nil)) + (while (and cl-seq (not (setq cl-x (funcall cl-pred (pop cl-seq)))= ))) + cl-x)))) ;;;###autoload (defun cl-every (cl-pred cl-seq &rest cl-rest)