From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail
From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
Newsgroups: gmane.emacs.devel
Subject: Re: New "make benchmark" target
Date: Mon, 30 Dec 2024 21:34:55 +0000
Message-ID: <871pxorh30.fsf@protonmail.com>
References: <87h679kftn.fsf@protonmail.com> <87frm51jkr.fsf@protonmail.com>
 <861pxpp88q.fsf@gnu.org> <87frm5z06l.fsf@protonmail.com>
 <86msgdnqmv.fsf@gnu.org> <87wmfhxjce.fsf@protonmail.com>
 <86jzbhnmzg.fsf@gnu.org> <87o70txew4.fsf@protonmail.com>
 <yp1frm56n8r.fsf@fencepost.gnu.org>
Reply-To: Pip Cet <pipcet@protonmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214";
	logging-data="31959"; mail-complaints-to="usenet@ciao.gmane.io"
Cc: Eli Zaretskii <eliz@gnu.org>, stefankangas@gmail.com, mattiase@acm.org,
 eggert@cs.ucla.edu, emacs-devel@gnu.org
To: Andrea Corallo <acorallo@gnu.org>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Dec 31 04:21:51 2024
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane-mx.org
Original-Received: from lists.gnu.org ([209.51.188.17])
	by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
	(Exim 4.92)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>)
	id 1tSSpK-00088N-Pb
	for ged-emacs-devel@m.gmane-mx.org; Tue, 31 Dec 2024 04:21:50 +0100
Original-Received: from localhost ([::1] helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <emacs-devel-bounces@gnu.org>)
	id 1tSSoZ-00054G-NT; Mon, 30 Dec 2024 22:21:03 -0500
Original-Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <pipcet@protonmail.com>)
 id 1tSNPo-0006RZ-Vq
 for emacs-devel@gnu.org; Mon, 30 Dec 2024 16:35:09 -0500
Original-Received: from mail-10628.protonmail.ch ([79.135.106.28])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <pipcet@protonmail.com>)
 id 1tSNPm-0007d2-MH
 for emacs-devel@gnu.org; Mon, 30 Dec 2024 16:35:08 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com;
 s=protonmail3; t=1735594501; x=1735853701;
 bh=Q2sgZ7bG4cOwemteTu4UREAods8PxpkNO6y3oOcD6y0=;
 h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References:
 Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID:
 Message-ID:BIMI-Selector:List-Unsubscribe:List-Unsubscribe-Post;
 b=iamxyhakWxWaKEtlEEWOTQpfB6mIzCSSS6fOj2mRS+UWL86zIP8c4Axd7hLkd6YYI
 7NIuYGVR8sv9kBNvtqZomT/M/ddnioxD9DIzFQf4gDjg+5kfUpaJv4utZ3ms3ljGmj
 aHeLME4PbqzxXsGWfR7UwzxuD20mcIxk9olRd6Mnw9E5U2YXr3AUZpf5mmMhB+wfqi
 t2cKRGzdHGu2x1YRDHxcUWgsCswrnggFiOXNqpzrO+MVoAsx524saxGNLfULuec28S
 4EMSCSL4nR8MxYnkiIoZ/qxtfpJandUsduje0QMovVWQq8JvqaOmIM+ZGBvGaOoha1
 CDGwltWmqPuxw==
In-Reply-To: <yp1frm56n8r.fsf@fencepost.gnu.org>
Feedback-ID: 112775352:user:proton
X-Pm-Message-ID: 78e776d9c36d755c07f7e5c3b224049d95574425
Received-SPF: pass client-ip=79.135.106.28; envelope-from=pipcet@protonmail.com;
 helo=mail-10628.protonmail.ch
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001,
 RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001,
 RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001,
 SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-Mailman-Approved-At: Mon, 30 Dec 2024 22:21:01 -0500
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
 <mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
 <mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org
Xref: news.gmane.io gmane.emacs.devel:327470
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/327470>

"Andrea Corallo" <acorallo@gnu.org> writes:
>> Benchmarking is hard, and I wouldn't have provided this very verbose
>> example if I hadn't seen "paradoxical" results that can only be
>> explained by such mechanisms.  We need to move away from average run
>> times either way, and that requires code changes.
>
> I'm not sure I understand what you mean, if we prefer something like
> geo-mean in elisp-beanhcmarks we can change for that, should be easy.

In such situations (machines that don't allow reasonable benchmarks;
this has become the standard situation for me) I've usually found it
necessary to store a bucket histogram (or full history) across many
benchmark runs; this clearly allows you to see the different throttling
levels as separate peaks.  If we must use a single number, we want the
fastest actual run; so, in practice, discard a few percentiles to
account for possible rare errors.

> I'm open to patches to elisp-benchmarks (and to its hypothetical copy in
> emacs-core).  My opinion that something can potentially be improved in

What's the best way to report the need for such improvements?  I'm
currently aware of four "bugs" we should definitely fix; one of them,
ideally, before merging.

> it (why not), but I personally ATM don't understand the need for ERT.

Let's focus on the basics right now: people know how to write ERT tests.
We have hundreds of them.  Some of them could be benchmarks, and we want
to make that as easy as possible.

ERT provides a way to do that, in the same file if we want to: just add
a tag.

It provides a way to locate and properly identify resources (five
"bugs": reusing test A as input for test B means we don't have
separation of tests in elisp-benchmarks, and that's something we should
strive for).

It also allows a third class of tests: stress tests which we want to
execute more often than once per test run, which identify occasional
failures in code that needs to be executed very often to establish
stability (think bug#75105: (cl-random 1.0e+INF) produces an incorrect
result once every 8 million runs).  IIRC, right now ERT uses ad-hoc
loops for such tests, but it'd be nicer to expose the repetition count
in the framework (I'm not going to run the non-expensive testsuite on
FreeDOS if that means waiting for a million iterations on an emulated
machine).

(I also think we should introduce an ert-how structure that describes how
a test is to be run: do we want to inhibit GC or allow it?  Run some
warm-up test runs or not? What's the expected time, and when should we
time out? We can't run the complete matrix for all tests, so we need
some hints in the test, and the lack of a test declaration in
elisp-benchmarks hurts us there).

Pip