From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail
From: Andrea Corallo <acorallo@gnu.org>
Newsgroups: gmane.emacs.devel
Subject: Re: New "make benchmark" target
Date: Mon, 30 Dec 2024 13:26:28 -0500
Message-ID: <yp1frm56n8r.fsf@fencepost.gnu.org>
References: <87h679kftn.fsf@protonmail.com> <87y107g0xc.fsf@protonmail.com>
 <yp1seq67ola.fsf@fencepost.gnu.org> <87frm51jkr.fsf@protonmail.com>
 <861pxpp88q.fsf@gnu.org> <87frm5z06l.fsf@protonmail.com>
 <86msgdnqmv.fsf@gnu.org> <87wmfhxjce.fsf@protonmail.com>
 <86jzbhnmzg.fsf@gnu.org> <87o70txew4.fsf@protonmail.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214";
	logging-data="19383"; mail-complaints-to="usenet@ciao.gmane.io"
User-Agent: Gnus/5.13 (Gnus v5.13)
Cc: Eli Zaretskii <eliz@gnu.org>,  stefankangas@gmail.com,
 mattiase@acm.org,  eggert@cs.ucla.edu,  emacs-devel@gnu.org
To: Pip Cet <pipcet@protonmail.com>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon Dec 30 19:26:54 2024
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane-mx.org
Original-Received: from lists.gnu.org ([209.51.188.17])
	by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
	(Exim 4.92)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>)
	id 1tSKTe-0004rv-80
	for ged-emacs-devel@m.gmane-mx.org; Mon, 30 Dec 2024 19:26:54 +0100
Original-Received: from localhost ([::1] helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <emacs-devel-bounces@gnu.org>)
	id 1tSKTM-0002mt-4r; Mon, 30 Dec 2024 13:26:36 -0500
Original-Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <acorallo@gnu.org>) id 1tSKTJ-0002kK-MT
 for emacs-devel@gnu.org; Mon, 30 Dec 2024 13:26:33 -0500
Original-Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <acorallo@gnu.org>)
 id 1tSKTI-0004Bp-99; Mon, 30 Dec 2024 13:26:32 -0500
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-Version:Date:References:In-Reply-To:Subject:To:
 From; bh=s7u4Lu3dvNWaFFaqb3ZKd3DaLzgi5TX2ppX+KMbImKU=; b=XF4XE/h7D/NoOxkP+vP0
 meVhKfoDdmwnEOI8TerxIU5Mt6iSaGfFgmJ4wzj6eF0czKMds/K2o5LcH993h6B5WulFOpeACQVRX
 fWf74PpJS6Jhs3uWAFsSccVAOL3ZDrl7itXvqEFi+DQcIwugE5KXJ9AgZ7xI9wQHU/9ScgC1AHEWh
 1YfsiPBx8oKNcmTXC00p1ctQszh8DkAmKLh70cXiQt04R2hwhVEJ6bUYZ+JHcypbkOgd11K3x6PBU
 lKk7y1quRzkvZIvUGUlsJRRvJvlKE4KhekbJw5b8acjW8zMUEbuEAvRlG18KfkZ6e11ZkQ90UEO9m
 jmsOvynstCbl2w==;
Original-Received: from acorallo by fencepost.gnu.org with local (Exim 4.90_1)
 (envelope-from <acorallo@gnu.org>)
 id 1tSKTF-00069Y-22; Mon, 30 Dec 2024 13:26:30 -0500
In-Reply-To: <87o70txew4.fsf@protonmail.com> (Pip Cet's message of "Mon, 30
 Dec 2024 17:25:44 +0000")
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
 <mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
 <mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org
Xref: news.gmane.io gmane.emacs.devel:327460
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/327460>

Pip Cet <pipcet@protonmail.com> writes:

> "Eli Zaretskii" <eliz@gnu.org> writes:
>
> Top-posted TL;DR: let's call Andrea's code "make elisp-benchmarks" and
> include it now?  That would preserve the Git history and importantly (to
> me) reserve the name for now.
>
>>> Date: Mon, 30 Dec 2024 15:49:30 +0000
>>> From: Pip Cet <pipcet@protonmail.com>
>>> Cc: acorallo@gnu.org, stefankangas@gmail.com, mattiase@acm.org, eggert@cs.ucla.edu, emacs-devel@gnu.org, joaotavora@gmail.com
>>>
>>> >> https://lists.gnu.org/archive/html/emacs-devel/2024-12/msg00595.html
>>> >
>>> > Thanks, but AFAICT this just says that you intended to use/extend ERT
>>> > to run this benchmark suite, but doesn't explain why you think using
>>> > ERT would be an advantage worthy of keeping.
>>>
>>> I think some advantages are stated in that email: the ERT tagging
>>> mechanism is more general, works, and can be extended (I describe one
>>> such extension).  All that isn't currently true for elisp-benchmarks.
>>
>> Unlike the rest of the test suite, where we need a way to be able to
>> run individual tests, a benchmark suite is much more likely to run as
>> a whole, because benchmarking a single kind of jobs in Emacs is much
>> less useful than producing a benchmark of a representative sample of
>> jobs.  So I'm not sure this particular aspect is such a serious
>
> Not my experience.  Running the entire suite is much more likely not to
> produce usable data due to such issues as CPU thermal management (for
> example: the first few tests are run at full clock speed and heat up the
> system so much that thermal throttling is activated; the next few tests
> are run at a reduced rate while the fan is running; eventually we run
> out of amperes that we're allowed to drain the battery by and reduce
> clock speed even further; this results in reduced temperature, so the
> fan speed is reduced, which means we will eventually decide to try a
> higher clock speed again, which will work for a while only before
> repeating the cycle.  The whole thing will appear regular enough we
> won't notice the data is bad, but it will be, until we rerun the test on
> the same system in a different room and get wildly different results).
> A single-second test run in a loop produces the occasional mid-stream
> result which is actually useful (and promptly lost to the averaging
> mechanism of elisp-benchmarks).

Yes, elisp-benchmark is running all the selected benchmarks at each
iteration, so that a single one cannot take advantaged of the initial
cool CPU state.  If unstable throttling on a specific system is a
problem this will show up as computed error for that test.  If a system
is throttling the right (and only) thing to do is to measure it, this is
in my experience what benchmarks do.

That said tipically Eli is right, the typical use of a benchmark suite
is to run it as a whole and look at the total results, this indeed
accounts for avg throttling as well.

> Benchmarking is hard, and I wouldn't have provided this very verbose
> example if I hadn't seen "paradoxical" results that can only be
> explained by such mechanisms.  We need to move away from average run
> times either way, and that requires code changes.

I'm not sure I understand what you mean, if we prefer something like
geo-mean in elisp-beanhcmarks we can change for that, should be easy.
I'm open to patches to elisp-benchmarks (and to its hypothetical copy in
emacs-core).  My opinion that something can potentially be improved in
it (why not), but I personally ATM don't understand the need for ERT.