From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail
From: Andrea Corallo <acorallo@gnu.org>
Newsgroups: gmane.emacs.devel
Subject: Re: New "make benchmark" target
Date: Mon, 06 Jan 2025 13:41:55 -0500
Message-ID: <yp14j2b6az0.fsf@fencepost.gnu.org>
References: <87h679kftn.fsf@protonmail.com> <87frm5z06l.fsf@protonmail.com>
 <86msgdnqmv.fsf@gnu.org> <87wmfhxjce.fsf@protonmail.com>
 <86jzbhnmzg.fsf@gnu.org> <87o70txew4.fsf@protonmail.com>
 <yp1frm56n8r.fsf@fencepost.gnu.org> <871pxorh30.fsf@protonmail.com>
 <86wmfgm3a5.fsf@gnu.org> <87pll2fsj7.fsf@protonmail.com>
 <yp1bjwk5gph.fsf@fencepost.gnu.org> <86ikqs57bc.fsf@gnu.org>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214";
	logging-data="40877"; mail-complaints-to="usenet@ciao.gmane.io"
User-Agent: Gnus/5.13 (Gnus v5.13)
Cc: pipcet@protonmail.com,  stefankangas@gmail.com,  mattiase@acm.org,
 eggert@cs.ucla.edu,  emacs-devel@gnu.org
To: Eli Zaretskii <eliz@gnu.org>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon Jan 06 19:43:42 2025
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane-mx.org
Original-Received: from lists.gnu.org ([209.51.188.17])
	by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
	(Exim 4.92)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>)
	id 1tUs4j-000AQb-Gd
	for ged-emacs-devel@m.gmane-mx.org; Mon, 06 Jan 2025 19:43:41 +0100
Original-Received: from localhost ([::1] helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <emacs-devel-bounces@gnu.org>)
	id 1tUs3w-0007E2-Hf; Mon, 06 Jan 2025 13:42:52 -0500
Original-Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <acorallo@gnu.org>) id 1tUs3u-0007CW-PF
 for emacs-devel@gnu.org; Mon, 06 Jan 2025 13:42:51 -0500
Original-Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <acorallo@gnu.org>)
 id 1tUs3s-0003kV-SD; Mon, 06 Jan 2025 13:42:48 -0500
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-Version:Date:References:In-Reply-To:Subject:To:
 From; bh=zIThMRArHeG+Xq7suJP858PtKAXq682DhOC99qMvd3w=; b=pKTxWmSQtCpiiB5/6z1M
 pdb+m1W9Ys6MzFOQksHFZcabcqVaSBmYjQH7kzPntEK+wwiX+UCfC5GIN9o2+DMHXJE0iMxR+Wlu5
 JXD4bSU5m5LjUuquRs1w8bT9JKOaYpa/O1/CnvW7RAxX5t6apnZoAWnWL4Dhf0SdsXG/IJQAKIVxu
 vizSh//VcJJRZlxAxKNqgyt3kE4qCgRhJbg48AdKiPCQaMM+yQCUKy/OLTJ9Q4XbFyfmWDiiM4PtZ
 CjRuMmET9dvPOXQbSsvnIKMp7nYby5ydcuYOKTUj2Pg7Jtq+jUFCjCJ280rzxWC0K3H1frFbVJeOU
 9D6AYxUc/9eROQ==;
Original-Received: from acorallo by fencepost.gnu.org with local (Exim 4.90_1)
 (envelope-from <acorallo@gnu.org>)
 id 1tUs32-0004l9-Bo; Mon, 06 Jan 2025 13:42:28 -0500
In-Reply-To: <86ikqs57bc.fsf@gnu.org> (Eli Zaretskii's message of "Mon, 06 Jan
 2025 16:46:15 +0200")
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
 <mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
 <mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org
Xref: news.gmane.io gmane.emacs.devel:327747
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/327747>

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Andrea Corallo <acorallo@gnu.org>
>> Cc: Eli Zaretskii <eliz@gnu.org>,  stefankangas@gmail.com,
>>   mattiase@acm.org,  eggert@cs.ucla.edu,  emacs-devel@gnu.org
>> Date: Mon, 06 Jan 2025 06:23:22 -0500
>> 
>> Pip Cet <pipcet@protonmail.com> writes:
>> 
>> > In particular, as you (Andrea) correctly pointed out, it is sometimes
>> > appropriate to use an average run time (or, non-equivalently, an average
>> > speed) for reporting test results; the assumptions needed for this are
>> > very significant and need to be spelled out explicitly.  The vast
>> > majority of "make benchmark" uses which I think should happen cannot
>> > meet these stringent requirements.
>> >
>> > To put things simply, it is better to discard outliers (test runs which
>> > take significantly longer than the rest).  Averaging doesn't do that: it
>> > simply ruins your entire test run if there is a significant outlier.
>> > IOW, running the benchmarks with a large repetition count is very likely
>> > to result in useful data being discarded, and a useless result.
>> 
>> As mentioned, I disagree with having some logic put in place to
>> arbitrarily decide which value is worth to be considered and which value
>> should be discarded.  If a system is producing noisy measures this has
>> to be reported as error of the measure.  Those numbers are there for
>> some real reason and have to be accounted.
>
> Without too deep understanding of the underlying issue: IME, if some
> sample can include outliers, it is always better to use robust
> estimators, rather than attempt to detect and discard outliers.
> That's because detection of outliers can decide that a valid
> measurement is an outlier, and then the estimation becomes biased.

100% agreed

> In practical terms, for estimating the mean, I can suggest to use the
> sample median instead of the sample average.  The median is very
> robust to outliers, and only slightly less efficient (i.e., converges
> a bit slower) than the sample average.

For my experience benchmarks typically use geo-mean, there's quite some
info around on why is that, ex [1].  The use of arithmetic mean in
elisp-benchmarks is an error of youth (I'm responsible of) which I think
should be fixed.

  Andrea

[1] <https://dl.acm.org/doi/pdf/10.1145/5666.5673>