Hi Ludo’,

On 1/11/22 20:10, Ludovic Courtès wrote:
> Hi,
> 
> Felix Gruber <felgru@posteo.net> skribis:
> 
>> * gnu/packages/maths.scm (ceres): Update to 2.0.0.
>>    [inputs]: Use simplified format.
>>    (ceres-solver-benchmarks)[phases]: Add schur_eliminator_benchmark.
>>    Replace autodiff_cost_function_benchmark with new autodiff_benchmarks.
> 
> Applied, thanks!
> 
> Since you’re looking at benchmarks, I’d be curious to see how those you
> added compare when passing ‘--tune’:
> 
>    https://hpc.guix.info/blog/2022/01/tuning-packages-for-a-cpu-micro-architecture/

Unfortunately, I'm getting mixed results for the benchmarks. In most 
cases, I got slight (<10%) improvements in runtime, but there are also 
some benchmarks that were worse with the --tune flag. I'm wondering 
whether the compiler flags set by the --tune option are correctly used 
by the custom 'build phase of the ceres-solver-benchmarks package. I 
didn't have the time to look closer into it as I'm currently in the 
middle of moving to another country.

Anyways, I've attached the results of benchmark runs that I've generated 
using guix commit 7f779286df7e8636d901f4734501902cc934a72f once untuned 
and once tuned for broadwell CPUs.
My laptop on which I ran the tests has a Quad Core AMD Ryzen 7 PRO 2700U 
CPU with 2200 MHz.

In the attachments you find
* a script run_benchmarks.sh used to run the benchmarks in tuned and 
untuned guix shells,
* text files ending in `-tuned` or `-untuned` which contain the results 
of those benchmark runs,
* a script compare.sh which calls a Python script compare-results.py to 
generate files ending in `-diff` that contain the relative change 
between untuned and tuned benchmarks (negative time and CPU percentages 
mean the tuned benchmark was faster, while for the number of iterations, 
positive percentages mean the tuned benchmark had run more iterations).

Best regards,
Felix