all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Dave Love <fx@gnu.org>
To: "Ludovic Courtès" <ludovic.courtes@inria.fr>
Cc: guix-devel <guix-devel@gnu.org>
Subject: Re: Optionally using more advanced CPU features
Date: Wed, 23 Aug 2017 14:59:23 +0100	[thread overview]
Message-ID: <87wp5ufkqs.fsf@albion.it.manchester.ac.uk> (raw)
In-Reply-To: <87a82s9cw3.fsf@gnu.org> ("Ludovic \=\?iso-8859-1\?Q\?Court\=E8s\?\= \=\?iso-8859-1\?Q\?\=22's\?\= message of "Tue, 22 Aug 2017 11:21:00 +0200")

ludovic.courtes@inria.fr (Ludovic Courtès) writes:

> Hi,
>
> Ricardo Wurmus <rekado@elephly.net> skribis:
>
>> I was wondering how we should go about optionally building software for
>> more advanced CPU features.  Currently, we build software for the lowest
>> common feature set among x86_64 CPUs.  That’s good for portability, but
>> not so good for performance.
>>
>> Enabling CPU features often happens through configure flags, but
>> expressing support at that level in our package definitions seems bad.
>> How can we make it possible for users to build their software for
>> different CPUs?
>
> To some extent, I think this is a compiler/OS/upstream issue.  By that I
> mean that the best way to achieve use of extra CPU features is by using
> the “IFUNC” feature of GNU ld.so, which is what libc does (it has
> variants of strcmp etc. tweaked for various CPU extensions like SSE, and
> the right one gets picked up at load time.)  Software like GMP, Nettle,
> or MPlayer also does this kind of selection at run time, but using
> custom mechanisms.

That may be the best way to handle it, but it's not widely available,
and isn't possible generally (as far as I know), e.g. for Fortran code.
See also below.  This issue surfaced again recently in Fedora.

In cases that don't dispatch on cpuid (or whatever), I think the
relevant missing OS/tool support is SIMD-specific hwcaps in the loader.
Hwcaps seem to be essentially undocumented, but there is, or has been,
support for instruction set capabilities on some architectures, just not
x86_64 apparently.  (An ancient example was for missing instructions on
some SPARC systems which greatly affected crypto operations in ssh et
al.)

>> We can cross-compile for other architectures on the command line with
>> “--target” and “--system”; can we allow for compilation with special CPU
>> features across the graph with “--features”?  Build system abstractions
>> or package definitions would then be changed to recognize these features
>> and modify the corresponding flags as needed.
>
> I’ve considered this, but designing this would be tricky, and not quite
> right IMO.
>
> There’s probably scientific software out there that can benefit from
> using the latest SSE/AVX/whatever extension, and yet doesn’t use any of
> the tricks above.  When we find such a piece of software, I think we
> should investigate and (1) see whether it actually benefits from those
> ISA extensions, and (2) see whether it would be feasible to just use
> ‘target_clones’ or similar on the hot spots.

One example which has been investigated, and you can't, is BLIS.  You
need it for vaguely competitive avx512 linear algebra.  (OpenBLAS is
basically fine for previous Intel and AMD SIMD.)  See, e.g.,
<https://github.com/xianyi/OpenBLAS/issues/991#issuecomment-273631173>
et seq.  I don't know if there's any good reason to, but if you want
ATLAS you have the same issue -- along with extra issues building it.

Related, I argue, as on the Fedora list, that like BLAS (and LAPACK)
should handled the way they are in Debian, with shared libraries built
compatibly with the reference BLAS.  They should be selectable at run
time, typically according to compute node type by flipping the ld.so
search path; you should be able to substitute BLIS or a GPU
implementation for OpenBLAS.  That likely applies in other cases, but
I'm most familiar with the linear algebra ones.

[By the way, you do have to be careful with ISA-specific libraries on
heterogeneous systems if you use checkpoint-restart, as you probably
should on an HPC cluster -- you need to restart on compatible hardware.]

  reply	other threads:[~2017-08-23 13:59 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-21 12:23 Optionally using more advanced CPU features Ricardo Wurmus
2017-08-22  9:21 ` Ludovic Courtès
2017-08-23 13:59   ` Dave Love [this message]
2017-08-28 13:48     ` Ludovic Courtès
2017-09-01 10:46       ` Dave Love
2017-09-04 12:38         ` Ludovic Courtès
2017-09-07 15:51         ` Packaging BLIS Ludovic Courtès
2017-09-08 22:36           ` Dave Love
2017-09-11  7:12             ` Ludovic Courtès
2017-08-26  3:39 ` Optionally using more advanced CPU features Ben Woodcroft
2017-08-26  5:14   ` Pjotr Prins
2017-09-04 14:50   ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wp5ufkqs.fsf@albion.it.manchester.ac.uk \
    --to=fx@gnu.org \
    --cc=guix-devel@gnu.org \
    --cc=ludovic.courtes@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.