From: Dave Love <fx@gnu.org>
To: "Ludovic Courtès" <ludovic.courtes@inria.fr>
Cc: guix-devel <guix-devel@gnu.org>
Subject: Re: Optionally using more advanced CPU features
Date: Fri, 01 Sep 2017 11:46:16 +0100 [thread overview]
Message-ID: <87inh2zog7.fsf@albion.it.manchester.ac.uk> (raw)
In-Reply-To: <87k21nerqa.fsf@inria.fr> ("Ludovic \=\?iso-8859-1\?Q\?Court\=E8s\?\= \=\?iso-8859-1\?Q\?\=22's\?\= message of "Mon, 28 Aug 2017 15:48:00 +0200")
Ludovic Courtès <ludovic.courtes@inria.fr> writes:
>> That may be the best way to handle it, but it's not widely available,
>> and isn't possible generally (as far as I know), e.g. for Fortran code.
>> See also below. This issue surfaced again recently in Fedora.
>
> Right. Do you have examples of Fortran packages in mind?
Not much off-hand because, shall we say, there's a shortage of the sort
of profiling information that's necessary for system performance
engineering and procurement. It's not in Guix, but cp2k is a (mainly)
Fortran program that is, or was, used as performance regression test for
GCC. I only know about its profile for cases where time in MPI or fftw
is most relevant. However, two of its kernels, ELPA, and libsmm (as
libxsmm) have low-level optimized versions for x86_64, but only Fortran
implementations for other architectures as far as I know.
Otherwise, BLAS/LAPACK for any micro-architectures that don't have
support in free optimized variants like OpenBLAS.
>> In cases that don't dispatch on cpuid (or whatever), I think the
>> relevant missing OS/tool support is SIMD-specific hwcaps in the loader.
>> Hwcaps seem to be essentially undocumented, but there is, or has been,
>> support for instruction set capabilities on some architectures, just not
>> x86_64 apparently. (An ancient example was for missing instructions on
>> some SPARC systems which greatly affected crypto operations in ssh et
>> al.)
>
> But that sounds similar to IFUNC in that application code would need to
> actually use hwcap info to select the right implementation at load time,
> right?
As far as I know, it's a loader feature. See "Hardware capabilities" in
ld.so(1).
> >> There’s probably scientific software out there that can benefit from
> >> using the latest SSE/AVX/whatever extension, and yet doesn’t use any of
> >> the tricks above. When we find such a piece of software, I think we
> >> should investigate and (1) see whether it actually benefits from those
> >> ISA extensions, and (2) see whether it would be feasible to just use
> >> ‘target_clones’ or similar on the hot spots.
> >
> >> One example which has been investigated, and you can't, is BLIS. You
>
> (Why “you can’t?” It’s free software AFAICS on
> <https://github.com/flame/blis/tree/master>.)
Well, you could embark on some sort of (GCC-specific?) re-write, but it
would be better to work on <https://github.com/flame/blis/issues/129>.
I don't think there's anywhere you can just attach GCC attributes, and
certainly no magic will happen for currently-unsupported architectures.
>> need it for vaguely competitive avx512 linear algebra. (OpenBLAS is
>> basically fine for previous Intel and AMD SIMD.) See, e.g.,
>> <https://github.com/xianyi/OpenBLAS/issues/991#issuecomment-273631173>
>> et seq. I don't know if there's any good reason to, but if you want
>> ATLAS you have the same issue -- along with extra issues building it.
>
> ATLAS is a problem because it does built-time ISA selection (and maybe
> profile-guided optimization?).
Yes, that's what I meant. (I can't remember to what extent you can just
specify the architecture and build it without the parameter sweep.)
> I sympathize with the idea of having several ABI-compatible BLAS
> implementations for the reasons you give. That somewhat conflicts with
> the idea of reproducibility, but after all we can have our cake and eat
> it too: the user can decide to have LD_LIBRARY_PATH point to an
> alternate ABI-compatible BLAS, or they can keep using the one that
> appears in RUNPATH.
>
> Thoughts?
Right, about the cake -- as with other packaging systems -- and
LD_LIBRARY_PATH/LD_PRELOAD are important for debugging and measurement
anyway. [I know too much about computing and experimental science to
believe in reproducibility as it's normally talked about, though
facilities for reproducible builds and environment components are good.]
next prev parent reply other threads:[~2017-09-01 10:46 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-21 12:23 Optionally using more advanced CPU features Ricardo Wurmus
2017-08-22 9:21 ` Ludovic Courtès
2017-08-23 13:59 ` Dave Love
2017-08-28 13:48 ` Ludovic Courtès
2017-09-01 10:46 ` Dave Love [this message]
2017-09-04 12:38 ` Ludovic Courtès
2017-09-07 15:51 ` Packaging BLIS Ludovic Courtès
2017-09-08 22:36 ` Dave Love
2017-09-11 7:12 ` Ludovic Courtès
2017-08-26 3:39 ` Optionally using more advanced CPU features Ben Woodcroft
2017-08-26 5:14 ` Pjotr Prins
2017-09-04 14:50 ` Ludovic Courtès
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87inh2zog7.fsf@albion.it.manchester.ac.uk \
--to=fx@gnu.org \
--cc=guix-devel@gnu.org \
--cc=ludovic.courtes@inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/guix.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.