From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:470:142:3::10]:36940) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j56YK-0005ML-7N for guix-patches@gnu.org; Fri, 21 Feb 2020 06:33:05 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j56YI-0004jc-IT for guix-patches@gnu.org; Fri, 21 Feb 2020 06:33:03 -0500 Received: from debbugs.gnu.org ([209.51.188.43]:40485) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1j56YI-0004jW-FM for guix-patches@gnu.org; Fri, 21 Feb 2020 06:33:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1j56YI-0001fe-BW for guix-patches@gnu.org; Fri, 21 Feb 2020 06:33:02 -0500 Subject: [bug#39588] gnu: Add mpich, scalapack-mpich, mumps-mpich, pt-scotch-mpich, python-mpi4py-mpich Resent-Message-ID: From: Ludovic =?UTF-8?Q?Court=C3=A8s?= References: <87blq2rclk.fsf@inria.fr> <87o8tx3z2q.fsf@gnu.org> <87k14j6amk.fsf@inria.fr> <878skxd2es.fsf@gnu.org> <875zg0jpjl.fsf@inria.fr> Date: Fri, 21 Feb 2020 12:32:44 +0100 In-Reply-To: <875zg0jpjl.fsf@inria.fr> ("Maurice \=\?utf-8\?Q\?Br\=C3\=A9mond\=22\?\= \=\?utf-8\?Q\?'s\?\= message of "Fri, 21 Feb 2020 09:46:38 +0100") Message-ID: <87ftf4npk3.fsf@gnu.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+kyle=kyleam.com@gnu.org Sender: "Guix-patches" To: Maurice =?UTF-8?Q?Br=C3=A9mond?= Cc: 39588@debbugs.gnu.org, zimoun --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi, I actually managed to reproduce it with a minimal test case (attached): --8<---------------cut here---------------start------------->8--- $ guix build -f mpich-test.scm substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0% La jena derivo estos konstruata: /gnu/store/rgr7wnxbgxnp6s96zcnb4ryn3rqfcl7b-mpi-init.drv building /gnu/store/rgr7wnxbgxnp6s96zcnb4ryn3rqfcl7b-mpi-init.drv... /gnu/store/pkbg6kllx5xb8vb6kwrwm7qm4rnpmhia-mpich-3.3.2/bin/mpicc: line 215= : expr: command not found /gnu/store/pkbg6kllx5xb8vb6kwrwm7qm4rnpmhia-mpich-3.3.2/bin/mpicc: line 215= : expr: command not found /gnu/store/pkbg6kllx5xb8vb6kwrwm7qm4rnpmhia-mpich-3.3.2/bin/mpicc: line 215= : expr: command not found /gnu/store/pkbg6kllx5xb8vb6kwrwm7qm4rnpmhia-mpich-3.3.2/bin/mpicc: line 215= : expr: command not found /gnu/store/pkbg6kllx5xb8vb6kwrwm7qm4rnpmhia-mpich-3.3.2/bin/mpicc: line 215= : expr: command not found Invalid error code (-2) (error ring index 127 invalid) INTERNAL ERROR: invalid error code fffffffe (Ring Index out of range) in MP= ID_nem_tcp_init:373 Invalid error code (-2) (error ring index 127 invalid) INTERNAL ERROR: invalid error code fffffffe (Ring Index out of range) in MP= ID_nem_tcp_init:373 Fatal error in PMPI_Init: Other MPI error, error stack: MPIR_Init_thread(586)..............:=20 MPID_Init(224).....................: channel initialization failed MPIDI_CH3_Init(105)................:=20 MPID_nem_init(324).................:=20 MPID_nem_tcp_init(175).............:=20 MPID_nem_tcp_get_business_card(401):=20 MPID_nem_tcp_init(373).............: gethostbyname failed, localhost (errno= 0) Backtrace: 1 (primitive-load "/gnu/store/iykxzg1n018sigd4c23kx1c4ngz?") In guix/build/utils.scm: 652:6 0 (invoke _ . _) guix/build/utils.scm:652:6: In procedure invoke: Throw to key `srfi-34' with args `(#)'. builder for `/gnu/store/rgr7wnxbgxnp6s96zcnb4ryn3rqfcl7b-mpi-init.drv' fail= ed with exit code 1 build of /gnu/store/rgr7wnxbgxnp6s96zcnb4ryn3rqfcl7b-mpi-init.drv failed View build log at '/var/log/guix/drvs/rg/r7wnxbgxnp6s96zcnb4ryn3rqfcl7b-mpi= -init.drv.bz2'. guix build: error: build of `/gnu/store/rgr7wnxbgxnp6s96zcnb4ryn3rqfcl7b-mp= i-init.drv' failed --8<---------------cut here---------------end--------------->8--- The same program outside the container works just fine: --8<---------------cut here---------------start------------->8--- $ guix environment --ad-hoc mpich -- mpiexec -np 2 "/gnu/store/8i1dci1wxd6c= 0q6a2cz4kgb8adfk8rrz-mpi-init" np =3D 2, rank =3D 0 np =3D 2, rank =3D 1 --8<---------------cut here---------------end--------------->8--- =E2=80=98MPL_get_sockaddr=E2=80=99 uses =E2=80=98getaddrinfo=E2=80=99 for h= ost name lookup. Interestingly, =E2=80=98getaddrinfo=E2=80=99 fails in the build environment= when passed the flags that =E2=80=98MPL_get_sockaddr=E2=80=99 uses: --8<---------------cut here---------------start------------->8--- (computed-file "getaddrinfo" #~(pk #$output (getaddrinfo "localhost" #f (logior AI_ADDRCONFIG AI_V4MAPPED) AF_INET SOCK_STREAM IPPROTO_TCP))) --8<---------------cut here---------------end--------------->8--- However, if you comment AF_INET, SOCK_STREAM, and IPPROTO_TCP, it works. Now we need to see why the =E2=80=98ai_family=E2=80=99 hint is causing trou= bles in glibc, and perhaps in parallel try to work around it in MPICH=E2=80=A6 Ludo=E2=80=99. PS: I=E2=80=99ll be mostly away from keyboard in the coming days. --=-=-= Content-Type: text/plain Content-Disposition: inline; filename=mpich.scm Content-Description: the test (use-modules (guix) (gnu)) (define code (plain-file "mpi.c" " #include #include #include int main (int argc, char *argv[]) { int err, np, rank; err = MPI_Init (&argc, &argv); assert (err == 0); err = MPI_Comm_size(MPI_COMM_WORLD, &np); assert (err == 0); err = MPI_Comm_rank(MPI_COMM_WORLD, &rank); assert (err == 0); printf (\"np = %i, rank = %i\\n\", np, rank); return 0; } ")) (define toolchain (specification->package "gcc-toolchain")) (define mpich (specification->package "mpich")) (computed-file "mpi-init" (with-imported-modules '((guix build utils)) #~(begin (use-modules (guix build utils)) (setenv "PATH" (string-append #$(file-append toolchain "/bin") ":" #$(file-append mpich "/bin"))) (setenv "CPATH" #$(file-append mpich "/include")) (setenv "LIBRARY_PATH" (string-append #$(file-append mpich "/lib") ":" #$(file-append toolchain "/lib"))) (invoke "mpicc" "-o" #$output "-Wall" "-g" #$code) ;; Run the MPI code in the build environment. (invoke "mpiexec" "-np" "2" #$output)))) --=-=-=--