From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp12.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id yBlBI1QQ+2F7aAAAgWs5BA (envelope-from ) for ; Thu, 03 Feb 2022 00:14:28 +0100 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp12.migadu.com with LMTPS id 6O0TIFQQ+2EgvwAAauVa8A (envelope-from ) for ; Thu, 03 Feb 2022 00:14:28 +0100 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id CBFF3AEE5 for ; Thu, 3 Feb 2022 00:14:27 +0100 (CET) Received: from localhost ([::1]:39204 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nFOpW-0004r0-QR for larch@yhetil.org; Wed, 02 Feb 2022 18:14:26 -0500 Received: from eggs.gnu.org ([209.51.188.92]:48468) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nFOp8-0004qn-7Y for bug-guix@gnu.org; Wed, 02 Feb 2022 18:14:02 -0500 Received: from debbugs.gnu.org ([209.51.188.43]:60014) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nFOp7-0003Hn-Sb for bug-guix@gnu.org; Wed, 02 Feb 2022 18:14:01 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1nFOp7-0005PY-Nb for bug-guix@gnu.org; Wed, 02 Feb 2022 18:14:01 -0500 X-Loop: help-debbugs@gnu.org Subject: bug#51536: openblas builds not reproducible on different x86_64 machines Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Wed, 02 Feb 2022 23:14:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 51536 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Maxim Cournoyer Received: via spool by 51536-submit@debbugs.gnu.org id=B51536.164384362720766 (code B ref 51536); Wed, 02 Feb 2022 23:14:01 +0000 Received: (at 51536) by debbugs.gnu.org; 2 Feb 2022 23:13:47 +0000 Received: from localhost ([127.0.0.1]:53911 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nFOos-0005Or-V6 for submit@debbugs.gnu.org; Wed, 02 Feb 2022 18:13:47 -0500 Received: from hera.aquilenet.fr ([185.233.100.1]:44726) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nFOop-0005Oc-BV for 51536@debbugs.gnu.org; Wed, 02 Feb 2022 18:13:45 -0500 Received: from localhost (localhost [127.0.0.1]) by hera.aquilenet.fr (Postfix) with ESMTP id CE0A32A0; Thu, 3 Feb 2022 00:13:36 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at aquilenet.fr Received: from hera.aquilenet.fr ([127.0.0.1]) by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3xCW0pnxAqei; Thu, 3 Feb 2022 00:13:35 +0100 (CET) Received: from ribbon (unknown [IPv6:2a01:e0a:1d:7270:af76:b9b:ca24:c465]) by hera.aquilenet.fr (Postfix) with ESMTPSA id 51EEA289; Thu, 3 Feb 2022 00:13:34 +0100 (CET) From: Ludovic =?UTF-8?Q?Court=C3=A8s?= References: <87h7cw7ewb.fsf@gmail.com> Date: Thu, 03 Feb 2022 00:13:33 +0100 In-Reply-To: <87h7cw7ewb.fsf@gmail.com> (Maxim Cournoyer's message of "Sun, 31 Oct 2021 23:07:00 -0400") Message-ID: <87czk4rheq.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Spamd-Bar: / X-Rspamd-Server: hera X-Rspamd-Queue-Id: CE0A32A0 X-Spamd-Result: default: False [-0.10 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; TO_MATCH_ENVRCPT_ALL(0.00)[]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[multipart/mixed,text/plain,text/x-patch]; TO_DN_SOME(0.00)[]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:+,3:+]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ricardo Wurmus , 51536@debbugs.gnu.org Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: "bug-Guix" X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1643843668; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:resent-cc:resent-from:resent-sender: resent-message-id:in-reply-to:in-reply-to:references:references: list-id:list-help:list-unsubscribe:list-subscribe:list-post; bh=Tw4Hv3hAkvLr/OZTyZmn5x+cAZ38w4Bc2dWaYmPWP50=; b=pywZB/fA4l6jGzVq98rfOllWxtmxHUzkSz90rZgDIZT0PtMaMKr0JLZEe+YNJ/XmA90yRU 4u9QLYAtic9C2/IjJxk43jCu/cbdU+mBFGWALOGYcybipT44zfXFpphqEXH0VtARjj2pzw xn6R09KDn7YQJeJtyrrfpiinpMt/lHGieU8UPIp4lWyir5OJk+6iQum63G8s10aOZsEMSJ b2eaEB8VIfwv1qInMkxSt+G0aZujOJ8G9P65IRQYKxGm8oSmbPg3cl5KVcb4PIumyiq/Wz 8MdJg+fckimG98ehVYznV76S+h7v45YmOf2+AqN2WIed9n4GpU5M0w6OBe5qjw== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1643843668; a=rsa-sha256; cv=none; b=P2IlUORE6bxvAp2xlmI7DJeaRm8VGThb496cEyDbvfw1xa19flOWJzcK1U52HFCRy3MRoI T1IgS0vGfNJMRby3rVHHav5fo12qetdIcoCdo+nbjSdedvKGNmwF+G9CYHh6IzCEj/BNUb YiWZ19ipVMHrnZCl2mUp8sM0abpZKg+xkiqJiPu+JVt9jzgq2iNFtqqVE+CZmBKjhPXrUw m5gP+gaF2QoFqnBUOiRsvqrqEpj/KrHzUgmTbPp9cdrVcKRv0hvr5gPxBoBCERbDz2eJA4 CsejKso+q5/mz5bmqMVJggc+FiulgxR9HvwEKtvzyrgIWpmDtd5K4i4E/C6iQA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -4.13 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: CBFF3AEE5 X-Spam-Score: -4.13 X-Migadu-Scanner: scn0.migadu.com X-TUID: p3NiP1TtRBkd --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi! Maxim Cournoyer skribis: > Our OpenBLAS package uses DYNAMIC_ARCH=3D1 to provide optimizations for > all supported targets, at least of x86 and x86_64. In theory that seems > OK, but in practice the builds differ depending on the host CPU. What follows is the log of an investigation that didn=E2=80=99t find the ro= ot cause, but perhaps it=E2=80=99ll give us ideas=E2=80=A6 Right now the build results of ci.guix and bordeaux.guix differ: --8<---------------cut here---------------start------------->8--- $ guix describe Generacio 202 Jan 30 2022 23:57:03 (nuna) guix 43dd34c repository URL: https://git.savannah.gnu.org/git/guix.git branch: master commit: 43dd34c7777a212c99a97da7a2c237158faa9a1b ludo@ribbon ~/src/guix$ guix challenge openblas /gnu/store/ras6dprsw3wm3swk23jjp8ww5dwxj333-openblas-0.3.18 contents differ: no local build for '/gnu/store/ras6dprsw3wm3swk23jjp8ww5dwxj333-openblas-= 0.3.18' https://ci.guix.gnu.org/nar/lzip/ras6dprsw3wm3swk23jjp8ww5dwxj333-openbla= s-0.3.18: 0m1jlc26yrwxn8gxwpj8452kw4g84ywclh0hnab93873ifz87s5c https://bordeaux.guix.gnu.org/nar/lzip/ras6dprsw3wm3swk23jjp8ww5dwxj333-o= penblas-0.3.18: 1d0m9v3kpsqzplpl1law2lfhm6rrbhkkqsvh19dlg9wx45vbbvjb differing file: /lib/libopenblasp-r0.3.18.so 1 store items were analyzed: - 0 (0.0%) were identical - 1 (100.0%) differed - 0 (0.0%) were inconclusive --8<---------------cut here---------------end--------------->8--- To get an idea, I thought we could compare the two build logs: https://ci.guix.gnu.org/log/ras6dprsw3wm3swk23jjp8ww5dwxj333-openblas-0.3= .18 https://bordeaux.guix.gnu.org/build/3fab433c-e7d3-498d-86f8-4bcd5da9c4db (Protip: I found the second one via .) The =E2=80=9Car -ru ../libopenblasp-r0.3.18.a =E2=80=A6=E2=80=9D are appar= ently the same in both cases, which rules out the simple case of unsorted .o files. The .so on ci.guix is slightly bigger: --8<---------------cut here---------------start------------->8--- $ wget -qO - https://ci.guix.gnu.org/nar/lzip/ras6dprsw3wm3swk23jjp8ww5dwxj= 333-openblas-0.3.18| lzip -d | guix archive -x /tmp/o1 $ wget -qO - https://bordeaux.guix.gnu.org/nar/lzip/ras6dprsw3wm3swk23jjp8w= w5dwxj333-openblas-0.3.18| lzip -d | guix archive -x /tmp/o2 $ ls -l /tmp/{o1,o2}/lib/libopenblasp-r0.3.18.so -r-xr-xr-x 1 ludo users 40538768 Jan 1 1970 /tmp/o1/lib/libopenblasp-r0.3= .18.so -r-xr-xr-x 1 ludo users 40436368 Jan 1 1970 /tmp/o2/lib/libopenblasp-r0.3= .18.so --8<---------------cut here---------------end--------------->8--- Both have the same symbols though, and in the same order: --8<---------------cut here---------------start------------->8--- $ diff -u <(objdump -T /tmp/o1/lib/libopenblasp-r0.3.18.so |cut -c 60- ) = <(objdump -T /tmp/o2/lib/libopenblasp-r0.3.18.so |cut -c60- ) $ echo $? 0 --8<---------------cut here---------------end--------------->8--- =E2=80=A6 which suggests they include code optimized for the same micro-architectures because symbols include the name of the micro-architecture: --8<---------------cut here---------------start------------->8--- $ objdump -T /tmp/o1/lib/libopenblasp-r0.3.18.so |cut -c 60-|tail -10 csymm3m_RU cgemv_c_BARCELONA csymv_U_HASWELL dtrmm_iltncopy_CORE2 LAPACKE_dsytrs2 openblas_num_threads_env csycon_rook_ csytri_rook_ --8<---------------cut here---------------end--------------->8--- Some of the offsets differ though: --=-=-= Content-Type: text/x-patch Content-Disposition: inline $ diff -u <(objdump -T /tmp/o1/lib/libopenblasp-r0.3.18.so ) <(objdump -T /tmp/o2/lib/libopenblasp-r0.3.18.so ) --- /dev/fd/63 2022-02-03 00:10:17.308357982 +0100 +++ /dev/fd/62 2022-02-03 00:10:17.276357923 +0100 @@ -1,5 +1,5 @@ -/tmp/o1/lib/libopenblasp-r0.3.18.so: format de fixer elf64-x86-64 +/tmp/o2/lib/libopenblasp-r0.3.18.so: format de fixer elf64-x86-64 DYNAMIC SYMBOL TABLE: 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.3.2 pthread_cond_signal @@ -91,57 +91,57 @@ 00000000013edb70 g DF .text 00000000000001be Base zgemm3m_incopyb_BULLDOZER 0000000000e6d200 g DF .text 0000000000002b06 Base strsm_kernel_RT_BOBCAT 0000000000512c00 g DF .text 0000000000000a0a Base zsymv_U_PRESCOTT -00000000023c7530 g DF .text 0000000000000201 Base LAPACKE_dpttrs_work +00000000023ae930 g DF .text 0000000000000201 Base LAPACKE_dpttrs_work 0000000000692000 g DF .text 0000000000000b89 Base srot_k_PENRYN 000000000179caa0 g DF .text 0000000000000200 Base dgemm_beta_HASWELL 0000000000a44690 g DF .text 00000000000004b4 Base dtrsm_iutucopy_OPTERON -000000000231cfc0 g DF .text 000000000000021d Base LAPACKE_sstein_work -0000000002327800 g DF .text 000000000000014b Base LAPACKE_ssytrd -0000000001ad9100 g DF .text 00000000000002aa Base chemm_outcopy_SKYLAKEX +00000000023043c0 g DF .text 000000000000021d Base LAPACKE_sstein_work +000000000230ec00 g DF .text 000000000000014b Base LAPACKE_ssytrd +0000000001acc900 g DF .text 00000000000002aa Base chemm_outcopy_SKYLAKEX 00000000017d6c10 g DF .text 0000000000000c38 Base cgemv_n_HASWELL -0000000002327b70 g DF .text 0000000000000143 Base LAPACKE_ssytrf +000000000230ef70 g DF .text 0000000000000143 Base LAPACKE_ssytrf 000000000018f010 g DF .text 000000000000025c Base cblas_stbmv 0000000000195a20 g DF .text 000000000000003b Base cblas_idamin -0000000002328d40 g DF .text 0000000000000101 Base LAPACKE_ssytri +0000000002310140 g DF .text 0000000000000101 Base LAPACKE_ssytri 000000000077be00 g DF .text 0000000000000e65 Base ztrsm_kernel_RN_PENRYN 0000000001583f20 g DF .text 0000000000001c22 Base dtrmm_iltucopy_STEAMROLLER -00000000021bf830 g DF .text 0000000000000527 Base ztbcon_ -0000000001a70630 g DF .text 00000000000001c7 Base dsymm_oltcopy_SKYLAKEX -000000000245a910 g DF .text 000000000000001b Base LAPACKE_zpp_nancheck +00000000021a6c30 g DF .text 0000000000000527 Base ztbcon_ +0000000001a640c0 g DF .text 000000000000066d Base dsymm_oltcopy_SKYLAKEX +0000000002441d10 g DF .text 000000000000001b Base LAPACKE_zpp_nancheck 000000000108ee20 g DF .text 000000000000014d Base zgemm3m_oncopyb_ATOM -0000000002409df0 g DF .text 000000000000035c Base LAPACKE_zgtsvx_work -0000000001e7d120 g DF .text 0000000000001743 Base dlatrs_ -0000000001e948a0 g DF .text 00000000000001d1 Base drscl_ +00000000023f11f0 g DF .text 000000000000035c Base LAPACKE_zgtsvx_work +0000000001e64520 g DF .text 0000000000001743 Base dlatrs_ +0000000001e7bca0 g DF .text 00000000000001d1 Base drscl_ 00000000019ac700 g DF .text 00000000000004bd Base zhemm3m_iucopyb_ZEN 00000000003c0f30 g DF .text 000000000000001e Base support_avx512_bf16 -0000000002329ac0 g DF .text 0000000000000107 Base LAPACKE_ssytrs +0000000002310ec0 g DF .text 0000000000000107 Base LAPACKE_ssytrs 0000000000f94890 g DF .text 00000000000002d3 Base ztrmm_oltncopy_BOBCAT --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On #guix-hpc Ricardo mentioned encountering this reproducibility issue earlier. Ludo=E2=80=99. --=-=-=--