From mboxrd@z Thu Jan 1 00:00:00 1970 From: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) Subject: Challenge substitute servers! Date: Tue, 20 Oct 2015 01:09:33 +0200 Message-ID: <87zize1mmq.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:44077) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZoJYi-0007Zu-JN for guix-devel@gnu.org; Mon, 19 Oct 2015 19:09:42 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZoJYe-00044k-JW for guix-devel@gnu.org; Mon, 19 Oct 2015 19:09:39 -0400 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:50582) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZoJYe-00044f-Fz for guix-devel@gnu.org; Mon, 19 Oct 2015 19:09:36 -0400 Received: from reverse-83.fdn.fr ([80.67.176.83]:41004 helo=pluto) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1ZoJYd-0003j0-Gj for guix-devel@gnu.org; Mon, 19 Oct 2015 19:09:36 -0400 List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org To: guix-devel Hello! I=E2=80=99m happy to announce the new =E2=80=98guix challenge=E2=80=99 comm= and! (Documentation below.) The goal of the command is, ideally, to report malicious or corrupt substitute servers. In practice though, many package build processes are still non-deterministic, so chances are that, when a discrepancy reported by =E2=80=98guix challenge=E2=80=99, it is due to a non-determinis= tic build process=E2=80=93something we must fix. Here=E2=80=99s what it reports on my machine (compared to hydra.gnunet.org, which is one of the build machines behind hydra.gnu.org): --8<---------------cut here---------------start------------->8--- $ ./pre-inst-env guix challenge --substitute-urls=3Dhttp://hydra.gnunet.org finding garbage collector roots... determining live/dead paths... updating list of substitutes from 'http://hydra.gnunet.org'... 100.0% /gnu/store/1c96zwwg6lnh6v9n3m0q30pg0x2m0v5c-openssl-1.0.2d contents differ: local hash: 0725l22r5jnzazaacncwsvp9kgf42266ayyp814v7djxs7nk963q http://hydra.gnunet.org/nar/1c96zwwg6lnh6v9n3m0q30pg0x2m0v5c-openssl-1.0.= 2d: 1zy4fmaaqcnjrzzajkdn3f5gmjk754b43qkq47llbyak9z0qjyim /gnu/store/3b7sjkz1ps719fkl53mcxzmjysd95i5c-emacs-deferred-0.3.2 contents d= iffer: local hash: 0hh4d5plz4ph93yqcqvyy8dfbdidwdmy93kikjgnkglxr0lw8qss http://hydra.gnunet.org/nar/3b7sjkz1ps719fkl53mcxzmjysd95i5c-emacs-deferr= ed-0.3.2: 0d0k5sl6zwbwb4yj0ns118bq6nxqaa8d4kr189w8lb7sma4ji5y1 /gnu/store/5zavfkmrax6z93q85q5nifbwkfz4704m-git-2.5.0 contents differ: local hash: 00p3bmryhjxrhpn2gxs2fy0a15lnip05l97205pgbk5ra395hyha http://hydra.gnunet.org/nar/5zavfkmrax6z93q85q5nifbwkfz4704m-git-2.5.0: 0= 69nb85bv4d4a6slrwjdy8v1cn4cwspm3kdbmyb81d6zckj3nq9f /gnu/store/76yq0qvcbjjljk8my6x06ayssph573xx-pius-2.1.1 contents differ: local hash: 0k4v3m9z1zp8xzzizb7d8kjj72f9172xv078sq4wl73vnq9ig3ax http://hydra.gnunet.org/nar/76yq0qvcbjjljk8my6x06ayssph573xx-pius-2.1.1: = 1cy25x1a4fzq5rk0pmvc8xhwyffnqz95h2bpvqsz2mpvlbccy0gs /gnu/store/avrrq8sl1ihq0vrhjrilszgmg7ifhqdw-guile-ssh-0.8.0 contents differ: local hash: 1qxjnirrd24djxwrh1wmsdg3qhhigymaqg673nri0d5dn87dihmc http://hydra.gnunet.org/nar/avrrq8sl1ihq0vrhjrilszgmg7ifhqdw-guile-ssh-0.= 8.0: 1js93xgwdrbq85mzl655dc7f6hb742bmvv46jij7b73npfzzjml9 /gnu/store/bc081jj8rx640ism3gm6wlpb7jamg080-dmd-0.2.01 contents differ: local hash: 1ngmjzscwva6w0wy3ahmq4r6svdpa8pq7f5kz273qld3k2xrg54v http://hydra.gnunet.org/nar/bc081jj8rx640ism3gm6wlpb7jamg080-dmd-0.2.01: = 0hfcg2qvdfrzyfgwn1p1j7k0vv93xk5z4y9kr22gyakf9f9jjfix /gnu/store/cmb889dhvnx9kk6gdvqjm14w4j6pqalq-guix-0.8.3.abbe2c6 contents dif= fer: local hash: 0p3mr40xk9q8bw26h8gkz2nx6n0csrmdh05zb1x85i0zs9pfd2f8 http://hydra.gnunet.org/nar/cmb889dhvnx9kk6gdvqjm14w4j6pqalq-guix-0.8.3.a= bbe2c6: 1aca8zkvcff38y3s9khf565mlcz8xgl8p6c93jbrin87k8z2islq /gnu/store/fkl97li2g7gqyw5bq09q2r0hkxla54lq-ath9k-htc-firmware-1.4.0 conten= ts differ: local hash: 0r4lysb0skx7dpyh1qsk2mamkwwd4a7yldr1yy2dpr7bc0nk4z8n http://hydra.gnunet.org/nar/fkl97li2g7gqyw5bq09q2r0hkxla54lq-ath9k-htc-fi= rmware-1.4.0: 1aia3539xvzaff4s11ga35f7iahch6rbryqzh5adlmx3gffpy2y7 /gnu/store/j0qc3ghc7ajja6k9c35y8ssxjzxrsy95-emacs-debbugs-0.7 contents diff= er: local hash: 0wfczpznl9pdalf4sp23dfp63pvrk9jsw7znvw8axj24wv9rbk4c http://hydra.gnunet.org/nar/j0qc3ghc7ajja6k9c35y8ssxjzxrsy95-emacs-debbug= s-0.7: 19bagzab1d6xabhy3av863ab06zkcxya4lc7q62za74p7qpj112g /gnu/store/n5q0gbn01w5m2ic9rc2xq4kj2faglg6s-emacs-typo-1.1 contents differ: local hash: 1yly4y92vphdsnqnmdqwsz019fy11jkbz31sl6ysly9j3fmnhq2m http://hydra.gnunet.org/nar/n5q0gbn01w5m2ic9rc2xq4kj2faglg6s-emacs-typo-1= .1: 1mj1h56pwf5cqbn397crlgsp7hdwjvrbqp4czha429y7pharjapl /gnu/store/q6xnjg9fd7lfhh7rq2l98grlyq7nbcf0-emacs-butler-0.2.4 contents dif= fer: local hash: 14lb6cdfngjjlxrnipq961hbgnhwp47ap904a9mm0dj4q7pj23n2 http://hydra.gnunet.org/nar/q6xnjg9fd7lfhh7rq2l98grlyq7nbcf0-emacs-butler= -0.2.4: 0vk41961frfvm5nrkmljhli3ibsz2lv0s09s6m29aiymixx6sw9g --8<---------------cut here---------------end--------------->8--- If we diff as explained in the doc below, we see that the Git discrepancies are due to timestamp in Perl=E2=80=99s POD files as well as sorted-by-inode-number =E2=80=98tclIndex=E2=80=99 files. For the Emacs mod= es, the problem is the autogenerated autoloads files, which include some sort of a timestamp as well. You=E2=80=99re welcome to help fix these issues! Comments welcome! Ludo=E2=80=99. 6.12 Invoking =E2=80=98guix challenge=E2=80=99 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D Do the binaries provided by this server really correspond to the source code it claims to build? Is this package=E2=80=99s build process determini= stic? These are the questions the =E2=80=98guix challenge=E2=80=99 command attemp= ts to answer. The former is obviously an important question: Before using a substitute server (*note Substitutes::), you=E2=80=99d rather _verify_ that= it provides the right binaries, and thus _challenge_ it. The latter is what enables the former: If package builds are deterministic, then independent builds of the package should yield the exact same result, bit for bit; if a server provides a binary different from the one obtained locally, it may be either corrupt or malicious. We know that the hash that shows up in =E2=80=98/gnu/store=E2=80=99 file= names is the hash of all the inputs of the process that built that file or directory=E2=80=94compilers, libraries, build scripts, etc. (*note Introduction::). Assuming deterministic build processes, one store file name should map to exactly one build output. =E2=80=98guix challenge=E2=80= =99 checks whether there is, indeed, a single mapping by comparing the build outputs of several independent builds of any given store item. The command=E2=80=99s output looks like this: $ guix challenge --substitute-urls=3D"http://hydra.gnu.org http://guix= .example.org" updating list of substitutes from 'http://hydra.gnu.org'... 100.0% updating list of substitutes from 'http://guix.example.org'... 100.0% /gnu/store/=E2=80=A6-openssl-1.0.2d contents differ: local hash: 0725l22r5jnzazaacncwsvp9kgf42266ayyp814v7djxs7nk963q http://hydra.gnu.org/nar/=E2=80=A6-openssl-1.0.2d: 0725l22r5jnzazaac= ncwsvp9kgf42266ayyp814v7djxs7nk963q http://guix.example.org/nar/=E2=80=A6-openssl-1.0.2d: 1zy4fmaaqcnjrz= zajkdn3f5gmjk754b43qkq47llbyak9z0qjyim /gnu/store/=E2=80=A6-git-2.5.0 contents differ: local hash: 00p3bmryhjxrhpn2gxs2fy0a15lnip05l97205pgbk5ra395hyha http://hydra.gnu.org/nar/=E2=80=A6-git-2.5.0: 069nb85bv4d4a6slrwjdy8= v1cn4cwspm3kdbmyb81d6zckj3nq9f http://guix.example.org/nar/=E2=80=A6-git-2.5.0: 0mdqa9w1p6cmli6976v= 4wi0sw9r4p5prkj7lzfd1877wk11c9c73 /gnu/store/=E2=80=A6-pius-2.1.1 contents differ: local hash: 0k4v3m9z1zp8xzzizb7d8kjj72f9172xv078sq4wl73vnq9ig3ax http://hydra.gnu.org/nar/=E2=80=A6-pius-2.1.1: 0k4v3m9z1zp8xzzizb7d8= kjj72f9172xv078sq4wl73vnq9ig3ax http://guix.example.org/nar/=E2=80=A6-pius-2.1.1: 1cy25x1a4fzq5rk0pm= vc8xhwyffnqz95h2bpvqsz2mpvlbccy0gs In this example, =E2=80=98guix challenge=E2=80=99 first scans the store to = determine the set of locally-built derivations=E2=80=94as opposed to store items that were downloaded from a substitute server=E2=80=94and then queries all the substi= tute servers. It then reports those store items for which the servers obtained a result different from the local build. As an example, =E2=80=98guix.example.org=E2=80=99 always gets a differen= t answer. Conversely, =E2=80=98hydra.gnu.org=E2=80=99 agrees with local builds, excep= t in the case of Git. This might indicate that the build process of Git is non-deterministic, meaning that its output varies as a function of various things that Guix does not fully control, in spite of building packages in isolated environments (*note Features::). Most common sources of non-determinism include the addition of timestamps in build results, the inclusion of random numbers, and directory listings sorted by inode number. See , for more information. To find out what=E2=80=99s wrong with this Git binary, we can do somethi= ng along these lines (*note Invoking guix archive::): $ wget -q -O - http://hydra.gnu.org/nar/=E2=80=A6-git-2.5.0 \ | guix archive -x /tmp/git $ diff -ur /gnu/store/=E2=80=A6-git.2.5.0 /tmp/git This command shows the difference between the files resulting from the local build, and the files resulting from the build on =E2=80=98hydra.gnu.org=E2=80=99 (*note Comparing and Merging Files: (diffutils)Overview.). The =E2=80=98diff=E2=80=99 command works great for = text files. When binary files differ, a better option is Diffoscope (http://diffoscope.org/), a tool that helps visualize differences for all kinds of files. Once you=E2=80=99ve done that work, you can tell whether the differences= are due to a non-deterministic build process or to a malicious server. We try hard to remove sources of non-determinism in packages to make it easier to verify substitutes, but of course, this is a process, one that involves not just Guix but a large part of the free software community. In the meantime, =E2=80=98guix challenge=E2=80=99 is one tool to help addre= ss the problem. If you are writing packages for Guix, you are encouraged to check whether =E2=80=98hydra.gnu.org=E2=80=99 and other substitute servers obtain= the same build result as you did with: $ guix challenge PACKAGE ... where PACKAGE is a package specification such as =E2=80=98guile-2.0=E2= =80=99 or =E2=80=98glibc:debug=E2=80=99. The general syntax is: guix challenge OPTIONS [PACKAGES=E2=80=A6] The one option that matters is: =E2=80=98--substitute-urls=3DURLS=E2=80=99 Consider URLS the whitespace-separated list of substitute source URLs to compare to.