From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?UTF-8?B?QmrDtnJuIEjDtmZsaW5n?= Subject: Re: Treating tests as special case Date: Thu, 5 Apr 2018 08:21:15 +0200 Message-ID: <20180405082115.60e604a6@alma-ubu> References: <20180405052439.GA30291@thebird.nl> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; boundary="Sig_/xfl4/.L=pZMHkjkH6P+HhlG"; protocol="application/pgp-signature" Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:52752) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3yGw-0003qA-OK for guix-devel@gnu.org; Thu, 05 Apr 2018 02:21:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f3yGt-0007FM-JG for guix-devel@gnu.org; Thu, 05 Apr 2018 02:21:22 -0400 Received: from m4s11.vlinux.de ([83.151.27.109]:51766 helo=bjoernhoefling.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3yGt-0007Ck-7h for guix-devel@gnu.org; Thu, 05 Apr 2018 02:21:19 -0400 In-Reply-To: <20180405052439.GA30291@thebird.nl> List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Pjotr Prins Cc: guix-devel@gnu.org --Sig_/xfl4/.L=pZMHkjkH6P+HhlG Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Thu, 5 Apr 2018 07:24:39 +0200 Pjotr Prins wrote: > Last night I was watching Rich Hickey's on Specs and deployment. It is > a very interesting talk in many ways, recommended. He talks about > tests at 1:02 into the talk: >=20 > https://www.youtube.com/watch?v=3DoyLBGkS5ICk >=20 > and he gave me a new insight which rang immediately true. He said: > what is the point of running tests everywhere? If two people test the > same thing, what is the added value of that? (I paraphrase) >=20 > With Guix a reproducibly building package generates the same Hash on > all dependencies. Running the same tests every time on that makes no > sense. >=20 > And this hooks in with my main peeve about building from source. The > building takes long enough. Testing takes incredibly long with many > packages (especially language related) and are usually single core > (unlike the build). It is also bad for our carbon foot print. Assuming > everyone uses Guix on the planet, is that where we want to end up? >=20 > Burning down the house. >=20 > Like we pull substitutes we could pull a list of hashes of test cases > that are known to work (on Hydra or elsewhere). This is much lighter > than storing substitutes, so when the binaries get removed we can > still retain the test hashes and have fast builds. Also true for guix > repo itself. >=20 > I know there are two 'inputs' I am not accounting for: (1) hardware > variants and (2) the Linux kernel. But, honestly, I do not think we > are in the business of testing those. We can assume these work. If > not, any issues will be found in other ways (typically a segfault ;). > Our tests are generally meaningless when it comes to (1) and (2). And > packages that build differently on different platforms, like openblas, > we should opt out on.=20 >=20 > I think this would be a cool innovation (in more ways than one). >=20 > Pj. Hi Pjotr, great ideas! Last night I did a=20 guix pull && guix package -i git We have substitutes, right? Yeah, but someone updated git, on my new machine I didn't configure berlin.guixsd.org yet and hydra didn't have any substitutes (build wasn't started yet?). Building git was relatively fast, but all the tests took ages. And it was just git. It should work. The git maintainers ran the tests. Marius when he updated it in commit 5c151862c ran the tests. And that should be enough of testing. Let's skip the tests. On the other hand, if I create a new package definition and forget to run the tests. If upstream is too sloppy, did not run the tests and had no continuous integration. Who will run the tests then? What if I build my package with different sources? And you mentioned different environment conditions like machine and kernel. We still have "only" 70-90% reproducibility. The complement should have tests enabled. And the question "is my package reproducible?" is not trivial to answer, and is not computable. We saw tests that failed only in 2% of the runs and were fine in 98%. If we would run those tests "just once", we couldn't figure out that there is a problem (assuming the problem really is in the software, not just the tests). There could also be practible problems with that: If all write there software nice and with autoconfigure and we just have a "make && make test && make install" it's easy to skip the test. But for more complicated things we have to find a way to tell the build-system how to skip tests. Bj=C3=B6rn --Sig_/xfl4/.L=pZMHkjkH6P+HhlG Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEARECAAYFAlrFwFwACgkQvyhstlk+X/0K2wCfWh93sU05Puif336wU7JyYKUE bSgAoIFLJmC5M2Dcg1NvVXN8zicoXKbH =caDM -----END PGP SIGNATURE----- --Sig_/xfl4/.L=pZMHkjkH6P+HhlG--