From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:4a6f::]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id YBNINUiIY2BY4AAAgWs5BA (envelope-from ) for ; Tue, 30 Mar 2021 22:21:28 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id cM/rLkiIY2DoTwAAB5/wlQ (envelope-from ) for ; Tue, 30 Mar 2021 20:21:28 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 35468250FB for ; Tue, 30 Mar 2021 22:21:28 +0200 (CEST) Received: from localhost ([::1]:38648 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lRKrf-0005SK-DZ for larch@yhetil.org; Tue, 30 Mar 2021 16:21:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41418) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lRKrH-0005S9-2j for guix-devel@gnu.org; Tue, 30 Mar 2021 16:21:03 -0400 Received: from imta-37.everyone.net ([216.200.145.37]:45296 helo=imta-38.everyone.net) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lRKr8-0001Rm-Lf for guix-devel@gnu.org; Tue, 30 Mar 2021 16:21:02 -0400 Received: from pps.filterd (localhost.localdomain [127.0.0.1]) by imta-38.everyone.net (8.16.0.43/8.16.0.43) with SMTP id 12UKHlTu006343; Tue, 30 Mar 2021 13:20:49 -0700 X-Eon-Originating-Account: BV0F3Xwcb6c51IMGVhv1zEsY_CP1O4QfLSs0bPLR2uQ X-Eon-Dm: m0117124.ppops.net Received: by m0117124.mta.everyone.net (EON-AUTHRELAY2 - 5a81d0f2) id m0117124.6062c7b8.dda0; Tue, 30 Mar 2021 13:20:41 -0700 X-Eon-Sig: AQMHrIJgY4gZDTwEGwIAAAAD,560e93c7cdf3036d041327b53f9f2012 X-Eip: KjrP3b8TWCWJ52frcCj7WF4rjJA3ecxZOLcVJql0OBU Date: Tue, 30 Mar 2021 22:20:30 +0200 From: Bengt Richter To: ilmu Subject: Re: Deep vs Shallow trace: Removing the tradeoff? Message-ID: <20210330202030.GA2811@LionPure> References: <01A29E7C-0E60-4518-A775-92B6F6A8B816@lepiller.eu> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Proofpoint-ORIG-GUID: fqI0j_JI5C05A6mnd0SZeV-GCXXCyclM X-Proofpoint-GUID: fqI0j_JI5C05A6mnd0SZeV-GCXXCyclM X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.369, 18.0.761 definitions=2021-03-30_09:2021-03-30, 2021-03-30 signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 impostorscore=0 mlxscore=0 malwarescore=0 suspectscore=0 phishscore=0 lowpriorityscore=0 adultscore=0 clxscore=1034 priorityscore=1501 bulkscore=0 mlxlogscore=999 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2103300000 definitions=main-2103300148 Received-SPF: pass client-ip=216.200.145.37; envelope-from=bokr@oz.net; helo=imta-38.everyone.net X-Spam_score_int: -16 X-Spam_score: -1.7 X-Spam_bar: - X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Bengt Richter Cc: "guix-devel@gnu.org" Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1617135688; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post; bh=9yrvrXmeanmROqJTTiNgCkrnE/pYynGTW20UeCcRVrg=; b=gQXCJ21vT/kV3JIhPJYL1HNqooX9OFMsKu749UmmQCpQfp08uuDIEVb7ziTaPPsIdz+/C2 n2A/3LO78I5xgKw3xfIEYMjsXXkLxORrsM9Qcf/ba/ux5k0E5YEn5sBaSCe4iWUeeAyIIp XvLEM65tmQrC07X92LDqQIemtv2WQOy0Ccg/9ObvkaN+VH5iIqIERbq6WwH2F1Ynx/4ZwD S4M5sa44BMPE0DZoKjujcRvYUNkC2JcbS5MwtnVHjTZMLoR7SZ/65iw4oAJ1w1Gk4pSngU fY//tlFq14X4JhpKGpKRdPpO03GQ9pq6NkW3YRoRKyM0nfpJNVXMzyBoV5iy7A== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1617135688; a=rsa-sha256; cv=none; b=Ei5luaR1ogI2HpTohVs6kU2m/WAg/ayJIFpA9DpcF7w3QV+HsR8IHQWnAy9KFJHJPR34uj nVZFIwrvNJ7KOjWKsKuv4XW5geAKD0rY6gs9igUlnDBrkuXUMgyg7OEx2Gy1Ql544AP5CT 14NhpcMy9rLJyPbMLqJRDAVziM0e9QPOU+HtKFtMnK6xQxInW78n1oWyAoKjHDcIP/Glu9 gnO/Iw2S1/iumEZ/cItgQ1dSHTRtUi2+2GL2VjDGMVuMqDCuv7cIwsqgp0Un+wtIByy/6u XRMXStTY84UtSojT9F/QMB1HVkQhl41FmHuLwZ4NHeTQH0JZGkBFQAsl5/bjZw== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Spam-Score: -0.92 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Queue-Id: 35468250FB X-Spam-Score: -0.92 X-Migadu-Scanner: scn0.migadu.com X-TUID: RgyG4/fNTMgD Hi ilmu, On +2021-03-28 23:16:23 +0000, ilmu wrote: > First of all, thanks for engaging with me, sorry for not putting more work into the initial email. Note that I have attached inline images from the paper, the paper is also attached to this email. > An interesting paper.. I have long been wondering if reproducibility is too exclusively focused on the source transformation chain and the tools implementing that. E.g., for programs that execute like side-effect-free functions, why not an early-cutoff based on a functionally well-tested upgrade written in a different language, with a totally different source? If you source the following in a subshell cat ./early-cutoff.txt --8<---------------cut here---------------start------------->8--- export HELLO_TO=World strace -qqxs120 -e write \ guile -c '(let* ((msg (string-append "Hello " (getenv "HELLO_TO") "\n"))) (display msg))' |& tr , $'\n'|grep '^ *"' strace -qqxs120 -e write \ echo "Hello $HELLO_TO" |& tr , $'\n'|grep '^ *"' --8<---------------cut here---------------end--------------->8--- Like, (. early-cutoff.txt) --8<---------------cut here---------------start------------->8--- "Hello World\n" "Hello World\n" --8<---------------cut here---------------end--------------->8--- It informally shows that in a particular context, --8<---------------cut here---------------start------------->8--- guile -c '(let* ((msg (string-append "Hello " (getenv "HELLO_TO") "\n"))) (display msg))' --8<---------------cut here---------------end--------------->8--- can substitute for --8<---------------cut here---------------start------------->8--- echo "Hello $HELLO_TO" --8<---------------cut here---------------end--------------->8--- So, what kinds of build sequences could be cut off early if you ignore how you produced the code (i.e. just compile whatever source you like with whatever tools you like) so long as the resultant code passed the functionality tests? > > What I understand from the above is that there is not so much of a difference between Bazel and Nix/Guix. The "shallow" tracing is a deep tracing. It just records something different. > > [classification.png] > > The framework they present is that a build system can be classified based on how it decides whether a rebuild is necessary and how it schedules the necessary rebuilds (i.e. how it sorts the work to be done). They consider these orthogonal strategies a complete description of "what a build system is". > > [constructive_traces.png] > > The constructive trace allows you to keep intermediate artifacts but creates the extra overhead of managing these. > > [deep_constructive_traces.png] > > Since a deep constructive trace only looks at the terminal dependencies you have something of a merkle-dag. What I was suggesting is to be able to do "shallow builds" while still supporting "early cutoff". For reference: > > [early_cutoff.png] > > Early cutoff is a very desirable property that nix does not have (and I assume therefore that neither does guix). > > [shallow_builds.png] > > Basically, if we have a proof that the immediate dependencies are complete then we don't need to fetch anything deeper in the dependency tree. > > > In Nix/Guix, we recursively compute the hash of packages from their inputs, receipe and other stuff. This hash is unrelated to the checksum of the result and changes as soon as one element changes. It essentially records the whole source dependency graph. > > From what you write, Bazel records the checksum of inputs and receipe, so it essentially encodes the whole binary dependency graph. It's not "shallow", as a real change down the graph can have repercussions higher up. > > Actually Eelco described something similar for Nix in his thesis, the intensionnal model. As you noticed, different receipes or inputs can lead to the same output (bitwise). The idea is to declare the expected checksum of the result, and use this to compute the store item name. As long as you apply changes that do not affect the output, different receipes (updates) can be used to generate the same item. So if you update a package, it might affect its output checksum, so you need to change it. Dependents might not be affected, in which case you can block update propagation there. > > The question being how to monitor that packages need to be rebuilt or have the same hash? > > I think the pictures above should clarify, I should have looked at the paper again before sending my previous email, I read it quite a long time ago.. The property I was trying to recover in the nix-style build systems is this idea of an early cutoff. The strategy for achieving that is to use constructive traces with depth 1 (what I was calling "shallow traces") and then recursive succinct arguments of knowledge for retaining the shallow build property that you get from the "infinitely deep traces". > > > I don't really understand why it must be zero-knowledge. My understanding is that to have a zero-knowledge proof, you first need a proof of the claim, then find a clever way to proove you have the proof, without disclosing it. Not sure it's helpful here. If you have a proof that you don't need to rebuild, why not share that proof to everyone in the first place? > > Sure, you only need the succinctness property. I should have been more precise, what I am mainly after is resolving the tradeoff between shallow and deep traces. > > I hope that this discussion can at least be helpful for people that are not familiar with these ideas, even if - as it seems from your response - the idea does not have merit. > > Kind regards, > - ilmu -- Regards, Bengt Richter