From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mark H Weaver Subject: Re: Trustworthiness of build farms (was Re: CDN performance) Date: Fri, 28 Dec 2018 02:12:01 -0500 Message-ID: <87y38auntv.fsf@netris.org> References: <87y38dyq8v.fsf@ITSx01.pdp10.guru> Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([208.118.235.92]:57958) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gcmWe-0002mc-G9 for guix-devel@gnu.org; Fri, 28 Dec 2018 02:25:48 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gcmKU-0002n7-FJ for guix-devel@gnu.org; Fri, 28 Dec 2018 02:13:14 -0500 Received: from world.peace.net ([64.112.178.59]:42222) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gcmKU-0002mN-Bw for guix-devel@gnu.org; Fri, 28 Dec 2018 02:13:10 -0500 In-Reply-To: <87y38dyq8v.fsf@ITSx01.pdp10.guru> (Jeremiah's message of "Wed, 26 Dec 2018 02:29:52 +0000") List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Jeremiah@pdp10.guru Cc: guix-devel@gnu.org Hi Jeremiah, Jeremiah@pdp10.guru writes: >> To truly solve that problem, we need bug-free compilers. > Impossible for all but the simplest of languages as the complexity of > implementing a compiler/assembler/interpreter is ln(c)+a but the > complexity of implementing a bug-free compiler/assembler/interpreter > is (e^(c))! - a. Where a is the complexity cost of supporting it's > host architecture. Where are you getting those complexity expressions from? Can you cite references to back them up? If not, can you explain how you arrived at them? What is 'c'? >> In practice, this requires provably correct compilers. > Which in practice turn out *NOT* to be bug free nor complete in regards > to the standard specification. If you're referring to the bugs found in CompCert, the ones I know about were actually bugs in the unverified parts of the toolchain. In the past, its frontend was unverified, and several bugs were found there. Even today, it produces assembly code, and depends on an unverified assembler and linker. Bugs can also exist in the specifications themselves, of course. However, in the areas covered by the proofs, the results are quite dramatic. For example, in the paper "Finding and Understanding Bugs in C Compilers" by Xuejun Yang, Yang Chen, Eric Eide, and John Regehr, https://web.stanford.edu/class/cs343/resources/finding-bugs-compilers.pdf the authors state on page 6: The striking thing about our CompCert results is that the middle- end bugs we found in all other compilers are absent. As of early 2011, the under-development version of CompCert is the only compiler we have tested for which Csmith cannot find wrong-code errors. This is not for lack of trying: we have devoted about six CPU-years to the task. The apparent unbreakability of CompCert supports a strong argument that developing compiler optimizations within a proof framework, where safety checks are explicit and machine-checked, has tangible benefits for compiler users > Now, don't get me wrong; provably correct > compilers are a correct step in the right direction but the real > solution is to first generate simplified languages that don't have > undefined behavior and human model first behavior. I'm not sure what "and human model first behavior" means, but if you mean that the semantics of languages should strive to match what a human would naturally expect, avoiding surprising or unintuitive behavior, I certainly agree. I consider Standard ML, and some subsets of Scheme and Lisp, to be such languages. >> Does that make sense? > Absolutely, certainly something possible to do; but extremely human > effort intensive and I don't see anyone willing to throw 2+ years of > human effort at the problem outside of non-free Businesses like > CompCert. If I understand correctly, what you don't expect to happen has already been done. CakeML is free software, and formally proven correct all the way down to the machine code. Moreover, it implements a language with an exceptionally clear semantics and no undefined behavior. Anyway, you've made it fairly clear that you're not interested in this line of work, and that's fine. I appreciate the work you're doing nonetheless. Regards, Mark