From mboxrd@z Thu Jan 1 00:00:00 1970 From: Leo Famulari Subject: bug#22533: Non-determinism in python-3 ".pyc" bytecode Date: Tue, 2 Feb 2016 00:15:44 -0500 Message-ID: <20160202051544.GA11744@jasmine> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:45522) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aQTKs-0008Vw-UH for bug-guix@gnu.org; Tue, 02 Feb 2016 00:17:08 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aQTKo-000580-Sf for bug-guix@gnu.org; Tue, 02 Feb 2016 00:17:06 -0500 Received: from debbugs.gnu.org ([208.118.235.43]:47012) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aQTKo-00057w-Pg for bug-guix@gnu.org; Tue, 02 Feb 2016 00:17:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84) (envelope-from ) id 1aQTKo-0006hp-JV for bug-guix@gnu.org; Tue, 02 Feb 2016 00:17:02 -0500 Sender: "Debbugs-submit" Resent-Message-ID: Received: from eggs.gnu.org ([2001:4830:134:3::10]:45355) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aQTJe-0008R3-HA for bug-guix@gnu.org; Tue, 02 Feb 2016 00:15:51 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aQTJa-0004ve-GU for bug-guix@gnu.org; Tue, 02 Feb 2016 00:15:50 -0500 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:43960) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aQTJZ-0004va-TZ for bug-guix@gnu.org; Tue, 02 Feb 2016 00:15:46 -0500 Received: from localhost (c-69-249-5-231.hsd1.pa.comcast.net [69.249.5.231]) by mail.messagingengine.com (Postfix) with ESMTPA id 1C742C0001A for ; Tue, 2 Feb 2016 00:15:45 -0500 (EST) Content-Disposition: inline List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org To: 22533@debbugs.gnu.org While preparing a package for borg [0], I found that the built output was not reproducible. The problem is that the bytecode compiler [1] for Python 3.4.3 (our current version) encodes the mtime of the corresponding Python source file in the output. This is described in PEP-3147 [2], and the responsible Python code is referenced below [3]. I tested a few of our existing python-3 packages: python-ccm, python-pysam, and python-scripttest all exhibit the same problem. We fixed this in python-2 with the patch python-2.7-source-date-epoch.patch, but I don't know how to write this patch for python-3. Can somebody write this patch? I asked about this on #debian-reproducible and they said that it wasn't an issue for Debian since they don't ship bytecode, but instead generate it at install time. Of course, that doesn't really apply to Guix. I used diffoscope-34 to inspect the build outputs to find this, and you can see the report here: https://famulari.name/misc/7c55c9e97f668234ddea50299d986f14/borg-diffoscope-report.html It's first demonstrated in the file ...-borg-0.30.0/lib/python3.4/site-packages/__pycache__/site.cpython-34.pyc. The first 2 bytes are the "magic numbers" described in PEP-3147, which specify the version of the bytecode format. The next 2 bytes are the problematic timestamp, as described in the PEP-3147. [0] http://borgbackup.github.io/ [1] https://docs.python.org/3/library/py_compile.html [2] https://www.python.org/dev/peps/pep-3147/ [3] Check out the Guix git commit 4efc8eb27502c, and from there: $ tar xf $(./pre-inst-env guix build --source python-3) $ sed -n 139,140p Python-3.4.3/Lib/py_compile.py bytecode = importlib._bootstrap._code_to_bytecode( code, source_stats['mtime'], source_stats['size'])