From mboxrd@z Thu Jan 1 00:00:00 1970 From: Leo Famulari Subject: bug#22533: Non-determinism in python-3 ".pyc" bytecode Date: Tue, 2 Feb 2016 03:54:39 -0500 Message-ID: <20160202085439.GA14802@jasmine> References: <20160202051544.GA11744@jasmine> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:57488) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aQWjr-0002OI-BP for bug-guix@gnu.org; Tue, 02 Feb 2016 03:55:08 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aQWjm-0005BP-Pd for bug-guix@gnu.org; Tue, 02 Feb 2016 03:55:07 -0500 Received: from debbugs.gnu.org ([208.118.235.43]:47069) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aQWjm-0005BK-Il for bug-guix@gnu.org; Tue, 02 Feb 2016 03:55:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84) (envelope-from ) id 1aQWjm-0003Vm-An for bug-guix@gnu.org; Tue, 02 Feb 2016 03:55:02 -0500 Sender: "Debbugs-submit" Resent-Message-ID: Content-Disposition: inline In-Reply-To: <20160202051544.GA11744@jasmine> List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org To: 22533@debbugs.gnu.org On Tue, Feb 02, 2016 at 12:15:44AM -0500, Leo Famulari wrote: > While preparing a package for borg [0], I found that the built output > was not reproducible. The problem is that the bytecode compiler [1] for > Python 3.4.3 (our current version) encodes the mtime of the > corresponding Python source file in the output. This is described in > PEP-3147 [2], and the responsible Python code is referenced below [3]. > > I tested a few of our existing python-3 packages: python-ccm, > python-pysam, and python-scripttest all exhibit the same problem. > > We fixed this in python-2 with the patch > python-2.7-source-date-epoch.patch, but I don't know how to write this > patch for python-3. mark_weaver suggested setting the timestamps of the source files before building. I think this is a better option if it doesn't break anything. It would allow the bytecode "staleness" check to work as expected while keeping the output consistent. > > Can somebody write this patch? > > I asked about this on #debian-reproducible and they said that it wasn't > an issue for Debian since they don't ship bytecode, but instead generate > it at install time. Of course, that doesn't really apply to Guix. > > I used diffoscope-34 to inspect the build outputs to find this, and you > can see the report here: > https://famulari.name/misc/7c55c9e97f668234ddea50299d986f14/borg-diffoscope-report.html > > It's first demonstrated in the file > ...-borg-0.30.0/lib/python3.4/site-packages/__pycache__/site.cpython-34.pyc. > > The first 2 bytes are the "magic numbers" described in PEP-3147, which > specify the version of the bytecode format. The next 2 bytes are the > problematic timestamp, as described in the PEP-3147. > > [0] > http://borgbackup.github.io/ > > [1] > https://docs.python.org/3/library/py_compile.html > > [2] > https://www.python.org/dev/peps/pep-3147/ > > [3] Check out the Guix git commit 4efc8eb27502c, and from there: > $ tar xf $(./pre-inst-env guix build --source python-3) > $ sed -n 139,140p Python-3.4.3/Lib/py_compile.py > bytecode = importlib._bootstrap._code_to_bytecode( > code, source_stats['mtime'], source_stats['size']) > > >