From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ricardo Wurmus Subject: bug#22533: Python bytecode reproducibility Date: Sun, 04 Mar 2018 13:46:07 +0100 Message-ID: <874llw101c.fsf@elephly.net> References: <20160202051544.GA11744@jasmine> <87bmqfu44s.fsf@fastmail.com> <87606c23bq.fsf@elephly.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:49777) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1esT2f-0000Ow-R4 for bug-guix@gnu.org; Sun, 04 Mar 2018 07:47:06 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1esT2c-0002V5-Mv for bug-guix@gnu.org; Sun, 04 Mar 2018 07:47:05 -0500 Received: from debbugs.gnu.org ([208.118.235.43]:35627) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1esT2c-0002Uf-Ht for bug-guix@gnu.org; Sun, 04 Mar 2018 07:47:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1esT2c-00065v-Ag for bug-guix@gnu.org; Sun, 04 Mar 2018 07:47:02 -0500 Sender: "Debbugs-submit" Resent-Message-ID: In-reply-to: List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: =?UTF-8?Q?G=C3=A1bor?= Boskovits Cc: 22533@debbugs.gnu.org Hi G=C3=A1bor, > Nix had this issue, it seems they have a python 3.5 solution, which > should be easy to adopt: https://github.com/NixOS/nixpkgs/issues/22570. > WDYT? Here=E2=80=99s the patch for Nix: https://patch-diff.githubusercontent.com/raw/NixOS/nixpkgs/pull/22585.dif= f Here are the relevant changes to the Python packages: * Python 3.4 substituteInPlace "Lib/py_compile.py" --replace "source_stats['mtime']" "= (1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])" substituteInPlace "Lib/importlib/_bootstrap.py" --replace "source_mtime = =3D int(source_stats['mtime'])" "source_mtime =3D 1" * Python 3.5 substituteInPlace "Lib/py_compile.py" --replace "source_stats['mtime']" "= (1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])" substituteInPlace "Lib/importlib/_bootstrap_external.py" --replace "sourc= e_mtime =3D int(st['mtime'])" "source_mtime =3D 1" * Python 3.6 substituteInPlace "Lib/py_compile.py" --replace "source_stats['mtime']" "= (1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])" substituteInPlace "Lib/importlib/_bootstrap_external.py" --replace "sourc= e_mtime =3D int(st['mtime'])" "source_mtime =3D 1" For all packages they set these environment variables: - set PYTHONHASHSEED=3D0 (for hashes of str, bytes and datetime objects) - set DETERMINISTIC_BUILD; for conditional patching of the timestamp for package builds. The timestamp is not patched in ad-hoc environments, because that would mess with Python=E2=80=99s ability to determine whether to compile source files. They also rebuild all bytecode (with the exception of lib2to3 because it is Python 2 code) three times, once for each optimization level. --8<---------------cut here---------------start------------->8--- + # Determinism: rebuild all bytecode + # We exclude lib2to3 because that's Python 2 code which fails + # We rebuild three times, once for each optimization level + find $out -name "*.py" | $out/bin/python -m compileall -q -f -x "lib2t= o3" -i - + find $out -name "*.py" | $out/bin/python -O -m compileall -q -f -x "li= b2to3" -i - + find $out -name "*.py" | $out/bin/python -OO -m compileall -q -f -x "l= ib2to3" -i - --8<---------------cut here---------------end--------------->8--- -- Ricardo GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC https://elephly.net