unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
* bug#22533: Non-determinism in python-3 ".pyc" bytecode
@ 2016-02-02  5:15 Leo Famulari
  2016-02-02  8:54 ` Leo Famulari
                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Leo Famulari @ 2016-02-02  5:15 UTC (permalink / raw)
  To: 22533

While preparing a package for borg [0], I found that the built output
was not reproducible. The problem is that the bytecode compiler [1] for
Python 3.4.3 (our current version) encodes the mtime of the
corresponding Python source file in the output. This is described in
PEP-3147 [2], and the responsible Python code is referenced below [3].

I tested a few of our existing python-3 packages: python-ccm,
python-pysam, and python-scripttest all exhibit the same problem.

We fixed this in python-2 with the patch
python-2.7-source-date-epoch.patch, but I don't know how to write this
patch for python-3.

Can somebody write this patch?

I asked about this on #debian-reproducible and they said that it wasn't
an issue for Debian since they don't ship bytecode, but instead generate
it at install time. Of course, that doesn't really apply to Guix.

I used diffoscope-34 to inspect the build outputs to find this, and you
can see the report here:
https://famulari.name/misc/7c55c9e97f668234ddea50299d986f14/borg-diffoscope-report.html

It's first demonstrated in the file
...-borg-0.30.0/lib/python3.4/site-packages/__pycache__/site.cpython-34.pyc.

The first 2 bytes are the "magic numbers" described in PEP-3147, which
specify the version of the bytecode format. The next 2 bytes are the
problematic timestamp, as described in the PEP-3147.

[0]
http://borgbackup.github.io/

[1]
https://docs.python.org/3/library/py_compile.html

[2]
https://www.python.org/dev/peps/pep-3147/

[3] Check out the Guix git commit 4efc8eb27502c, and from there:
$ tar xf $(./pre-inst-env guix build --source python-3)
$ sed -n 139,140p Python-3.4.3/Lib/py_compile.py
    bytecode = importlib._bootstrap._code_to_bytecode(
            code, source_stats['mtime'], source_stats['size'])

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Non-determinism in python-3 ".pyc" bytecode
  2016-02-02  5:15 bug#22533: Non-determinism in python-3 ".pyc" bytecode Leo Famulari
@ 2016-02-02  8:54 ` Leo Famulari
  2016-02-02 20:41 ` Ludovic Courtès
  2017-05-26 13:41 ` bug#22533: Python bytecode reproducibility Marius Bakke
  2 siblings, 0 replies; 29+ messages in thread
From: Leo Famulari @ 2016-02-02  8:54 UTC (permalink / raw)
  To: 22533

On Tue, Feb 02, 2016 at 12:15:44AM -0500, Leo Famulari wrote:
> While preparing a package for borg [0], I found that the built output
> was not reproducible. The problem is that the bytecode compiler [1] for
> Python 3.4.3 (our current version) encodes the mtime of the
> corresponding Python source file in the output. This is described in
> PEP-3147 [2], and the responsible Python code is referenced below [3].
> 
> I tested a few of our existing python-3 packages: python-ccm,
> python-pysam, and python-scripttest all exhibit the same problem.
> 
> We fixed this in python-2 with the patch
> python-2.7-source-date-epoch.patch, but I don't know how to write this
> patch for python-3.

mark_weaver suggested setting the timestamps of the source files before
building. I think this is a better option if it doesn't break anything.
It would allow the bytecode "staleness" check to work as expected while
keeping the output consistent.

> 
> Can somebody write this patch?
> 
> I asked about this on #debian-reproducible and they said that it wasn't
> an issue for Debian since they don't ship bytecode, but instead generate
> it at install time. Of course, that doesn't really apply to Guix.
> 
> I used diffoscope-34 to inspect the build outputs to find this, and you
> can see the report here:
> https://famulari.name/misc/7c55c9e97f668234ddea50299d986f14/borg-diffoscope-report.html
> 
> It's first demonstrated in the file
> ...-borg-0.30.0/lib/python3.4/site-packages/__pycache__/site.cpython-34.pyc.
> 
> The first 2 bytes are the "magic numbers" described in PEP-3147, which
> specify the version of the bytecode format. The next 2 bytes are the
> problematic timestamp, as described in the PEP-3147.
> 
> [0]
> http://borgbackup.github.io/
> 
> [1]
> https://docs.python.org/3/library/py_compile.html
> 
> [2]
> https://www.python.org/dev/peps/pep-3147/
> 
> [3] Check out the Guix git commit 4efc8eb27502c, and from there:
> $ tar xf $(./pre-inst-env guix build --source python-3)
> $ sed -n 139,140p Python-3.4.3/Lib/py_compile.py
>     bytecode = importlib._bootstrap._code_to_bytecode(
>             code, source_stats['mtime'], source_stats['size'])
> 
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Non-determinism in python-3 ".pyc" bytecode
  2016-02-02  5:15 bug#22533: Non-determinism in python-3 ".pyc" bytecode Leo Famulari
  2016-02-02  8:54 ` Leo Famulari
@ 2016-02-02 20:41 ` Ludovic Courtès
  2016-02-04 23:17   ` Leo Famulari
  2017-05-26 13:41 ` bug#22533: Python bytecode reproducibility Marius Bakke
  2 siblings, 1 reply; 29+ messages in thread
From: Ludovic Courtès @ 2016-02-02 20:41 UTC (permalink / raw)
  To: Leo Famulari; +Cc: 22533

[-- Attachment #1: Type: text/plain, Size: 231 bytes --]

Leo Famulari <leo@famulari.name> skribis:

> We fixed this in python-2 with the patch
> python-2.7-source-date-epoch.patch, but I don't know how to write this
> patch for python-3.

I would imagine something like this (untested):


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 653 bytes --]

--- Python-3.4.3/Lib/importlib/_bootstrap.py	2016-02-02 21:38:48.655809055 +0100
+++ Python-3.4.3/Lib/importlib/_bootstrap.py.new	2016-02-02 21:38:43.659769251 +0100
@@ -667,7 +667,10 @@ def _code_to_bytecode(code, mtime=0, sou
     """Compile a code object into bytecode for writing out to a byte-compiled
     file."""
     data = bytearray(MAGIC_NUMBER)
-    data.extend(_w_long(mtime))
+    if 'SOURCE_DATE_EPOCH' in _os.environ:
+        data.extend(_w_long(string.atoi(_os.environ['SOURCE_DATE_EPOCH'])))
+    else:
+        data.extend(_w_long(mtime))
     data.extend(_w_long(source_size))
     data.extend(marshal.dumps(code))
     return data

[-- Attachment #3: Type: text/plain, Size: 618 bytes --]


Could you give it a try and refine as needed?  :-)

> I asked about this on #debian-reproducible and they said that it wasn't
> an issue for Debian since they don't ship bytecode, but instead generate
> it at install time. Of course, that doesn't really apply to Guix.

I’d recommend trying #reproducible-builds on OFTC, which is more
generic.  Also, in some cases, it’s useful to look at
<git://git.debian.org/git/reproducible/notes.git>, which contains notes
about non-reproducible packages (currently partly Debian-specific, but
we need to lobby to make it more generic.  ;-))

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Non-determinism in python-3 ".pyc" bytecode
  2016-02-02 20:41 ` Ludovic Courtès
@ 2016-02-04 23:17   ` Leo Famulari
  2016-03-29 23:11     ` Cyril Roelandt
  2016-03-29 23:13     ` Cyril Roelandt
  0 siblings, 2 replies; 29+ messages in thread
From: Leo Famulari @ 2016-02-04 23:17 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 22533

[-- Attachment #1: Type: text/plain, Size: 1096 bytes --]

On Tue, Feb 02, 2016 at 09:41:19PM +0100, Ludovic Courtès wrote:
> Could you give it a try and refine as needed?  :-)

I altered your example as shown in the attached patch. It causes some
tests related to timestamps to fail, so I disabled them in a very crude
way. The final patch should address those tests more carefully.

But, the patch doesn't seem to have the desired effect so I'm asking for
help!

Here is how I tested the patch:

I build python-3 with it, and then `export SOURCE_DATE_EPOCH=1` and
enter the resulting Python shell. I manually define the '_w_long'
function used by the patched function. Then: 

print (_w_long(locale.atoi(os.getenv('SOURCE_DATE_EPOCH'))))
b'\x01\x00\x00\x00'

But, when I leave the Python shell and issue `python3 -m compileall
helloworld.py`, the timestamps are present in the compiled bytecode. I
can watch the clock "tick" by doing this repeatedly:

$ touch helloworld.py && rm -r __pycache__ && \
python3 -m compileall helloworld.py &&  \
hexdump __pycache__/helloworld.cpython-34.pyc | head -n1

I'm not much of a Python programmer, so I'm stumped.

[-- Attachment #2: 0001-SOURCE_DATE_EPOCH.patch --]
[-- Type: text/x-diff, Size: 3447 bytes --]

From d34a71e4ec4501cb53acd3e15633bc1a05665be9 Mon Sep 17 00:00:00 2001
Message-Id: <d34a71e4ec4501cb53acd3e15633bc1a05665be9.1454625404.git.leo@famulari.name>
From: Leo Famulari <leo@famulari.name>
Date: Wed, 3 Feb 2016 20:44:02 -0500
Subject: [PATCH 1/1] SOURCE_DATE_EPOCH

---
 .../patches/python-3.4.3-source-date-epoch.patch    | 21 +++++++++++++++++++++
 gnu/packages/python.scm                             | 14 +++++++++++++-
 2 files changed, 34 insertions(+), 1 deletion(-)
 create mode 100644 gnu/packages/patches/python-3.4.3-source-date-epoch.patch

diff --git a/gnu/packages/patches/python-3.4.3-source-date-epoch.patch b/gnu/packages/patches/python-3.4.3-source-date-epoch.patch
new file mode 100644
index 0000000..403b2df
--- /dev/null
+++ b/gnu/packages/patches/python-3.4.3-source-date-epoch.patch
@@ -0,0 +1,21 @@
+diff --git a/Lib/importlib/_bootstrap.py b/Lib/importlib/_bootstrap.py
+index 5b91c05..a87d178 100644
+--- Lib/importlib/_bootstrap.py
++++ Lib/importlib/_bootstrap.py
+@@ -666,8 +666,15 @@ def _compile_bytecode(data, name=None, bytecode_path=None, source_path=None):
+ def _code_to_bytecode(code, mtime=0, source_size=0):
+     """Compile a code object into bytecode for writing out to a byte-compiled
+     file."""
++    """os and locale are required for the SOURCE_DATE_EPOCH
++    deterministic timestamp conditional."""
++    import os
++    import locale
+     data = bytearray(MAGIC_NUMBER)
+-    data.extend(_w_long(mtime))
++    if os.getenv('SOURCE_DATE_EPOCH'):
++        data.extend(_w_long(locale.atoi(os.getenv('SOURCE_DATE_EPOCH'))))
++    else:
++        data.extend(_w_long(mtime))
+     data.extend(_w_long(source_size))
+     data.extend(marshal.dumps(code))
+     return data
diff --git a/gnu/packages/python.scm b/gnu/packages/python.scm
index 48f65b5..cd366f5 100644
--- a/gnu/packages/python.scm
+++ b/gnu/packages/python.scm
@@ -173,6 +173,17 @@
              ;; gnu-build-system.scm.
              (setenv "SOURCE_DATE_EPOCH" "1")
              #t))
+          (add-before 'configure 'disable-timestamp-tests
+            (lambda _
+              ;; Filter for existing files, since this only affects
+              ;; Python-3 if the SOURCE_DATE_EPOCH patch is applied.
+              (substitute* (filter file-exists?
+                                   '("Lib/test/test_importlib/test_abc.py"))
+                           (("test_code_bad_timestamp") "disable_test_code_bad_timestamp"))
+              (substitute* (filter file-exists?
+                                   '("Lib/test/test_importlib/source/test_file_loader.py"))
+                           (("test_old_timestamp") "disable_test_old_timestamp"))
+              ))
           (add-before 'configure 'do-not-record-configure-flags
             (lambda* (#:key configure-flags #:allow-other-keys)
               ;; Remove configure flags from the installed '_sysconfigdata.py'
@@ -268,7 +279,8 @@ data types.")
                               ;; XXX Try removing this patch for python > 3.4.3
                               "python-disable-ssl-test.patch"
                               "python-3-deterministic-build-info.patch"
-                              "python-3-search-paths.patch")))
+                              "python-3-search-paths.patch"
+                              "python-3.4.3-source-date-epoch.patch")))
               (patch-flags '("-p0"))
               (sha256
                (base32
-- 
2.6.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* bug#22533: Non-determinism in python-3 ".pyc" bytecode
  2016-02-04 23:17   ` Leo Famulari
@ 2016-03-29 23:11     ` Cyril Roelandt
  2016-03-29 23:13     ` Cyril Roelandt
  1 sibling, 0 replies; 29+ messages in thread
From: Cyril Roelandt @ 2016-03-29 23:11 UTC (permalink / raw)
  To: 22533

[-- Attachment #1: Type: text/plain, Size: 209 bytes --]

Here is a version of the patch that works with the upstream Python, but
that I cannot get to work with our Guix recipe.

Could you test it and tell me what you think? I intend to push this to
CPython.

Cyril.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: upstream.patch --]
[-- Type: text/x-diff; name="upstream.patch", Size: 1213 bytes --]

diff --git a/Lib/importlib/_bootstrap.py b/Lib/importlib/_bootstrap.py
index c4ee41a..d9885c9 100644
--- Lib/importlib/_bootstrap.py
+++ Lib/importlib/_bootstrap.py
@@ -1443,7 +1443,8 @@ class SourceLoader(_LoaderBasics):
         Implementing this method allows the loader to read bytecode files.
         Raises IOError when the path cannot be handled.
         """
-        return {'mtime': self.path_mtime(path)}
+        return {'mtime': float(_os.environ.get(b'SOURCE_DATE_EPOCH',
+                                               st.st_mtime))}
 
     def _cache_bytecode(self, source_path, cache_path, data):
         """Optional method which writes data (bytes) to a file path (a str).
@@ -1580,7 +1581,10 @@ class SourceFileLoader(FileLoader, SourceLoader):
     def path_stats(self, path):
         """Return the metadata for the path."""
         st = _path_stat(path)
-        return {'mtime': st.st_mtime, 'size': st.st_size}
+        return {
+            'mtime':  float(_os.environ.get(b'SOURCE_DATE_EPOCH', st.st_mtime)),
+            'size': st.st_size
+        }
 
     def _cache_bytecode(self, source_path, bytecode_path, data):
         # Adapt between the two APIs

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* bug#22533: Non-determinism in python-3 ".pyc" bytecode
  2016-02-04 23:17   ` Leo Famulari
  2016-03-29 23:11     ` Cyril Roelandt
@ 2016-03-29 23:13     ` Cyril Roelandt
  2016-04-06  8:29       ` Ludovic Courtès
  1 sibling, 1 reply; 29+ messages in thread
From: Cyril Roelandt @ 2016-03-29 23:13 UTC (permalink / raw)
  To: Leo Famulari, Ludovic Courtès; +Cc: 22533

[-- Attachment #1: Type: text/plain, Size: 209 bytes --]

Here is a version of the patch that works with the upstream Python, but
that I cannot get to work with our Guix recipe.

Could you test it and tell me what you think? I intend to push this to
CPython.

Cyril.

[-- Attachment #2: upstream.patch --]
[-- Type: text/x-diff, Size: 1187 bytes --]

diff --git a/Lib/importlib/_bootstrap.py b/Lib/importlib/_bootstrap.py
index c4ee41a..d9885c9 100644
--- Lib/importlib/_bootstrap.py
+++ Lib/importlib/_bootstrap.py
@@ -1443,7 +1443,8 @@ class SourceLoader(_LoaderBasics):
         Implementing this method allows the loader to read bytecode files.
         Raises IOError when the path cannot be handled.
         """
-        return {'mtime': self.path_mtime(path)}
+        return {'mtime': float(_os.environ.get(b'SOURCE_DATE_EPOCH',
+                                               st.st_mtime))}
 
     def _cache_bytecode(self, source_path, cache_path, data):
         """Optional method which writes data (bytes) to a file path (a str).
@@ -1580,7 +1581,10 @@ class SourceFileLoader(FileLoader, SourceLoader):
     def path_stats(self, path):
         """Return the metadata for the path."""
         st = _path_stat(path)
-        return {'mtime': st.st_mtime, 'size': st.st_size}
+        return {
+            'mtime':  float(_os.environ.get(b'SOURCE_DATE_EPOCH', st.st_mtime)),
+            'size': st.st_size
+        }
 
     def _cache_bytecode(self, source_path, bytecode_path, data):
         # Adapt between the two APIs

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* bug#22533: Non-determinism in python-3 ".pyc" bytecode
  2016-03-29 23:13     ` Cyril Roelandt
@ 2016-04-06  8:29       ` Ludovic Courtès
  0 siblings, 0 replies; 29+ messages in thread
From: Ludovic Courtès @ 2016-04-06  8:29 UTC (permalink / raw)
  To: Cyril Roelandt; +Cc: 22533

[-- Attachment #1: Type: text/plain, Size: 255 bytes --]

Cyril Roelandt <tipecaml@gmail.com> skribis:

> Here is a version of the patch that works with the upstream Python, but
> that I cannot get to work with our Guix recipe.

At first sight the patch LGTM.  How does it not work for you? :-)

I applied this:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 1557 bytes --]

diff --git a/gnu/packages/patches/python-3-deterministic-build-info.patch b/gnu/packages/patches/python-3-deterministic-build-info.patch
index 22c372a..bdf9f20 100644
--- a/gnu/packages/patches/python-3-deterministic-build-info.patch
+++ b/gnu/packages/patches/python-3-deterministic-build-info.patch
@@ -15,3 +15,28 @@ We cannot pass it in CPPFLAGS due to whitespace in the DATE string.
  #ifndef DATE
  #ifdef __DATE__
  #define DATE __DATE__
+
+--- Lib/importlib/_bootstrap.py
++++ Lib/importlib/_bootstrap.py
+@@ -1443,7 +1443,8 @@ class SourceLoader(_LoaderBasics):
+         Implementing this method allows the loader to read bytecode files.
+         Raises IOError when the path cannot be handled.
+         """
+-        return {'mtime': self.path_mtime(path)}
++        return {'mtime': float(_os.environ.get(b'SOURCE_DATE_EPOCH',
++                                               st.st_mtime))}
+ 
+     def _cache_bytecode(self, source_path, cache_path, data):
+         """Optional method which writes data (bytes) to a file path (a str).
+@@ -1580,7 +1581,10 @@ class SourceFileLoader(FileLoader, SourceLoader):
+     def path_stats(self, path):
+         """Return the metadata for the path."""
+         st = _path_stat(path)
+-        return {'mtime': st.st_mtime, 'size': st.st_size}
++        return {
++            'mtime':  float(_os.environ.get(b'SOURCE_DATE_EPOCH', st.st_mtime)),
++            'size': st.st_size
++        }
+ 
+     def _cache_bytecode(self, source_path, bytecode_path, data):
+         # Adapt between the two APIs

[-- Attachment #3: Type: text/plain, Size: 7418 bytes --]


… and that leads to these test failures:

--8<---------------cut here---------------start------------->8---
$ ./pre-inst-env guix build python@3 --rounds=2 -K

[...]

======================================================================
FAIL: test_bad_marshal (test.test_importlib.source.test_file_loader.Source_SourceLoaderBadBytecodeTestPEP302)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/util.py", line 22, in wrapper
    to_return = fxn(*args, **kwargs)
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 452, in test_bad_marshal
    self._test_bad_marshal()
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 342, in _test_bad_marshal
    self.import_(file_path, '_temp')
AssertionError: EOFError not raised

======================================================================
FAIL: test_no_marshal (test.test_importlib.source.test_file_loader.Source_SourceLoaderBadBytecodeTestPEP302)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/util.py", line 22, in wrapper
    to_return = fxn(*args, **kwargs)
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 441, in test_no_marshal
    self._test_no_marshal()
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 322, in _test_no_marshal
    self.import_(file_path, '_temp')
AssertionError: EOFError not raised

======================================================================
FAIL: test_non_code_marshal (test.test_importlib.source.test_file_loader.Source_SourceLoaderBadBytecodeTestPEP302)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/util.py", line 22, in wrapper
    to_return = fxn(*args, **kwargs)
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 445, in test_non_code_marshal
    self._test_non_code_marshal()
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 331, in _test_non_code_marshal
    self.import_(file_path, '_temp')
AssertionError: ImportError not raised

======================================================================
FAIL: test_old_timestamp (test.test_importlib.source.test_file_loader.Source_SourceLoaderBadBytecodeTestPEP302)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/util.py", line 22, in wrapper
    to_return = fxn(*args, **kwargs)
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 471, in test_old_timestamp
    self.assertEqual(bytecode_file.read(4), source_timestamp)
AssertionError: b'\x01\x00\x00\x00' != b'\x7f\xc7\x04W'

======================================================================
FAIL: test_bad_marshal (test.test_importlib.source.test_file_loader.Source_SourceLoaderBadBytecodeTestPEP451)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/util.py", line 22, in wrapper
    to_return = fxn(*args, **kwargs)
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 452, in test_bad_marshal
    self._test_bad_marshal()
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 342, in _test_bad_marshal
    self.import_(file_path, '_temp')
AssertionError: EOFError not raised

======================================================================
FAIL: test_no_marshal (test.test_importlib.source.test_file_loader.Source_SourceLoaderBadBytecodeTestPEP451)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/util.py", line 22, in wrapper
    to_return = fxn(*args, **kwargs)
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 441, in test_no_marshal
    self._test_no_marshal()
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 322, in _test_no_marshal
    self.import_(file_path, '_temp')
AssertionError: EOFError not raised

======================================================================
FAIL: test_non_code_marshal (test.test_importlib.source.test_file_loader.Source_SourceLoaderBadBytecodeTestPEP451)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/util.py", line 22, in wrapper
    to_return = fxn(*args, **kwargs)
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 445, in test_non_code_marshal
    self._test_non_code_marshal()
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 331, in _test_non_code_marshal
    self.import_(file_path, '_temp')
AssertionError: ImportError not raised

======================================================================
FAIL: test_old_timestamp (test.test_importlib.source.test_file_loader.Source_SourceLoaderBadBytecodeTestPEP451)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/util.py", line 22, in wrapper
    to_return = fxn(*args, **kwargs)
  File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 471, in test_old_timestamp
    self.assertEqual(bytecode_file.read(4), source_timestamp)
AssertionError: b'\x01\x00\x00\x00' != b'\x7f\xc7\x04W'

----------------------------------------------------------------------
Ran 951 tests in 1.102s

FAILED (failures=8, skipped=19, expected failures=1)
Makefile:958: recipe for target 'test' failed
--8<---------------cut here---------------end--------------->8---

‘test_old_timestamp’ clearly needs to be adjusted to account for the
change.  The others have to do with the bytecode loader, so it’s
probably a similar story.  Could you look into it?

Perhaps you tested with SOURCE_DATE_EPOCH unset?

Thanks for working on this, it’s an important bug to fix!

Ludo’.

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2016-02-02  5:15 bug#22533: Non-determinism in python-3 ".pyc" bytecode Leo Famulari
  2016-02-02  8:54 ` Leo Famulari
  2016-02-02 20:41 ` Ludovic Courtès
@ 2017-05-26 13:41 ` Marius Bakke
  2018-03-03 22:37   ` Ricardo Wurmus
  2 siblings, 1 reply; 29+ messages in thread
From: Marius Bakke @ 2017-05-26 13:41 UTC (permalink / raw)
  To: 22533

[-- Attachment #1: Type: text/plain, Size: 476 bytes --]

Hello!

I stumbled across this bug after re-discovering that Python bytecode is
not reproducible (through "glib"). Just sharing some notes..

Nix recently made an effort to fix this. AFAICT the ".pyc" files are
still a problem, but at least they got the interpreters building
reproducibly:

https://github.com/NixOS/nixpkgs/issues/22570
https://github.com/NixOS/nixpkgs/pull/22585

It would be great to revive this longstanding bug!

*walks away slowly before anyone notices*

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2017-05-26 13:41 ` bug#22533: Python bytecode reproducibility Marius Bakke
@ 2018-03-03 22:37   ` Ricardo Wurmus
  2018-03-04  9:21     ` Gábor Boskovits
  2018-03-05  9:25     ` Ludovic Courtès
  0 siblings, 2 replies; 29+ messages in thread
From: Ricardo Wurmus @ 2018-03-03 22:37 UTC (permalink / raw)
  To: Marius Bakke; +Cc: 22533

Hi Guix,

Marius Bakke <mbakke@fastmail.com> writes:

> It would be great to revive this longstanding bug!

Indeed.

Here’s another attempt.  As far as I understand, the timestamp in the
pyc files only affects the header.

Up until Python 3.6 (incl) the header looks like this:

  magic | timestamp | size

Since Python 3.7 the header may either contain a timestamp or a hash:

  magic | 00000000000000000000000000000000 | timestamp | size
  magic | 00000000000000000000000000000001 | hash      | size

This means we likely won’t have this problem any more with Python 3.7.
For Python 3.6 I guess we could add a final build phase that overwrites
the timestamp in the *binary*.  This needs to happen before any of the
compiled files are wrapped up in a wheel.

Should we just wait for Python 3.7 which is expected to be released in
June 2018?  We’d still have to deal with this problem in Python 2,
though.

Is it a bad idea to override the timestamps in the generated binaries?
I think that we could avoid the recency check then, which was an
obstacle to resetting the timestamps of the source files.

--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2018-03-03 22:37   ` Ricardo Wurmus
@ 2018-03-04  9:21     ` Gábor Boskovits
  2018-03-04 12:46       ` Ricardo Wurmus
  2018-03-05  9:25     ` Ludovic Courtès
  1 sibling, 1 reply; 29+ messages in thread
From: Gábor Boskovits @ 2018-03-04  9:21 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 22533

[-- Attachment #1: Type: text/plain, Size: 1516 bytes --]

2018-03-03 23:37 GMT+01:00 Ricardo Wurmus <rekado@elephly.net>:

> Hi Guix,
>
> Marius Bakke <mbakke@fastmail.com> writes:
>
> > It would be great to revive this longstanding bug!
>
> Indeed.
>
> Here’s another attempt.  As far as I understand, the timestamp in the
> pyc files only affects the header.
>
> Up until Python 3.6 (incl) the header looks like this:
>
>   magic | timestamp | size
>
> Since Python 3.7 the header may either contain a timestamp or a hash:
>
>   magic | 00000000000000000000000000000000 | timestamp | size
>   magic | 00000000000000000000000000000001 | hash      | size
>
> This means we likely won’t have this problem any more with Python 3.7.
> For Python 3.6 I guess we could add a final build phase that overwrites
> the timestamp in the *binary*.  This needs to happen before any of the
> compiled files are wrapped up in a wheel.
>
> Should we just wait for Python 3.7 which is expected to be released in
> June 2018?  We’d still have to deal with this problem in Python 2,
> though.
>
> Is it a bad idea to override the timestamps in the generated binaries?
> I think that we could avoid the recency check then, which was an
> obstacle to resetting the timestamps of the source files.

--
> Ricardo
>
> GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
> https://elephly.net
>
>
Nix had this issue, it seems they have a python 3.5 solution, which
should be easy to adopt: https://github.com/NixOS/nixpkgs/issues/22570.
WDYT?

[-- Attachment #2: Type: text/html, Size: 2278 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2018-03-04  9:21     ` Gábor Boskovits
@ 2018-03-04 12:46       ` Ricardo Wurmus
  2018-03-04 15:30         ` Gábor Boskovits
  2018-03-04 19:18         ` Ricardo Wurmus
  0 siblings, 2 replies; 29+ messages in thread
From: Ricardo Wurmus @ 2018-03-04 12:46 UTC (permalink / raw)
  To: Gábor Boskovits; +Cc: 22533


Hi Gábor,

> Nix had this issue, it seems they have a python 3.5 solution, which
> should be easy to adopt: https://github.com/NixOS/nixpkgs/issues/22570.
> WDYT?

Here’s the patch for Nix:

  https://patch-diff.githubusercontent.com/raw/NixOS/nixpkgs/pull/22585.diff

Here are the relevant changes to the Python packages:

* Python 3.4

  substituteInPlace "Lib/py_compile.py" --replace "source_stats['mtime']" "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])"
  substituteInPlace "Lib/importlib/_bootstrap.py" --replace "source_mtime = int(source_stats['mtime'])" "source_mtime = 1"

* Python 3.5

  substituteInPlace "Lib/py_compile.py" --replace "source_stats['mtime']" "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])"
  substituteInPlace "Lib/importlib/_bootstrap_external.py" --replace "source_mtime = int(st['mtime'])" "source_mtime = 1"

* Python 3.6
  substituteInPlace "Lib/py_compile.py" --replace "source_stats['mtime']" "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])"
  substituteInPlace "Lib/importlib/_bootstrap_external.py" --replace "source_mtime = int(st['mtime'])" "source_mtime = 1"


For all packages they set these environment variables:

  - set PYTHONHASHSEED=0 (for hashes of str, bytes and datetime objects)

  - set DETERMINISTIC_BUILD; for conditional patching of the timestamp
    for package builds.  The timestamp is not patched in ad-hoc
    environments, because that would mess with Python’s ability to
    determine whether to compile source files.

They also rebuild all bytecode (with the exception of lib2to3 because it
is Python 2 code) three times, once for each optimization level.

--8<---------------cut here---------------start------------->8---
+    # Determinism: rebuild all bytecode
+    # We exclude lib2to3 because that's Python 2 code which fails
+    # We rebuild three times, once for each optimization level
+    find $out -name "*.py" | $out/bin/python -m compileall -q -f -x "lib2to3" -i -
+    find $out -name "*.py" | $out/bin/python -O -m compileall -q -f -x "lib2to3" -i -
+    find $out -name "*.py" | $out/bin/python -OO -m compileall -q -f -x "lib2to3" -i -
--8<---------------cut here---------------end--------------->8---

--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2018-03-04 12:46       ` Ricardo Wurmus
@ 2018-03-04 15:30         ` Gábor Boskovits
  2018-03-04 19:18         ` Ricardo Wurmus
  1 sibling, 0 replies; 29+ messages in thread
From: Gábor Boskovits @ 2018-03-04 15:30 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 22533

[-- Attachment #1: Type: text/plain, Size: 3002 bytes --]

2018-03-04 13:46 GMT+01:00 Ricardo Wurmus <rekado@elephly.net>:

>
> Hi Gábor,
>
> > Nix had this issue, it seems they have a python 3.5 solution, which
> > should be easy to adopt: https://github.com/NixOS/nixpkgs/issues/22570.
> > WDYT?
>
> Here’s the patch for Nix:
>
>   https://patch-diff.githubusercontent.com/raw/
> NixOS/nixpkgs/pull/22585.diff
>
> Here are the relevant changes to the Python packages:
>
> * Python 3.4
>
>   substituteInPlace "Lib/py_compile.py" --replace "source_stats['mtime']"
> "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])"
>   substituteInPlace "Lib/importlib/_bootstrap.py" --replace "source_mtime
> = int(source_stats['mtime'])" "source_mtime = 1"
>
> * Python 3.5
>
>   substituteInPlace "Lib/py_compile.py" --replace "source_stats['mtime']"
> "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])"
>   substituteInPlace "Lib/importlib/_bootstrap_external.py" --replace
> "source_mtime = int(st['mtime'])" "source_mtime = 1"
>
> * Python 3.6
>   substituteInPlace "Lib/py_compile.py" --replace "source_stats['mtime']"
> "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])"
>   substituteInPlace "Lib/importlib/_bootstrap_external.py" --replace
> "source_mtime = int(st['mtime'])" "source_mtime = 1"
>
>
>
Nice, thanks for the summary.
Can we adopt this as is?
Do we need the 3.4 and 3.5 fix or the 3.6 one is enough?


> For all packages they set these environment variables:
>
>   - set PYTHONHASHSEED=0 (for hashes of str, bytes and datetime objects)
>
>   - set DETERMINISTIC_BUILD; for conditional patching of the timestamp
>     for package builds.  The timestamp is not patched in ad-hoc
>     environments, because that would mess with Python’s ability to
>     determine whether to compile source files.
>
>
Should we set these in python-build-system? What about python booststrap?
I guess we use gnu-build-system there, so bootstrap packages might need to
set these explicitly?


> They also rebuild all bytecode (with the exception of lib2to3 because it
> is Python 2 code) three times, once for each optimization level.
>
> --8<---------------cut here---------------start------------->8---
> +    # Determinism: rebuild all bytecode
> +    # We exclude lib2to3 because that's Python 2 code which fails
> +    # We rebuild three times, once for each optimization level
> +    find $out -name "*.py" | $out/bin/python -m compileall -q -f -x
> "lib2to3" -i -
> +    find $out -name "*.py" | $out/bin/python -O -m compileall -q -f -x
> "lib2to3" -i -
> +    find $out -name "*.py" | $out/bin/python -OO -m compileall -q -f -x
> "lib2to3" -i -
> --8<---------------cut here---------------end--------------->8---
>
>
Do we also have to do this, or should we settle with one optimization
level? Which one?


> --
> Ricardo
>
> GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
> https://elephly.net
>
>
>

[-- Attachment #2: Type: text/html, Size: 4686 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2018-03-04 12:46       ` Ricardo Wurmus
  2018-03-04 15:30         ` Gábor Boskovits
@ 2018-03-04 19:18         ` Ricardo Wurmus
  2018-03-05  0:02           ` Ricardo Wurmus
                             ` (2 more replies)
  1 sibling, 3 replies; 29+ messages in thread
From: Ricardo Wurmus @ 2018-03-04 19:18 UTC (permalink / raw)
  To: Gábor Boskovits; +Cc: 22533

[-- Attachment #1: Type: text/plain, Size: 36 bytes --]

I have applied this patch locally:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 1.diff --]
[-- Type: text/x-patch, Size: 2279 bytes --]

diff --git a/gnu/packages/python.scm b/gnu/packages/python.scm
index 5f701701a..0d1ecc3c6 100644
--- a/gnu/packages/python.scm
+++ b/gnu/packages/python.scm
@@ -359,8 +359,42 @@ data types.")
                               "Lib/ctypes/test/test_win32.py" ; fails on aarch64
                               "Lib/test/test_fcntl.py")) ; fails on aarch64
                   #t))))
-    (arguments (substitute-keyword-arguments (package-arguments python-2)
-                 ((#:tests? _) #t)))
+    (arguments
+     (substitute-keyword-arguments (package-arguments python-2)
+       ((#:tests? _) #t)
+       ((#:phases phases)
+        `(modify-phases ,phases
+           (add-after 'unpack 'patch-timestamp-for-pyc-files
+             (lambda _
+               ;; We set DETERMINISTIC_BUILD to only override the mtime when
+               ;; building with Guix, lest we break auto-compilation in
+               ;; environments.
+               (setenv "DETERMINISTIC_BUILD" "1")
+               (substitute* "Lib/py_compile.py"
+                 (("source_stats\\['mtime'\\]")
+                  "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])"))
+
+               ;; Use deterministic hashes for strings, bytes, and datetime
+               ;; objects.
+               (setenv "PYTHONHASHSEED" "0")
+
+               ;; Reset mtime when validating bytecode header.
+               (substitute* "Lib/importlib/_bootstrap_external.py"
+                 (("source_mtime = int\\(source_stats\\['mtime'\\]\\)")
+                  "source_mtime = 1"))
+               #t))
+           (add-after 'unpack 'disable-timestamp-tests
+             (lambda _
+               (substitute* "Lib/test/test_importlib/source/test_file_loader.py"
+                 (("test_bad_marshal")
+                  "disable_test_bad_marshal")
+                 (("test_no_marshal")
+                  "disable_test_no_marshal")
+                 (("test_non_code_marshal")
+                  "disable_test_non_code_marshal"))
+               #t))
+           (add-before 'check 'allow-non-deterministic-compilation
+             (lambda _ (unsetenv "DETERMINISTIC_BUILD") #t))))))
     (native-search-paths
      (list (search-path-specification
             (variable "PYTHONPATH")

[-- Attachment #3: Type: text/plain, Size: 389 bytes --]


It allows me to build python-six and python-sip reproducibly.  It does
not fix problems with Python 2, and I haven’t yet tested if it causes
any new problems.

It’s a little worrying that I had to disable three more tests that I
think shouldn’t have failed.

What do you think?

--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2018-03-04 19:18         ` Ricardo Wurmus
@ 2018-03-05  0:02           ` Ricardo Wurmus
  2018-03-05  0:05             ` Ricardo Wurmus
  2018-03-05 22:06             ` Ricardo Wurmus
  2018-03-05 23:21           ` Marius Bakke
  2018-03-08 10:39           ` Gábor Boskovits
  2 siblings, 2 replies; 29+ messages in thread
From: Ricardo Wurmus @ 2018-03-05  0:02 UTC (permalink / raw)
  To: Gábor Boskovits; +Cc: 22533


Ricardo Wurmus <rekado@elephly.net> writes:

> I have applied this patch locally:
>
> diff --git a/gnu/packages/python.scm b/gnu/packages/python.scm
> index 5f701701a..0d1ecc3c6 100644
> --- a/gnu/packages/python.scm
> +++ b/gnu/packages/python.scm
> @@ -359,8 +359,42 @@ data types.")
>                                "Lib/ctypes/test/test_win32.py" ; fails on aarch64
>                                "Lib/test/test_fcntl.py")) ; fails on aarch64
>                    #t))))
> -    (arguments (substitute-keyword-arguments (package-arguments python-2)
> -                 ((#:tests? _) #t)))
> +    (arguments
> +     (substitute-keyword-arguments (package-arguments python-2)
> +       ((#:tests? _) #t)
> +       ((#:phases phases)
> +        `(modify-phases ,phases
> +           (add-after 'unpack 'patch-timestamp-for-pyc-files
> +             (lambda _
> +               ;; We set DETERMINISTIC_BUILD to only override the mtime when
> +               ;; building with Guix, lest we break auto-compilation in
> +               ;; environments.
> +               (setenv "DETERMINISTIC_BUILD" "1")
> +               (substitute* "Lib/py_compile.py"
> +                 (("source_stats\\['mtime'\\]")
> +                  "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])"))
> +
> +               ;; Use deterministic hashes for strings, bytes, and datetime
> +               ;; objects.
> +               (setenv "PYTHONHASHSEED" "0")
> +
> +               ;; Reset mtime when validating bytecode header.
> +               (substitute* "Lib/importlib/_bootstrap_external.py"
> +                 (("source_mtime = int\\(source_stats\\['mtime'\\]\\)")
> +                  "source_mtime = 1"))
> +               #t))
> +           (add-after 'unpack 'disable-timestamp-tests
> +             (lambda _
> +               (substitute* "Lib/test/test_importlib/source/test_file_loader.py"
> +                 (("test_bad_marshal")
> +                  "disable_test_bad_marshal")
> +                 (("test_no_marshal")
> +                  "disable_test_no_marshal")
> +                 (("test_non_code_marshal")
> +                  "disable_test_non_code_marshal"))
> +               #t))
> +           (add-before 'check 'allow-non-deterministic-compilation
> +             (lambda _ (unsetenv "DETERMINISTIC_BUILD") #t))))))
>      (native-search-paths
>       (list (search-path-specification
>              (variable "PYTHONPATH")
>
> It allows me to build python-six and python-sip reproducibly.  It does
> not fix problems with Python 2, and I haven’t yet tested if it causes
> any new problems.

I tested importing modules in an ad-hoc environment — no problems.

Unfortunately, this doesn’t fix all reproducibility problems with numpy:

--8<---------------cut here---------------start------------->8---
Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/distutils/__pycache__/__config__.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/distutils/__pycache__/__config__.cpython-36.pyc differ
Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/distutils/__pycache__/exec_command.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/distutils/__pycache__/exec_command.cpython-36.pyc differ
Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/distutils/__pycache__/system_info.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/distutils/__pycache__/system_info.cpython-36.pyc differ
Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/__pycache__/__config__.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/__pycache__/__config__.cpython-36.pyc differ
Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc differ
Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc differ
--8<---------------cut here---------------end--------------->8---

But the successes with simpler Python packages are promising.

--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2018-03-05  0:02           ` Ricardo Wurmus
@ 2018-03-05  0:05             ` Ricardo Wurmus
  2018-03-05 15:36               ` Gábor Boskovits
  2018-03-05 22:02               ` Ricardo Wurmus
  2018-03-05 22:06             ` Ricardo Wurmus
  1 sibling, 2 replies; 29+ messages in thread
From: Ricardo Wurmus @ 2018-03-05  0:05 UTC (permalink / raw)
  To: Gábor Boskovits; +Cc: 22533


Ricardo Wurmus <rekado@elephly.net> writes:

> Unfortunately, this doesn’t fix all reproducibility problems with numpy:
>
> --8<---------------cut here---------------start------------->8---
> Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/distutils/__pycache__/__config__.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/distutils/__pycache__/__config__.cpython-36.pyc differ
> Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/distutils/__pycache__/exec_command.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/distutils/__pycache__/exec_command.cpython-36.pyc differ
> Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/distutils/__pycache__/system_info.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/distutils/__pycache__/system_info.cpython-36.pyc differ
> Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/__pycache__/__config__.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/__pycache__/__config__.cpython-36.pyc differ
> Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc differ
> Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc differ
> --8<---------------cut here---------------end--------------->8---

Here’s what diffoscope says:

--8<---------------cut here---------------start------------->8---
diffoscope /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0{-check,}/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc
--- /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc
+++ /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc
@@ -1,8 +1,8 @@
-00000000: 330d 0d0a fa87 9c5a 2601 0000 e300 0000  3......Z&.......
+00000000: 330d 0d0a c485 9c5a 2601 0000 e300 0000  3......Z&.......
 00000010: 0000 0000 0000 0000 0001 0000 0040 0000  .............@..
 00000020: 0073 2000 0000 6400 5a00 6400 5a01 6400  .s ...d.Z.d.Z.d.
 00000030: 5a02 6401 5a03 6402 5a04 6504 731c 6502  Z.d.Z.d.Z.e.s.e.
 00000040: 5a01 6403 5300 2904 7a06 312e 3134 2e30  Z.d.S.).z.1.14.0
 00000050: da28 3639 3134 6262 3431 6630 6662 3363  .(6914bb41f0fb3c
 00000060: 3162 6135 3030 6261 6534 6537 6436 3731  1ba500bae4e7d671
 00000070: 6461 3935 3336 3738 3666 544e 2905 da0d  da9536786fTN)...
--8<---------------cut here---------------end--------------->8---

In other words: this is the timestamp field of the pyc file.

Maybe this can be avoided by setting DETERMINISTIC_BUILD in the
python-build-system?

--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2018-03-03 22:37   ` Ricardo Wurmus
  2018-03-04  9:21     ` Gábor Boskovits
@ 2018-03-05  9:25     ` Ludovic Courtès
  1 sibling, 0 replies; 29+ messages in thread
From: Ludovic Courtès @ 2018-03-05  9:25 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 22533

Hello!

Ricardo Wurmus <rekado@elephly.net> skribis:

> Is it a bad idea to override the timestamps in the generated binaries?
> I think that we could avoid the recency check then, which was an
> obstacle to resetting the timestamps of the source files.

I think it’s good if we can fix Python itself to honor SOURCE_DATE_EPOCH
for its timestamps, but it’s also OK to patch timestamps in generated
binaries.

We do that already in gzip headers, with ‘reset-gzip-timestamp’.

Thanks for tackling this!

Ludo’.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2018-03-05  0:05             ` Ricardo Wurmus
@ 2018-03-05 15:36               ` Gábor Boskovits
  2018-03-05 20:33                 ` Gábor Boskovits
  2018-03-05 22:02               ` Ricardo Wurmus
  1 sibling, 1 reply; 29+ messages in thread
From: Gábor Boskovits @ 2018-03-05 15:36 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 22533

[-- Attachment #1: Type: text/plain, Size: 4143 bytes --]

2018-03-05 1:05 GMT+01:00 Ricardo Wurmus <rekado@elephly.net>:

>
> Ricardo Wurmus <rekado@elephly.net> writes:
>
> > Unfortunately, this doesn’t fix all reproducibility problems with numpy:
> >
> > --8<---------------cut here---------------start------------->8---
> > Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dw
> cc-python-numpy-1.14.0-check/lib/python3.6/site-packages/
> numpy/distutils/__pycache__/__config__.cpython-36.pyc and /gnu/store/
> kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/
> python3.6/site-packages/numpy/distutils/__pycache__/__config__.cpython-36.pyc
> differ
> > Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dw
> cc-python-numpy-1.14.0-check/lib/python3.6/site-packages/
> numpy/distutils/__pycache__/exec_command.cpython-36.pyc and /gnu/store/
> kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/
> python3.6/site-packages/numpy/distutils/__pycache__/exec_command.cpython-36.pyc
> differ
> > Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dw
> cc-python-numpy-1.14.0-check/lib/python3.6/site-packages/
> numpy/distutils/__pycache__/system_info.cpython-36.pyc and /gnu/store/
> kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/
> python3.6/site-packages/numpy/distutils/__pycache__/system_info.cpython-36.pyc
> differ
> > Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dw
> cc-python-numpy-1.14.0-check/lib/python3.6/site-packages/
> numpy/__pycache__/__config__.cpython-36.pyc and /gnu/store/
> kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/
> python3.6/site-packages/numpy/__pycache__/__config__.cpython-36.pyc differ
> > Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dw
> cc-python-numpy-1.14.0-check/lib/python3.6/site-packages/
> numpy/__pycache__/version.cpython-36.pyc and /gnu/store/
> kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/
> python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc differ
> > Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dw
> cc-python-numpy-1.14.0-check/lib/python3.6/site-packages/
> numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc and /gnu/store/
> kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/
> python3.6/site-packages/numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc
> differ
> > --8<---------------cut here---------------end--------------->8---
>
> Here’s what diffoscope says:
>
> --8<---------------cut here---------------start------------->8---
> diffoscope /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dw
> cc-python-numpy-1.14.0{-check,}/lib/python3.6/site-packages/
> numpy/__pycache__/version.cpython-36.pyc
> --- /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/
> lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc
> +++ /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/
> python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc
> @@ -1,8 +1,8 @@
> -00000000: 330d 0d0a fa87 9c5a 2601 0000 e300 0000  3......Z&.......
> +00000000: 330d 0d0a c485 9c5a 2601 0000 e300 0000  3......Z&.......
>  00000010: 0000 0000 0000 0000 0001 0000 0040 0000  .............@..
>  00000020: 0073 2000 0000 6400 5a00 6400 5a01 6400  .s ...d.Z.d.Z.d.
>  00000030: 5a02 6401 5a03 6402 5a04 6504 731c 6502  Z.d.Z.d.Z.e.s.e.
>  00000040: 5a01 6403 5300 2904 7a06 312e 3134 2e30  Z.d.S.).z.1.14.0
>  00000050: da28 3639 3134 6262 3431 6630 6662 3363  .(6914bb41f0fb3c
>  00000060: 3162 6135 3030 6261 6534 6537 6436 3731  1ba500bae4e7d671
>  00000070: 6461 3935 3336 3738 3666 544e 2905 da0d  da9536786fTN)...
> --8<---------------cut here---------------end--------------->8---
>
> In other words: this is the timestamp field of the pyc file.
>
> Maybe this can be avoided by setting DETERMINISTIC_BUILD in the
> python-build-system?
>
>
It seems that the deterministic build patch already landed upstream
https://github.com/python/cpython/pull/5200, so we might consider
applying the upstream patches. WDYT?


> --
> Ricardo
>
> GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
> https://elephly.net
>
>
>

[-- Attachment #2: Type: text/html, Size: 5401 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2018-03-05 15:36               ` Gábor Boskovits
@ 2018-03-05 20:33                 ` Gábor Boskovits
  2018-03-05 21:46                   ` Ricardo Wurmus
  0 siblings, 1 reply; 29+ messages in thread
From: Gábor Boskovits @ 2018-03-05 20:33 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 22533

[-- Attachment #1: Type: text/plain, Size: 4476 bytes --]

2018-03-05 16:36 GMT+01:00 Gábor Boskovits <boskovits@gmail.com>:

> 2018-03-05 1:05 GMT+01:00 Ricardo Wurmus <rekado@elephly.net>:
>
>>
>> Ricardo Wurmus <rekado@elephly.net> writes:
>>
>> > Unfortunately, this doesn’t fix all reproducibility problems with numpy:
>> >
>> > --8<---------------cut here---------------start------------->8---
>> > Binary files /gnu/store/kd06ql8fynlydymzhhn
>> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-
>> packages/numpy/distutils/__pycache__/__config__.cpython-36.pyc and
>> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.
>> 14.0/lib/python3.6/site-packages/numpy/distutils/__
>> pycache__/__config__.cpython-36.pyc differ
>> > Binary files /gnu/store/kd06ql8fynlydymzhhn
>> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-
>> packages/numpy/distutils/__pycache__/exec_command.cpython-36.pyc and
>> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.
>> 14.0/lib/python3.6/site-packages/numpy/distutils/__
>> pycache__/exec_command.cpython-36.pyc differ
>> > Binary files /gnu/store/kd06ql8fynlydymzhhn
>> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-
>> packages/numpy/distutils/__pycache__/system_info.cpython-36.pyc and
>> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.
>> 14.0/lib/python3.6/site-packages/numpy/distutils/__
>> pycache__/system_info.cpython-36.pyc differ
>> > Binary files /gnu/store/kd06ql8fynlydymzhhn
>> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-
>> packages/numpy/__pycache__/__config__.cpython-36.pyc and
>> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.
>> 14.0/lib/python3.6/site-packages/numpy/__pycache__/__config__.cpython-36.pyc
>> differ
>> > Binary files /gnu/store/kd06ql8fynlydymzhhn
>> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-
>> packages/numpy/__pycache__/version.cpython-36.pyc and
>> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.
>> 14.0/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc
>> differ
>> > Binary files /gnu/store/kd06ql8fynlydymzhhn
>> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-
>> packages/numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc and
>> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.
>> 14.0/lib/python3.6/site-packages/numpy/testing/nose_
>> tools/__pycache__/utils.cpython-36.pyc differ
>> > --8<---------------cut here---------------end--------------->8---
>>
>> Here’s what diffoscope says:
>>
>> --8<---------------cut here---------------start------------->8---
>> diffoscope /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.
>> 14.0{-check,}/lib/python3.6/site-packages/numpy/__pycache_
>> _/version.cpython-36.pyc
>> --- /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.
>> 14.0-check/lib/python3.6/site-packages/numpy/__pycache__/
>> version.cpython-36.pyc
>> +++ /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.
>> 14.0/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc
>> @@ -1,8 +1,8 @@
>> -00000000: 330d 0d0a fa87 9c5a 2601 0000 e300 0000  3......Z&.......
>> +00000000: 330d 0d0a c485 9c5a 2601 0000 e300 0000  3......Z&.......
>>  00000010: 0000 0000 0000 0000 0001 0000 0040 0000  .............@..
>>  00000020: 0073 2000 0000 6400 5a00 6400 5a01 6400  .s ...d.Z.d.Z.d.
>>  00000030: 5a02 6401 5a03 6402 5a04 6504 731c 6502  Z.d.Z.d.Z.e.s.e.
>>  00000040: 5a01 6403 5300 2904 7a06 312e 3134 2e30  Z.d.S.).z.1.14.0
>>  00000050: da28 3639 3134 6262 3431 6630 6662 3363  .(6914bb41f0fb3c
>>  00000060: 3162 6135 3030 6261 6534 6537 6436 3731  1ba500bae4e7d671
>>  00000070: 6461 3935 3336 3738 3666 544e 2905 da0d  da9536786fTN)...
>> --8<---------------cut here---------------end--------------->8---
>>
>> In other words: this is the timestamp field of the pyc file.
>>
>> Maybe this can be avoided by setting DETERMINISTIC_BUILD in the
>> python-build-system?
>>
>>
> It seems that the deterministic build patch already landed upstream
> https://github.com/python/cpython/pull/5200, so we might consider
> applying the upstream patches. WDYT?
>

And also this: https://github.com/python/cpython/pull/4575.
I'm now having a look at this approach. However this second one
seems quite invasive...


>
>
>> --
>> Ricardo
>>
>> GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
>> https://elephly.net
>>
>>
>>
>

[-- Attachment #2: Type: text/html, Size: 6509 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2018-03-05 20:33                 ` Gábor Boskovits
@ 2018-03-05 21:46                   ` Ricardo Wurmus
  0 siblings, 0 replies; 29+ messages in thread
From: Ricardo Wurmus @ 2018-03-05 21:46 UTC (permalink / raw)
  To: Gábor Boskovits; +Cc: 22533


Gábor Boskovits <boskovits@gmail.com> writes:

> 2018-03-05 16:36 GMT+01:00 Gábor Boskovits <boskovits@gmail.com>:
>
>> 2018-03-05 1:05 GMT+01:00 Ricardo Wurmus <rekado@elephly.net>:
>>
>>>
>>> Ricardo Wurmus <rekado@elephly.net> writes:
>>>
>>> > Unfortunately, this doesn’t fix all reproducibility problems with numpy:
>>> >
>>> > --8<---------------cut here---------------start------------->8---
>>> > Binary files /gnu/store/kd06ql8fynlydymzhhn
>>> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-
>>> packages/numpy/distutils/__pycache__/__config__.cpython-36.pyc and
>>> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.
>>> 14.0/lib/python3.6/site-packages/numpy/distutils/__
>>> pycache__/__config__.cpython-36.pyc differ
>>> > Binary files /gnu/store/kd06ql8fynlydymzhhn
>>> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-
>>> packages/numpy/distutils/__pycache__/exec_command.cpython-36.pyc and
>>> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.
>>> 14.0/lib/python3.6/site-packages/numpy/distutils/__
>>> pycache__/exec_command.cpython-36.pyc differ
>>> > Binary files /gnu/store/kd06ql8fynlydymzhhn
>>> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-
>>> packages/numpy/distutils/__pycache__/system_info.cpython-36.pyc and
>>> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.
>>> 14.0/lib/python3.6/site-packages/numpy/distutils/__
>>> pycache__/system_info.cpython-36.pyc differ
>>> > Binary files /gnu/store/kd06ql8fynlydymzhhn
>>> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-
>>> packages/numpy/__pycache__/__config__.cpython-36.pyc and
>>> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.
>>> 14.0/lib/python3.6/site-packages/numpy/__pycache__/__config__.cpython-36.pyc
>>> differ
>>> > Binary files /gnu/store/kd06ql8fynlydymzhhn
>>> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-
>>> packages/numpy/__pycache__/version.cpython-36.pyc and
>>> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.
>>> 14.0/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc
>>> differ
>>> > Binary files /gnu/store/kd06ql8fynlydymzhhn
>>> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-
>>> packages/numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc and
>>> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.
>>> 14.0/lib/python3.6/site-packages/numpy/testing/nose_
>>> tools/__pycache__/utils.cpython-36.pyc differ
>>> > --8<---------------cut here---------------end--------------->8---
>>>
>>> Here’s what diffoscope says:
>>>
>>> --8<---------------cut here---------------start------------->8---
>>> diffoscope /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.
>>> 14.0{-check,}/lib/python3.6/site-packages/numpy/__pycache_
>>> _/version.cpython-36.pyc
>>> --- /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.
>>> 14.0-check/lib/python3.6/site-packages/numpy/__pycache__/
>>> version.cpython-36.pyc
>>> +++ /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.
>>> 14.0/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc
>>> @@ -1,8 +1,8 @@
>>> -00000000: 330d 0d0a fa87 9c5a 2601 0000 e300 0000  3......Z&.......
>>> +00000000: 330d 0d0a c485 9c5a 2601 0000 e300 0000  3......Z&.......
>>>  00000010: 0000 0000 0000 0000 0001 0000 0040 0000  .............@..
>>>  00000020: 0073 2000 0000 6400 5a00 6400 5a01 6400  .s ...d.Z.d.Z.d.
>>>  00000030: 5a02 6401 5a03 6402 5a04 6504 731c 6502  Z.d.Z.d.Z.e.s.e.
>>>  00000040: 5a01 6403 5300 2904 7a06 312e 3134 2e30  Z.d.S.).z.1.14.0
>>>  00000050: da28 3639 3134 6262 3431 6630 6662 3363  .(6914bb41f0fb3c
>>>  00000060: 3162 6135 3030 6261 6534 6537 6436 3731  1ba500bae4e7d671
>>>  00000070: 6461 3935 3336 3738 3666 544e 2905 da0d  da9536786fTN)...
>>> --8<---------------cut here---------------end--------------->8---
>>>
>>> In other words: this is the timestamp field of the pyc file.
>>>
>>> Maybe this can be avoided by setting DETERMINISTIC_BUILD in the
>>> python-build-system?
>>>
>>>
>> It seems that the deterministic build patch already landed upstream
>> https://github.com/python/cpython/pull/5200, so we might consider
>> applying the upstream patches. WDYT?
>>
>
> And also this: https://github.com/python/cpython/pull/4575.
> I'm now having a look at this approach. However this second one
> seems quite invasive...

These patches are for what will become Python 3.7.  Python 3.6 does not
have support for “invalidation_mode”, so at least the first patch would
not work for us.

--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2018-03-05  0:05             ` Ricardo Wurmus
  2018-03-05 15:36               ` Gábor Boskovits
@ 2018-03-05 22:02               ` Ricardo Wurmus
  1 sibling, 0 replies; 29+ messages in thread
From: Ricardo Wurmus @ 2018-03-05 22:02 UTC (permalink / raw)
  To: Gábor Boskovits; +Cc: 22533


Ricardo Wurmus <rekado@elephly.net> writes:

> Ricardo Wurmus <rekado@elephly.net> writes:
>
>> Unfortunately, this doesn’t fix all reproducibility problems with numpy:
>>
>> --8<---------------cut here---------------start------------->8---
>> Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/distutils/__pycache__/__config__.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/distutils/__pycache__/__config__.cpython-36.pyc differ
>> Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/distutils/__pycache__/exec_command.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/distutils/__pycache__/exec_command.cpython-36.pyc differ
>> Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/distutils/__pycache__/system_info.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/distutils/__pycache__/system_info.cpython-36.pyc differ
>> Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/__pycache__/__config__.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/__pycache__/__config__.cpython-36.pyc differ
>> Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc differ
>> Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc differ
>> --8<---------------cut here---------------end--------------->8---
>
> Here’s what diffoscope says:
>
> --8<---------------cut here---------------start------------->8---
> diffoscope /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0{-check,}/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc
> --- /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc
> +++ /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc
> @@ -1,8 +1,8 @@
> -00000000: 330d 0d0a fa87 9c5a 2601 0000 e300 0000  3......Z&.......
> +00000000: 330d 0d0a c485 9c5a 2601 0000 e300 0000  3......Z&.......
>  00000010: 0000 0000 0000 0000 0001 0000 0040 0000  .............@..
>  00000020: 0073 2000 0000 6400 5a00 6400 5a01 6400  .s ...d.Z.d.Z.d.
>  00000030: 5a02 6401 5a03 6402 5a04 6504 731c 6502  Z.d.Z.d.Z.e.s.e.
>  00000040: 5a01 6403 5300 2904 7a06 312e 3134 2e30  Z.d.S.).z.1.14.0
>  00000050: da28 3639 3134 6262 3431 6630 6662 3363  .(6914bb41f0fb3c
>  00000060: 3162 6135 3030 6261 6534 6537 6436 3731  1ba500bae4e7d671
>  00000070: 6461 3935 3336 3738 3666 544e 2905 da0d  da9536786fTN)...
> --8<---------------cut here---------------end--------------->8---
>
> In other words: this is the timestamp field of the pyc file.
>
> Maybe this can be avoided by setting DETERMINISTIC_BUILD in the
> python-build-system?

It cannot.

So, something’s still missing from my patch.  Does anyone see what might
be missing?

-- 
Ricardo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2018-03-05  0:02           ` Ricardo Wurmus
  2018-03-05  0:05             ` Ricardo Wurmus
@ 2018-03-05 22:06             ` Ricardo Wurmus
  1 sibling, 0 replies; 29+ messages in thread
From: Ricardo Wurmus @ 2018-03-05 22:06 UTC (permalink / raw)
  To: Gábor Boskovits; +Cc: 22533


Ricardo Wurmus <rekado@elephly.net> writes:

> Ricardo Wurmus <rekado@elephly.net> writes:
>
>> I have applied this patch locally:
>>
>> diff --git a/gnu/packages/python.scm b/gnu/packages/python.scm
>> index 5f701701a..0d1ecc3c6 100644
>> --- a/gnu/packages/python.scm
>> +++ b/gnu/packages/python.scm
>> @@ -359,8 +359,42 @@ data types.")
>>                                "Lib/ctypes/test/test_win32.py" ; fails on aarch64
>>                                "Lib/test/test_fcntl.py")) ; fails on aarch64
>>                    #t))))
>> -    (arguments (substitute-keyword-arguments (package-arguments python-2)
>> -                 ((#:tests? _) #t)))
>> +    (arguments
>> +     (substitute-keyword-arguments (package-arguments python-2)
>> +       ((#:tests? _) #t)
>> +       ((#:phases phases)
>> +        `(modify-phases ,phases
>> +           (add-after 'unpack 'patch-timestamp-for-pyc-files
>> +             (lambda _
>> +               ;; We set DETERMINISTIC_BUILD to only override the mtime when
>> +               ;; building with Guix, lest we break auto-compilation in
>> +               ;; environments.
>> +               (setenv "DETERMINISTIC_BUILD" "1")
>> +               (substitute* "Lib/py_compile.py"
>> +                 (("source_stats\\['mtime'\\]")
>> +                  "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])"))
>> +
>> +               ;; Use deterministic hashes for strings, bytes, and datetime
>> +               ;; objects.
>> +               (setenv "PYTHONHASHSEED" "0")
>> +
>> +               ;; Reset mtime when validating bytecode header.
>> +               (substitute* "Lib/importlib/_bootstrap_external.py"
>> +                 (("source_mtime = int\\(source_stats\\['mtime'\\]\\)")
>> +                  "source_mtime = 1"))
>> +               #t))
>> +           (add-after 'unpack 'disable-timestamp-tests
>> +             (lambda _
>> +               (substitute* "Lib/test/test_importlib/source/test_file_loader.py"
>> +                 (("test_bad_marshal")
>> +                  "disable_test_bad_marshal")
>> +                 (("test_no_marshal")
>> +                  "disable_test_no_marshal")
>> +                 (("test_non_code_marshal")
>> +                  "disable_test_non_code_marshal"))
>> +               #t))
>> +           (add-before 'check 'allow-non-deterministic-compilation
>> +             (lambda _ (unsetenv "DETERMINISTIC_BUILD") #t))))))
>>      (native-search-paths
>>       (list (search-path-specification
>>              (variable "PYTHONPATH")
>>
>> It allows me to build python-six and python-sip reproducibly.  It does
>> not fix problems with Python 2, and I haven’t yet tested if it causes
>> any new problems.

I should also note that Python 3 itself still contains pyc files with
timestamps.  This could be the reason why in Nix all pyc files are
rebuilt (more than once).

--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2018-03-04 19:18         ` Ricardo Wurmus
  2018-03-05  0:02           ` Ricardo Wurmus
@ 2018-03-05 23:21           ` Marius Bakke
  2018-03-06 13:28             ` Ricardo Wurmus
  2018-03-08 10:39           ` Gábor Boskovits
  2 siblings, 1 reply; 29+ messages in thread
From: Marius Bakke @ 2018-03-05 23:21 UTC (permalink / raw)
  To: Ricardo Wurmus, Gábor Boskovits; +Cc: 22533

[-- Attachment #1: Type: text/plain, Size: 4105 bytes --]

Ricardo Wurmus <rekado@elephly.net> writes:

> I have applied this patch locally:
>
> diff --git a/gnu/packages/python.scm b/gnu/packages/python.scm
> index 5f701701a..0d1ecc3c6 100644
> --- a/gnu/packages/python.scm
> +++ b/gnu/packages/python.scm
> @@ -359,8 +359,42 @@ data types.")
>                                "Lib/ctypes/test/test_win32.py" ; fails on aarch64
>                                "Lib/test/test_fcntl.py")) ; fails on aarch64
>                    #t))))
> -    (arguments (substitute-keyword-arguments (package-arguments python-2)
> -                 ((#:tests? _) #t)))
> +    (arguments
> +     (substitute-keyword-arguments (package-arguments python-2)
> +       ((#:tests? _) #t)
> +       ((#:phases phases)
> +        `(modify-phases ,phases
> +           (add-after 'unpack 'patch-timestamp-for-pyc-files
> +             (lambda _
> +               ;; We set DETERMINISTIC_BUILD to only override the mtime when
> +               ;; building with Guix, lest we break auto-compilation in
> +               ;; environments.
> +               (setenv "DETERMINISTIC_BUILD" "1")
> +               (substitute* "Lib/py_compile.py"
> +                 (("source_stats\\['mtime'\\]")
> +                  "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])"))
> +
> +               ;; Use deterministic hashes for strings, bytes, and datetime
> +               ;; objects.
> +               (setenv "PYTHONHASHSEED" "0")
> +
> +               ;; Reset mtime when validating bytecode header.
> +               (substitute* "Lib/importlib/_bootstrap_external.py"
> +                 (("source_mtime = int\\(source_stats\\['mtime'\\]\\)")
> +                  "source_mtime = 1"))
> +               #t))
> +           (add-after 'unpack 'disable-timestamp-tests
> +             (lambda _
> +               (substitute* "Lib/test/test_importlib/source/test_file_loader.py"
> +                 (("test_bad_marshal")
> +                  "disable_test_bad_marshal")
> +                 (("test_no_marshal")
> +                  "disable_test_no_marshal")
> +                 (("test_non_code_marshal")
> +                  "disable_test_non_code_marshal"))
> +               #t))
> +           (add-before 'check 'allow-non-deterministic-compilation
> +             (lambda _ (unsetenv "DETERMINISTIC_BUILD") #t))))))
>      (native-search-paths
>       (list (search-path-specification
>              (variable "PYTHONPATH")
>
> It allows me to build python-six and python-sip reproducibly.  It does
> not fix problems with Python 2, and I haven’t yet tested if it causes
> any new problems.
>
> It’s a little worrying that I had to disable three more tests that I
> think shouldn’t have failed.

Woow, nice work!  I can't tell what's going on with the tests, they do
some bytecode manipulation stuff.  Maybe it does not expect the low
timestamp somehow?

https://github.com/python/cpython/blob/374c6e178a7599aae46c857b17c6c8bc19dfe4c2/Lib/test/test_importlib/source/test_file_loader.py#L457-L484

I guess we'll do at least one 'core-updates' before 3.7 is released, so
it makes sense to include this.  It should also give us some experience
that might be relevant for 2.7, since it probably won't get the upstream
reproducibility patch that relies on 3.7 features.

The only remark I have is: is introducing a new variable necessary?
SOURCE_DATE_EPOCH implies that the user wants a deterministic build;
the upstream patch doesn't actually honor it outside of making the
hashing method deterministic.  So, I think it might be enough to just
test for SOURCE_DATE_EPOCH instead of DETERMINISTIC_BUILD.  The former
is also already set in the build environment.

However, I just noticed that you unset DETERMINISTIC_BUILD before the
'check' phase.  Did it break more things?

I suppose we'll have to set PYTHONHASHSEED somewhere in
python-build-system as well.  Did you check if that makes a difference
for numpy?  Perhaps it's enough to set it if we add an auto-compilation
step?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2018-03-05 23:21           ` Marius Bakke
@ 2018-03-06 13:28             ` Ricardo Wurmus
  2018-03-06 14:43               ` Ricardo Wurmus
  0 siblings, 1 reply; 29+ messages in thread
From: Ricardo Wurmus @ 2018-03-06 13:28 UTC (permalink / raw)
  To: Marius Bakke; +Cc: 22533


Marius Bakke <mbakke@fastmail.com> writes:

> The only remark I have is: is introducing a new variable necessary?
> SOURCE_DATE_EPOCH implies that the user wants a deterministic build;
> the upstream patch doesn't actually honor it outside of making the
> hashing method deterministic.  So, I think it might be enough to just
> test for SOURCE_DATE_EPOCH instead of DETERMINISTIC_BUILD.  The former
> is also already set in the build environment.

> However, I just noticed that you unset DETERMINISTIC_BUILD before the
> 'check' phase.  Did it break more things?

Yes, it broke a bunch of tests that are all about recompiling files when
they are considered stale.

> I suppose we'll have to set PYTHONHASHSEED somewhere in
> python-build-system as well.  Did you check if that makes a difference
> for numpy?  Perhaps it's enough to set it if we add an auto-compilation
> step?

Right, I’m going to test this with numpy now.  Thanks for the hint!

-- 
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2018-03-06 13:28             ` Ricardo Wurmus
@ 2018-03-06 14:43               ` Ricardo Wurmus
  2018-03-06 14:57                 ` Gábor Boskovits
  0 siblings, 1 reply; 29+ messages in thread
From: Ricardo Wurmus @ 2018-03-06 14:43 UTC (permalink / raw)
  To: Marius Bakke; +Cc: 22533


Ricardo Wurmus <rekado@elephly.net> writes:

> Marius Bakke <mbakke@fastmail.com> writes:
>
>> I suppose we'll have to set PYTHONHASHSEED somewhere in
>> python-build-system as well.  Did you check if that makes a difference
>> for numpy?  Perhaps it's enough to set it if we add an auto-compilation
>> step?
>
> Right, I’m going to test this with numpy now.  Thanks for the hint!

It did help with one file, which is now built reproducibly, namely

  lib/python3.6/site-packages/numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc

This leaves five files in numpy that shouldn’t be but unfortunately are
different.

--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2018-03-06 14:43               ` Ricardo Wurmus
@ 2018-03-06 14:57                 ` Gábor Boskovits
  0 siblings, 0 replies; 29+ messages in thread
From: Gábor Boskovits @ 2018-03-06 14:57 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 22533

[-- Attachment #1: Type: text/plain, Size: 1029 bytes --]

2018-03-06 15:43 GMT+01:00 Ricardo Wurmus <rekado@elephly.net>:

>
> Ricardo Wurmus <rekado@elephly.net> writes:
>
> > Marius Bakke <mbakke@fastmail.com> writes:
> >
> >> I suppose we'll have to set PYTHONHASHSEED somewhere in
> >> python-build-system as well.  Did you check if that makes a difference
> >> for numpy?  Perhaps it's enough to set it if we add an auto-compilation
> >> step?
> >
> > Right, I’m going to test this with numpy now.  Thanks for the hint!
>
> It did help with one file, which is now built reproducibly, namely
>
>   lib/python3.6/site-packages/numpy/testing/nose_tools/__
> pycache__/utils.cpython-36.pyc
>
> This leaves five files in numpy that shouldn’t be but unfortunately are
> different.
>
>
Unfortunately backporting the upstream version is not straightforward at
all.
There are too many changes. I will have a look at those test failures
instead.


> --
> Ricardo
>
> GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
> https://elephly.net
>
>
>

[-- Attachment #2: Type: text/html, Size: 1861 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2018-03-04 19:18         ` Ricardo Wurmus
  2018-03-05  0:02           ` Ricardo Wurmus
  2018-03-05 23:21           ` Marius Bakke
@ 2018-03-08 10:39           ` Gábor Boskovits
  2019-01-14 13:40             ` Ricardo Wurmus
  2 siblings, 1 reply; 29+ messages in thread
From: Gábor Boskovits @ 2018-03-08 10:39 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: 22533

[-- Attachment #1: Type: text/plain, Size: 3218 bytes --]

2018-03-04 20:18 GMT+01:00 Ricardo Wurmus <rekado@elephly.net>:

> I have applied this patch locally:
>
>
> diff --git a/gnu/packages/python.scm b/gnu/packages/python.scm
> index 5f701701a..0d1ecc3c6 100644
> --- a/gnu/packages/python.scm
> +++ b/gnu/packages/python.scm
> @@ -359,8 +359,42 @@ data types.")
>                                "Lib/ctypes/test/test_win32.py" ; fails on
> aarch64
>                                "Lib/test/test_fcntl.py")) ; fails on
> aarch64
>                    #t))))
> -    (arguments (substitute-keyword-arguments (package-arguments python-2)
> -                 ((#:tests? _) #t)))
> +    (arguments
> +     (substitute-keyword-arguments (package-arguments python-2)
> +       ((#:tests? _) #t)
> +       ((#:phases phases)
> +        `(modify-phases ,phases
> +           (add-after 'unpack 'patch-timestamp-for-pyc-files
> +             (lambda _
> +               ;; We set DETERMINISTIC_BUILD to only override the mtime
> when
> +               ;; building with Guix, lest we break auto-compilation in
> +               ;; environments.
> +               (setenv "DETERMINISTIC_BUILD" "1")
> +               (substitute* "Lib/py_compile.py"
> +                 (("source_stats\\['mtime'\\]")
> +                  "(1 if 'DETERMINISTIC_BUILD' in os.environ else
> source_stats['mtime'])"))
> +
> +               ;; Use deterministic hashes for strings, bytes, and
> datetime
> +               ;; objects.
> +               (setenv "PYTHONHASHSEED" "0")
> +
> +               ;; Reset mtime when validating bytecode header.
> +               (substitute* "Lib/importlib/_bootstrap_external.py"
> +                 (("source_mtime = int\\(source_stats\\['mtime'\\]\\)")
> +                  "source_mtime = 1"))
> +               #t))
> +           (add-after 'unpack 'disable-timestamp-tests
> +             (lambda _
> +               (substitute* "Lib/test/test_importlib/
> source/test_file_loader.py"
> +                 (("test_bad_marshal")
> +                  "disable_test_bad_marshal")
> +                 (("test_no_marshal")
> +                  "disable_test_no_marshal")
> +                 (("test_non_code_marshal")
> +                  "disable_test_non_code_marshal"))
> +               #t))
> +           (add-before 'check 'allow-non-deterministic-compilation
> +             (lambda _ (unsetenv "DETERMINISTIC_BUILD") #t))))))
>      (native-search-paths
>       (list (search-path-specification
>              (variable "PYTHONPATH")
>
>
> It allows me to build python-six and python-sip reproducibly.  It does
> not fix problems with Python 2, and I haven’t yet tested if it causes
> any new problems.
>
> It’s a little worrying that I had to disable three more tests that I
> think shouldn’t have failed.
>
>
Ok, I've checked the test issue again. If we change the
_bootstrap_external.py
substitution to:
"source_mtime = 1 if 'DETERMINISTIC_BUILD' in _os.environ else
int(source_stats['mtime'])"
the test do not fail any more. WDYT?



> What do you think?
>
> --
> Ricardo
>
> GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
> https://elephly.net
>
>

[-- Attachment #2: Type: text/html, Size: 4525 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2018-03-08 10:39           ` Gábor Boskovits
@ 2019-01-14 13:40             ` Ricardo Wurmus
  2019-02-03 21:22               ` Ricardo Wurmus
  0 siblings, 1 reply; 29+ messages in thread
From: Ricardo Wurmus @ 2019-01-14 13:40 UTC (permalink / raw)
  To: Gábor Boskovits; +Cc: 22533


Now that we’re using Python 3.7 and this version supports hash-based pyc
files, is this still an issue?  Do we need to do anything to enable
hash-based pyc compilation?

See:
  https://docs.python.org/3/whatsnew/3.7.html#pep-552-hash-based-pyc-files
  https://www.python.org/dev/peps/pep-0552/

-- 
Ricardo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2019-01-14 13:40             ` Ricardo Wurmus
@ 2019-02-03 21:22               ` Ricardo Wurmus
  2019-02-04 22:39                 ` Ludovic Courtès
  0 siblings, 1 reply; 29+ messages in thread
From: Ricardo Wurmus @ 2019-02-03 21:22 UTC (permalink / raw)
  To: Gábor Boskovits; +Cc: 22533-done


Ricardo Wurmus <rekado@elephly.net> writes:

> Now that we’re using Python 3.7 and this version supports hash-based pyc
> files, is this still an issue?  Do we need to do anything to enable
> hash-based pyc compilation?
>
> See:
>   https://docs.python.org/3/whatsnew/3.7.html#pep-552-hash-based-pyc-files
>   https://www.python.org/dev/peps/pep-0552/

It looks like this is no longer a problem.  I built borg just now and
the pyc files are reproducible.

(The man pages include a date stamp, though, which I’m trying to patch
now.)

--
Ricardo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* bug#22533: Python bytecode reproducibility
  2019-02-03 21:22               ` Ricardo Wurmus
@ 2019-02-04 22:39                 ` Ludovic Courtès
  0 siblings, 0 replies; 29+ messages in thread
From: Ludovic Courtès @ 2019-02-04 22:39 UTC (permalink / raw)
  To: 22533

Ricardo Wurmus <rekado@elephly.net> skribis:

> Ricardo Wurmus <rekado@elephly.net> writes:
>
>> Now that we’re using Python 3.7 and this version supports hash-based pyc
>> files, is this still an issue?  Do we need to do anything to enable
>> hash-based pyc compilation?
>>
>> See:
>>   https://docs.python.org/3/whatsnew/3.7.html#pep-552-hash-based-pyc-files
>>   https://www.python.org/dev/peps/pep-0552/
>
> It looks like this is no longer a problem.  I built borg just now and
> the pyc files are reproducible.

Yay! \o/

Ludo'.

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2019-02-04 22:56 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-02  5:15 bug#22533: Non-determinism in python-3 ".pyc" bytecode Leo Famulari
2016-02-02  8:54 ` Leo Famulari
2016-02-02 20:41 ` Ludovic Courtès
2016-02-04 23:17   ` Leo Famulari
2016-03-29 23:11     ` Cyril Roelandt
2016-03-29 23:13     ` Cyril Roelandt
2016-04-06  8:29       ` Ludovic Courtès
2017-05-26 13:41 ` bug#22533: Python bytecode reproducibility Marius Bakke
2018-03-03 22:37   ` Ricardo Wurmus
2018-03-04  9:21     ` Gábor Boskovits
2018-03-04 12:46       ` Ricardo Wurmus
2018-03-04 15:30         ` Gábor Boskovits
2018-03-04 19:18         ` Ricardo Wurmus
2018-03-05  0:02           ` Ricardo Wurmus
2018-03-05  0:05             ` Ricardo Wurmus
2018-03-05 15:36               ` Gábor Boskovits
2018-03-05 20:33                 ` Gábor Boskovits
2018-03-05 21:46                   ` Ricardo Wurmus
2018-03-05 22:02               ` Ricardo Wurmus
2018-03-05 22:06             ` Ricardo Wurmus
2018-03-05 23:21           ` Marius Bakke
2018-03-06 13:28             ` Ricardo Wurmus
2018-03-06 14:43               ` Ricardo Wurmus
2018-03-06 14:57                 ` Gábor Boskovits
2018-03-08 10:39           ` Gábor Boskovits
2019-01-14 13:40             ` Ricardo Wurmus
2019-02-03 21:22               ` Ricardo Wurmus
2019-02-04 22:39                 ` Ludovic Courtès
2018-03-05  9:25     ` Ludovic Courtès

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).