* bug#25598: R packages are not bit-reproducible
@ 2017-02-01 9:55 Ludovic Courtès
2017-02-01 11:08 ` Ricardo Wurmus
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Ludovic Courtès @ 2017-02-01 9:55 UTC (permalink / raw)
To: 25598
R packages build non-deterministically:
https://www.gnu.org/software/guix/packages/reproducibility.html
--8<---------------cut here---------------start------------->8---
$ wget -q -O - https://mirror.hydra.gnu.org/nar/imiwif0wn7dxcc7f4zdq09y1l1132pqj-r-zoo-1.7-14 | bunzip2 | guix archive -x one
$ wget -q -O - https://bayfront.guixsd.org/nar/gzip/imiwif0wn7dxcc7f4zdq09y1l1132pqj-r-zoo-1.7-14 | gunzip | guix archive -x two
$ diff -ru one two
diff -ru one/site-library/zoo/DESCRIPTION two/site-library/zoo/DESCRIPTION
--- one/site-library/zoo/DESCRIPTION 2017-02-01 10:49:49.700423133 +0100
+++ two/site-library/zoo/DESCRIPTION 2017-02-01 10:49:57.224462007 +0100
@@ -28,4 +28,4 @@
Maintainer: Achim Zeileis <Achim.Zeileis@R-project.org>
Repository: CRAN
Date/Publication: 2016-12-19 09:38:14
-Built: R 3.3.2; x86_64-unknown-linux-gnu; 2017-01-15 03:12:57 UTC; unix
+Built: R 3.3.2; x86_64-unknown-linux-gnu; 2017-01-23 21:48:44 UTC; unix
Binary files one/site-library/zoo/Meta/package.rds and two/site-library/zoo/Meta/package.rds differ
--8<---------------cut here---------------end--------------->8---
First there’s a timestamp in ‘DESCRIPTION’ (this is discussed at
<https://bugs.debian.org/782764>).
The .rds differences seem less trivial but there’s apparently a fix at
<https://bugs.debian.org/774031>.
Ludo’.
^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#25598: R packages are not bit-reproducible
2017-02-01 9:55 bug#25598: R packages are not bit-reproducible Ludovic Courtès
@ 2017-02-01 11:08 ` Ricardo Wurmus
2017-02-01 13:00 ` Ludovic Courtès
[not found] ` <87bmuafcgg.fsf@gnu.org>
` (2 subsequent siblings)
3 siblings, 1 reply; 9+ messages in thread
From: Ricardo Wurmus @ 2017-02-01 11:08 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: 25598
[-- Attachment #1: Type: text/plain, Size: 163 bytes --]
It looks like R 3.3.2 already includes the fixes but they need to be
explicitly requested when installing packages.
Attached is a patch that seems to fix this.
[-- Attachment #2: 0001-build-r-build-system-Use-deterministic-built-date.patch --]
[-- Type: text/x-patch, Size: 1340 bytes --]
From fa42971cb7099e3b370565de5d3f454faecf0369 Mon Sep 17 00:00:00 2001
From: Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de>
Date: Wed, 1 Feb 2017 11:42:34 +0100
Subject: [PATCH] build: r-build-system: Use deterministic built date.
Fixes <http://bugs.gnu.org/25598>.
* guix/build/r-build-system.scm (install): Pass "--built-timestamp"
option to make build deterministic.
---
guix/build/r-build-system.scm | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/guix/build/r-build-system.scm b/guix/build/r-build-system.scm
index 3fc13eb83..24aa73d4f 100644
--- a/guix/build/r-build-system.scm
+++ b/guix/build/r-build-system.scm
@@ -1,5 +1,5 @@
;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2015 Ricardo Wurmus <rekado@elephly.net>
+;;; Copyright © 2015, 2017 Ricardo Wurmus <rekado@elephly.net>
;;;
;;; This file is part of GNU Guix.
;;;
@@ -84,6 +84,7 @@
(params (append configure-flags
(list "--install-tests"
(string-append "--library=" site-library)
+ "--built-timestamp=1970-01-01"
".")))
(site-path (string-append site-library ":"
(generate-site-path inputs))))
--
2.11.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* bug#25598: R packages are not bit-reproducible
2017-02-01 11:08 ` Ricardo Wurmus
@ 2017-02-01 13:00 ` Ludovic Courtès
0 siblings, 0 replies; 9+ messages in thread
From: Ludovic Courtès @ 2017-02-01 13:00 UTC (permalink / raw)
To: Ricardo Wurmus; +Cc: 25598
Hi!
Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> skribis:
> From fa42971cb7099e3b370565de5d3f454faecf0369 Mon Sep 17 00:00:00 2001
> From: Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de>
> Date: Wed, 1 Feb 2017 11:42:34 +0100
> Subject: [PATCH] build: r-build-system: Use deterministic built date.
>
> Fixes <http://bugs.gnu.org/25598>.
>
> * guix/build/r-build-system.scm (install): Pass "--built-timestamp"
> option to make build deterministic.
Great. I think it’s fine for master, that’s 276 packages but they don’t
take long to build.
Does that also help with the .rds discrepancies?
Thank you for the super-fast reply!
Ludo’.
^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <87bmuafcgg.fsf@gnu.org>]
* bug#25598: [PATCH] More reproducibility fixes for R.
[not found] ` <87bmuafcgg.fsf@gnu.org>
@ 2017-03-08 11:53 ` Ricardo Wurmus
0 siblings, 0 replies; 9+ messages in thread
From: Ricardo Wurmus @ 2017-03-08 11:53 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: Guix-devel, 25598
Ludovic Courtès <ludo@gnu.org> writes:
> Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> skribis:
>
>> attached are more reproducibility fixes for R. Unfortunately, it seems
>> that files of type “rdb”, “rdx”, and “rds” are still not reproducible.
>> This leaves us with the following files in R that are currently not
>> reproducible:
>
> Could it be that --built-timestamp is not honored for R modules within
> R?
With these two patches the flag *should* be honoured. I don’t
understand yet where the rds differences come from, but I’ll
investigate this now.
> Do the Debian patches mentioned in #25598 help?
R 3.3.2 already includes the patches that were posted on Debian bug
#774031. The patch at #782764 is the equivalent of our change to the
r-build-system to pass down the flag to R packages.
>> From e8cd2114b824ab6fed671c2214956ee22deeaedf Mon Sep 17 00:00:00 2001
>> From: Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de>
>> Date: Thu, 9 Feb 2017 14:34:57 +0100
>> Subject: [PATCH 1/2] gnu: r: Fix syntax for INSTALL_OPTS.
>>
>> This is a follow-up to commit 4621acfd8272fa93d0530faa5f015b26a194b587.
>>
>> * gnu/packages/statistics.scm (r)[arguments]: Ensure that
>> "--built-timestamp" appears on the same line as the other INSTALL_OPTS.
>
> So the previous attempt had no effect, right?
Yeah, it was not effective and I failed to use “guix build --check”
properly (without grafts), so I thought everything was fine already.
>> From 95b939f662a29b3cc6973a2fba286f32faf010c1 Mon Sep 17 00:00:00 2001
>> From: Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de>
>> Date: Thu, 9 Feb 2017 15:40:02 +0100
>> Subject: [PATCH 2/2] gnu: r: Fix more reproducibility problems.
>>
>> * gnu/packages/statistics.scm (r)[arguments]: Patch locations in the
>> build system that need special treatment for reproducibility.
>
> LGTM, thanks!
I pushed both to master.
--
Ricardo
GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC
https://elephly.net
^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <idj4m03whdy.fsf@bimsb-sys02.mdc-berlin.net>]
* bug#25598: [PATCH] More reproducibility fixes for R.
[not found] ` <idj4m03whdy.fsf@bimsb-sys02.mdc-berlin.net>
@ 2017-02-10 12:38 ` Ludovic Courtès
2017-03-08 17:56 ` Ricardo Wurmus
1 sibling, 0 replies; 9+ messages in thread
From: Ludovic Courtès @ 2017-02-10 12:38 UTC (permalink / raw)
To: Ricardo Wurmus; +Cc: Guix-devel, 25598
Hi!
Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> skribis:
> attached are more reproducibility fixes for R. Unfortunately, it seems
> that files of type “rdb”, “rdx”, and “rds” are still not reproducible.
> This leaves us with the following files in R that are currently not
> reproducible:
Could it be that --built-timestamp is not honored for R modules within R?
Do the Debian patches mentioned in #25598 help?
> From e8cd2114b824ab6fed671c2214956ee22deeaedf Mon Sep 17 00:00:00 2001
> From: Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de>
> Date: Thu, 9 Feb 2017 14:34:57 +0100
> Subject: [PATCH 1/2] gnu: r: Fix syntax for INSTALL_OPTS.
>
> This is a follow-up to commit 4621acfd8272fa93d0530faa5f015b26a194b587.
>
> * gnu/packages/statistics.scm (r)[arguments]: Ensure that
> "--built-timestamp" appears on the same line as the other INSTALL_OPTS.
So the previous attempt had no effect, right?
LGTM.
> From 95b939f662a29b3cc6973a2fba286f32faf010c1 Mon Sep 17 00:00:00 2001
> From: Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de>
> Date: Thu, 9 Feb 2017 15:40:02 +0100
> Subject: [PATCH 2/2] gnu: r: Fix more reproducibility problems.
>
> * gnu/packages/statistics.scm (r)[arguments]: Patch locations in the
> build system that need special treatment for reproducibility.
LGTM, thanks!
Ludo’.
^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#25598: [PATCH] More reproducibility fixes for R.
[not found] ` <idj4m03whdy.fsf@bimsb-sys02.mdc-berlin.net>
2017-02-10 12:38 ` Ludovic Courtès
@ 2017-03-08 17:56 ` Ricardo Wurmus
1 sibling, 0 replies; 9+ messages in thread
From: Ricardo Wurmus @ 2017-03-08 17:56 UTC (permalink / raw)
To: Guix-devel; +Cc: 25598
Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> writes:
> attached are more reproducibility fixes for R. Unfortunately, it seems
> that files of type “rdb”, “rdx”, and “rds” are still not reproducible.
> This leaves us with the following files in R that are currently not
> reproducible:
[…]
> /lib/R/library/boot/help/paths.rds
> /lib/R/library/class/help/paths.rds
> /lib/R/library/cluster/help/paths.rds
> /lib/R/library/codetools/help/paths.rds
> /lib/R/library/foreign/help/paths.rds
> /lib/R/library/KernSmooth/help/paths.rds
> /lib/R/library/lattice/help/paths.rds
> /lib/R/library/MASS/help/paths.rds
> /lib/R/library/Matrix/help/paths.rds
> /lib/R/library/mgcv/help/paths.rds
> /lib/R/library/nlme/help/paths.rds
> /lib/R/library/nnet/help/paths.rds
> /lib/R/library/rpart/help/paths.rds
> /lib/R/library/spatial/help/paths.rds
> /lib/R/library/survival/help/paths.rds
[…]
>
> I’ll try to figure out if there’s something we can do to make them
> reproducible (there’s a Debian bug report with relevant information). I
> had originally assumed that 3.3.2 already included fixes for this.
The paths.rds files contain temporary paths like this:
/tmp/guix-build-r-3.3.2.drv-0/RtmpCmeE9W/R.INSTALL43fb733deccc/survival/
These paths contain the random strings produced by “mkdtemp”. This
happens in “src/main/sysutils.c”.
I don’t know if we need these files. All of them are part of the
recommended packages. I don’t know if these are also built by Debian.
I patched the package in a previous commit to override the built
timestamp, and it does seem to have an effect on the DESCRIPTION file,
but it does not affect the .rd* files. More investigation required.
--
Ricardo
GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC
https://elephly.net
^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#25598: [PATCH] gnu: r: Fix remaining reproducibility problems.
2017-02-01 9:55 bug#25598: R packages are not bit-reproducible Ludovic Courtès
` (2 preceding siblings ...)
[not found] ` <idj4m03whdy.fsf@bimsb-sys02.mdc-berlin.net>
@ 2017-03-16 7:54 ` Ricardo Wurmus
2017-03-16 9:00 ` Ludovic Courtès
3 siblings, 1 reply; 9+ messages in thread
From: Ricardo Wurmus @ 2017-03-16 7:54 UTC (permalink / raw)
To: 25598
Fixes <https://bugs.gnu.org/25598>.
* gnu/packages/statistics.scm (r)[arguments]: Add remaining reproducibility
fixes to "build-reproducibly" phase.
---
gnu/packages/statistics.scm | 35 ++++++++++++++++++++++++++++++++++-
1 file changed, 34 insertions(+), 1 deletion(-)
diff --git a/gnu/packages/statistics.scm b/gnu/packages/statistics.scm
index 656895273..2a20abd86 100644
--- a/gnu/packages/statistics.scm
+++ b/gnu/packages/statistics.scm
@@ -134,11 +134,44 @@ be output in text, PostScript, PDF or HTML.")
#t))
(add-after 'unpack 'build-reproducibly
(lambda _
- ;; Ensure that gzipped files are reproducible
+ ;; The documentation contains time stamps to demonstrate
+ ;; documentation generation in different phases.
+ (substitute* "src/library/tools/man/Rd2HTML.Rd"
+ (("\\\\%Y-\\\\%m-\\\\%d at \\\\%H:\\\\%M:\\\\%S")
+ "(removed for reproducibility)"))
+
+ ;; Remove timestamp from tracing environment. This fixes
+ ;; reproducibility of "methods.rd{b,x}".
+ (substitute* "src/library/methods/R/trace.R"
+ (("dateCreated = Sys.time\\(\\)")
+ "dateCreated = as.POSIXct(\"1970-1-1 00:00:00\", tz = \"UTC\")"))
+
+ ;; Ensure that gzipped files are reproducible.
(substitute* '("src/library/grDevices/Makefile.in"
"doc/manual/Makefile.in")
(("R_GZIPCMD\\)" line)
(string-append line " -n")))
+
+ ;; The "srcfile" procedure in "src/library/base/R/srcfile.R"
+ ;; queries the mtime of a given file and records it in an object.
+ ;; This is acceptable at runtime to detect stale source files,
+ ;; but it destroys reproducibility at build time.
+ ;;
+ ;; Instead of disabling this feature, which may have unexpected
+ ;; consequences, we reset the mtime of generated files before
+ ;; passing them to the "srcfile" procedure.
+ (substitute* "src/library/Makefile.in"
+ (("@\\(cd base && \\$\\(MAKE\\) mkdesc\\)" line)
+ (string-append line "\n find $(top_builddir)/library/tools | xargs touch -d '1970-01-01'; \n"))
+ (("@\\$\\(MAKE\\) Rdobjects" line)
+ (string-append "@find $(srcdir)/tools | xargs touch -d '1970-01-01'; \n "
+ line)))
+ (substitute* "src/library/tools/Makefile.in"
+ (("@\\$\\(INSTALL_DATA\\) all.R \\$\\(top_builddir\\)/library/\\$\\(pkg\\)/R/\\$\\(pkg\\)" line)
+ (string-append
+ line
+ "\n find $(srcdir)/$(pkg) $(top_builddir)/library/$(pkg) | xargs touch -d \"1970-01-01\"; \n")))
+
;; This library is installed using "install_package_description",
;; so we need to pass the "builtStamp" argument.
(substitute* "src/library/tools/Makefile.in"
--
2.12.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-03-17 9:18 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-02-01 9:55 bug#25598: R packages are not bit-reproducible Ludovic Courtès
2017-02-01 11:08 ` Ricardo Wurmus
2017-02-01 13:00 ` Ludovic Courtès
[not found] ` <87bmuafcgg.fsf@gnu.org>
2017-03-08 11:53 ` bug#25598: [PATCH] More reproducibility fixes for R Ricardo Wurmus
[not found] ` <idj4m03whdy.fsf@bimsb-sys02.mdc-berlin.net>
2017-02-10 12:38 ` Ludovic Courtès
2017-03-08 17:56 ` Ricardo Wurmus
2017-03-16 7:54 ` bug#25598: [PATCH] gnu: r: Fix remaining reproducibility problems Ricardo Wurmus
2017-03-16 9:00 ` Ludovic Courtès
2017-03-17 9:17 ` Ricardo Wurmus
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).