unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
* bug#45675: Zip-based archives store timestamps
@ 2021-01-05 13:10 Miguel Ángel Arruga Vivas
  2021-01-05 15:17 ` Julien Lepiller
  0 siblings, 1 reply; 4+ messages in thread
From: Miguel Ángel Arruga Vivas @ 2021-01-05 13:10 UTC (permalink / raw)
  To: 45675

A procedure like reset-gzip-timestamp should be useful for
reproducibility purposes, adapted to zip based archives as Smalltalk's
STAR or Java's JAR binary formats, as some or all of their contents are
generated usually at build time.

On the latest Zip specification[1], which only seem to be encumbered
regarding encryption, there is one header ID which could be used for
timestamp purposes (0x0020) but third party header IDs are allowed,
which include a "commonly used" 0x5455 timestamp.

[1] https://pkware.cachefly.net/webdocs/APPNOTE/APPNOTE-6.3.8.TXT




^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#45675: Zip-based archives store timestamps
  2021-01-05 13:10 bug#45675: Zip-based archives store timestamps Miguel Ángel Arruga Vivas
@ 2021-01-05 15:17 ` Julien Lepiller
  2021-01-06 22:34   ` Miguel Ángel Arruga Vivas
  0 siblings, 1 reply; 4+ messages in thread
From: Julien Lepiller @ 2021-01-05 15:17 UTC (permalink / raw)
  To: Miguel Ángel Arruga Vivas, 45675

[-- Attachment #1: Type: text/plain, Size: 768 bytes --]

For java packages, we have a strip-jar-timestamps phase in the ant-build-system.

Le 5 janvier 2021 08:10:37 GMT-05:00, "Miguel Ángel Arruga Vivas" <rosen644835@gmail.com> a écrit :
>A procedure like reset-gzip-timestamp should be useful for
>reproducibility purposes, adapted to zip based archives as Smalltalk's
>STAR or Java's JAR binary formats, as some or all of their contents are
>generated usually at build time.
>
>On the latest Zip specification[1], which only seem to be encumbered
>regarding encryption, there is one header ID which could be used for
>timestamp purposes (0x0020) but third party header IDs are allowed,
>which include a "commonly used" 0x5455 timestamp.
>
>[1] https://pkware.cachefly.net/webdocs/APPNOTE/APPNOTE-6.3.8.TXT

[-- Attachment #2: Type: text/html, Size: 1115 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#45675: Zip-based archives store timestamps
  2021-01-05 15:17 ` Julien Lepiller
@ 2021-01-06 22:34   ` Miguel Ángel Arruga Vivas
  2021-01-06 23:10     ` Julien Lepiller
  0 siblings, 1 reply; 4+ messages in thread
From: Miguel Ángel Arruga Vivas @ 2021-01-06 22:34 UTC (permalink / raw)
  To: Julien Lepiller; +Cc: 45675

[-- Attachment #1: Type: text/plain, Size: 389 bytes --]

Hi,

Julien Lepiller <julien@lepiller.eu> writes:

> For java packages, we have a strip-jar-timestamps phase in the ant-build-system.

Thanks for the pointer.  Do you think could be worth to extract that
into (guix build utils) as the attached patch (WIP) does?  It rebuilds
the world and replaces all of "old usages", so I'm still waiting to
reach ant-bootstrap...

Happy hacking!
Miguel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: wip.patch --]
[-- Type: text/x-patch, Size: 11032 bytes --]

From dd2e78badad805cff8be940411994533aed8b059 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Miguel=20=C3=81ngel=20Arruga=20Vivas?=
 <rosen644835@gmail.com>
Date: Wed, 6 Jan 2021 23:29:36 +0100
Subject: [PATCH] wip-build-utils: Extract reset-zip-timestamp and use it
 everywhere.

---
 gnu/packages/java.scm           | 73 ++++++++-------------------------
 guix/build/ant-build-system.scm | 32 ++++-----------
 guix/build/utils.scm            | 48 ++++++++++++++++++++++
 3 files changed, 73 insertions(+), 80 deletions(-)

diff --git a/gnu/packages/java.scm b/gnu/packages/java.scm
index 758f8f1859..82d18bf62a 100644
--- a/gnu/packages/java.scm
+++ b/gnu/packages/java.scm
@@ -411,28 +411,11 @@ JNI.")
          (add-after 'build 'strip-jar-timestamps ;based on ant-build-system
            (lambda* (#:key outputs #:allow-other-keys)
              (define (repack-archive jar)
-               (let* ((dir (mkdtemp! "jar-contents.XXXXXX"))
-                      (manifest (string-append dir "/META-INF/MANIFESTS.MF")))
-                 (with-directory-excursion dir
-                   (invoke "unzip" jar))
-                 (delete-file jar)
-                 ;; XXX: copied from (gnu build install)
-                 (for-each (lambda (file)
-                             (let ((s (lstat file)))
-                               (unless (eq? (stat:type s) 'symlink)
-                                 (utime file  0 0 0 0))))
-                           (find-files dir #:directories? #t))
-                 ;; It is important that the manifest appears first.
-                 (with-directory-excursion dir
-                   (let* ((files (find-files "." ".*" #:directories? #t))
-                          ;; To ensure that the reference scanner can
-                          ;; detect all store references in the jars
-                          ;; we disable compression with the "-0" option.
-                          (command (if (file-exists? manifest)
-                                       `("zip" "-0" "-X" ,jar ,manifest
-                                         ,@files)
-                                       `("zip" "-0" "-X" ,jar ,@files))))
-                     (apply invoke command)))))
+               (let ((mktempdir (lambda ()
+                                  (mkdtemp! "jar-contents.XXXXXX"))))
+                 (reset-zip-timestamp jar mktempdir
+                                      #:first-file "/META-INF/MANIFEST.MF"
+                                      #:compression-level "-0")))
              (for-each repack-archive
                     (find-files
                      (string-append (assoc-ref %outputs "out") "/lib")
@@ -1962,21 +1945,10 @@ new Date();"))
          (add-after 'install 'strip-zip-timestamps
            (lambda* (#:key outputs #:allow-other-keys)
              (use-modules (guix build syscalls))
-             (for-each (lambda (zip)
-                         (let ((dir (mkdtemp! "zip-contents.XXXXXX")))
-                           (with-directory-excursion dir
-                             (invoke "unzip" zip))
-                           (delete-file zip)
-                           (for-each (lambda (file)
-                                       (let ((s (lstat file)))
-                                         (unless (eq? (stat:type s) 'symlink)
-                                           (format #t "reset ~a~%" file)
-                                           (utime file 0 0 0 0))))
-                             (find-files dir #:directories? #t))
-                           (with-directory-excursion dir
-                             (let ((files (find-files "." ".*" #:directories? #t)))
-                               (apply invoke "zip" "-0" "-X" zip files)))))
-               (find-files (assoc-ref outputs "doc") ".*.zip$"))
+             (let ((mktempdir (lambda () (mkdtemp! "zip-contents.XXXXXX"))))
+               (for-each (lambda (zip)
+                           (reset-zip-timestamp zip mktempdir))
+                        (find-files (assoc-ref outputs "doc") ".*.zip$")))
              #t)))))
     (inputs
      `(("alsa-lib" ,alsa-lib)
@@ -2197,25 +2169,14 @@ new Date();"))
              (use-modules (guix build syscalls)
                           (ice-9 binary-ports)
                           (rnrs bytevectors))
-             (letrec ((repack-archive
-                    (lambda (archive)
-                      (let ((dir (mkdtemp! "zip-contents.XXXXXX")))
-                        (with-directory-excursion dir
-                          (invoke "unzip" archive))
-                        (delete-file archive)
-                        (for-each (compose repack-archive canonicalize-path)
-                                  (find-files dir "(ct.sym|.*.jar)$"))
-                        (let ((reset-file-timestamp
-                               (lambda (file)
-                                 (let ((s (lstat file)))
-                                   (unless (eq? (stat:type s) 'symlink)
-                                     (format #t "reset ~a~%" file)
-                                     (utime file 0 0 0 0))))))
-                          (for-each reset-file-timestamp
-                                    (find-files dir #:directories? #t)))
-                        (with-directory-excursion dir
-                          (let ((files (find-files "." ".*" #:directories? #t)))
-                            (apply invoke "zip" "-0" "-X" archive files)))))))
+             (let* ((mktempdir (lambda ()
+                                 (mkdtemp! "zip-contents.XXXXXX")))
+                    (repack-archive
+                     (lambda (archive)
+                       (reset-zip-timestamp archive mktempdir
+                                            #:compression-level "-0"
+                                            #:recursion-regexp
+                                            "(ct.sym|.*.jar)$"))))
                (for-each repack-archive
                          (find-files (assoc-ref outputs "doc") ".*.zip$"))
                (for-each repack-archive
diff --git a/guix/build/ant-build-system.scm b/guix/build/ant-build-system.scm
index fae1b47ec5..d6c8b71abc 100644
--- a/guix/build/ant-build-system.scm
+++ b/guix/build/ant-build-system.scm
@@ -201,35 +201,19 @@ dependencies of this jar file."
 repack them.  This is necessary to ensure that archives are reproducible."
   (define (repack-archive jar)
     (format #t "repacking ~a\n" jar)
-    (let* ((dir (mkdtemp! "jar-contents.XXXXXX"))
-           (manifest (string-append dir "/META-INF/MANIFEST.MF")))
-      (with-directory-excursion dir
-        (invoke "jar" "xf" jar))
-      (delete-file jar)
-      ;; XXX: copied from (gnu build install)
-      (for-each (lambda (file)
-                  (let ((s (lstat file)))
-                    (unless (eq? (stat:type s) 'symlink)
-                      (utime file 0 0 0 0))))
-                (find-files dir #:directories? #t))
-
+    (let ((manifest "/META-INF/MANIFEST.MF")
+          (mktmpdir (lambda () (mkdtemp! "jar-contents.XXXXXX"))))
       ;; The jar tool will always set the timestamp on the manifest file
       ;; and the containing directory to the current time, even when we
       ;; reuse an existing manifest file.  To avoid this we use "zip"
       ;; instead of "jar".  It is important that the manifest appears
       ;; first.
-      (with-directory-excursion dir
-        (let* ((files (find-files "." ".*" #:directories? #t))
-               ;; To ensure that the reference scanner can detect all
-               ;; store references in the jars we disable compression
-               ;; with the "-0" option.
-               (command (if (file-exists? manifest)
-                            `("zip" "-0" "-X" ,jar ,manifest ,@files)
-                            `("zip" "-0" "-X" ,jar ,@files))))
-          (apply invoke command)))
-      (utime jar 0 0)
-      #t))
-
+      (reset-zip-timestamp jar mktmpdir
+                           #:first-file manifest
+                           ;; To ensure that the reference scanner can detect
+                           ;; all store references in the jars we disable
+                           ;; compression with the "-0" option.
+                           #:compression-level "-0")))
   (for-each (match-lambda
               ((output . directory)
                (for-each repack-archive
diff --git a/guix/build/utils.scm b/guix/build/utils.scm
index 419c10195b..3f82d87732 100644
--- a/guix/build/utils.scm
+++ b/guix/build/utils.scm
@@ -56,7 +56,9 @@
             elf-file?
             ar-file?
             gzip-file?
+            zip-file?
             reset-gzip-timestamp
+            reset-zip-timestamp
             with-directory-excursion
             mkdir-p
             install-file
@@ -282,6 +284,52 @@ preserve FILE's modification time."
      (lambda ()
        (chdir init)))))
 
+(define %zip-magic-bytes
+  ;; Magic bytes of zip file.  Beware, it's a small header so there could be
+  ;; false positives.
+  #vu8(#x50 #x4b))
+
+(define zip-file?
+  (file-header-match %zip-magic-bytes))
+
+(define* (reset-zip-timestamp zip-file tmp-dir-generator
+                              #:key (first-file #f)
+                              (compression-level "-6")
+                              (recursion-regexp #f))
+  "Reset the timestamps inside ZIP-FILE, regenerating it with the
+COMPRESSION-LEVEL provided, and optionally placing FIRST-FILE at the
+beginning of the archive when it exists.
+
+TMP-DIR-GENERATOR must return a different directory each time it is called
+when RECURSION-REGEXP is provided."
+  (let* ((dir (tmp-dir-generator))
+         (first-file (string-append dir first-file)))
+    (with-directory-excursion dir
+      (invoke "unzip" zip-file))
+    (delete-file zip-file)
+    (when recursion-regexp
+      (for-each (lambda (file)
+                  (reset-zip-timestamp (canonicalize-path file)
+                                       tmp-dir-generator
+                                       #:first-file first-file
+                                       #:compression-level compression-level
+                                       #:recursion-regexp recursion-regexp))
+                (find-files dir recursion-regexp)))
+    (for-each (lambda (file)
+                (let ((s (lstat file)))
+                  (unless (eq? (stat:type s) 'symlink)
+                    (utime file 0 0 0 0))))
+              (find-files dir #:directories? #t))
+
+    (with-directory-excursion dir
+      (let* ((files (find-files "." ".*" #:directories? #t))
+             (call-zip `("zip" ,compression-level "-X" ,zip-file))
+             (command (if (file-exists? first-file)
+                          `(,@call-zip ,first-file ,@files)
+                          `(,@call-zip ,@files))))
+        (apply invoke command)))
+    (utime zip-file 0 0)))
+
 (define (mkdir-p dir)
   "Create directory DIR and all its ancestors."
   (define absolute?
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* bug#45675: Zip-based archives store timestamps
  2021-01-06 22:34   ` Miguel Ángel Arruga Vivas
@ 2021-01-06 23:10     ` Julien Lepiller
  0 siblings, 0 replies; 4+ messages in thread
From: Julien Lepiller @ 2021-01-06 23:10 UTC (permalink / raw)
  To: Miguel Ángel Arruga Vivas; +Cc: 45675

[-- Attachment #1: Type: text/plain, Size: 559 bytes --]

This sounds like a good idea indeed

Le 6 janvier 2021 17:34:01 GMT-05:00, "Miguel Ángel Arruga Vivas" <rosen644835@gmail.com> a écrit :
>Hi,
>
>Julien Lepiller <julien@lepiller.eu> writes:
>
>> For java packages, we have a strip-jar-timestamps phase in the
>ant-build-system.
>
>Thanks for the pointer.  Do you think could be worth to extract that
>into (guix build utils) as the attached patch (WIP) does?  It rebuilds
>the world and replaces all of "old usages", so I'm still waiting to
>reach ant-bootstrap...
>
>Happy hacking!
>Miguel

[-- Attachment #2: Type: text/html, Size: 953 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-01-06 23:12 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-05 13:10 bug#45675: Zip-based archives store timestamps Miguel Ángel Arruga Vivas
2021-01-05 15:17 ` Julien Lepiller
2021-01-06 22:34   ` Miguel Ángel Arruga Vivas
2021-01-06 23:10     ` Julien Lepiller

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).