unofficial mirror of help-guix@gnu.org 
 help / color / mirror / Atom feed
* Output of guix build --check foo is not part of store deduplication
@ 2018-08-09  9:45 Björn Höfling
  2018-08-10  9:16 ` Chris Marusich
  2018-08-24 22:08 ` Ludovic Courtès
  0 siblings, 2 replies; 3+ messages in thread
From: Björn Höfling @ 2018-08-09  9:45 UTC (permalink / raw)
  To: Guix-Help

[-- Attachment #1: Type: text/plain, Size: 3998 bytes --]

Is there any reason why the output of 'guix build --check ...' is not
part of deduplication? I will explain my problem:

When checking for (un)reproducibility, we use something like:

guix build --check -K foo

That will build the package foo again and produce a store output

/gnu/store/hash..-foo-1.0.0-check

You can then use diffoscope to view the difference between the old and
the new '-check' output.

Usually, the store gets deduplicated, i.e. if files bar and baz have
the same content, they will hard-link to the same thing on disk. That's
cool for saving space if for example some package get's updated because
of a changed dependency but really there is no or little change to the
output files.

But the '-check' files are somehow not part of that deduplication. Even
if you enforce deduplication with guix gc --optimize. You can see it
like this:

ls -l  /gnu/store/zlxarsbwwkasy69cyv34jvzi7bgmajxz-shishi-1.0.2-check/share/man/man3/shishi_asreq.3.gz /gnu/store/zlxarsbwwkasy69cyv34jvzi7bgmajxz-shishi-1.0.2/share/man/man3/shishi_asreq.3.gz 
-r--r--r--  1 root root 624 Jan  1  1970 /gnu/store/zlxarsbwwkasy69cyv34jvzi7bgmajxz-shishi-1.0.2-check/share/man/man3/shishi_asreq.3.gz
-r--r--r-- 11 root root 624 Jan  1  1970 /gnu/store/zlxarsbwwkasy69cyv34jvzi7bgmajxz-shishi-1.0.2/share/man/man3/shishi_asreq.3.gz

ls -i  /gnu/store/zlxarsbwwkasy69cyv34jvzi7bgmajxz-shishi-1.0.2-check/share/man/man3/shishi_asreq.3.gz /gnu/store/zlxarsbwwkasy69cyv34jvzi7bgmajxz-shishi-1.0.2/share/man/man3/shishi_asreq.3.gz 
46161304 /gnu/store/zlxarsbwwkasy69cyv34jvzi7bgmajxz-shishi-1.0.2-check/share/man/man3/shishi_asreq.3.gz
45141642 /gnu/store/zlxarsbwwkasy69cyv34jvzi7bgmajxz-shishi-1.0.2/share/man/man3/shishi_asreq.3.gz

The '-check' output has only one link count and the actual output has
11 links, because I have already so many store items/generations of
that package around. The inode differs.

If you now diffoscope them, diffoscope will call stat and then we get
diffs like:

│ │   --- /gnu/store/h63cx6akyrv3m73lky585ba10qq3mydc-libchop-0.5.2/share/info/libchop.info.gz
│ │ ├── +++ /gnu/store/h63cx6akyrv3m73lky585ba10qq3mydc-libchop-0.5.2-check/share/info/libchop.info.gz
│ │ │ ├── /gnu/store/as7vb5xx7vqdwmmqj9543470r49b4c0c-coreutils-8.28/bin/stat {}
│ │ │ │ @@ -1,8 +1,8 @@
│ │ │ │  
│ │ │ │    Size: 29524          Blocks: 64         IO Block: 4096   regular file
│ │ │ │ -Links: 3
│ │ │ │ +Links: 1


This is annoying because it hides the actual unreproducibility-problem. 
Is there any reason for that?


At least, I found a very guixy way around it:

There's a patch by Eelco to filter those Links out:

https://github.com/edolstra/diffoscope/commit/367f77bba8df0dbc89e63c9f66f05736adf5ec59

(with copy/paste errors):

 diffoscope/comparators/directory.py
@@ -47,14 +47,18 @@ def cmdline(self):
    FILE_RE = re.compile(r'^\s*File:.*$')
    DEVICE_RE = re.compile(r'Device: [0-9a-f]+h/[0-9]+d')
+   LINKS_RE = re.compile(r'Links: [0-9]+')
    ACCESS_TIME_RE = re.compile(r'^Access: [0-9]{4}-[0-9]{2}-[0-9]{2}.*$')
    CHANGE_TIME_RE = re.compile(r'^Change: [0-9]{4}-[0-9]{2}-[0-9]{2}.*$')
    def filter(self, line):
        line = line.decode('utf-8')
        line = Stat.FILE_RE.sub('', line)
        line = Stat.DEVICE_RE.sub('', line)
        line = Stat.INODE_RE.sub('', line)
+       line = Stat.LINKS_RE.sub('', line)
        line = Stat.ACCESS_TIME_RE.sub('', line)
        line = Stat.CHANGE_TIME_RE.sub('', line)
        return line.encode('utf-8')


So, I did:

guix build -S diffoscope

to get the source tarball, unpacked the sources. Patched. Packed. Then:

guix package -i diffoscope --with-source=diffoscope-96.tar.gz

and have a Links-free version of diffoscope in my profile (If I would
have thought about that earlier, I would have done it in a separate
profile and not in my main one)!

Björn





[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-08-24 22:08 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-08-09  9:45 Output of guix build --check foo is not part of store deduplication Björn Höfling
2018-08-10  9:16 ` Chris Marusich
2018-08-24 22:08 ` Ludovic Courtès

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).