From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id SScUIDcN1GAqRQAAgWs5BA (envelope-from ) for ; Thu, 24 Jun 2021 06:42:31 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id UYE6GzcN1GDFVgAAB5/wlQ (envelope-from ) for ; Thu, 24 Jun 2021 04:42:31 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id A7B82226CB for ; Thu, 24 Jun 2021 06:42:30 +0200 (CEST) Received: from localhost ([::1]:49878 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lwHC9-0006Wr-NJ for larch@yhetil.org; Thu, 24 Jun 2021 00:42:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:56974) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lwHBj-0005xu-Qx for guix-patches@gnu.org; Thu, 24 Jun 2021 00:42:03 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:58870) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lwHBj-0002y4-Ji for guix-patches@gnu.org; Thu, 24 Jun 2021 00:42:03 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1lwHBj-0002R3-IR for guix-patches@gnu.org; Thu, 24 Jun 2021 00:42:03 -0400 X-Loop: help-debbugs@gnu.org Subject: [bug#49149] [PATCH v2 5/7] pack: Prevent duplicate files in tar archives. Resent-From: Maxim Cournoyer Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Thu, 24 Jun 2021 04:42:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 49149 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: 49149@debbugs.gnu.org Cc: Maxim Cournoyer Received: via spool by 49149-submit@debbugs.gnu.org id=B49149.16245096859262 (code B ref 49149); Thu, 24 Jun 2021 04:42:03 +0000 Received: (at 49149) by debbugs.gnu.org; 24 Jun 2021 04:41:25 +0000 Received: from localhost ([127.0.0.1]:42174 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lwHB7-0002PD-0k for submit@debbugs.gnu.org; Thu, 24 Jun 2021 00:41:25 -0400 Received: from mail-qt1-f170.google.com ([209.85.160.170]:42700) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lwHB2-0002O3-7u for 49149@debbugs.gnu.org; Thu, 24 Jun 2021 00:41:20 -0400 Received: by mail-qt1-f170.google.com with SMTP id x21so3905103qtq.9 for <49149@debbugs.gnu.org>; Wed, 23 Jun 2021 21:41:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Cj+VO9kGHPIMPZNWQUldz/0xcId7j7VDvjBbQZGXql0=; b=YU1FjtLQn8BlWbX0D8VUv7XwwSA6kCWWDTZrYdcA/FjlLCSCBmRNj3IUaxJVXIk0mX 0dhWdWe7zWBltqa9O+EnB40mFjvKoJftlFbue5dNnawX6WtbSS0VnY1Cx1XI+SBf+B7b H+MdIlavlaRQwMMVk4mErQMF3cREySrhsLrhFqSZxHr5HsYNYagiQZF+XTTGr4jtKRmv N+jf01MeHsNR0TVhL0baKsRrVXHKXY7dIXEzb0FaESRVytKlRmDdmmPVgdzta3+/0fKR WzVBQOoKnV0v5vP68LjW8o78YPmp/3hT20Ut4gtBNC7stByf6f46PILkVgl7BZLzGjya 2hzA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Cj+VO9kGHPIMPZNWQUldz/0xcId7j7VDvjBbQZGXql0=; b=kHBzlWUYKyQoYax96gBpykcf/iJ60aqiZR5tbArvxN/0yD691Cn6yAwBB4TSblZq1l psT0+/MDRXWRR/j3EQXV2ArJGHOSZorgEt/FcoJM5vF1+xRDliP1Z3Y276y9ChrzwF/o 6/XQv6maOfoosjvtx3QN+o/7F0fg1aEvmUweLFIkjB5Frh7PBg0Butzi77DdaAaGyCBL B4HpLWdDMswGMhwPHGG4wUME4jOEJD9AGqRQNaDmPQ4FqPUHIEgllamXo+unyEgJEg4/ CMw7g3aJrw0uwvqD3YcAuwdY3XYDOhlxmRd5FOkh06pGzvEjC4iK9HNhNXvqzxkcd524 zdUg== X-Gm-Message-State: AOAM533AHyLpbjSJLf4QU7SQaiyGExyPBQVcX3wuEHGbAJydevweU9WH mdvQ2/I+YaZPhacpsV60H2qJhtmm/SxzKQ== X-Google-Smtp-Source: ABdhPJxNUP7LsqUUr/9IH7AsdrSifIp5JeNUTZtZ5jfTdqtPIHpcQ7m9eBNaUF9HA+HmKCgKl48Juw== X-Received: by 2002:ac8:7f94:: with SMTP id z20mr3142236qtj.17.1624509674678; Wed, 23 Jun 2021 21:41:14 -0700 (PDT) Received: from localhost.localdomain ([207.35.95.2]) by smtp.gmail.com with ESMTPSA id q199sm1603880qka.112.2021.06.23.21.41.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jun 2021 21:41:14 -0700 (PDT) From: Maxim Cournoyer Date: Thu, 24 Jun 2021 00:40:47 -0400 Message-Id: <20210624044049.17906-5-maxim.cournoyer@gmail.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210624044049.17906-1-maxim.cournoyer@gmail.com> References: <20210624044049.17906-1-maxim.cournoyer@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+larch=yhetil.org@gnu.org Sender: "Guix-patches" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1624509750; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:resent-cc: resent-from:resent-sender:resent-message-id:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=Cj+VO9kGHPIMPZNWQUldz/0xcId7j7VDvjBbQZGXql0=; b=StZVbu3uVxh2W9kvetFSQB1Amj5aXiQHnAGKYoVc1u1RKdSKmrzyker99RRYFbQzbXcozr JvmGXX6XVDZpXM2GhIiA1gif8m2IAsjH2RcA7/HzRnPlqEwrCBCn7j5TJoK+1fd9ysur4z NABNwilDOYQ6KeqFle6O7Xu8/xuKO3RuCbILeTH9LHgUwCfVuo64/iV/nTT5nKEbG/tRHg fhtqZQQdvHr3VZIvzVv+Ox/y4oQ8QRWy3nFjWoMUIOBrcbx4ECjRMKMYr+lXwJgkPNtu2o 2i4IUGWDI3b7gY9mCfEltXqjm4bI5P1PDDVWYDwIdJrAaIVRSVyiCNZTQyjk4w== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1624509750; a=rsa-sha256; cv=none; b=MpSeFhLeIi0n962D0v9+YcuIqcjuK+qK4LQ95JqiV3uObBRgpd6sfsRQbthN7Qm+n6w834 ToHDjYrbT2A887sFWGXWLw/Y7dOrXeQEy31bTxZG3CPoZVhVgw/flx9iXUG+y/o1VCI9tp MQhpBMwfgH5NWoel8/RlpubjYE1uYC7W1Aby2s+23NydLe5/jOgWJMDDB0qU/czK2OkzSJ GABreQ7G/t/DRqs4Z5bAuZ1iOVMLNwAcOuHeF7ezjKry1XkZM2nwM3i+SGYP+Gu0LIiOsO LLpYGTQov1XenJdpshFtrsaaSQd4ccqTxKJZodIGJGoNjbTX3bHhPYF99r/i4w== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gmail.com header.s=20161025 header.b=YU1FjtLQ; dmarc=fail reason="SPF not aligned (relaxed)" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of guix-patches-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-patches-bounces@gnu.org X-Migadu-Spam-Score: -1.33 Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gmail.com header.s=20161025 header.b=YU1FjtLQ; dmarc=fail reason="SPF not aligned (relaxed)" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of guix-patches-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-patches-bounces@gnu.org X-Migadu-Queue-Id: A7B82226CB X-Spam-Score: -1.33 X-Migadu-Scanner: scn0.migadu.com X-TUID: 1eXy7umFmUxa Tar translate duplicate files in the archive into hard links. These can cause problems, as not every tool support them; for example dpkg doesn't. * gnu/system/file-systems.scm (reduce-directories): New procedure. (file-prefix?): Lift the restriction on file prefix. The procedure can be useful for comparing relative file names. Adjust doc. (file-name-depth): New procedure, extracted from ... (btrfs-store-subvolume-file-name): ... here. * guix/scripts/pack.scm (self-contained-tarball/builder): Use reduce-directories. * tests/file-systems.scm ("reduce-directories"): New test. --- gnu/system/file-systems.scm | 56 +++++++++++++++++++++++++------------ guix/scripts/pack.scm | 6 ++-- tests/file-systems.scm | 7 ++++- 3 files changed, 48 insertions(+), 21 deletions(-) diff --git a/gnu/system/file-systems.scm b/gnu/system/file-systems.scm index 464e87cb18..fb87bfc85b 100644 --- a/gnu/system/file-systems.scm +++ b/gnu/system/file-systems.scm @@ -55,6 +55,7 @@ file-system-dependencies file-system-location + reduce-directories file-system-type-predicate btrfs-subvolume? btrfs-store-subvolume-file-name @@ -231,8 +232,8 @@ (char-set-complement (char-set #\/))) (define (file-prefix? file1 file2) - "Return #t if FILE1 denotes the name of a file that is a parent of FILE2, -where both FILE1 and FILE2 are absolute file name. For example: + "Return #t if FILE1 denotes the name of a file that is a parent of FILE2. +For example: (file-prefix? \"/gnu\" \"/gnu/store\") => #t @@ -240,19 +241,41 @@ where both FILE1 and FILE2 are absolute file name. For example: (file-prefix? \"/gn\" \"/gnu/store\") => #f " - (and (string-prefix? "/" file1) - (string-prefix? "/" file2) - (let loop ((file1 (string-tokenize file1 %not-slash)) - (file2 (string-tokenize file2 %not-slash))) - (match file1 - (() - #t) - ((head1 tail1 ...) - (match file2 - ((head2 tail2 ...) - (and (string=? head1 head2) (loop tail1 tail2))) - (() - #f))))))) + (let loop ((file1 (string-tokenize file1 %not-slash)) + (file2 (string-tokenize file2 %not-slash))) + (match file1 + (() + #t) + ((head1 tail1 ...) + (match file2 + ((head2 tail2 ...) + (and (string=? head1 head2) (loop tail1 tail2))) + (() + #f)))))) + +(define (file-name-depth file-name) + (length (string-tokenize file-name %not-slash))) + +(define (reduce-directories file-names) + "Eliminate entries in FILE-NAMES that are children of other entries in +FILE-NAMES. This is for example useful when passing a list of files to GNU +tar, which would otherwise descend into each directory passed and archive the +duplicate files as hard links, which can be undesirable." + (let* ((file-names/sorted + ;; Ascending sort by file hierarchy depth, then by file name length. + (stable-sort (delete-duplicates file-names) + (lambda (f1 f2) + (let ((depth1 (file-name-depth f1)) + (depth2 (file-name-depth f2))) + (if (= depth1 depth2) + (string< f1 f2) + (< depth1 depth2))))))) + (reverse (fold (lambda (file-name results) + (if (find (cut file-prefix? <> file-name) results) + results ;parent found -- skipping + (cons file-name results))) + '() + file-names/sorted)))) (define* (file-system-device->string device #:key uuid-type) "Return the string representations of the DEVICE field of a @@ -624,9 +647,6 @@ store is located, else #f." s (string-append "/" s))) - (define (file-name-depth file-name) - (length (string-tokenize file-name %not-slash))) - (and-let* ((btrfs-subvolume-fs (filter btrfs-subvolume? file-systems)) (btrfs-subvolume-fs* (sort btrfs-subvolume-fs diff --git a/guix/scripts/pack.scm b/guix/scripts/pack.scm index ad432f2b63..84f2f14343 100644 --- a/guix/scripts/pack.scm +++ b/guix/scripts/pack.scm @@ -230,13 +230,15 @@ its source property." `((guix build pack) (guix build utils) (guix build union) - (gnu build install)) + (gnu build install) + (gnu system file-systems)) #:select? import-module?) #~(begin (use-modules (guix build pack) (guix build utils) ((guix build union) #:select (relative-file-name)) (gnu build install) + ((gnu system file-systems) #:select (reduce-directories)) (srfi srfi-1) (srfi srfi-26) (ice-9 match)) @@ -303,7 +305,7 @@ its source property." ,(string-append "." (%store-directory)) - ,@(delete-duplicates + ,@(reduce-directories (filter-map (match-lambda (('directory directory) (string-append "." directory)) diff --git a/tests/file-systems.scm b/tests/file-systems.scm index 7f7c373884..80acb6d5b9 100644 --- a/tests/file-systems.scm +++ b/tests/file-systems.scm @@ -1,6 +1,6 @@ ;;; GNU Guix --- Functional package management for GNU ;;; Copyright © 2015, 2017 Ludovic Courtès -;;; Copyright © 2020 Maxim Cournoyer +;;; Copyright © 2020, 2021 Maxim Cournoyer ;;; ;;; This file is part of GNU Guix. ;;; @@ -50,6 +50,11 @@ (device "/foo") (flags '(bind-mount read-only))))))))) +(test-equal "reduce-directories" + '("./opt/gnu/" "./opt/gnuism" "a/b/c") + (reduce-directories '("./opt/gnu/etc" "./opt/gnu/" "./opt/gnu/bin" + "./opt/gnu/lib/debug" "./opt/gnuism" "a/b/c" "a/b/c"))) + (test-assert "does not pull (guix config)" ;; This module is meant both for the host side and "build side", so make ;; sure it doesn't pull in (guix config), which depends on the user's -- 2.32.0