From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id a9bZAgYu0GAyRQAAgWs5BA (envelope-from ) for ; Mon, 21 Jun 2021 08:13:26 +0200 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id 4Zp4OQUu0GBRaQAAB5/wlQ (envelope-from ) for ; Mon, 21 Jun 2021 06:13:25 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 6AACD26CCC for ; Mon, 21 Jun 2021 08:13:21 +0200 (CEST) Received: from localhost ([::1]:38608 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lvDBQ-00081p-Cn for larch@yhetil.org; Mon, 21 Jun 2021 02:13:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47944) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lvDBA-0007w9-Pw for guix-patches@gnu.org; Mon, 21 Jun 2021 02:13:04 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:51897) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lvDBA-0001tp-HG for guix-patches@gnu.org; Mon, 21 Jun 2021 02:13:04 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1lvDBA-0006cw-DJ for guix-patches@gnu.org; Mon, 21 Jun 2021 02:13:04 -0400 X-Loop: help-debbugs@gnu.org Subject: [bug#49149] [PATCH 5/7] pack: Prevent duplicate files in tar archives. Resent-From: Maxim Cournoyer Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Mon, 21 Jun 2021 06:13:04 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 49149 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: 49149@debbugs.gnu.org Cc: Maxim Cournoyer Received: via spool by 49149-submit@debbugs.gnu.org id=B49149.162425597225399 (code B ref 49149); Mon, 21 Jun 2021 06:13:04 +0000 Received: (at 49149) by debbugs.gnu.org; 21 Jun 2021 06:12:52 +0000 Received: from localhost ([127.0.0.1]:35200 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lvDAx-0006bU-RU for submit@debbugs.gnu.org; Mon, 21 Jun 2021 02:12:52 -0400 Received: from mail-qv1-f41.google.com ([209.85.219.41]:36371) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lvDAw-0006aY-B0 for 49149@debbugs.gnu.org; Mon, 21 Jun 2021 02:12:50 -0400 Received: by mail-qv1-f41.google.com with SMTP id im10so6893973qvb.3 for <49149@debbugs.gnu.org>; Sun, 20 Jun 2021 23:12:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=9sYSuoxh/TXX0vpZJIGqEGuZnOV29nWz3iCoA1cYbX4=; b=jIeO8o2AdhRHLnGOzTu6AGjp0/kTxZ4lzWCmSNYQ3X/7LL+3v0WSaq9x6FZ3T8x7x5 5o/TImIr9ZXSRqnW3UTXZ9Vw4yf8RysFt1kGT8a9b5bLXE9gtz5+EytNTFJZpY6HLnHN JCObUfKTG2lrO4Ai3FjR+TUI7089/LRAHVUF5WecUf7sBfBmcQTV2qneAG6NDNUYdACh Iz4TyNCvAcd8IzpipIp/HYbUgYGQP9O0eCbtEvEsplyJlyMKAEhGDjW/Bc575fYy/g0J 4mvCHxrax2zZo6WS/O4x5v/ar8Uu31YaFitZWaBTq249/6A19Ka+/F+c+twJpEPjgmzC t5rw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9sYSuoxh/TXX0vpZJIGqEGuZnOV29nWz3iCoA1cYbX4=; b=KEo0e1WysN2ez6pFhzgTi3PT4zTnpR5C9EghqVjpdE9KYFouoDS+SVKrxaiT1lsBHn RMyMtK+30qhS4eQO4m4ZHTeVgF9le2QI81U01YaFig+ZIvtBQnSv3lvmNGx3fFeVngfI FUXW2D3xDO2OMYAosgvDK+EFdNJ3I47t/Z77ERckZIoQz1jPbMEGPymK/skGOeHSSfN3 DIG28Pe0xbUrlhzHF0zszTTEWaZFQFFF+Cp9S8IGNFvUXt5ckBKI/Cu2apX/q06Yc7XM i0IwSROnKSrU/oX1r1ksZeORYEqIKYftHnmbU/nvLVWGKm9u3Zhx7HqThusntkkCap+K Prxg== X-Gm-Message-State: AOAM533w14lXECziEQjjPi1RUeJdpRpPRdqSZvHExOSmT9V4QH8+9guS TfNuFoQgzG6qELFPegeAwwst3+fKtE02RA== X-Google-Smtp-Source: ABdhPJwKELnPF5QJFXzuk1HcuLong5Lm4ltkdpl7qGX0ryWnzz/b0dHEZr6hdC7un+BNsGY+lURLhg== X-Received: by 2002:a05:6214:b6c:: with SMTP id ey12mr18374648qvb.48.1624255964788; Sun, 20 Jun 2021 23:12:44 -0700 (PDT) Received: from localhost.localdomain (dsl-148-219.b2b2c.ca. [66.158.148.219]) by smtp.gmail.com with ESMTPSA id i11sm8478663qke.74.2021.06.20.23.12.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 20 Jun 2021 23:12:44 -0700 (PDT) From: Maxim Cournoyer Date: Mon, 21 Jun 2021 02:12:03 -0400 Message-Id: <20210621061205.31878-6-maxim.cournoyer@gmail.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210621061205.31878-1-maxim.cournoyer@gmail.com> References: <20210621061205.31878-1-maxim.cournoyer@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+larch=yhetil.org@gnu.org Sender: "Guix-patches" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1624256001; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:resent-cc: resent-from:resent-sender:resent-message-id:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=9sYSuoxh/TXX0vpZJIGqEGuZnOV29nWz3iCoA1cYbX4=; b=NxJ1TthoobyP1SLe/iW8kVxk/37+b32DdCIcsHXodL+sqrHnvgQssdilmHHUaL8SHw2Ose 9uMVWkvb3cjKWwj2yPQeOATdZnjb0zgHMXYmW22/fU9RDroUiOzpkbaF3Z9/31YdLkVxic QTPcMET3XGWCI4zndWCZJsB//FfTDdr5HJn6GITbe16cCA16d8QRxswOfFU8lw3Ljd5HUf 9IRDtqQcfAj2aIHuqWtAJK8JrFRHV68fIrGyR/VmX/wQtEQxZRP3CDzK2ikzBqk4q9d+Ep 2E/hONhIAS1F4l1v0ehnmHRB+pqhBp9oI8sttdTwiictuhIA3p1tG/PqRjQneg== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1624256001; a=rsa-sha256; cv=none; b=a7PpAoO3zqmrU3VUI6eMdBaLT/URP2VnY5XMYrsWHpPnAunAUZ7svIEf2kbu+5nQ82FTJd KEyk4tvzhqeo36GLPy2mFydRukBS+c6VMiyj8SHZpugc9DknRFeyNlsj3abe/zzOrFwDAD brYMMhnBi+kVBEwkAtZDql8Pm+WQ7JpYGOU+x8Nlznp0qLjV667VSD1q2W1PgPR68l4WZM L5caDbSar3kDMpAVLUaulbubdCgUrXEtMrCgSkMcDHhkXeVVY6Z/x0gCVo/EGGpNhQEWi7 eDpzhr892MLLKi56GZoA8oy3EpB+03OQBo+zJTQ1DzSIZ9AwY3MnyjsMi4aBsA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gmail.com header.s=20161025 header.b=jIeO8o2A; dmarc=fail reason="SPF not aligned (relaxed)" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of guix-patches-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-patches-bounces@gnu.org X-Migadu-Spam-Score: -1.32 Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gmail.com header.s=20161025 header.b=jIeO8o2A; dmarc=fail reason="SPF not aligned (relaxed)" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of guix-patches-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-patches-bounces@gnu.org X-Migadu-Queue-Id: 6AACD26CCC X-Spam-Score: -1.32 X-Migadu-Scanner: scn1.migadu.com X-TUID: URAAW5aVWJR7 Tar translate duplicate files in the archive into hard links. These can cause problems, as not every tool support them; for example dpkg doesn't. * gnu/system/file-systems.scm (reduce-directories): New procedure. (file-prefix?): Lift the restriction on file prefix. The procedure can be useful for comparing relative file names. Adjust doc. (file-name-depth): New procedure, extracted from ... (btrfs-store-subvolume-file-name): ... here. * guix/scripts/pack.scm (self-contained-tarball/builder): Use reduce-directories. * tests/file-systems.scm ("reduce-directories"): New test. --- gnu/system/file-systems.scm | 56 +++++++++++++++++++++++++------------ guix/scripts/pack.scm | 6 ++-- tests/file-systems.scm | 7 ++++- 3 files changed, 48 insertions(+), 21 deletions(-) diff --git a/gnu/system/file-systems.scm b/gnu/system/file-systems.scm index 464e87cb18..fb87bfc85b 100644 --- a/gnu/system/file-systems.scm +++ b/gnu/system/file-systems.scm @@ -55,6 +55,7 @@ file-system-dependencies file-system-location + reduce-directories file-system-type-predicate btrfs-subvolume? btrfs-store-subvolume-file-name @@ -231,8 +232,8 @@ (char-set-complement (char-set #\/))) (define (file-prefix? file1 file2) - "Return #t if FILE1 denotes the name of a file that is a parent of FILE2, -where both FILE1 and FILE2 are absolute file name. For example: + "Return #t if FILE1 denotes the name of a file that is a parent of FILE2. +For example: (file-prefix? \"/gnu\" \"/gnu/store\") => #t @@ -240,19 +241,41 @@ where both FILE1 and FILE2 are absolute file name. For example: (file-prefix? \"/gn\" \"/gnu/store\") => #f " - (and (string-prefix? "/" file1) - (string-prefix? "/" file2) - (let loop ((file1 (string-tokenize file1 %not-slash)) - (file2 (string-tokenize file2 %not-slash))) - (match file1 - (() - #t) - ((head1 tail1 ...) - (match file2 - ((head2 tail2 ...) - (and (string=? head1 head2) (loop tail1 tail2))) - (() - #f))))))) + (let loop ((file1 (string-tokenize file1 %not-slash)) + (file2 (string-tokenize file2 %not-slash))) + (match file1 + (() + #t) + ((head1 tail1 ...) + (match file2 + ((head2 tail2 ...) + (and (string=? head1 head2) (loop tail1 tail2))) + (() + #f)))))) + +(define (file-name-depth file-name) + (length (string-tokenize file-name %not-slash))) + +(define (reduce-directories file-names) + "Eliminate entries in FILE-NAMES that are children of other entries in +FILE-NAMES. This is for example useful when passing a list of files to GNU +tar, which would otherwise descend into each directory passed and archive the +duplicate files as hard links, which can be undesirable." + (let* ((file-names/sorted + ;; Ascending sort by file hierarchy depth, then by file name length. + (stable-sort (delete-duplicates file-names) + (lambda (f1 f2) + (let ((depth1 (file-name-depth f1)) + (depth2 (file-name-depth f2))) + (if (= depth1 depth2) + (string< f1 f2) + (< depth1 depth2))))))) + (reverse (fold (lambda (file-name results) + (if (find (cut file-prefix? <> file-name) results) + results ;parent found -- skipping + (cons file-name results))) + '() + file-names/sorted)))) (define* (file-system-device->string device #:key uuid-type) "Return the string representations of the DEVICE field of a @@ -624,9 +647,6 @@ store is located, else #f." s (string-append "/" s))) - (define (file-name-depth file-name) - (length (string-tokenize file-name %not-slash))) - (and-let* ((btrfs-subvolume-fs (filter btrfs-subvolume? file-systems)) (btrfs-subvolume-fs* (sort btrfs-subvolume-fs diff --git a/guix/scripts/pack.scm b/guix/scripts/pack.scm index 9d4bb9f497..8a108b7a1a 100644 --- a/guix/scripts/pack.scm +++ b/guix/scripts/pack.scm @@ -225,13 +225,15 @@ dependencies are registered." `((guix build pack) (guix build utils) (guix build union) - (gnu build install)) + (gnu build install) + (gnu system file-systems)) #:select? import-module?) #~(begin (use-modules (guix build pack) (guix build utils) ((guix build union) #:select (relative-file-name)) (gnu build install) + ((gnu system file-systems) #:select (reduce-directories)) (srfi srfi-1) (srfi srfi-26) (ice-9 match)) @@ -298,7 +300,7 @@ dependencies are registered." ,(string-append "." (%store-directory)) - ,@(delete-duplicates + ,@(reduce-directories (filter-map (match-lambda (('directory directory) (string-append "." directory)) diff --git a/tests/file-systems.scm b/tests/file-systems.scm index 7f7c373884..80acb6d5b9 100644 --- a/tests/file-systems.scm +++ b/tests/file-systems.scm @@ -1,6 +1,6 @@ ;;; GNU Guix --- Functional package management for GNU ;;; Copyright © 2015, 2017 Ludovic Courtès -;;; Copyright © 2020 Maxim Cournoyer +;;; Copyright © 2020, 2021 Maxim Cournoyer ;;; ;;; This file is part of GNU Guix. ;;; @@ -50,6 +50,11 @@ (device "/foo") (flags '(bind-mount read-only))))))))) +(test-equal "reduce-directories" + '("./opt/gnu/" "./opt/gnuism" "a/b/c") + (reduce-directories '("./opt/gnu/etc" "./opt/gnu/" "./opt/gnu/bin" + "./opt/gnu/lib/debug" "./opt/gnuism" "a/b/c" "a/b/c"))) + (test-assert "does not pull (guix config)" ;; This module is meant both for the host side and "build side", so make ;; sure it doesn't pull in (guix config), which depends on the user's -- 2.32.0