From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1.migadu.com ([2001:41d0:1008:1e59::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms1.migadu.com with LMTPS id AK3kLJdJWmZtdQAAA41jLg (envelope-from ) for ; Sat, 01 Jun 2024 00:05:11 +0200 Received: from aspmx1.migadu.com ([2001:41d0:303:e16b::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1.migadu.com with LMTPS id KDhXJZdJWmaG3wAA62LTzQ (envelope-from ) for ; Sat, 01 Jun 2024 00:05:11 +0200 X-Envelope-To: larch@yhetil.org Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=lease-up.com header.s=2017 header.b=DDJHUKyi; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=none) header.from=gnu.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1717193111; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=TOiZmBbOjKz/AMzw6cgMZR/gpIAdqQEFgZE/JzCqxgo=; b=PqKkDO9/ekpgBdGlxXx67URh9p+SmXw8HhaDIydUb21xl3jZtD3PRxdeCNI2KArUv0lk3w fGX6mb0rJSzhwzlSVaECfmqdPPIezbj0TVk7xIaZR6kpFa9RoqlWKVK8h9VfCYMJhvXlT7 lwwNwTLbv28WhnNKO9d6lQZR7ydkFJzEs9xmfCfGCMOsBqvYTHiN0vCGVRH+7dKkR+HZbs Jx5uJo8zUgHFICBSxlVEGS9dT2QgX/tmMEcC+Q+VAlCmyTOi6XcAwTDuTIkkweeZdCFmIw jsPSR5z4Ra2z9lj22qFxNxeOkL1DdiMTU0qQE5pWRMECTnft7zgeO6o/cgVyhg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=lease-up.com header.s=2017 header.b=DDJHUKyi; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=none) header.from=gnu.org ARC-Seal: i=1; s=key1; d=yhetil.org; t=1717193111; a=rsa-sha256; cv=none; b=Xp2hBiZmKaNkFLuO0wpcFzlgZZrj3h7iBfsTwX+DcIPFC9bp+/1jWoWiyJCVZNrFCXiDeK 7NJcpUwRcfaK3t7veLYsI7E5peQ3qXlOtyPfhf0arTZjAxeBsRvgUJ4yzTVbuHcaSByn8y dKvdNUxa6tOOJp0gNGPA7L+6fAjRvR45Xs11fBkPxoBv6tU25QvWnXMnBmOBZq3eEPP4Bd 5pel7GZE1hpAeXZ7rMnIiCdxLE5Uk1TP1iKMyEFpSZuJKXDqNkbfkmXsYSF0hQRraTt/W3 /TUY2j3VTTZrBXKcgIaibS1vl2WaWuCAEwdoiu3FugvJsAZGwXf2Yf2Rtr9cNQ== Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 34D7562F51 for ; Sat, 1 Jun 2024 00:05:11 +0200 (CEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sDAMO-00044g-2W; Fri, 31 May 2024 18:04:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sDAMM-00044V-Hf for guix-devel@gnu.org; Fri, 31 May 2024 18:04:26 -0400 Received: from sail-ipv4.us-core.com ([208.82.101.137]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_CHACHA20_POLY1305:256) (Exim 4.90_1) (envelope-from ) id 1sDAMK-0007ii-GL for guix-devel@gnu.org; Fri, 31 May 2024 18:04:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; s=2017; bh=FOtQ55NGDKQAkyE 8LN4cKiVS1mckU3+he9WHAfL4WKM=; h=date:references:in-reply-to:subject: cc:to:from; d=lease-up.com; b=DDJHUKyiF/q1eZJyFuw0aNHER00CswztPKLVGxuw qKk/Q+TrXbBremLZpamDYFOClPsSVEODoOuRhHNOmDG6Vp+lbllYW7pReoyef3r7UrpRbr pBQlt3q+Zg62O/gx9pl9+rvIQ8/mYLTp1PL4p+lzzB2PdmUSgXM62whVubwmY= Received: by sail-ipv4.us-core.com (OpenSMTPD) with ESMTPSA id cd43c8e4 (TLSv1.3:TLS_CHACHA20_POLY1305_SHA256:256:NO); Fri, 31 May 2024 22:04:20 +0000 (UTC) To: Efraim Flashner Cc: guix-devel@gnu.org Subject: Re: Are 'guix gc' stats exaggerated? In-Reply-To: References: <87bk4stjpi.fsf@lease-up.com> Date: Fri, 31 May 2024 15:03:47 -0700 Message-ID: <87a5k5oczg.fsf@lease-up.com> MIME-Version: 1.0 Content-Type: text/plain Received-SPF: pass client-ip=208.82.101.137; envelope-from=felix.lechner@lease-up.com; helo=sail-ipv4.us-core.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-to: Felix Lechner From: Felix Lechner via "Development of GNU Guix and the GNU System distribution." Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: guix-devel-bounces+larch=yhetil.org@gnu.org X-Migadu-Flow: FLOW_IN X-Migadu-Country: US X-Migadu-Spam-Score: -5.01 X-Spam-Score: -5.01 X-Migadu-Queue-Id: 34D7562F51 X-Migadu-Scanner: mx13.migadu.com X-TUID: 0F85UUxoApOx Hi Efraim, On Tue, May 28 2024, Efraim Flashner wrote: > As your store grows larger the inherent deduplication from the > guix-daemon approaches a 3:1 file deduplication ratio. Thank you for your explanations and your data about btrfs! Btrfs compression is a well-understood feature, although even its developers acknowledge that the benefit is hard to quantify. It probably makes more sense to focus on the Guix daemon here. I hope you don't mind a few clarifying questions. Why, please, does the benefit of de-duplication approach a fixed ratio of 3:1? Does the benefit not depend on the number of copies in the store, which can vary by any number? (It sounds like the answer may have something to do with store size.) Further, why is the removal of hardlinks counted as saving space even when their inode reference count, which is widely available [1] is greater than one? Finally, barring a better solution should our output numbers be divided by three to being them closer to the expected result for users? Thanks! Kind regards, Felix [1] https://en.wikipedia.org/wiki/Hard_link#Reference_counting