From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id aDpeNmskYV99SAAA0tVLHw (envelope-from ) for ; Tue, 15 Sep 2020 20:30:35 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id CD2ZMGskYV94TAAAbx9fmQ (envelope-from ) for ; Tue, 15 Sep 2020 20:30:35 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id C58A994053C for ; Tue, 15 Sep 2020 20:30:34 +0000 (UTC) Received: from localhost ([::1]:35670 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kIHay-0008Im-VM for larch@yhetil.org; Tue, 15 Sep 2020 16:30:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:33982) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kIHaU-0008H5-9a for guix-patches@gnu.org; Tue, 15 Sep 2020 16:30:04 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:49110) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kIHaT-0003Ni-Vk for guix-patches@gnu.org; Tue, 15 Sep 2020 16:30:01 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1kIHaT-0002Wt-Qv for guix-patches@gnu.org; Tue, 15 Sep 2020 16:30:01 -0400 X-Loop: help-debbugs@gnu.org Subject: [bug#42023] [PATCH] Retry deduplication on ENOENT Resent-From: Caleb Ristvedt Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Tue, 15 Sep 2020 20:30:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 42023 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: fixed patch To: bug-guix@gnu.org Cc: Ludovic =?UTF-8?Q?Court=C3=A8s?= Received: via spool by submit@debbugs.gnu.org id=B.16002017869679 (code B ref -1); Tue, 15 Sep 2020 20:30:01 +0000 Received: (at submit) by debbugs.gnu.org; 15 Sep 2020 20:29:46 +0000 Received: from localhost ([127.0.0.1]:60656 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kIHaD-0002W3-Ag for submit@debbugs.gnu.org; Tue, 15 Sep 2020 16:29:45 -0400 Received: from lists.gnu.org ([209.51.188.17]:58562) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kIHa9-0002Vt-VT for submit@debbugs.gnu.org; Tue, 15 Sep 2020 16:29:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:33938) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kIHa8-0008Fu-Dl for bug-guix@gnu.org; Tue, 15 Sep 2020 16:29:40 -0400 Received: from mail-io1-xd42.google.com ([2607:f8b0:4864:20::d42]:33706) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kIHa4-0003Ik-Id for bug-guix@gnu.org; Tue, 15 Sep 2020 16:29:39 -0400 Received: by mail-io1-xd42.google.com with SMTP id r25so5700406ioj.0 for ; Tue, 15 Sep 2020 13:29:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cune-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:in-reply-to:references:user-agent:date :message-id:mime-version; bh=wVSL8s2Pxv/+/fZSnLK9LkpBL+wkUSNx9wtZHsCtxGU=; b=tDCEKOKL2M9z23e24ak687m5rGv6oh33uOm2TuFDD6IF8gKL5kiHPpOOum9rjy6c6p YTeCIFKNMN7ukcLlubBxaIh+4FOl/85aRRpAlhRr5FVujB5Peti1I/054VMfILEKr109 3ZhuZhWtECIyRzDVoFUGHWkofV1gMWs3vqJKYi0ma/Ll53x4IKPlRJXwnyvpZuxDb5Tp eW815GCtoDa1fW0f5swuUoB/7wBgFEheaWw3PChkyXVvjo6McZ283ZirZWpXTdUtjq43 Rbue3nXGyofYZzqy0y0WUbyRRziZd7quNmaO7onJKl5Dqo3KVK247IH/GUJhJjZf8S1L 0G3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references :user-agent:date:message-id:mime-version; bh=wVSL8s2Pxv/+/fZSnLK9LkpBL+wkUSNx9wtZHsCtxGU=; b=savYWPJQ3I9m+kOg8FbotZyE3q4HnbfHuFwlK7j1lUZQeSRenvIpbC4Ng27iwwYt/y XM08piyoJiqVAwk3W3qFPVDIM9ACWT+XzQtGM5TCfwZhUMD6/CnhdX00MSj/lHJFWg7e UZKeqlfiCPFWWQ1gqfhfO2yy5rtCWNCmAzq8mPFtbLF/3kUPhO4RjU9FU2kBeaaQJwDE EGGfsJytcrxEBSVZiQyjYNEJ3uGnGYi72r9djUvOfY+tbv0aMM0JmEIwwGwnE3cK+zsF 6MHK1zYXOM5yAK8rXeIkJ+/8GYF3zmL4jj81xHqcX/a2CmBLvhLscg0G7eHPFgveKDLY 7IoA== X-Gm-Message-State: AOAM530QQ5K3kfcebUcHz7wYadLcvYPMqAvI4Jr80NPgudbZS8MiqueV Z319kHfM7kipHlU6BdX4O4Q3yA== X-Google-Smtp-Source: ABdhPJygC8j6PDipAbHImvF1GEoq57bOQXveMHWcO/PN3dMs4cB8Mu30bZZSnyptYBdgTaKBXvxHnA== X-Received: by 2002:a05:6638:168c:: with SMTP id f12mr19475580jat.16.1600201774841; Tue, 15 Sep 2020 13:29:34 -0700 (PDT) Received: from GuixPotato ([208.89.170.4]) by smtp.gmail.com with ESMTPSA id d19sm8288278iod.38.2020.09.15.13.29.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Sep 2020 13:29:34 -0700 (PDT) From: Caleb Ristvedt In-Reply-To: <877dswpweh.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Mon, 14 Sep 2020 10:58:30 +0200") References: <20200623163649.32444-1-mail@cbaines.net> <87366lxwcd.fsf@gnu.org> <87v9jg4aiz.fsf@cune.org> <87eeq3vbat.fsf@gnu.org> <87tuxu2sz6.fsf@gnu.org> <87bljka234.fsf@cune.org> <877dswpweh.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) Date: Tue, 15 Sep 2020 15:29:32 -0500 Message-ID: <87sgbikclv.fsf_-_@cune.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Received-SPF: pass client-ip=2607:f8b0:4864:20::d42; envelope-from=caleb.ristvedt@cune.org; helo=mail-io1-xd42.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.4 (-) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-Spam-Score: -2.4 (--) X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+larch=yhetil.org@gnu.org Sender: "Guix-patches" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=fail (rsa verify failed) header.d=cune-org.20150623.gappssmtp.com header.s=20150623 header.b=tDCEKOKL; dmarc=none; spf=pass (aspmx1.migadu.com: domain of guix-patches-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-patches-bounces@gnu.org X-Spam-Score: -0.01 X-TUID: n4UcMOSOLzmb --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Continued where left off from 42023: Ludovic Court=C3=A8s writes: >> From b992a3aaac7e3b30222e0bf1df09093f18e25e6a Mon Sep 17 00:00:00 2001 >> From: Caleb Ristvedt >> Date: Sat, 8 Aug 2020 11:25:57 -0500 >> Subject: [PATCH 3/6] deduplication: retry on ENOENT. >> >> It's possible for the garbage collector to remove the "canonical" link a= fter >> it's been detected as existing by 'deduplicate'. This would cause an EN= OENT >> error when replace-with-link attempts to create the temporary link. This >> changes it so that it will properly handle that by retrying. > > Would that ENOENT cause an error, or just a missed deduplication > opportunity? Depends on how we handle it. Previously the error would be uncaught entirely; simply skipping deduplication of that file would work, though it would be suboptimal. > >> * guix/store/deduplication.scm (replace-with-link): renamed to >> canonicalize-with-link, now also handles the case where the target link >> doesn't exist yet, and retries on ENOENT. >> (deduplicate): modified to use canonicalize-with-link. > > There=E2=80=99s an issue with this patch. I gave it a spin (offloading a= few > builds) and it got stuck in a infinite loop: > > stat("/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31br= hrk", 0x7ffe43898cd0) =3D -1 ENOENT (Dosiero a=C5=AD dosierujo ne ekzistas) > link("/gnu/store/83jy739bn644w3pnwgb5kwjig0kzs92f-libwps-0.4.12/lib/libwp= s-0.4.so", "/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq= 31brhrk") =3D -1 EEXIST (Dosiero jam ekzistas) > stat("/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31br= hrk", 0x7ffe43898cd0) =3D -1 ENOENT (Dosiero a=C5=AD dosierujo ne ekzistas) > link("/gnu/store/83jy739bn644w3pnwgb5kwjig0kzs92f-libwps-0.4.12/lib/libwp= s-0.4.so", "/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq= 31brhrk") =3D -1 EEXIST (Dosiero jam ekzistas) > stat("/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31br= hrk", 0x7ffe43898cd0) =3D -1 ENOENT (Dosiero a=C5=AD dosierujo ne ekzistas) > link("/gnu/store/83jy739bn644w3pnwgb5kwjig0kzs92f-libwps-0.4.12/lib/libwp= s-0.4.so", "/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq= 31brhrk") =3D -1 EEXIST (Dosiero jam ekzistas) > I believe I can explain this. In 'deduplicate' we currently treat anything that isn't a directory as a hardlinkable thing. This includes symlinks (although it's implementation-defined whether symlinks can be hardlinked to - we use CAN_LINK_SYMLINK to test this in nix/libstore/optimise-store.cc). This means that at present we unconditionally attempt to deduplicate symlinks (which happens to work on linux). However, 'file-exists?' uses stat, not lstat, to check for file existence. Thus, if there is a dangling symlink, 'file-exists?' will return #f when passed it, but of course attempting to call link() to create it will fail with EEXIST. Attached is a modified patch that tests for file existence with lstat instead. I expect that will fix the problem. We should probably still add a test in 'deduplicate' for whether symlinks can be hardlinked to. Tangent: I was curious why libwps-0.4.so would be a dangling symlink, and it turns out that it's actually a relative symlink, so when accessing it via /gnu/store/...-libwps-0.4.12/lib/libwps-0.4.so it isn't dangling, but when accessing it via /gnu/store/.links/0k63r... it is. > I think we should work on reducing the complexity of that code (e.g., > there are several layers of system-error handling). I've since flattened it down into just one layer of catch'es. It adds a bit of redundancy, but might make it clearer. - reepca --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0001-deduplication-retry-on-ENOENT.patch >From 12f5848e79b0ede95babebea240264b32e39812c Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt Date: Sat, 8 Aug 2020 11:25:57 -0500 Subject: [PATCH] deduplication: retry on ENOENT. It's possible for the garbage collector to remove the "canonical" link after it's been detected as existing by 'deduplicate'. This would cause an ENOENT error when replace-with-link attempts to create the temporary link. This changes it so that it will properly handle that by retrying. * guix/store/deduplication.scm (replace-with-link): renamed to canonicalize-with-link, now also handles the case where the target link doesn't exist yet, and retries on ENOENT. Also modified to support canonicalizing symbolic links, though it is the caller's responsibility to ensure that the system supports hardlinking to a symbolic link (on Linux it does). (deduplicate): modified to use canonicalize-with-link. --- guix/store/deduplication.scm | 125 ++++++++++++++++++++--------------- 1 file changed, 70 insertions(+), 55 deletions(-) diff --git a/guix/store/deduplication.scm b/guix/store/deduplication.scm index 0655ceb890..42116ed47d 100644 --- a/guix/store/deduplication.scm +++ b/guix/store/deduplication.scm @@ -115,26 +115,63 @@ store." ;; things link to this" (EMLINK), "this link already exists" (EEXIST), and ;; "can't fit more stuff in this directory" (ENOSPC). -(define* (replace-with-link target to-replace - #:key (swap-directory (dirname target)) - (store (%store-directory))) +(define* (canonicalize-with-link target to-replace + #:key (swap-directory (dirname target)) + (store (%store-directory))) "Atomically replace the file TO-REPLACE with a link to TARGET. Use SWAP-DIRECTORY as the directory to store temporary hard links. Upon ENOSPC and EMLINK, TO-REPLACE is left unchanged. Note: TARGET, TO-REPLACE, and SWAP-DIRECTORY must be on the same file system." - (define temp-link + (define (file-or-symlink-exists? filename) + ;; Like 'file-exists?', but doesn't follow symlinks. (catch 'system-error (lambda () - (get-temp-link target swap-directory)) + (lstat filename) + #t) (lambda args - ;; We get ENOSPC when we can't fit an additional entry in - ;; SWAP-DIRECTORY. If it's EMLINK, then TARGET has reached its - ;; maximum number of links. - (if (memv (system-error-errno args) `(,ENOSPC ,EMLINK)) + (if (= (system-error-errno args) ENOENT) #f (apply throw args))))) + (define temp-link + (let retry () + (if (file-or-symlink-exists? target) + (catch 'system-error + (lambda () + (get-temp-link target swap-directory)) + (lambda args + (let ((errno (system-error-errno args))) + (cond + ((= errno ENOENT) + ;; either SWAP-DIRECTORY has missing directory + ;; components or TARGET was deleted - this is a + ;; fundamental ambiguity to the errno produced by + ;; link() + (if (file-exists? swap-directory) + ;; we must assume link failed because target doesn't + ;; exist, so create it. + (retry) + (apply throw args))) + ((memv errno `(,ENOSPC ,EMLINK)) + ;; We get ENOSPC when we can't fit an additional entry + ;; in SWAP-DIRECTORY. If it's EMLINK, then TARGET has + ;; reached its maximum number of links. In both cases, + ;; skip deduplication. + #f) + (else (apply throw args)))))) + (catch 'system-error + (lambda () + (link to-replace target) + #f) + (lambda args + (cond + ((= (system-error-errno args) EEXIST) + (retry)) + ((memv (system-error-errno args) `(,ENOSPC ,EMLINK)) + #f) + (else (apply throw args)))))))) + ;; If we couldn't create TEMP-LINK, that's OK: just don't do the ;; replacement, which means TO-REPLACE won't be deduplicated. (when temp-link @@ -155,49 +192,27 @@ under STORE." (define links-directory (string-append store "/.links")) - (mkdir-p links-directory) - (let loop ((path path) - (type (stat:type (lstat path))) - (hash hash)) - (if (eq? 'directory type) - ;; Can't hardlink directories, so hardlink their atoms. - (for-each (match-lambda - ((file . properties) - (unless (member file '("." "..")) - (let* ((file (string-append path "/" file)) - (type (match (assoc-ref properties 'type) - ((or 'unknown #f) - (stat:type (lstat file))) - (type type)))) - (loop file type - (and (not (eq? 'directory type)) - (nar-sha256 file))))))) - (scandir* path)) - (let ((link-file (string-append links-directory "/" - (bytevector->nix-base32-string hash)))) - (if (file-exists? link-file) - (replace-with-link link-file path - #:swap-directory links-directory - #:store store) - (catch 'system-error - (lambda () - (link path link-file)) - (lambda args - (let ((errno (system-error-errno args))) - (cond ((= errno EEXIST) - ;; Someone else put an entry for PATH in - ;; LINKS-DIRECTORY before we could. Let's use it. - (replace-with-link path link-file - #:swap-directory - links-directory - #:store store)) - ((= errno ENOSPC) - ;; There's not enough room in the directory index for - ;; more entries in .links, but that's fine: we can - ;; just stop. - #f) - ((= errno EMLINK) - ;; PATH has reached the maximum number of links, but - ;; that's OK: we just can't deduplicate it more. - #f) - (else (apply throw args))))))))))) + (mkdir-p links-directory) + (let loop ((path path) + (type (stat:type (lstat path))) + (hash hash)) + (if (eq? 'directory type) + ;; Can't hardlink directories, so hardlink their atoms. + (for-each (match-lambda + ((file . properties) + (unless (member file '("." "..")) + (let* ((file (string-append path "/" file)) + (type (match (assoc-ref properties 'type) + ((or 'unknown #f) + (stat:type (lstat file))) + (type type)))) + (loop file type + (and (not (eq? 'directory type)) + (nar-sha256 file))))))) + (scandir* path)) + (let ((link-file (string-append links-directory "/" + (bytevector->nix-base32-string + hash)))) + (canonicalize-with-link link-file path + #:swap-directory links-directory + #:store store))))) -- 2.28.0 --=-=-=--