From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp11.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id SOgvImI+lmL3NgAAbAwnHQ (envelope-from ) for ; Tue, 31 May 2022 18:12:18 +0200 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp11.migadu.com with LMTPS id eBYnImI+lmKSOQAA9RJhRA (envelope-from ) for ; Tue, 31 May 2022 18:12:18 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id CFFD93A035 for ; Tue, 31 May 2022 18:12:17 +0200 (CEST) Received: from localhost ([::1]:35798 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nw4Tg-0006J7-Jn for larch@yhetil.org; Tue, 31 May 2022 12:12:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43048) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nw4RY-0006DX-0i for bug-guix@gnu.org; Tue, 31 May 2022 12:10:04 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:54836) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nw4RX-0002Xj-M4 for bug-guix@gnu.org; Tue, 31 May 2022 12:10:03 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1nw4RX-0004Ok-B4 for bug-guix@gnu.org; Tue, 31 May 2022 12:10:03 -0400 X-Loop: help-debbugs@gnu.org Subject: bug#55499: [PATCH 2/3] profiles: Do not repeat entries in 'manifest' file. Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Tue, 31 May 2022 16:10:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 55499 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: 55499@debbugs.gnu.org Cc: Ludovic =?UTF-8?Q?Court=C3=A8s?= Received: via spool by 55499-submit@debbugs.gnu.org id=B55499.165401339716867 (code B ref 55499); Tue, 31 May 2022 16:10:03 +0000 Received: (at 55499) by debbugs.gnu.org; 31 May 2022 16:09:57 +0000 Received: from localhost ([127.0.0.1]:48729 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nw4RR-0004Nv-4h for submit@debbugs.gnu.org; Tue, 31 May 2022 12:09:57 -0400 Received: from eggs.gnu.org ([209.51.188.92]:49678) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nw4RJ-0004NA-Sc for 55499@debbugs.gnu.org; Tue, 31 May 2022 12:09:52 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:41050) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nw4RE-0002MJ-MG; Tue, 31 May 2022 12:09:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:References:In-Reply-To:Date:Subject:To: From; bh=sjABqRdlHF6jZWwr0em6ySoXDazNqLZd5o7Q+B4UjUg=; b=oeURduUKgE224HqIkW1t 5ztMgWgStRRCp9wHL7qlJPMd6jP2zSmYCBQvCqwnewAeboHWQvlV3aEAVt2PHZvmxDBWHwme3BOLF OewO8rzIhi42Yo5UiyYkSLidbwl/SZZnsWqKQzBTlS0uFR5x3mcPFi5PTBLBKTT0O+LT9NwjKmuFg lB0Pg+PY9yiL+CGS7jotOGptzbxYqwNRC7UhibQzCQrlJot2VffmbpFV/K/7aFQyGxwxo6nyyh8Jj hnIshF+TfFwH94ximAJTdalHOyp71cHYRF1osCYm6gmnvqsBxv2p4FdTSaSS1GretDs7Qo3N58a5o zbE+DrKVmHcM6A==; Received: from [2001:660:6102:320:e120:2c8f:8909:cdfe] (port=52502 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nw4RE-00067m-0p; Tue, 31 May 2022 12:09:44 -0400 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Date: Tue, 31 May 2022 18:09:15 +0200 Message-Id: <20220531160916.21508-3-ludo@gnu.org> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220531160916.21508-1-ludo@gnu.org> References: <87sfp7kkim.fsf@elephly.net> <20220531160916.21508-1-ludo@gnu.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: "bug-Guix" X-Migadu-Flow: FLOW_IN X-Migadu-To: larch@yhetil.org X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1654013538; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding:resent-cc: resent-from:resent-sender:resent-message-id:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=sjABqRdlHF6jZWwr0em6ySoXDazNqLZd5o7Q+B4UjUg=; b=ffP9DW0YUMKXgU2TtlMUqxwpLUEAcigBSpzPUaOTZYlo4oB9QDXsVYcU/pmKZDCE6mXtg5 xctTeXRkUSS4MlHM7SurV3l9fe4UOTj7n1PpI39Rreh6UBpqwR/bqKat0ominS2+gUYDYd HYK2ntemVe52ix4AsUFLixESZhZU6NIs6l96k/AJDRJDj2yTk9nJ4EJxeh21sq2EGeYNFx nuwBzrxtcJ2o7pPR6tPmTsecbHwebTY5y1m9L7r+hrL7O9AZTdTIzLbg0GbHLTu68PIEXC aWgFeLf/PEBUmx8OLpdVCJzRv8TbDsUb/ve75/okyanFZ6aSwh0iqeAm16vdbg== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1654013538; a=rsa-sha256; cv=none; b=rQ2M71I1cIoYt4clWPr6s2MHZQ0D3FwfmwKRVnTo+68GF+X0yoJRZHC5hQUDa9OP5GDHX5 fmtpGOsTTD7LfkHfPiVu9Ub/Pt2Y0e3qqJC9JU3MjFjrjKd8hOKavfeBD7ZAVkyh0/5jbG iw7VWUxh9mcwHKMDxbDhQxlcVJqd+K6OJtWTT6OkggXmgAyHJVYnQmp0tzXLnPsAyGn5x8 vmNFwmpP7aG+Ex4LB/pXvmwU/Oi1ZM0DBE24NnhcXoaMQxZFr1aSGIsP6xRxlA1PSIG2vZ K4iwZEMg6nDq3Ww2JM7MjjyjpxGT6PuLw9SFAVgLuZvEzTLu2G/3uYPndE2QgA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gnu.org header.s=fencepost-gnu-org header.b=oeURduUK; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -3.24 Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gnu.org header.s=fencepost-gnu-org header.b=oeURduUK; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: CFFD93A035 X-Spam-Score: -3.24 X-Migadu-Scanner: scn0.migadu.com X-TUID: TRvbquAkujL4 Fixes . Reported by Ricardo Wurmus . With this change, the manifest file created for: guix install r r-seurat r-cistopic r-monocle3 r-cicero-monocle3 r-assertthat goes from 5.6M to 192K. Likewise, on this profile, wall-clock time of: GUIX_PROFILING=gc guix package -I goes from 0.7s to 0.1s, with heap usage going from 55M to 9M. * guix/profiles.scm (manifest->gexp)[entry->gexp]: Turn into a monadic procedure. Return a 'repeated' sexp if ENTRY was already visited before. Adjust caller accordingly. Bump manifest version. (sexp->manifest)[sexp->manifest-entry]: Turn into a monadic procedure. Add case for 'repeated' nodes. Add each entry to the current state vhash. Add clause for version 4 manifests. * tests/profiles.scm ("deduplication of repeated entries"): New test. * guix/build/profiles.scm (manifest-sexp->inputs+search-paths): Expect version 4. Add clause for 'repeated' nodes. --- guix/build/profiles.scm | 4 +- guix/profiles.scm | 127 ++++++++++++++++++++++++++++------------ tests/profiles.scm | 42 +++++++++++++ 3 files changed, 134 insertions(+), 39 deletions(-) diff --git a/guix/build/profiles.scm b/guix/build/profiles.scm index f9875ca92e..c4460f624b 100644 --- a/guix/build/profiles.scm +++ b/guix/build/profiles.scm @@ -150,7 +150,7 @@ (define (manifest-sexp->inputs+search-paths manifest) values: the list of store items of its manifest entries, and the list of search path specifications." (match manifest ;this must match 'manifest->gexp' - (('manifest ('version 3) + (('manifest ('version 4) ('packages (entries ...))) (let loop ((entries entries) (inputs '()) @@ -162,6 +162,8 @@ (define (manifest-sexp->inputs+search-paths manifest) (loop (append rest deps) ;breadth-first traversal (cons item inputs) (append paths search-paths))) + ((('repeated name version item) . rest) + (loop rest inputs search-paths)) (() (values (reverse inputs) (delete-duplicates diff --git a/guix/profiles.scm b/guix/profiles.scm index bf50c00a1e..44ff37e75b 100644 --- a/guix/profiles.scm +++ b/guix/profiles.scm @@ -455,31 +455,53 @@ (define (inferior->entry) (define (manifest->gexp manifest) "Return a representation of MANIFEST as a gexp." (define (entry->gexp entry) - (match entry - (($ name version output (? string? path) - (deps ...) (search-paths ...) _ (properties ...)) - #~(#$name #$version #$output #$path - (propagated-inputs #$(map entry->gexp deps)) - (search-paths #$(map search-path-specification->sexp - search-paths)) - #$@(if (null? properties) - #~() - #~((properties . #$properties))))) - (($ name version output package - (deps ...) (search-paths ...) _ (properties ...)) - #~(#$name #$version #$output - (ungexp package (or output "out")) - (propagated-inputs #$(map entry->gexp deps)) - (search-paths #$(map search-path-specification->sexp - search-paths)) - #$@(if (null? properties) - #~() - #~((properties . #$properties))))))) + ;; Maintain in state monad a vhash of visited entries, indexed by their + ;; item, usually package objects (we cannot use the entry itself as an + ;; index since identical entries are usually not 'eq?'). Use that vhash + ;; to avoid repeating duplicate entries. This is particularly useful in + ;; the presence of propagated inputs, where we could otherwise end up + ;; repeating large trees. + (mlet %state-monad ((visited (current-state))) + (if (match (vhash-assq (manifest-entry-item entry) visited) + ((_ . previous-entry) + (manifest-entry=? previous-entry entry)) + (#f #f)) + (return #~(repeated #$(manifest-entry-name entry) + #$(manifest-entry-version entry) + #$(manifest-entry-item entry))) + (mbegin %state-monad + (set-current-state (vhash-consq (manifest-entry-item entry) + entry visited)) + (mlet %state-monad ((deps (mapm %state-monad entry->gexp + (manifest-entry-dependencies entry)))) + (return + (match entry + (($ name version output (? string? path) + (_deps ...) (search-paths ...) _ (properties ...)) + #~(#$name #$version #$output #$path + (propagated-inputs #$deps) + (search-paths #$(map search-path-specification->sexp + search-paths)) + #$@(if (null? properties) + #~() + #~((properties . #$properties))))) + (($ name version output package + (_deps ...) (search-paths ...) _ (properties ...)) + #~(#$name #$version #$output + (ungexp package (or output "out")) + (propagated-inputs #$deps) + (search-paths #$(map search-path-specification->sexp + search-paths)) + #$@(if (null? properties) + #~() + #~((properties . #$properties)))))))))))) (match manifest (($ (entries ...)) - #~(manifest (version 3) - (packages #$(map entry->gexp entries)))))) + #~(manifest (version 4) + (packages #$(run-with-state + (mapm %state-monad entry->gexp entries) + vlist-null)))))) (define (find-package name version) "Return a package from the distro matching NAME and possibly VERSION. This @@ -522,25 +544,44 @@ (define (infer-dependency item parent) (define* (sexp->manifest-entry sexp #:optional (parent (delay #f))) (match sexp + (('repeated name version path) + ;; This entry is the same as another one encountered earlier; look it + ;; up and return it. + (mlet %state-monad ((visited (current-state)) + (key -> (list name version path))) + (match (vhash-assoc key visited) + (#f + (raise (formatted-message + (G_ "invalid repeated entry in profile: ~s") + sexp))) + ((_ . entry) + (return entry))))) ((name version output path ('propagated-inputs deps) ('search-paths search-paths) extra-stuff ...) - ;; For each of DEPS, keep a promise pointing to ENTRY. - (letrec* ((deps* (map (cut sexp->manifest-entry <> (delay entry)) - deps)) - (entry (manifest-entry - (name name) - (version version) - (output output) - (item path) - (dependencies deps*) - (search-paths (map sexp->search-path-specification - search-paths)) - (parent parent) - (properties (or (assoc-ref extra-stuff 'properties) - '()))))) - entry)))) + (mlet* %state-monad + ((entry -> #f) + (deps* (mapm %state-monad + (cut sexp->manifest-entry <> (delay entry)) + deps)) + (visited (current-state)) + (key -> (list name version path))) + (set! entry ;XXX: emulate 'letrec*' + (manifest-entry + (name name) + (version version) + (output output) + (item path) + (dependencies deps*) + (search-paths (map sexp->search-path-specification + search-paths)) + (parent parent) + (properties (or (assoc-ref extra-stuff 'properties) + '())))) + (mbegin %state-monad + (set-current-state (vhash-cons key entry visited)) + (return entry)))))) (match sexp (('manifest ('version 0) @@ -608,7 +649,17 @@ (define* (sexp->manifest-entry sexp #:optional (parent (delay #f))) ;; Version 3 represents DEPS as full-blown manifest entries. (('manifest ('version 3 minor-version ...) ('packages (entries ...))) - (manifest (map sexp->manifest-entry entries))) + (manifest (run-with-state + (mapm %state-monad sexp->manifest-entry entries) + vlist-null))) + + ;; Version 4 deduplicates repeated entries, as can happen with deep + ;; propagated input trees. + (('manifest ('version 4 minor-version ...) + ('packages (entries ...))) + (manifest (run-with-state + (mapm %state-monad sexp->manifest-entry entries) + vlist-null))) (_ (raise (condition (&message (message "unsupported manifest format"))))))) diff --git a/tests/profiles.scm b/tests/profiles.scm index 7e51d37ab9..3838d971c9 100644 --- a/tests/profiles.scm +++ b/tests/profiles.scm @@ -586,6 +586,48 @@ (define (entry->sexp entry) #:locales? #f))) (return #f))))) +(test-assertm "deduplication of repeated entries" + ;; Make sure the 'manifest' file does not duplicate identical entries. + ;; See . + (mlet* %store-monad ((p0 -> (dummy-package "p0" + (build-system trivial-build-system) + (arguments + `(#:guile ,%bootstrap-guile + #:builder (mkdir (assoc-ref %outputs "out")))) + (propagated-inputs + `(("guile" ,%bootstrap-guile))))) + (p1 -> (package + (inherit p0) + (name "p1"))) + (drv (profile-derivation (packages->manifest + (list p0 p1)) + #:hooks '() + #:locales? #f))) + (mbegin %store-monad + (built-derivations (list drv)) + (let ((file (string-append (derivation->output-path drv) + "/manifest")) + (manifest (profile-manifest (derivation->output-path drv)))) + (define (contains-repeated? sexp) + (match sexp + (('repeated _ ...) #t) + ((lst ...) (any contains-repeated? sexp)) + (_ #f))) + + (return (and (contains-repeated? (call-with-input-file file read)) + + ;; MANIFEST has two entries for %BOOTSTRAP-GUILE since + ;; it's propagated both from P0 and from P1. When + ;; reading a 'repeated' node, 'read-manifest' should + ;; reuse the previously-read entry so the two + ;; %BOOTSTRAP-GUILE entries must be 'eq?'. + (match (manifest-entries manifest) + (((= manifest-entry-dependencies (dep0)) + (= manifest-entry-dependencies (dep1))) + (and (string=? (manifest-entry-name dep0) + (package-name %bootstrap-guile)) + (eq? dep0 dep1)))))))))) + (test-assertm "no collision" ;; Here we have an entry that is "lowered" (its 'item' field is a store file ;; name) and another entry (its 'item' field is a package) that is -- 2.36.1