From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:8:6d80::]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id gGGAAlPUXWDl1QAAgWs5BA (envelope-from ) for ; Fri, 26 Mar 2021 13:32:19 +0100 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id yL/RN1LUXWCJGgAAbx9fmQ (envelope-from ) for ; Fri, 26 Mar 2021 12:32:18 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 7C4EF37469 for ; Fri, 26 Mar 2021 13:32:17 +0100 (CET) Received: from localhost ([::1]:55512 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lPldQ-0007M7-HM for larch@yhetil.org; Fri, 26 Mar 2021 08:32:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55070) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lPlc7-0006Pl-QJ for gwl-devel@gnu.org; Fri, 26 Mar 2021 08:30:55 -0400 Received: from out2-smtp.messagingengine.com ([66.111.4.26]:54675) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lPlc3-0000jm-Ks for gwl-devel@gnu.org; Fri, 26 Mar 2021 08:30:55 -0400 Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id D977B5C0376; Fri, 26 Mar 2021 08:30:45 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute3.internal (MEProxy); Fri, 26 Mar 2021 08:30:45 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastmail.net; h= from:to:cc:subject:in-reply-to:references:date:message-id :mime-version:content-type:content-transfer-encoding; s=fm3; bh= gt5uffrdq1rc8ylFoZnJJN2GTd9smQ8xqga5poM9SjA=; b=KgWHhxWUFIGjw623 RygEE6LHwuSPglOZG0JPA9LNX7EoW8pen0hssxAja0P/IDdfevDwyDXzph3MB90E 1zNeT+rsL3RAQqY14f3BT2WpiG33YwNR8OrUD1BN0bt1HPfae7n7KnPAdJK+mmaQ 1k4R29pEc22clan7u0u4sCBxPcnKEAgYKi/j8j717uqE5CVU2nPIvZiMQste6k6F R9usdTaayM6YQfBuEfJUKEay7iI0espE4fBy8ku6tD3w7md8Q+UDTzmMQwrKpo63 imRResCbvNV2kqPAg6LlatK/cIkELL1gkTGHh+EtwmXTdczh/jv2ottHlrGVg4n4 XRBvuw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm2; bh=gt5uffrdq1rc8ylFoZnJJN2GTd9smQ8xqga5poM9S jA=; b=kYfmiNR0CBDLq3mZd0myhlvqW/zp3uzidphV6pnjZUlELpLR54nSHUm4E 3fR0FLj4/FFQbdCbE95uhjage/xmpvLPP3R+FEnsJrX5q11KtolFtBVepdBb9j/d 0vNFVC0qFKmOxS/JIeBd0Fo/aGYuRkroai/vLOP9osc4FP9HWBzFnlYZFn0368i/ gmqlLkvutQ4l+KJELekAPChTD2y2p/4ZvfmwsxCFYRTeU0UkoVkW0lAtSykD3upk hYYa7pAou2lE281tYk7arYgUtYmIx71voeSwUGP3pC3lO/5dSEGxaFEkzxbT/p2L MrdxPhU6IEajZQspbALsdQPWjwVkA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrudehvddggedvucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefhvffujghffffkgggtgfesthhqre dttddtjeenucfhrhhomhepmfhonhhrrgguucfjihhnshgvnhcuoehkohhnrhgrugdrhhhi nhhsvghnsehfrghsthhmrghilhdrnhgvtheqnecuggftrfgrthhtvghrnhepveehleevge ejuddvgfduieefveetffdvteegfeekgfduueejfeegvdehgeekhfejnecukfhppeekiedr vdegjedrgeekrdehkeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrih hlfhhrohhmpehkohhnrhgrugdrhhhinhhsvghnsehfrghsthhmrghilhdrnhgvth X-ME-Proxy: Received: from ordinateur-de-catherine--konrad.home (lfbn-idf2-1-840-58.w86-247.abo.wanadoo.fr [86.247.48.58]) by mail.messagingengine.com (Postfix) with ESMTPA id 15086240066; Fri, 26 Mar 2021 08:30:45 -0400 (EDT) From: Konrad Hinsen To: Ricardo Wurmus Subject: Re: Managing data files in workflows In-Reply-To: <87r1k2ti7k.fsf@elephly.net> References: <87r1k2ti7k.fsf@elephly.net> Date: Fri, 26 Mar 2021 13:30:43 +0100 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=66.111.4.26; envelope-from=konrad.hinsen@fastmail.net; helo=out2-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: gwl-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: gwl-devel@gnu.org Errors-To: gwl-devel-bounces+larch=yhetil.org@gnu.org Sender: "gwl-devel" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1616761938; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=gt5uffrdq1rc8ylFoZnJJN2GTd9smQ8xqga5poM9SjA=; b=jG20vbEIoFLflLVLyX10h2LVR90zjnrdZrT4YNM2KDVWU5ZnLKokVV1dzWtAneuynMnWhz hJVl78u/4ljm0m1AyUOW/7WjqP1qBLSm+4/3qWoZTyqM57BRmFndj6k2efjuDCipS+Emd8 C4fGkGTS6XHYPPWQAz4eBWSwn6RNnlsp4GzEUttLUK7gEraIMP3xueWz6ByH+fHNoKxPPz ANjSwwLZ88klx4I1NWeG2Jq//PAWIZ0MIZjQTgyaC4epSoxomV2Dh8HtHa4Ji6MkCxScHi B9SeEn4Gx8XlqKv5Tq0T0etkE648eaanxzaJnpx6hf/V4AyVgiUuVQQzy7Uegg== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1616761938; a=rsa-sha256; cv=none; b=otHRPKrMq4lRKzDbF2lvVxIeYnliCrQ1vWbjFxcHJbcePKI0nWZEkRO8LY3a1hI71S/Fyz ISHKaTAN89oIX6T58VKCwR0cd3fit31KyGW/SUxS+4vU4ztE4LmxPNcsx9aFe1PMqYfbgL 7kcHqRvTTLyZOBse0YcwB81w0Mkv3qH2JOJ+efNa8Yzhu9OpvoaaI9TvhSYOvdPEsNjCCy A6yYxG/fnIna57Q2LzUNhnRxMp7t+5HJUAWZB9BmhhqAyr8zBZjLCo75mR0zjHpi4YAMO0 R7xrI3RG/NLpWY3URbpwv2YednqqkPa1h05EXFhqz6B6hXre5MPZJI2RSIzsHA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=fastmail.net header.s=fm3 header.b=KgWHhxWU; dkim=pass header.d=messagingengine.com header.s=fm2 header.b=kYfmiNR0; spf=pass (aspmx1.migadu.com: domain of gwl-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=gwl-devel-bounces@gnu.org X-Migadu-Spam-Score: -3.67 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=fastmail.net header.s=fm3 header.b=KgWHhxWU; dkim=pass header.d=messagingengine.com header.s=fm2 header.b=kYfmiNR0; dmarc=pass (policy=none) header.from=fastmail.net; spf=pass (aspmx1.migadu.com: domain of gwl-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=gwl-devel-bounces@gnu.org X-Migadu-Queue-Id: 7C4EF37469 X-Spam-Score: -3.67 X-Migadu-Scanner: scn0.migadu.com X-TUID: X0+4OQnp+UK1 Hi Ricardo, Ricardo Wurmus writes: > This works for me correctly: Thanks for looking into this! For me, your change makes no difference. Nor should it, because in my setup the "data" directory already exists. I still get an error message about the already existing file. Maybe it's time to switch to the development version of GWL! > It skips the process because the output file exists and the daring > assumption we make is that outputs are reproducible. > > I would like to make these assumptions explicit in a future version, but > I=E2=80=99m not sure how. An idea is to add keyword arguments to =E2=80= =9Cfile=E2=80=9D that > allows us to provide a content hash, or merely a flag to declare a file > as volatile and thus in need of recomputation. Declaring a file explicitly as volatile or reproducible sounds good. I am less convinced about adding a hash, except for inputs external to the workflow. In my example, the file I download changes on the server once per week, so I'd mark it as volatile. I'd then expect it to be re-downloaded at every execution of the workflow. But I a also OK with doing this manually, i.e. deleting the file if I want it to be replaced. Old make habits never die ;-) > I also wanted to have IPFS and git-annex support, but before I embark on > this I want to understand exactly how this should behave and what the UI > should be. E.g. having an input that is declared as =E2=80=9CIPFS-file= =E2=80=9D would > cause that input file to be fetched automatically without having to > specify a process that downloads it first. (Something similar could be > implemented for web resources as in your example.) Indeed. An extended version of "guix download" for workflows. However, what I had in mind with my question is the management of intermediate results in my workflow, especially in its development phase. If I change my workflow file, or a script that it calls, I'd want only the affected steps to be recomputed. That's not much of an issue for my current test case, but I have bigger dreams for the future ;-) Cheers, Konrad.