From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:8:6d80::]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id l9fTC5VeXGC51wAAgWs5BA (envelope-from ) for ; Thu, 25 Mar 2021 10:57:41 +0100 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id 6FZgBpVeXGDrJAAAB5/wlQ (envelope-from ) for ; Thu, 25 Mar 2021 09:57:41 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 48FAB188F7 for ; Thu, 25 Mar 2021 10:57:39 +0100 (CET) Received: from localhost ([::1]:59626 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lPMkD-0003Ri-Bu for larch@yhetil.org; Thu, 25 Mar 2021 05:57:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40446) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lPMk9-0003PJ-Gj for gwl-devel@gnu.org; Thu, 25 Mar 2021 05:57:33 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:43955) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lPMk7-0002Nu-HR for gwl-devel@gnu.org; Thu, 25 Mar 2021 05:57:33 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 46AA95C0124; Thu, 25 Mar 2021 05:57:29 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Thu, 25 Mar 2021 05:57:29 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastmail.net; h= from:to:subject:date:message-id:mime-version:content-type; s= fm3; bh=YG3/qlYzEKiVU46xfU8Ma/5OGLqhpX2Ei7aiidv/ghA=; b=IIv6o5SF tNWMVerE5MtiaqvfMpppr00gVkYZDlN6YoWOogMzfVhrE2kCKv2Upk/NzshB7wE4 cgAlGfMo+rXFziWed91GE+mrs8mKReyFoTIYRerCMajIdik7UxhjK/jltkcqBHPO p/ycVIKLgDg3OnC4mUWzooDHxo9x3IeS87qD9Q4IdBC4x81SGg4Doy8gWU5Pk4Fe a2ZCGhjGlHB0Z8eLq1NIxno5aMEbNhqPHALzKP5aoEsb0lVeQWz8s0MWAU2nyt4h pqyEHaa3UvGM+a7bmX7wRkR9NpDGwr1qSH3nzCCRNw2nGwBDvijOFbVzY53NoEDg iowUmmsidAK3cg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=content-type:date:from:message-id :mime-version:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm2; bh=YG3/qlYzEKiVU46xfU8Ma/5OGLqhp X2Ei7aiidv/ghA=; b=BFd5dbNuFaOf+yZo4ROI8SVrsRWSh4Yl+WEYys22DLv1B SkKdYc++hM2qi+7jqUGIPS0naHK68Hov/nr19mYajtzFKwldCPCsE5RFXRXlgGbz FXYlLglLsahb1WG/NGWOoXVxA2yybjTI65UbuvVR+veEavRYEKK1f79jl9jZv1Pw OyglQrYU+g28RWhTP7mFVsBonVfSOlInVawLrSyAm/ID9GO6SR7ApICbtsy75I36 OwNQSu7Qm/0TO0kxEacJ29iu1QynBa2hBDyu2+tL6Avpjs6bq80QiXbk3iuSF9HD Pm91LrYC8PnxVFPjOf28RNFVv1XED2Ry9obGlKZNA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrudehtddguddtucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefhvffufffkgggtsehttdertddttd dtnecuhfhrohhmpefmohhnrhgrugcujfhinhhsvghnuceokhhonhhrrggurdhhihhnshgv nhesfhgrshhtmhgrihhlrdhnvghtqeenucggtffrrghtthgvrhhnpefghedufeeuuddtfe duhefhjeffteekteeftdevieduhfduueeghfehtedtueehhfenucffohhmrghinhepshgv nhhtihifvggsrdhfrhenucfkphepkeeirddvgeejrdegkedrheeknecuvehluhhsthgvrh fuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepkhhonhhrrggurdhhihhnshgv nhesfhgrshhtmhgrihhlrdhnvght X-ME-Proxy: Received: from ordinateur-de-catherine--konrad.home (lfbn-idf2-1-840-58.w86-247.abo.wanadoo.fr [86.247.48.58]) by mail.messagingengine.com (Postfix) with ESMTPA id B5E5B24005C; Thu, 25 Mar 2021 05:57:28 -0400 (EDT) From: Konrad Hinsen To: gwl-devel@gnu.org Subject: Managing data files in workflows Date: Thu, 25 Mar 2021 10:57:27 +0100 Message-ID: MIME-Version: 1.0 Content-Type: text/plain Received-SPF: pass client-ip=66.111.4.25; envelope-from=konrad.hinsen@fastmail.net; helo=out1-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: gwl-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gwl-devel-bounces+larch=yhetil.org@gnu.org Sender: "gwl-devel" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1616666260; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=YG3/qlYzEKiVU46xfU8Ma/5OGLqhpX2Ei7aiidv/ghA=; b=oY5W0a6aJt/KbQ5qrTfJBxKedHXi2uRniqCPLGe2fytCJ4njhj+2hXin2LSGsJoxonMBdj fk1AusALNtYzkmk3DEbn2pazXDDfvm3fpOboCOFs5aBWbj2W3y8jCIzJD9W1lw+2F5qwR6 AzV0Q03NKyOA0TW2TBnasVus52JGEZ9+3WiZJa5DalUkV95AYOKxUJsjASkZzrhJiWfQ9O aCjptfXGLLjZ3YbR4WtmZRLu55daFeyxoreszbviZFyez0vhAsmADJXevJccs3vhk3Cv8W 6YVe+4aSIVSuIF2FCBOJZFvnotGMGMCDhDGYi3KOPB8xManDJgTx4eUL/1cDVA== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1616666260; a=rsa-sha256; cv=none; b=X7qPV1fK+SytZVLoZu04Epbk3wkcX/e92/dTI4Gg1WA9Nyzbguftq34isO+mX5GtniCz2u mpk4FkRTdfX8PJSpE8+zUNwO8nDNXE4EQ2QxFoytpmSvwzvhPmW+XkIvAn34heqAxm/ZuV w6EspuCgX7HTX9dzt24uoFvKvFjLaIkzFbPf3wjdWRhUXvwrTno+TE4FmG1Se2KOLZJOMO uMf9EJ4Ok9WzMbGl21zEj40g8k2rvyB3SEw4CfLw8ZzzveJY9cE1f6xHUGhyls9OHjz7MY iXBmemu+7MwB94MU6z63Ubt6CnG+Q9SBMm0oOymcp03uqF+2OOxOsJX5VDdA9w== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=fastmail.net header.s=fm3 header.b=IIv6o5SF; dkim=pass header.d=messagingengine.com header.s=fm2 header.b=BFd5dbNu; spf=pass (aspmx1.migadu.com: domain of gwl-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=gwl-devel-bounces@gnu.org X-Migadu-Spam-Score: -1.62 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=fastmail.net header.s=fm3 header.b=IIv6o5SF; dkim=pass header.d=messagingengine.com header.s=fm2 header.b=BFd5dbNu; dmarc=pass (policy=none) header.from=fastmail.net; spf=pass (aspmx1.migadu.com: domain of gwl-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=gwl-devel-bounces@gnu.org X-Migadu-Queue-Id: 48FAB188F7 X-Spam-Score: -1.62 X-Migadu-Scanner: scn0.migadu.com X-TUID: vHEoJU/bPcll Hi everyone, Coming from make-like workflow systems, I wonder how data files are best managed in GWL workflow. GWL is clearly less file-centric than make (which is a Good Thing in my opinion), but at a first reading of the manual, it doesn't seem to care about files at all, except for auto-connect. A simple example: ================================================== process download packages "wget" outputs file "data/weekly-incidence.csv" # { wget -O {{outputs}} http://www.sentiweb.fr/datasets/incidence-PAY-3.csv } workflow influenza-incidence processes download ================================================== This works fine the first time, but the second time it fails because the output file of the process already exists. This doesn't look very useful. The two behaviors I do see as potentially useful are 1) Always replace the file. 2) Don't run the process if the output file already exists (as make would do by default) I can handle this in my bash code of course, but that becomes lengthy even for this trivial case: ================================================== process download packages "wget" outputs file "data/weekly-incidence.csv" # { rm {{outputs}} wget -O {{outputs}} http://www.sentiweb.fr/datasets/incidence-PAY-3.csv } ================================================== ================================================== process download packages "wget" outputs file "data/weekly-incidence.csv" # { test -f {{outputs}} || wget -O {{outputs}} http://www.sentiweb.fr/datasets/incidence-PAY-3.csv } ================================================== Is there a better solution? Cheers, Konrad.