From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1.migadu.com ([2001:41d0:303:e224::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms8.migadu.com with LMTPS id oPx8BPo+A2aVUwEA62LTzQ:P1 (envelope-from ) for ; Tue, 26 Mar 2024 22:32:42 +0100 Received: from aspmx1.migadu.com ([2001:41d0:303:e224::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1.migadu.com with LMTPS id oPx8BPo+A2aVUwEA62LTzQ (envelope-from ) for ; Tue, 26 Mar 2024 22:32:42 +0100 X-Envelope-To: larch@yhetil.org Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=elephly.net header.s=zoho header.b=DqhkOlC9; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "gwl-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="gwl-devel-bounces+larch=yhetil.org@gnu.org"; arc=pass ("zohomail.com:s=zohoarc:i=1") ARC-Seal: i=2; s=key1; d=yhetil.org; t=1711488762; a=rsa-sha256; cv=pass; b=eY/qI02hD+ourBFryrOa3En3x+8scGQf0b66ipa7JVFwuzHL77DxMi+8Qez+zfxMPndzYk hVuHZGLZfX5ZlSJ/41k3vf/7qXAa/oi6AXupmFCt0Cpj+t92BJdR/L9/Ig1IjQihWLEX3K SGO3sgyIlmsWEs843qgcKSuKnjGvb0rXVFA1UYJwrbjJnZZdtr/NDksAZqUjKurE3dippa 7A/SmZ/GswMK+VOVhtKlbUL3nMIdQC1RMi5DCzf7oJSnksVyxF9v9sV9n53YEoaIat1BCk sxvcOQYVKiDHCOy+iPStvIXhcKUUcXEi4kGYqxgSbqCnIrFUSit+QCU/NhLhSw== ARC-Authentication-Results: i=2; aspmx1.migadu.com; dkim=pass header.d=elephly.net header.s=zoho header.b=DqhkOlC9; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "gwl-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="gwl-devel-bounces+larch=yhetil.org@gnu.org"; arc=pass ("zohomail.com:s=zohoarc:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1711488762; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=+XwyYBwKhCk6CS7EKP1AztshavIRuriwjB3eCfY8Gtw=; b=RakU5gCLzS7xWAXx0UbwcXUIIH+RjvbwayPyg60+7rMvj/lLzRvIUAfO5TahcuT7i9dSXp 6I7A5eMYNIRwdXVKGGPXB+nbFh1sznz9S4j6gdE1oNpeMyzJraP6A4tLrJD6G6ghwM5inG 9bB/n4VX+/2uWIuydYbnSUASCz2sQ9CY/SOAAUeg/VY6dKh+P+7T3CIz9DXH3ar5LBmqWc YgxHmYEDVZCgIaGHdSRKbo1qF43mJLFeLo8yJQ6e1vzniLqcpRK8rmoVyGfXc9+r0phoCI CuElRkQAzFiL4CmhHU9NNPE1ByGKntxWcQvlTkQQqMcps7F0z/wA1s8JgObkPg== Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id C749A14C68 for ; Tue, 26 Mar 2024 22:32:41 +0100 (CET) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rpEP4-0003Ca-F5; Tue, 26 Mar 2024 17:32:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rpEP1-0003CO-UX for gwl-devel@gnu.org; Tue, 26 Mar 2024 17:32:15 -0400 Received: from sender4-of-o51.zoho.com ([136.143.188.51]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rpEOz-0001xH-NO for gwl-devel@gnu.org; Tue, 26 Mar 2024 17:32:15 -0400 ARC-Seal: i=1; a=rsa-sha256; t=1711488727; cv=none; d=zohomail.com; s=zohoarc; b=jSc/VI1ZKD+Egjk/OzeIbzBU055KR6BCJAiLZLr8+Ec5T6ponDOdF24+WAw5upvtzd9j/LNDIWtPJ8OwPQJamTP/Sp/CKsKMii2lMxQtmIXsDFdASzw1Z+co6HEOeYHxO/BiDWUSQWRCYv4T9m/Bd2Zq0LbuzHv4K1Pg4VGiLqA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1711488727; h=Content-Type:Content-Transfer-Encoding:Date:Date:From:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:Subject:To:To:Message-Id:Reply-To:Cc; bh=+XwyYBwKhCk6CS7EKP1AztshavIRuriwjB3eCfY8Gtw=; b=aE7XoUGBprm2K4lYCoCAU32rI0XSke9CBy5jMre5+8/ShX6VlDu4D00PJp04bAlhieho5fIiiNEaTvATDYybPciDSpdjFGKcNFmLBT+R9p8xU/FxVpizKyVHQwh3Szw7XzO3512gIiWErIn9gZrMcWS5Wdu86o1/Rweqhj4UgXU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=elephly.net; spf=pass smtp.mailfrom=rekado@elephly.net; dmarc=pass header.from= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1711488727; s=zoho; d=elephly.net; i=rekado@elephly.net; h=References:From:From:To:To:Subject:Subject:Date:Date:In-reply-to:Message-ID:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-Id:Reply-To:Cc; bh=+XwyYBwKhCk6CS7EKP1AztshavIRuriwjB3eCfY8Gtw=; b=DqhkOlC9jugOEwSPc2zjuwP8v7rYX0ZW40vvNQEo3RCEG5nj1q9Gn6t9t/5o+Fst Ki2DBq2K/3lFwQGGBYOehzawJdCJBzSYWLSucHRr9knIxLC9yK81hq3kD9c8ZynNy4O 8f9iioJ8FfiKXkVknvPe1VR9KsJV6UZ9AmiYUK+E= Received: from localhost (196-110-142-46.pool.kielnet.net [46.142.110.196]) by mx.zohomail.com with SMTPS id 1711488724834472.1946498388429; Tue, 26 Mar 2024 14:32:04 -0700 (PDT) References: <2010bdb88116d64da3650b06e58979518b2c7277.camel@ist.tugraz.at> <877chvehuu.fsf@elephly.net> User-agent: mu4e 1.10.8; emacs 29.1 From: Ricardo Wurmus To: Liliana Marie Prikler , gwl-devel@gnu.org Subject: Re: Processing large amounts of files Date: Tue, 26 Mar 2024 22:30:45 +0100 In-reply-to: <877chvehuu.fsf@elephly.net> Message-ID: <87v858brxq.fsf@elephly.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-ZohoMailClient: External Received-SPF: pass client-ip=136.143.188.51; envelope-from=rekado@elephly.net; helo=sender4-of-o51.zoho.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: gwl-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gwl-devel-bounces+larch=yhetil.org@gnu.org Sender: gwl-devel-bounces+larch=yhetil.org@gnu.org X-Migadu-Flow: FLOW_IN X-Migadu-Country: US X-Spam-Score: -8.11 X-Migadu-Queue-Id: C749A14C68 X-Migadu-Scanner: mx12.migadu.com X-Migadu-Spam-Score: -8.11 X-TUID: goBnJGE0kzxg Ricardo Wurmus writes: > Liliana Marie Prikler writes: > >> For comparison: >> time cat /tmp/meow/{0..7769} >> [=E2=80=A6] >>=20=20=20 >> real 0m0,144s >> user 0m0,049s >> sys 0m0,094s >> >> It takes GWL 6 times longer to compute the workflow=C2=A0than to create = the >> inputs in Guile, and 600 times longer than to actually execute the >> shell command. I think there is room for improvement :) > > GWL checks if all input files exist before running the command. Part of > the difference you see here (takes about 2 seconds on my laptop) is GWL > running FILE-EXISTS? on 7769 files. This happens in prepare-inputs; its > purpose: > > "Ensure that all files in the INPUTS-MAP alist exist and are linked to > the expected locations. Pick unspecified inputs from the environment. > Return either the INPUTS-MAP alist with any additionally used input > file names added, or raise a condition containing the list of missing > files." > > Another significant delay is introduced by the cache mechanism, which > computes a unique prefix based on the contents of all input files. It's > not unexpected that this will take a little while, but it's not great > either. With commit f4442e409cf05d0c7cc4d6a251626d22efaffe8c it's a little faster. We used a whole lot of alists, and this becomes slow when there are thousands of inputs. We're now using hash tables. --=20 Ricardo