From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp12.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms9.migadu.com with LMTPS id 2NOKMr5yK2TqQwAASxT56A (envelope-from ) for ; Tue, 04 Apr 2023 02:43:42 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp12.migadu.com with LMTPS id wEw5Mr5yK2SmHQAAauVa8A (envelope-from ) for ; Tue, 04 Apr 2023 02:43:42 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 1FBB31040 for ; Tue, 4 Apr 2023 02:43:42 +0200 (CEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pjUlF-0001yB-SY; Mon, 03 Apr 2023 20:42:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pjUlE-0001xt-IX for guix-devel@gnu.org; Mon, 03 Apr 2023 20:42:56 -0400 Received: from mail-qv1-xf2d.google.com ([2607:f8b0:4864:20::f2d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pjUkJ-00063n-NE for guix-devel@gnu.org; Mon, 03 Apr 2023 20:42:55 -0400 Received: by mail-qv1-xf2d.google.com with SMTP id on15so9209110qvb.7 for ; Mon, 03 Apr 2023 17:41:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=terpmail.umd.edu; s=google; t=1680568916; x=1683160916; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:mime-version:date:message-id:from:to:cc :subject:date:message-id:reply-to; bh=XxQVUmuYStGkVheos5hjE0GOZNzwmdnkBdOB3s/Tjfk=; b=MQpsV/nZnfvMB70UdhY250+SQcRE6PxufpbahJbqMdhy+iL4bSIIyVSMMKlHCIa4ru XVcR4XRqXCIzysgQm9mTE92oY623W35yZ3oZ0fGM9/O3AgO5aAvsDd/qpvFJz9sXew1k tEy8zcEQlKzJtrxjEP3nPEo0zW049bhrzbmZMS0zyYQukDzWWbLTqTSJXNirSgU3SNhk ZZsKXb6pCm7lYsqxKhd9IjlgPf/z/y6lxsF2j97tqwyX59S9iPNpSLypqbTzWWKKbrtx Yv956xxsYdByyPokpG+5DuHUdg9bXplvVzyryJuvwpAUnrmPpnXdNnAQxfY/nvdLzPaO Ljsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680568916; x=1683160916; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=XxQVUmuYStGkVheos5hjE0GOZNzwmdnkBdOB3s/Tjfk=; b=JcZnce5s03Tx2bvxLSUrEUiUTfxZx1oRXBWRVdRLIC8s0cQYe46Gp36JfrOHBkveBi BF/3GepMXwAjFOpzaPvIItW9rS1jYXJDLiHHR39nysYtgDTY8rrKax71mRuPqDg04G4C uTSDZHs7gCGF5NWG7pAWuyMHQG6FMiCnrno16nyu1thu1W8CMuzLboS/2r+eytswCm8P T+nnphOXX4sVpxN1ahprjBAjgelUmVH23dQh3Vg6juxkyuThm8URgmSur0dDJ8EVaBOg xSwuh2FTvApMwabh5qPauOGjwIMZT9Ma02JO9m/gzBPDc9bbV0i3uoDdP88dTRyYdZmE 4Wqw== X-Gm-Message-State: AAQBX9eQP+xuBo6WcU6rt8i9qUERK08qD44VrlK58B/ze6AHYT7k8TT3 DQabLZCExefKBq1mNORNc0PjVA== X-Google-Smtp-Source: AKy350bG/BCvhN4TaNPpN/YiTGmug50oHLf1Outuv60bhYF/wYGmXo5aqZbx3BZpbbc92vFkLmT8yA== X-Received: by 2002:a05:6214:f2e:b0:5df:3a1b:434 with SMTP id iw14-20020a0562140f2e00b005df3a1b0434mr1169471qvb.8.1680568915929; Mon, 03 Apr 2023 17:41:55 -0700 (PDT) Received: from [10.105.61.138] ([129.2.192.138]) by smtp.gmail.com with ESMTPSA id j13-20020ac8550d000000b003ba2a15f93dsm2900321qtq.26.2023.04.03.17.41.55 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 03 Apr 2023 17:41:55 -0700 (PDT) Message-ID: <09755392-de37-c039-6b60-46310f6f4314@terpmail.umd.edu> Date: Mon, 3 Apr 2023 20:41:53 -0400 MIME-Version: 1.0 Subject: Re: Google Summer of Code 2023 Inquiry Content-Language: en-US To: Kyle , Ricardo Wurmus , Simon Tournier Cc: guix-devel@gnu.org References: <6d30ee7b-f1f0-9199-fea8-efd434c8611c@terpmail.umd.edu> <86sfeb9zx8.fsf@gmail.com> <87ttycir7r.fsf@elephly.net> <42aa5844-0769-e122-efd7-8a152070c71c@terpmail.umd.edu> From: Spencer Skylar Chan In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Received-SPF: pass client-ip=2607:f8b0:4864:20::f2d; envelope-from=schan12@terpmail.umd.edu; helo=mail-qv1-xf2d.google.com X-Spam_score_int: -19 X-Spam_score: -2.0 X-Spam_bar: -- X-Spam_report: (-2.0 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: guix-devel-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN ARC-Seal: i=1; s=key1; d=yhetil.org; t=1680569022; a=rsa-sha256; cv=none; b=VewnUt7lrJhRRIMpOzKpsoPQAyRalJwv/WiA0fjHuZqzzBVSfgYM8/zryuz6JFA/RVljvV OOmIHK06fACaekZe3aMCfSxKd9b1dMUNU3tcMScI+61tSrgYmn9xmI0RWllSjcjbs+0ImD mbZq7zDgeFl4v8dtgsfX/eJeUC5HsEc4O0OFWg4Bc60gxt1L7bW736ZiFDvyZi+gtmF29z H5+mzujenzowTkjV37REt8VuRKWCgIAfUcomJfNcx7vYrCiZPILdqPCuBD5SC3ySZlkvnv pbeX4NWnmwN7DbC0q42rYv45v4S9T0//2OsbGUWvj3r8SjWccMYr4cQa+N5u6A== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=terpmail.umd.edu header.s=google header.b="MQpsV/nZ"; dmarc=pass (policy=none) header.from=terpmail.umd.edu; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1680569022; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=XxQVUmuYStGkVheos5hjE0GOZNzwmdnkBdOB3s/Tjfk=; b=M6w/DKM+vMnd9zjietet8vUDPzxKp0IXmAMpp9eAB5il5F4iVqsCTJgKh2+ITjIKxfRBTU RgHHhdbQK0JS+FgJ9ghLsci4iEQJwJLkoDxuRCtGzNBhB0vc8EVUj5zvUcyAsbF3aQO31i EV0LdioeN+GyXKbiGRFvzOasei5Xk5Ihw9ZGY4DvkDNMSNtqYSFsNdOBY5WFvCSjKLromy 3V0+Krkbc+qvx4M2suhncrri+HZs+/SwtbJIJtYbw7B6YYJWdIFlPanrr1X2fjXiYjByoC nVqW74+yrX6nCwPxF/ABQ24Fca/XK1b2dzoExazqb2Kuep+sv8xUjjOumNFudw== Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=terpmail.umd.edu header.s=google header.b="MQpsV/nZ"; dmarc=pass (policy=none) header.from=terpmail.umd.edu; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Scanner: scn0.migadu.com X-Migadu-Spam-Score: -6.22 X-Spam-Score: -6.22 X-Migadu-Queue-Id: 1FBB31040 X-TUID: lwdwZEt442Ax Hi Kyle, On 3/31/23 11:15, Kyle wrote: > I would expect most software versions to not be in Guix. Simon had mentioned that this is mostly what the guix-past repository is for. However, some packages might be buried on some branch or some commit in some Guix related git repository. It may be helpful to facilitate their discovery and extraction for conda import. > > Git has a newish binary file format for caching searches across commits. Maybe it would be helpful to figure out how to parse this format (its documented) and index the data further using Xapian or a graph data structure (or tree sitter?) with the relevant metadata needed to find and efficiently extract scheme code and its dependencies? If the format is documented then this is possible, although I'm not super familiar with these kinds of data structures. > You make an interesting point about compilation errors. It may more productive to help researchers test for working satisfiable configurations as a more relaxed approach to having to specify the exact software version. Maybe some "nearby" or newer version is packaged and that is enough to successfully run a test suite? I'm imagining something between git bisect and Guix's own package solver. Yes, we could have a variant of the solver that's more relaxed. It could output multiple solutions so the user can inspect them and pick the best one. > It might also be productive to add infrastructure to help scientists more conveniently track and study their recent packaging experiments. Guix will only become more useful the more packages which are already available. Work which makes packaging more approachable by more people benefits everyone. Perhaps you can think of other ideas in this direction? I'm not sure how "packaging experiments" are different from packaging software the usual way. I think making the importers easier to use and debug would help, although that sounds outside the scope of the projects. Finally, would these projects be considered large or medium for the purposes of GSOC? Thanks, Skylar > On March 30, 2023 7:22:14 PM EDT, Spencer Skylar Chan wrote: >> Hi Kyle, >> >> On 3/24/23 14:59, Kyle wrote: >>> I am a bit worried about your proposed project is too focused on replacing python with guile. I think the project would benefit more from making python users more comfortable productively using Guix tools in concert with the tools they are already comfortable with. >> >> Yes, I agree with you. Replacing Python with Guile is a much more ambitious task and is not the highest priority here. >> >>> I'm wondering if you might consider modifying your project goals toward exploring how GWL might be enhanced so that it could better complement more expressive language specific workflow tools like snakemake. I am also personally interested in exploring such a facilities from the targets workflow system in R as well. Alternatively, perhaps you could focus kn extending the GWL with more features? >> >> I would also be interested in extending GWL with more features, I will follow up with this on the GWL mailing list. >> >>> I agree that establishing an achievable scope within a short timeline is crucial. The conda env importer idea would be quite an ambitious undertaking by itself and would lead you towards thinking about some pretty interesting and impactful problems. >> >> While it's a challenging project, it could be broken into smaller steps: >> >> 1. import packages by exact matching names only, without versioning. >> 2. extend `guix import` to have `guix import conda` to help with package names that do not match exactly, and to accelerate adoption of Conda packages not in Guix >> 3. match software version numbers when translating Conda packages to Guix >> >> What's currently undefined is the error handling: >> - if a Conda package does not exist in Guix >> - if the dependency graph is not solvable >> - if compiling the environment fails (due to mismatching dependency versions) >> >> I believe there are many satisfactory stopping points for successful completion within the timeline of the summer, which I hope to present with my proposal soon. >> >> Thanks, >> Skylar >> >>> >>> On March 22, 2023 5:44:52 PM EDT, Spencer Skylar Chan wrote: >>> >>> Hi Ricardo, >>> >>> On 3/22/23 14:19, Ricardo Wurmus wrote: >>> >>> >>> - Translating Snakemake to Guix Workflow Language (GWL) >>> >>> >>> Ricardo, maybe you would have some suggestions. :-) >>> >>> >>> Oh, this looks interesting. Could you please elaborate on the idea? >>> >>> My idea is to take as input a Snakemake workflow file and eventually output an equivalent GWL workflow file. >>> >>> Currently, Snakemake workflows can be exported to CWL (Common Workflow Language): >>> >>> https://snakemake.readthedocs.io/en/stable/executing/interoperability.html >>> >>> One approach could be to add CWL import/export capabilities to GWL. Then Snakemake/GWL conversion would be a 2 step process, using CWL as an intermediate step: >>> >>> 1. Snakemake -> CWL >>> 2. CWL -> GWL >>> >>> However, CWL is not as expressive as Snakemake. There may be some details that are lost from Snakemake workflows. >>> >>> So a 1-step Snakemake/GWL transpiler could be interesting, as both Snakemake/GWL use a domain-specific language inside a general purpose language (Python/Guile respectively). There may be a possibility to achieve more "accurate" translations between workflows. >>> >>> Is this topic something that could fit into a summer project? >>> >> >