From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp10.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms9.migadu.com with LMTPS id aJJ7DH8PLGSWLQAASxT56A (envelope-from ) for ; Tue, 04 Apr 2023 13:52:31 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp10.migadu.com with LMTPS id kIuIC38PLGSlRAEAG6o9tA (envelope-from ) for ; Tue, 04 Apr 2023 13:52:31 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id C499B167DF for ; Tue, 4 Apr 2023 13:52:30 +0200 (CEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pjfCe-0006fh-24; Tue, 04 Apr 2023 07:51:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pjfCa-0006ej-Jm for guix-devel@gnu.org; Tue, 04 Apr 2023 07:51:53 -0400 Received: from mail-wm1-x331.google.com ([2a00:1450:4864:20::331]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pjfCY-00073D-Ig for guix-devel@gnu.org; Tue, 04 Apr 2023 07:51:52 -0400 Received: by mail-wm1-x331.google.com with SMTP id o32so18828433wms.1 for ; Tue, 04 Apr 2023 04:51:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680609109; x=1683201109; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:from:to:cc:subject:date:message-id :reply-to; bh=hXtXbp0mkB2hJfjqwt3KF2HGBYQhFQkjlUXykJJUEsM=; b=UfOrv7AB/OkhTHq/BYz4Z/+DaONsgPCopfItCviogxNSkgjrT8qiTbfPXcmXA+7DCL UMZD1v/SN7zjPTvyXyVbiyoKhCpa9xPb2zKD88HGnZd4BRkflpnxpi0x2bUbgbGAejm1 jEb2k6EyeANWTs3BeP4LPq88kK9Ba+q0jYN5ZeD1Zr9qpYZbkNPbMd2Nlv8EptdjI7t0 LWnFnLvtH45M5/Cb2AZh2aQL4xWJJft2+he3qxl9jzEBHsrJpeepzH9q+Yt0wIM5F9Vh pVao2FNWEf0iDE3itgwandKAmcXawL+8yIZY84t9nBsXv6jzEb6shJWKZ/XbzTXu/q7k HdLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680609109; x=1683201109; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hXtXbp0mkB2hJfjqwt3KF2HGBYQhFQkjlUXykJJUEsM=; b=WQg1YOBEnaBW1yKdLt6FCA0tNKLzFlAV14CIIbLtnsGJ1DNlyEQXs2VW9kBY9Ry1yP bWx4/1O7Ktbw8F8paxkzEre0bOKacOVq6bK57wJC7DjqMk1SYDVTzIw4cTOY9+6OshWb PDbOGdxoyRJA9eEb/hAo7YlOXVOIYyoKK+jaLrtfGn3ck09yiypdA1jym+Vu8pAO08Y+ 5xat2Abe5CCiEmjFqBK/sa2kin4ODdeT+9cDeaZ83dBvXSEnbdHHP8XW1RdegtxIYb8p aRrA7gpN9RcGAwenKzC1dtZA11i+WvA7T6GP3hQ87Vz84ZBB8QMKpjDnDRygE+XvlkHi 8kOA== X-Gm-Message-State: AAQBX9cC5M3CK2xV88txnvpECfJycf+Pv4q5Prc4szqdYAEFo9k6PjGw sHbqOCiOOiPHFkXqsEucei8GvY0MZ/o= X-Google-Smtp-Source: AKy350YptYqgeh/63xvy4h0uJSSmNyOZ3/X8EATmjAeNXGVLov4YCmjhdD9lOrFFC86+8qfKGGLktA== X-Received: by 2002:a05:600c:1c1e:b0:3ef:7795:e5b1 with SMTP id j30-20020a05600c1c1e00b003ef7795e5b1mr2036725wms.4.1680609109013; Tue, 04 Apr 2023 04:51:49 -0700 (PDT) Received: from lili ([2a01:e0a:59b:9120:65d2:2476:f637:db1e]) by smtp.gmail.com with ESMTPSA id o5-20020a05600c510500b003ef5bb63f13sm22698648wms.10.2023.04.04.04.51.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Apr 2023 04:51:48 -0700 (PDT) From: Simon Tournier To: Spencer Skylar Chan , Kyle , Ricardo Wurmus Cc: guix-devel@gnu.org Subject: Re: Google Summer of Code 2023 Inquiry In-Reply-To: <09755392-de37-c039-6b60-46310f6f4314@terpmail.umd.edu> References: <6d30ee7b-f1f0-9199-fea8-efd434c8611c@terpmail.umd.edu> <86sfeb9zx8.fsf@gmail.com> <87ttycir7r.fsf@elephly.net> <42aa5844-0769-e122-efd7-8a152070c71c@terpmail.umd.edu> <09755392-de37-c039-6b60-46310f6f4314@terpmail.umd.edu> Date: Tue, 04 Apr 2023 10:59:33 +0200 Message-ID: <86ttxwvx8q.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=2a00:1450:4864:20::331; envelope-from=zimon.toutoune@gmail.com; helo=mail-wm1-x331.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: guix-devel-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN ARC-Seal: i=1; s=key1; d=yhetil.org; t=1680609150; a=rsa-sha256; cv=none; b=iM5khIUqVKBc/EH3LyQ8E0KBqUSrEqlGoVQMO1GrgIJVMNDKgqC0iP+9cHUwYsdLlEXA1J 0Hqz/S7nF3aPqOwxRvUxzeIeJ6EUVyoL3dXi1bRdZHT3BKtn6rq5Kpw1IUUXHDkT8LaD5D D4dA+O0wxPM81gdjWV+eqkWfNHQn+YmZDNdlqTU1H2EPLv6xYfilpdKowctEF/vphnOynF qiTMlcrc/2ahXQRi6XBB1uT41JWD+uoFNfawgI3cuKUORmqEc1QwwnlJFRO6T+1bgCxite j5qN/Cr9aYNby+5bOEPZ9arxdWpc7iEerfEVPgPi6pYivIJzcEPGe/l1jz+X9g== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=UfOrv7AB; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1680609150; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=hXtXbp0mkB2hJfjqwt3KF2HGBYQhFQkjlUXykJJUEsM=; b=mz8z9HF8UQ2DBrMEX3LuPgc25tSBfJMHpK7QL9SzNKOC9IXSm0YB6HN10pvV9xAXRAlnfK g/fgchUlqJF0xwrB9nKcsu42vJH5Qxo9I/38hRosXUvd3Ui4jS71RRV/2+yfoeQNbcYnLJ aCrGo0ytRRqLZFZgOez9lfdC6B6F3gjNjN0SDfvmagisNoKRlMC/9+E8mIj+503MdeA3Yb Vt3r8Iczygk45cXLBDS8q9RMJsyIBoHxQ6soBZW6jSZsORUB+hJFm/c4zNEQa94cLRoACr Y+ie+7lR7d1iy2GINrlaAuwFnDJphfi3jxWEtcjix8TovuEoOx7Oxd3r+dimDA== Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=UfOrv7AB; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Scanner: scn0.migadu.com X-Migadu-Spam-Score: -4.95 X-Spam-Score: -4.95 X-Migadu-Queue-Id: C499B167DF X-TUID: nb5mDKwTFloj Hi, On Mon, 03 Apr 2023 at 20:41, Spencer Skylar Chan wrote: >> I would expect most software versions to not be in Guix. Simon had >> mentioned that this is mostly what the guix-past repository is >> for. However, some packages might be buried on some branch or some >> commit in some Guix related git repository. It may be helpful to >> facilitate their discovery and extraction for conda import.=20 Please note, 1. The aim of the guix-past [1] channel is to have previous versions of some packages still working with recent Guix revisions. The motivation of guix-past had been the 10 Years Challenge [2] and then fed by hackathon [3]. 2. There is no easy way to know which revision of Guix provides that specific version of this package. The discovery of package version mapping Guix revision is not straightforward with the current tool. I am aware of two directions: rely on external server as the Guix Data Service [4] or implement =E2=80=9Cguix git log=E2=80=9D [5] (the c= ode lives in the branch =E2=80=99wip-guix-log=E2=80=99). 1: https://gitlab.inria.fr/guix-hpc/guix-past 2: http://rescience.github.io/ten-years/ 3: https://hpc.guix.info/blog/2020/07/reproducible-research-hackathon-exper= ience-report/ 4: https://data.guix.gnu.org/repository/1/branch/master/package/gmsh/output= -history 5: https://guix.gnu.org/en/blog/2021/outreachy-guix-git-log-internship-wrap= -up/ >> Git has a newish binary file format for caching searches across >> commits. Maybe it would be helpful to figure out how to parse this >> format (its documented) and index the data further using Xapian or a >> graph data structure (or tree sitter?) with the relevant metadata >> needed to find and efficiently extract scheme code and its >> dependencies?=20 Months ago, I have started to do that: index the package list using Xapian. Well, started is a strong word here, since I have not done much. My idea was (is still!) an attempt to address to two in the same time: faster =E2=80=9Cguix search=E2=80=9D [6] and discovery the past versi= ons. Somehow rework Arun=E2=80=99s patches [6]. From my point of view, it would= be possible to add Xapian as a dependency for Guix, therefore I think it should use GUIX_EXTENSIONS_PATH. 6: https://issues.guix.gnu.org/39258#14 > If the format is documented then this is possible, although I'm not=20 > super familiar with these kinds of data structures. As said, an entry point about how =E2=80=9Cguix search=E2=80=9D works is th= e super long discussion in #39258 [7]. :-) 7: https://issues.guix.gnu.org/39258 >> You make an interesting point about compilation errors. It may more >> productive to help researchers test for working satisfiable >> configurations as a more relaxed approach to having to specify the >> exact software version. Maybe some "nearby" or newer version is >> packaged and that is enough to successfully run a test suite? I'm >> imagining something between git bisect and Guix's own package >> solver.=20 > > Yes, we could have a variant of the solver that's more relaxed. It could= =20 > output multiple solutions so the user can inspect them and pick the best= =20 > one. I do not know what you have in mind with =E2=80=9Cworking satisfiable configurations=E2=80=9D or with =E2=80=9Ca variant of the solver=E2=80=9D. = To my knowledge, this implies some SAT solver. Well, before going this direction, I would suggest to read some output of the Mancoosi project [8]. Especially this part [9]. From my point of view, the direction =E2=80=9Cwo= rking satisfiable configurations=E2=80=9D or =E2=80=9Ca variant of the solver=E2= =80=9D would break the reproducibility of a specific configuration for the general case. Part of the problem about computational environment reproducibility is because package manager implements solvers for installing some packages. That=E2=80=99s said, all the package versions that Guix can provide is some= DAG because it is a Git history =E2=80=93 well, it is the combination of severa= l Git histories when considering several channels. Thus, a specific version for a package is given by an interval in the graph. Considering a list of packages at one specific version, we end with a list of intervals. The =E2=80=9Cworking satisfiable configuration=E2=80=9D is then the interse= ction of all the intervals of this list; note that the resulting output could also be the empty interval. It=E2=80=99s a problem of graph. Almost trivial when the graph is linear. = But it requires some work when merge happens. And note that the merges merge some branches that does not always fully build; for instance part of core-updates before its merges. To my knowledge, it is impossible to detect beforehand. We discussed these kind of topics when introducing =E2=80=9Cguix package --export-channels=E2=80=9D; it is a variant of this proposal, IMHO. Last, considering all Guix the version fields, I am not convinced it is straightforward to guarantee some =E2=80=9Cnearby=E2=80=9D or newer version= s. It can only be heuristics working with more or less accuracy; see =E2=80=9Cguix refresh=E2=80=9D and all the updaters. All in all, I am not convinced Guix should try to implement a way to =E2=80=9Cspecify the exact software version=E2=80=9D. Because it leads to = false considerations that label versions are enough for reproducing computational environments, when it is far to be. Well, I agree that Guix should only provide tools to build channels.scm and manifest.scm files, both hinted by some inputs as requirements.txt. And strongly claiming that only the resulting computational environment generated by channels.scm+manifest.scm is reproducible. All other computational environments generated with inputs other than channels.scm+manifest.scm is not reproducible =E2=80=93 this includes any converter from whatever inputs to generated channels.scm+manifest.scm. 8: https://www.mancoosi.org/ 9: https://www.mancoosi.org/edos/algorithmic/ > Finally, would these projects be considered large or medium for the=20 > purposes of GSOC? Well, there is many ideas floating around. :-) That=E2=80=99s because many= work still remain. ;-) Many ideas discussed here are larger than GSoC. Now, you should pick one that interests you and where you have an idea for implementing it. Then try to draw a schedule to see if you think it would fit. Please consider that implementing always takes longer than initially planned =E2= =80=93 there is always unexpected tiny details that are blocking the initial plan; devil, details and all that. ;-) Cheers, simon