From mboxrd@z Thu Jan 1 00:00:00 1970 From: ludovic.courtes@inria.fr (Ludovic =?utf-8?Q?Court=C3=A8s?=) Subject: Re: Software Heritage API Date: Mon, 04 Sep 2017 16:47:07 +0200 Message-ID: <87y3puy104.fsf@gnu.org> References: <878tpj8xye.fsf@gnu.org> <20170823191238.GJ2484@macbook42.flashner.co.il> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:46766) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dosgo-0005e7-4G for guix-devel@gnu.org; Mon, 04 Sep 2017 10:51:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dosec-0003O0-Id for guix-devel@gnu.org; Mon, 04 Sep 2017 10:49:26 -0400 Received: from mail2-relais-roc.national.inria.fr ([192.134.164.83]:56385) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dosec-0003LJ-6p for guix-devel@gnu.org; Mon, 04 Sep 2017 10:47:10 -0400 In-Reply-To: <20170823191238.GJ2484@macbook42.flashner.co.il> (Efraim Flashner's message of "Wed, 23 Aug 2017 22:12:38 +0300") List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Efraim Flashner Cc: Guix-devel Hi Efraim, Efraim Flashner skribis: > I've kept this tagged to take a look at it later. I checked the sha1sum > of swig-3.0.10.tar.gz and it gave me a valid URL. > https://archive.softwareheritage.org/api/1/content/c672b8535394cfb204c70d= e7c66e69fb20a95647/ > https://archive.softwareheritage.org/api/1/content/sha1:c672b8535394cfb20= 4c70de7c66e69fb20a95647/ > https://archive.softwareheritage.org/api/1/content/sha256:2939aae39dec060= 95462f1b95ce1c958ac80d07b926e48871046d17c0094f44c/ > If you take a look at the page(s), '/raw' can only be appended to the > sha1 (or blank) URLs to download the source, which currently returns > 401. Be aware that Software Heritage (SWH) stores only raw commits and not tarballs (or not yet). That means that you may be able to find the =E2=80=9C3.0.10=E2=80=9D tag of SWIG, but not swig-3.0.10.tar.gz. See: https://sympa.inria.fr/sympa/arc/swh-devel/2016-09/msg00000.html > Currently our "magic mirrors" search hydra based on the hash; in order > to check here also for the source we'd have to undo the base32 hash, and > then either transform the sha256 hash to a sha1 hash, or use two API > calls, the first to check for the source and the second to get and use > the url to download it. A quick check online makes me think it's not > possible to take a sha256 hash and get the sha1 hash of that file. As it is now, SWH could only help us for Git checkouts, not for tarballs. There is no way to =E2=80=9Cconvert=E2=80=9D a SHA256 hash to SHA1 or simil= ar, though, but apparently SWH supports SHA256 anyway. The second problem, though, is that the way we compute the hash of a directory differs from the way they do: https://sympa.inria.fr/sympa/arc/swh-devel/2016-07/msg00018.html Essentially, Guix computes the hash of the nar (=E2=80=9Cnormalized archive= =E2=80=9D) of the directory, whereas SWH computes a hash over the Git tree representation. AFAICS this cannot be overcome without manually specifying the git-tree-hash in our =E2=80=98origin=E2=80=99 objects. Ludo=E2=80=99.