From mboxrd@z Thu Jan 1 00:00:00 1970 From: swedebugia Subject: Re: Internet Archive APIs useful as fallback? Date: Wed, 19 Dec 2018 18:18:53 +0100 Message-ID: <0f4cdac2-3c54-b6f4-55a4-1b1f4416d942@riseup.net> References: <161edf75-57b7-4d5b-8acf-a4c61358add1@riseup.net> <87d0pxwon3.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:41129) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gZfOZ-0000Wf-LZ for guix-devel@gnu.org; Wed, 19 Dec 2018 12:12:32 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gZfOV-0002jn-Ij for guix-devel@gnu.org; Wed, 19 Dec 2018 12:12:31 -0500 In-Reply-To: <87d0pxwon3.fsf@gnu.org> Content-Language: en-US List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: =?UTF-8?Q?Ludovic_Court=c3=a8s?= Cc: guix-devel On 2018-12-19 15:57, Ludovic Court=C3=A8s wrote: > Hi! >=20 > swedebugia skribis: >=20 >> I stumbled over these at clintons blog and thought I would share them >> here if anybody is interested. >> >> APIs for content other that way-back machine: >> https://blog.archive.org/2018/12/13/documentation-for-public-apis-at-t= he-internet-archive/ >> >> APIs for the way-back machine: >> https://archive.org/help/wayback_api.php >=20 > We added support to retrieve Git checkouts (and some tarballs) from > Software Heritage recently: >=20 > https://issues.guix.info/issue/33432 >=20 > The Internet Archive is not in the business of archiving software, but > it=E2=80=99d be interesting to see if it archives tarballs that people = put on > =E2=80=9Crandom=E2=80=9D web sites, in which case it could also be usef= ul. >=20 > Thoughts? Thanks for the quick reply :) Yes, thanks for working on SWH! I did not yet succede to download from=20 it but I guess it has to do with the baking and all. --- I tested 3 of the quicklisp (there are ~1000) packages and these 2 worked= : https://web.archive.org/web/20170313123155/http://beta.quicklisp.org/arch= ive/cl-moneris/2011-04-18/cl-moneris-20110418-git.tgz https://web.archive.org/web/20170330204246/http://beta.quicklisp.org/arch= ive/cl-modlisp/2015-09-23/cl-modlisp-20150923-git.tgz :D I tested the pypi packages and none out of 8 I tried was available. To my knowledge the wayback crawler archives everything now, tgz, pdf,=20 exe, you name it. It seems though that it does not archive these large=20 files for every pass. I think that the quicklisp example above is enough merit to add it as a=20 last resort after trying SWH. Thoughts? --=20 Cheers Swedebugia