From mboxrd@z Thu Jan 1 00:00:00 1970 From: swedebugia@riseup.net Subject: Re: `guix lint' warn of GitHub autogenerated source tarballs Date: Fri, 21 Dec 2018 13:00:07 -0800 Message-ID: <780fe5ff38aac96b9949148d2ffd73d2@riseup.net> References: <87pntxwqx0.fsf@gnu.org> <08635A1A-EDA5-44B0-8C8A-532F16683154@flashner.co.il> <20181219192926.GB2581@macbook41> <87imzmmwno.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:52003) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gaSEa-0001A5-1S for guix-devel@gnu.org; Fri, 21 Dec 2018 16:21:31 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gaSEN-0007nn-3T for guix-devel@gnu.org; Fri, 21 Dec 2018 16:21:22 -0500 In-Reply-To: <87imzmmwno.fsf@gnu.org> List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: =?UTF-8?Q?Ludovic_Court=C3=A8s?= Cc: guix-devel@gnu.org, Guix-devel On 2018-12-21 21:50, Ludovic Court=C3=A8s wrote: > Hi! >=20 > Efraim Flashner skribis: >=20 >> Here's what I currently have. I don't think I've tried running the tes= ts >> I've written yet, and Ludo said there was a better way to check if the >> download was a git-fetch or a url-fetch. As the logic is currently >> written it'll flag any package hosted on github owned by 'archive' or >> any package named 'archive' in addition to the ones we want. >=20 > OK. I think you=E2=80=99re pretty much there anyway, so please don=E2=80= =99t drop the > ball. ;-) >=20 > Some comments follow: >=20 >> From 8a07c8aea1f23db48a9e69956ad15f79f0f70e35 Mon Sep 17 00:00:00 2001 >> From: Efraim Flashner >> Date: Tue, 23 Oct 2018 12:01:53 +0300 >> Subject: [PATCH] lint: Add checker for unstable tarballs. >> >> * guix/scripts/lint.scm (check-source-unstable-tarball): New procedure= . >> (%checkers): Add it. >> * tests/lint.scm ("source-unstable-tarball", source-unstable-tarball: >> source #f", "source-unstable-tarball: valid", source-unstable-tarball: >> not-github", source-unstable-tarball: git-fetch"): New tests. >=20 > [...] >=20 >> +(define (check-source-unstable-tarball package) >> + "Emit a warning if PACKAGE's source is an autogenerated tarball." >> + (define (github-tarball? origin) >> + (string-contains origin "github.com")) >> + (define (autogenerated-tarball? origin) >> + (string-contains origin "/archive/")) >> + (let ((origin (package-source package))) >> + (unless (not origin) ; check for '(source #f)' >> + (let ((uri (origin-uri origin)) >> + (dl-method (origin-method origin))) >> + (unless (not (pk dl-method "url-fetch")) >> + (when (and (github-tarball? uri) >> + (autogenerated-tarball? uri)) >> + (emit-warning package >> + (G_ "the source URI should not be an autogene= rated tarball") >> + 'source))))))) >=20 > You should use =E2=80=98origin-uris=E2=80=99 (plural), which always ret= urns a list of > URIs, and iterate on them (see =E2=80=98check-mirror-url=E2=80=99 as an= example.) >=20 > Also, when you have a URI, you can obtain just the host part and decode > the path part like this: >=20 > --8<---------------cut here---------------start------------->8--- > scheme@(guile-user)> (string->uri "https://github.com/foo/bar/archive/w= hatnot") > $2 =3D #< scheme: https userinfo: #f host: "github.com" port: #f > path: "/foo/bar/archive/whatnot" query: #f fragment: #f> > scheme@(guile-user)> (uri-host $2) > $3 =3D "github.com" > scheme@(guile-user)> (split-and-decode-uri-path (uri-path $2)) > $4 =3D ("foo" "bar" "archive" "whatnot") > --8<---------------cut here---------------end--------------->8--- >=20 > That way you should be able to get more accurate matching than with > =E2=80=98string-contains=E2=80=99. Does that make sense? This is super nice! I did not know this. It makes URL parsing much easier :D --=20 Cheers=20 Swedebugia