From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id gEExF2maemFASwEAgWs5BA (envelope-from ) for ; Thu, 28 Oct 2021 14:41:13 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id 84zlEmmaemH2XwAA1q6Kng (envelope-from ) for ; Thu, 28 Oct 2021 12:41:13 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 895F02F6E8 for ; Thu, 28 Oct 2021 14:41:12 +0200 (CEST) Received: from localhost ([::1]:57072 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mg4iV-0006fr-M9 for larch@yhetil.org; Thu, 28 Oct 2021 08:41:11 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58066) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mg4ho-0006fO-Dl for guix-devel@gnu.org; Thu, 28 Oct 2021 08:40:28 -0400 Received: from sender4-of-o51.zoho.com ([136.143.188.51]:21119) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mg4hk-0001Wf-JI for guix-devel@gnu.org; Thu, 28 Oct 2021 08:40:28 -0400 ARC-Seal: i=1; a=rsa-sha256; t=1635424818; cv=none; d=zohomail.com; s=zohoarc; b=EHEcnJShftj0mk4FRoqA2/yyLu5A2KBQQbM6P6l1neIBuY8S90nB5kEFWXjYPeNlnNlyJ5zJUjz4zm11sbG53KM4wVOh6UtmNI5mHBIE08zpCf6726l5RkEWd0Qx6JG5y/Pzm9agfVYOj4xjoP435u/1KLGPyVKd5DB9BZrAslo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1635424818; h=Content-Type:Cc:Date:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:To; bh=b594VN3cvfaMJvAZA5ZFjHnsQKt1WJTvC2kzyHZVRpU=; b=JuERlEZMGDdJAqD1ofz+ND4kTKJWBw+EEvMcXemOfFvj1vwron8oDd497WkAXga3G6CpszrhR2GrAacGY3Wa9JSyAgIyZgRyU1zPd+79Z1fNd8qaPpWQtDH2z3KTxK891ESurAFgdfqaaVazePM+F1EIiU4hP3ujjN/+Z9Tyd5s= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=elephly.net; spf=pass smtp.mailfrom=rekado@elephly.net; dmarc=pass header.from= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1635424817; s=zoho; d=elephly.net; i=rekado@elephly.net; h=References:From:To:Cc:Subject:Date:In-reply-to:Message-ID:MIME-Version:Content-Type; bh=b594VN3cvfaMJvAZA5ZFjHnsQKt1WJTvC2kzyHZVRpU=; b=bddNjmJ11EoqH/YP1u6cpTo0Z6nU76JAQuCds+dd28tV1YRz0+gNzTiMIFzNG6mc y+g/7eCTQ5ryVo8Vx/fM0FN5jdzn4cNWPFkPMzLSnEAhQFNgle13LzkFK/y3ZTIZu0u nmRpISRRyJzOs89oOLUcvP2Y7P9XSjTj29IcgUmw= Received: from localhost (p54ad48e8.dip0.t-ipconnect.de [84.173.72.232]) by mx.zohomail.com with SMTPS id 1635424813650769.166701906141; Thu, 28 Oct 2021 05:40:13 -0700 (PDT) References: <878ryd8we4.fsf@inria.fr> User-agent: mu4e 1.6.6; emacs 27.2 From: Ricardo Wurmus To: Ludovic =?utf-8?Q?Court=C3=A8s?= Subject: Re: Accuracy of importers? Date: Thu, 28 Oct 2021 12:25:56 +0000 In-reply-to: <878ryd8we4.fsf@inria.fr> X-URL: https://elephly.net X-PGP-Key: https://elephly.net/rekado.pubkey X-PGP-Fingerprint: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC Message-ID: <87y26dxqz9.fsf@elephly.net> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-ZohoMailClient: External X-Zoho-Virus-Status: 1 Received-SPF: pass client-ip=136.143.188.51; envelope-from=rekado@elephly.net; helo=sender4-of-o51.zoho.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: guix-devel@gnu.org Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1635424872; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=b594VN3cvfaMJvAZA5ZFjHnsQKt1WJTvC2kzyHZVRpU=; b=NAvjNAgrhaGpXyKRGQO+ZgHrxfOZhuoph7wcjyU4xogc6700RkDB+EMtrWflLOkNSQGGF7 PqEdxBV6T8vnIzGe3SUmVewDZOXZ0nT+kkBKQIndIY6uhC/mbJUYWKHcAQSHxf7ubZZuIs frVaJ8ohT3ijQZLFnUoezQ8od+qh9UAZPUjTc9HF0HegsEuNhaXSZMpJVKalc/UZtpaa5R ZgyykbI/OaUvlLtbGbyEYN/9Dz2X2fMKAF4bTTkaeCng9SYrQOxqimvx8iaGAbqyzfFz5O 2SGx3GW4W1HHa9hBVoj1OahlaRyXoYcekdJ+eI5BcLMF5zQ7pjSAhWfQj8L+Yg== ARC-Seal: i=2; s=key1; d=yhetil.org; t=1635424872; a=rsa-sha256; cv=pass; b=HYz3+i8q9z6bka0iQH0OrkHjsg8HxWF/DJZRzD3GPsLgC2v2prE4em69KGHyBr6OTDrr/6 AIDjiZwxEI3maTuVb7ig6XlbDNoX8aQO+K9GY9YiJOFNKJILwGRN7yVfCXmUB1W7wWE8mg BN0PoJycW6b7v8QqYWAGmfbKiwzjAgF8C98AFxpe56xK4GPBk1e2FFBBsMfRkc2DklzVIY cu6kjugVmNyWyPMqHncm3984U4ymSiNOcWX1OzcGpWoNR2KBuLKI6fGRBoWEaZ/KYIVn63 gSTp2gSrjL0doGxtFsei0hZpiKn+1mphgqy9BPlDibRYdzwldCTusse1xaMnRA== ARC-Authentication-Results: i=2; aspmx1.migadu.com; dkim=pass header.d=elephly.net header.s=zoho header.b=bddNjmJ1; arc=pass ("zohomail.com:s=zohoarc:i=1"); dmarc=none; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Spam-Score: -2.62 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=elephly.net header.s=zoho header.b=bddNjmJ1; arc=pass ("zohomail.com:s=zohoarc:i=1"); dmarc=none; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Queue-Id: 895F02F6E8 X-Spam-Score: -2.62 X-Migadu-Scanner: scn0.migadu.com X-TUID: O6ch+TtWlnTD --=-=-= Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Ludovic Court=C3=A8s writes: > Hello Guix! > > As I=E2=80=99m preparing my PackagingCon talk and wondering how language= =20 > package > managers could make our lives easier, I thought it=E2=80=99d be=20 > interesting to > know how well our importers are doing. > > My understanding is that most of them require manual=20 > intervention=E2=80=94i.e., > one has to tweak what =E2=80=98guix import=E2=80=99 produces, even if we = ignore > synopsis/description/license, to set the right inputs, etc. If=20 > we were > to estimate the fraction of imported packages for which manual=20 > changes > are needed, what would it look like? > > importer fraction of imported packages needing changes [=E2=80=A6] > cran 5% (Ricardo? Simon? seems to almost always=20 > work?) Like Lars and Simon wrote: the importers work *really* well for=20 both CRAN and Bioconductor, so much so that I=E2=80=99m using them in the=20 background here: https://git.elephly.net/gitweb.cgi?p=3Dsoftware/r-guix-install.git;a=3Dblob= ;f=3Dguix-install.R;h=3D2766aa1f2d248a8ed2a4eb4c3244b85574d326e2;hb=3DHEAD The biggest annoyance is the missing =E2=80=9Clicense:=E2=80=9D prefix when= =20 packaging things for gnu/packages/cran.scm or=20 gnu/packages/bioconductor.scm. Descriptions need regular clean-=20 up work (e.g. to complete sentences), even though we=E2=80=99re using some= =20 heuristics to fix the most common stylistic problems. It=E2=80=99s really= =20 not a big deal, though. The biggest missing feature is recursive import of dependencies=20 hosted on Github or Mercurial (with =E2=80=9C-r -a git=E2=80=9D or =E2=80= =9C-r -a hg=E2=80=9D).=20 I.e. a package on Github that declares a dependency on another=20 package that=E2=80=99s also only hosted on Github will fail to import that= =20 dependency. This is pretty rare, but it happens with experimental=20 bioinfo software. > texlive (Ricardo? Thiago? Marius?) This one is not usable. I=E2=80=99d even add =E2=80=9Cat all=E2=80=9D. I = keep announcing=20 that one day I=E2=80=99ll replace it with a new importer, but that new=20 importer just isn=E2=80=99t ready yet. > What about licensing info: which ones provide accurate licensing=20 > info? > My guess: > > gnu > pypi > cpan > cran The CRAN importer is as accurate as upstream allows. CRAN=20 requires a free license, Bioconductor requires a license=20 declaration (there have been very few cases where the license was=20 not correct, but a number of cases where the license was non-free,=20 such as the Artistic 1.0 license. Bioconductor sometimes is=20 sneaky and the R code is free but a necessary library is not. > texlive Pretty terrible. The license declaration is generally too vague.=20 Licenses are often declared without version number, and sometimes=20 it=E2=80=99s just some generic =E2=80=9Cfree=E2=80=9D license. A new impor= ter based on=20 texlive.tlpdb would not improve this by much, because the upstream=20 declarations are just spotty and unreliable. --=20 Ricardo PS: attached is a rough WIP patch of what I had been using to=20 import new texlive stuff. --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=texlive-import.diff diff --git a/guix/import/texlive.scm b/guix/import/texlive.scm index 18d8b95ee0..b94aa1cf40 100644 --- a/guix/import/texlive.scm +++ b/guix/import/texlive.scm @@ -19,10 +19,12 @@ (define-module (guix import texlive) #:use-module (ice-9 match) + #:use-module (ice-9 rdelim) #:use-module (sxml simple) #:use-module (sxml xpath) #:use-module (srfi srfi-11) #:use-module (srfi srfi-1) + #:use-module (srfi srfi-2) #:use-module (srfi srfi-26) #:use-module (srfi srfi-34) #:use-module (web uri) @@ -125,9 +127,9 @@ (define (fetch-sxml name) (xml->sxml (http-fetch url) #:trim-whitespace? #t)))) -(define (guix-name component name) +(define (guix-name name) "Return a Guix package name for a given Texlive package NAME." - (string-append "texlive-" component "-" + (string-append "texlive-" (string-map (match-lambda (#\_ #\-) (#\. #\-) @@ -186,12 +188,123 @@ (define (sxml-value path) ((lst ...) `(list ,@lst)) (license license))))))) +(define tlpdb + (memoize + (lambda () + (let ((file "/home/rekado/dev/gx/branches/master/texlive.tlpdb") + (fields + '((name . string) + (shortdesc . string) + (longdesc . string) + (catalogue-license . string) + (catalogue-ctan . string) + (srcfiles . list) + (runfiles . list) + (docfiles . list) + (depend . list))) + (record + (lambda* (key value alist #:optional (type 'string)) + (let ((new + (or (and=> (assoc-ref alist key) + (lambda (existing) + (cond + ((eq? type 'string) + (string-append existing " " value)) + ((eq? type 'list) + (cons value existing))))) + (cond + ((eq? type 'string) + value) + ((eq? type 'list) + (list value)))))) + (acons key new (alist-delete key alist)))))) + (call-with-input-file file + (lambda (port) + (let loop ((all (list)) + (current (list)) + (last-property #false)) + (let ((line (read-line port))) + (cond + ((eof-object? line) all) + + ;; End of record. + ((string-null? line) + (loop (cons (cons (assoc-ref current 'name) current) + all) + (list) #false)) + + ;; Continuation of a list + ((and (zero? (string-index line #\space)) last-property) + ;; Erase optional second part of list values like + ;; "details=Readme" for files + (let ((plain-value (first + (string-split + (string-trim-both line) #\space)))) + (loop all (record last-property + plain-value + current + 'list) + last-property))) + (else + (or (and-let* ((space (string-index line #\space)) + (key (string->symbol (string-take line space))) + (value (string-drop line (1+ space))) + (field-type (assoc-ref fields key))) + ;; Erase second part of list keys like "size=29" + (if (eq? field-type 'list) + (loop all current key) + (loop all (record key value current field-type) key))) + (loop all current #false)))))))))))) + +(define (files->directories files) + (map (cut string-join <> "/" 'suffix) + (delete-duplicates (map (lambda (file) + (drop-right (string-split file #\/) 1)) + files) + equal?))) + +(define (tlpdb->package name) + (and-let* ((data (assoc-ref (tlpdb) name)) + (dirs (files->directories + (append (or (assoc-ref data 'docfiles) (list)) + (or (assoc-ref data 'runfiles) (list)) + (or (assoc-ref data 'srcfiles) (list)))))) + (pk data) + ;; TODO + `(package + (name ,(guix-name name)) + (version (number->string %texlive-revision)) + (source (texlive-origin name version + ',dirs + (base32 + "TODO" + #; + ,(bytevector->nix-base32-string + (let-values (((port get-hash) (open-sha256-port))) + (write-file checkout port) + (force-output port) + (get-hash)))))) + (build-system texlive-build-system) + (arguments ,`(,'quote (#:tex-directory "TODO"))) + ,@(or (and=> (assoc-ref data 'depend) + (lambda (inputs) + `((propagated-inputs ,inputs)))) + '()) + ,@(or (and=> (assoc-ref data 'catalogue-ctan) + (lambda (url) + `((home-page ,(string-append "https://ctan.org" url))))) + '((home-page "https://www.tug.org/texlive/"))) + (synopsis ,(assoc-ref data 'shortdesc)) + (description ,(beautify-description + (assoc-ref data 'longdesc))) + (license ,(string->license + (assoc-ref data 'catalogue-license)))))) + (define texlive->guix-package (memoize (lambda* (package-name #:optional (component "latex")) "Fetch the metadata for PACKAGE-NAME from REPO and return the `package' s-expression corresponding to that package, or #f on failure." - (and=> (fetch-sxml package-name) - (cut sxml->package <> component))))) + (tlpdb->package package-name)))) ;;; ctan.scm ends here --=-=-=--