From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ricardo Wurmus Subject: bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case. Date: Mon, 27 May 2019 17:11:33 +0200 Message-ID: References: <87pnod7ot4.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([209.51.188.92]:52511) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hVHID-0003wm-0H for bug-guix@gnu.org; Mon, 27 May 2019 11:12:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hVHIB-00068B-Pd for bug-guix@gnu.org; Mon, 27 May 2019 11:12:05 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:40251) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hVHIB-000681-NJ for bug-guix@gnu.org; Mon, 27 May 2019 11:12:03 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1hVHIA-0000MA-BD for bug-guix@gnu.org; Mon, 27 May 2019 11:12:03 -0400 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <87pnod7ot4.fsf@gmail.com> List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: Maxim Cournoyer Cc: 24450@debbugs.gnu.org Hi Maxim, on to patch number 2! > From 5f79b0502f62bd1dacc8ea143c1dbd9ef7cfc29d Mon Sep 17 00:00:00 2001 > From: Maxim Cournoyer > Date: Thu, 28 Mar 2019 00:26:00 -0400 > Subject: [PATCH 2/9] import: pypi: Do not parse optional requirements from > source. > > * guix/import/pypi.scm: Export PARSE-REQUIRES.TXT. > (guess-requirements): Move the READ-REQUIREMENTS procedure to the top lev= el, > and rename it to PARSE-REQUIRES.TXT. Move the CLEAN-REQUIREMENT and COMM= ENT? > functions inside the READ-REQUIREMENTS procedure. > (parse-requires.txt): Add a SECTION-HEADER? predicate, and use it to prev= ent > parsing optional requirements. > > * tests/pypi.scm (test-requires-with-sections): New variable. > ("parse-requires.txt, with sections"): New test. > ("pypi->guix-package"): Mute tar output to stdout. The commit log does not match the changes. CLEAN-REQUIREMENT is now a top-level procedure, not a local procedure inside of READ-REQUIREMENTS as reported in the commit message. Which is correct? > + (call-with-input-file requires.txt > + (lambda (port) > + (let loop ((result '())) > + (let ((line (read-line port))) > + ;; Stop when a section is encountered, as sections contains op= tional Should be =E2=80=9Ccontain=E2=80=9D. > + ;; (extra) requirements. Non-optional requirements must appear > + ;; before any section is defined. > + (if (or (eof-object? line) (section-header? line)) > + (reverse result) > + (cond > + ((or (string-null? line) (comment? line)) > + (loop result)) > + (else > + (loop (cons (clean-requirement line) > + result)))))))))) > + I think it would be better to use =E2=80=9Cmatch=E2=80=9D here instead of n= ested =E2=80=9Clet=E2=80=9D, =E2=80=9Cif=E2=80=9D and =E2=80=9Ccond=E2=80=9D. At least you can drop the= =E2=80=9Cif=E2=80=9D and just use cond. The loop let and the inner let can be merged. > +(define (parse-requires.txt requires.txt) > + "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of > +requirement names." > + ;; This is a very incomplete parser, which job is to select the non-op= tional =E2=80=9Cwhich=E2=80=9D =E2=80=93> =E2=80=9Cwhose=E2=80=9D > + ;; dependencies and strip them out of any version information. > + ;; Alternatively, we could implement a PEG parser with the (ice-9 peg) > + ;; library and the requirements grammar defined by PEP-0508 > + ;; (https://www.python.org/dev/peps/pep-0508/). Let=E2=80=99s remove the sentence starting with =E2=80=9CAlternatively=E2= =80=A6=E2=80=9D. We could do that but we didn=E2=80=99t :) > + (define (section-header? line) > + ;; Return #t if the given LINE is a section header, #f otherwise. > + (let ((trimmed-line (string-trim line))) > + (and (not (string-null? trimmed-line)) > + (eq? (string-ref trimmed-line 0) #\[)))) > + How about using string-prefix? instead? This looks more complicated than it deserves. You can get rid of string-null? and eq? and string-ref and all that. Same here: > + (define (comment? line) > + ;; Return #t if the given LINE is a comment, #f otherwise. > + (eq? (string-ref (string-trim line) 0) #\#)) I=E2=80=99d just use string-prefix? here. > +(define (clean-requirement s) > + ;; Given a requirement LINE, as can be found in a setuptools requires.= txt > + ;; file, remove everything other than the actual name of the required > + ;; package, and return it. > + (string-take s (or (string-index s (lambda (chr) > + (member chr '(#\space #\> #\=3D #= \<)))) > + (string-length s)))) =E2=80=9Cstring-take=E2=80=9D with =E2=80=9Cstring-length=E2=80=9D is not v= ery elegant. The char predicate in string-index could better be a char set: --8<---------------cut here---------------start------------->8--- (define (clean-requirement s) (cond ((string-index s (char-set #\space #\> #\=3D #\<)) =3D> (cut string-take = s <>)) (else s))) --8<---------------cut here---------------end--------------->8--- > ("pypi->guix-package"): Mute tar output to stdout. Finally, I think it would be better to keep this separate because it=E2=80= =99s really orthogonal to the other changes in this patch. What do you think? --=20 Ricardo