From mboxrd@z Thu Jan 1 00:00:00 1970 From: Maxim Cournoyer Subject: bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case. Date: Mon, 10 Jun 2019 12:30:47 +0900 Message-ID: <87muiq5d7c.fsf@gmail.com> References: <87pnod7ot4.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:470:142:3::10]:42910) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1haB1T-00064j-U5 for bug-guix@gnu.org; Sun, 09 Jun 2019 23:31:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1haB1S-0002Zk-Hd for bug-guix@gnu.org; Sun, 09 Jun 2019 23:31:03 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:42238) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1haB1S-0002ZL-Aw for bug-guix@gnu.org; Sun, 09 Jun 2019 23:31:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1haB1S-0006SN-1a for bug-guix@gnu.org; Sun, 09 Jun 2019 23:31:02 -0400 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: (Ricardo Wurmus's message of "Mon, 27 May 2019 17:11:33 +0200") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: Ricardo Wurmus Cc: 24450@debbugs.gnu.org Hello again! Ricardo Wurmus writes: > Hi Maxim, > > on to patch number 2! Yay! >> From 5f79b0502f62bd1dacc8ea143c1dbd9ef7cfc29d Mon Sep 17 00:00:00 2001 >> From: Maxim Cournoyer >> Date: Thu, 28 Mar 2019 00:26:00 -0400 >> Subject: [PATCH 2/9] import: pypi: Do not parse optional requirements fr= om >> source. >> >> * guix/import/pypi.scm: Export PARSE-REQUIRES.TXT. >> (guess-requirements): Move the READ-REQUIREMENTS procedure to the top le= vel, >> and rename it to PARSE-REQUIRES.TXT. Move the CLEAN-REQUIREMENT and COM= MENT? >> functions inside the READ-REQUIREMENTS procedure. >> (parse-requires.txt): Add a SECTION-HEADER? predicate, and use it to pre= vent >> parsing optional requirements. >> >> * tests/pypi.scm (test-requires-with-sections): New variable. >> ("parse-requires.txt, with sections"): New test. >> ("pypi->guix-package"): Mute tar output to stdout. > > The commit log does not match the changes. CLEAN-REQUIREMENT is now a > top-level procedure, not a local procedure inside of READ-REQUIREMENTS > as reported in the commit message. Which is correct? Fixed. >> + (call-with-input-file requires.txt >> + (lambda (port) >> + (let loop ((result '())) >> + (let ((line (read-line port))) >> + ;; Stop when a section is encountered, as sections contains o= ptional > > Should be =E2=80=9Ccontain=E2=80=9D. Fixed. >> + ;; (extra) requirements. Non-optional requirements must appe= ar >> + ;; before any section is defined. >> + (if (or (eof-object? line) (section-header? line)) >> + (reverse result) >> + (cond >> + ((or (string-null? line) (comment? line)) >> + (loop result)) >> + (else >> + (loop (cons (clean-requirement line) >> + result)))))))))) >> + > > I think it would be better to use =E2=80=9Cmatch=E2=80=9D here instead of= nested =E2=80=9Clet=E2=80=9D, > =E2=80=9Cif=E2=80=9D and =E2=80=9Ccond=E2=80=9D. At least you can drop t= he =E2=80=9Cif=E2=80=9D and just use cond. > > The loop let and the inner let can be merged. I'm not sure I understand; wouldn't merging the named let with the plain let mean adding an extra LINE argument to my LOOP procedure? I don't want that. Also, how could the above code be expressed using "match"? I'm using predicates which tests for (special) characters in a string; I don't see how the more primitive pattern language of "match" will enable me to do the same. >> +(define (parse-requires.txt requires.txt) >> + "Given REQUIRES.TXT, a Setuptools requires.txt file, return a list of >> +requirement names." >> + ;; This is a very incomplete parser, which job is to select the non-o= ptional > > =E2=80=9Cwhich=E2=80=9D =E2=80=93> =E2=80=9Cwhose=E2=80=9D Fixed, with due diligence reading on English grammar ;-) >> + ;; dependencies and strip them out of any version information. >> + ;; Alternatively, we could implement a PEG parser with the (ice-9 peg) >> + ;; library and the requirements grammar defined by PEP-0508 >> + ;; (https://www.python.org/dev/peps/pep-0508/). > > Let=E2=80=99s remove the sentence starting with =E2=80=9CAlternatively=E2= =80=A6=E2=80=9D. We could do > that but we didn=E2=80=99t :) Alright; done! >> + (define (section-header? line) >> + ;; Return #t if the given LINE is a section header, #f otherwise. >> + (let ((trimmed-line (string-trim line))) >> + (and (not (string-null? trimmed-line)) >> + (eq? (string-ref trimmed-line 0) #\[)))) >> + > > How about using string-prefix? instead? This looks more complicated > than it deserves. You can get rid of string-null? and eq? and string-ref > and all that. > > Same here: > >> + (define (comment? line) >> + ;; Return #t if the given LINE is a comment, #f otherwise. >> + (eq? (string-ref (string-trim line) 0) #\#)) > > I=E2=80=99d just use string-prefix? here. Neat! Adjusted, for both counts. >> +(define (clean-requirement s) >> + ;; Given a requirement LINE, as can be found in a setuptools requires= .txt >> + ;; file, remove everything other than the actual name of the required >> + ;; package, and return it. >> + (string-take s (or (string-index s (lambda (chr) >> + (member chr '(#\space #\> #\=3D = #\<)))) >> + (string-length s)))) > > =E2=80=9Cstring-take=E2=80=9D with =E2=80=9Cstring-length=E2=80=9D is not= very elegant. The char > predicate in string-index could better be a char set: > > (define (clean-requirement s) > (cond > ((string-index s (char-set #\space #\> #\=3D #\<)) =3D> (cut string-tak= e s <>)) > (else s))) That's nicer, thanks! >> ("pypi->guix-package"): Mute tar output to stdout. > > Finally, I think it would be better to keep this separate because it=E2= =80=99s > really orthogonal to the other changes in this patch. OK, done! > What do you think? All good points; I just don't understand the one about using a match and/or merging the regular "let" with the named "let". Thanks, Maxim