From mboxrd@z Thu Jan 1 00:00:00 1970 From: ludo@gnu.org (Ludovic =?UTF-8?Q?Court=C3=A8s?=) Subject: bug#21829: guix import hackage failures Date: Sun, 15 Nov 2015 21:59:37 +0100 Message-ID: <87k2pjq8qu.fsf@gnu.org> References: <87d1vghjhk.fsf@gnu.org> <87vb971t74.fsf@gnu.org> <87lha3ufxv.fsf@gnu.org> <87h9kp1ts2.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:34785) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zy4PB-0000Vu-8X for bug-guix@gnu.org; Sun, 15 Nov 2015 16:00:10 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zy4P8-0002h5-03 for bug-guix@gnu.org; Sun, 15 Nov 2015 16:00:09 -0500 Received: from debbugs.gnu.org ([208.118.235.43]:49526) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zy4P7-0002gy-SY for bug-guix@gnu.org; Sun, 15 Nov 2015 16:00:05 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1Zy4P6-0001n8-H7 for bug-guix@gnu.org; Sun, 15 Nov 2015 16:00:04 -0500 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: (Federico Beffa's message of "Sat, 14 Nov 2015 15:37:35 +0100") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org To: Federico Beffa Cc: 21829@debbugs.gnu.org Federico Beffa skribis: > On Fri, Nov 13, 2015 at 10:19 PM, Ludovic Court=C3=A8s wro= te: >> Federico Beffa skribis: [...] >> In practice this discards LF even if it=E2=80=99s not following CR; that= =E2=80=99s >> probably a good enough approximation, but an XXX comment would be >> welcome. > > This is intentional because, in my ignorance, I only know of uses of > '\r' before or after '\n'. Do you know of any other use in text files? ISTR that some OSes (MacOS 9 and earlier?! who cares?! :-)) use(d) a single LF instead of a single CR. Again that=E2=80=99s fine in practice I guess, but I always think it=E2=80= =99s good to add a note when we make an approximation so we can notice later, just in case. > The attached patches fix the parsing of all but two of the failures > reported by Paul. > Two cabal files are still not imported correctly because they are buggy: > > * streaming-commons: indentation changes from 4 to 2. But this is > explicitly forbidden. From [1]: "Field names may be indented, but all > field values in the same section must use the same indentation." > > * fgl: uses braces to delimit the value of a field. As far as I > understand this is not allowed by [1]: "To continue a field value, > indent the next line relative to the field name." and "Flags, > conditionals, library and executable sections use layout to indicate > structure. ... As an alternative to using layout you can also use > explicit braces {}. ". Thus I understand that braces may be used to > delimit sections, not field values. Fair enough! > Obviously the official 'cabal' program is more permissive than the > description in the documentation. We=E2=80=99re more royalist than the king! ;-) > From d13f06383d07e0ad4096ff7eb715264463738b0c Mon Sep 17 00:00:00 2001 > From: Federico Beffa > Date: Wed, 11 Nov 2015 10:39:38 +0100 > Subject: [PATCH 1/6] import: hackage: Add recognition of 'true' and 'fals= e' > symbols. > > * guix/import/cabal.scm (is-true, is-false, lex-true, lex-false): New pro= cedures. > (lex-word): Use them. > (make-cabal-parser): Add TRUE and FALSE tokens. > (eval): Add entries for 'true and 'false symbols. LGTM. > From 445f1b6197c0e266027ac033c52629d990137171 Mon Sep 17 00:00:00 2001 > From: Federico Beffa > Date: Wed, 11 Nov 2015 11:22:42 +0100 > Subject: [PATCH 2/6] import: hackage: Imporve parsing of tests. > > * guix/import/cabal.scm (lex-word): Add support for tests with no spaces. > (impl): Fix handling of operator "=3D=3D". LGTM, but I think it=E2=80=99d be great to add a test that illustrates the = case that this fixes (and to make sure it doesn=E2=80=99t come back later.) > From f796d814821289a98e401a3e3df13334a2e8689b Mon Sep 17 00:00:00 2001 > From: Federico Beffa > Date: Wed, 11 Nov 2015 15:31:46 +0100 > Subject: [PATCH 3/6] import: hackage: Make it resilient to missing final > newline. > > * guix/import/cabal.scm (peek-next-line-indent): Check for missing final > newline. [...] > + (if (eof-object? (peek-char port)) > + ;; If the file is missing the #\newline on the last line, add it a= nd act > + ;; as if it were there. This is needed for propoer operation of ^^^^ Typo. > + ;; indentation based block recognition. > + (begin (unread-char #\newline port) (read-char port) 0) Isn=E2=80=99t this equivalent to: 0 ? Could you add a test for this one? > From 225164d2355afd6f9455251d87cbd34b08f68cdb Mon Sep 17 00:00:00 2001 > From: Federico Beffa > Date: Wed, 11 Nov 2015 16:20:45 +0100 > Subject: [PATCH 4/6] import: hackage: Make parsing of tests and fields mo= re > flexible. > > * guix/import/cabal.scm (is-test): Allow spaces between keyword and > parentheses. > (is-id): Add argument 'port'. Allow spaces between keyword and column. > (lex-word): Adjust call to 'is-id'. LGTM, and would be perfect with a test. ;-) > From 1b26410f4a7a920382750bffbf5381394acafdbc Mon Sep 17 00:00:00 2001 > From: Federico Beffa > Date: Sat, 14 Nov 2015 15:00:36 +0100 > Subject: [PATCH 5/6] utils: Add 'canonical-newline-port'. > > * guix/utils.scm (canonical-newline-port): New procedure. > * tests/utils.scm ("canonical-newline-port"): New test. [...] > +(test-equal "canonical-newline-port" > + "This is a journey" > + (let ((port (open-string-input-port > + "This is a journey\r\n"))) > + (get-line (canonical-newline-port port)))) I would rather use =E2=80=98get-string-all=E2=80=99 and make sure the resul= t is exactly: "This is a journey\n" (Because =E2=80=98get-line=E2=80=99 could have been doing its own thing reg= ardless of the EOL style.) A test with several lines, including lines with just \n would be nice. > From c57be8cae9b3642beff1462acd32a0aee54ad7c6 Mon Sep 17 00:00:00 2001 > From: Federico Beffa > Date: Sat, 14 Nov 2015 15:15:00 +0100 > Subject: [PATCH 6/6] import: hackage: Handle CRLF end of line style. > > * guix/import/hackage.scm (hackage-fetch, hackage->guix-package): Do it. Rather =E2=80=9CUse =E2=80=98canonical-newline-port=E2=80=99.=E2=80=9D inst= ead of =E2=80=9CDo it.=E2=80=9D Thanks for all the work! Ludo=E2=80=99.