From mboxrd@z Thu Jan 1 00:00:00 1970 From: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) Subject: Re: hackage importer Date: Tue, 31 Mar 2015 15:33:54 +0200 Message-ID: <87zj6t9tq5.fsf@gnu.org> References: <87k2yiqqaw.fsf@gnu.org> <871tkbdhwc.fsf@gnu.org> <871tk7ykfj.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:39954) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YcwIs-0003A0-Cx for guix-devel@gnu.org; Tue, 31 Mar 2015 09:34:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YcwIn-0000ht-Vi for guix-devel@gnu.org; Tue, 31 Mar 2015 09:34:02 -0400 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:48010) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YcwIn-0000hm-SG for guix-devel@gnu.org; Tue, 31 Mar 2015 09:33:57 -0400 In-Reply-To: (Federico Beffa's message of "Sun, 29 Mar 2015 18:55:34 +0200") List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org To: Federico Beffa Cc: Guix-devel Federico Beffa skribis: > On Sun, Mar 29, 2015 at 3:58 PM, Ludovic Court=C3=A8s wrot= e: >>> On Thu, Mar 26, 2015 at 2:09 PM, Ludovic Court=C3=A8s wr= ote: >>>> Could you post the actual backtrace you get (?) when running the progr= am >>>> with LC_ALL=3DC? >>> >>> I doesn't backtrace, the function just gives the wrong result. >> >> Hmm, OK. Still sounds like an encoding error. > > After changing the character that I mentioned in the previous email it > works correctly with LC_ALL=3DC. Well, OK. We may still be doing something wrong wrt. encoding/decoding, but I=E2=80=99m not sure what. >>> Before working further on improving the interface, I want first to >>> understand what are the root causes of the errors (especially the one >>> causing the backtrace) and fix them. > > The problems turned out to be related to: > * the use of TABs in some .cabal files. I've now updated a couple of rege= xp. > * The following odd indentation which confused the parsing (this is > the one which caused the backtrace): > > build-depends: > base >=3D 4.3 && < 4.9 > , bytestring > , filepath > ... > > I've now improved the algorithm which can now handle this odd indentation. Nice. TABs and odd indentation probably make good additional test cases to have in tests/cabal.scm. > I've now tested the importer with ca. 40 packages and (I believe) they > are all handled without errors. Woohoo! >> Would it be possible for =E2=80=98read-cabal=E2=80=99 to instead return a >> tree-structured sexp like: >> >> (if (os windows) >> (build-depends (Win32 >=3D 2 && < 3)) >> (build-depends (unix >=3D 2.0 && < 2.8))) >> >> That would use a variant of =E2=80=98conditional->sexp-like=E2=80=99, es= sentially. >> >> (Of course the achieve that the parser must keep track of indentation >> levels and everything, as you already implemented; I=E2=80=99m just comm= enting >> on the interface here.) >> >> Then, if I could imagine: >> >> (eval-cabal '(("name" "foo") >> ("version" "1.0" >> ("executable cabal" (if (os windows) ...))) >> =3D> # >> >> This way the structure of the Cabal file would be preserved, only >> converted to sexp form, which is easier to work with. >> >> Does that make sense? > > To be honest, I'm not sure I understand what you would like to achieve. It=E2=80=99s really just about the architecture and layers of code. > 'read-cabal' returns an object and, according to your proposal, you > would like a function '(eval-cabal object)' returning a package. In > the code that is exactly what '(hackage-module->sexp object)' does. Is > it a matter of naming? (I've taken names from the python and perl > build systems, but of course I can change them if desired.) I think it=E2=80=99s a matter of separating concerns. In my mind there are three distinct layers: 1. Cabal parsing (what I call =E2=80=98read-cabal=E2=80=99, because it=E2= =80=99s the equivalent of =E2=80=98read=E2=80=99); 2. Cabal evaluation/instantiation for a certain set of flags, OS, etc. (what I call =E2=80=98eval-cabal=E2=80=99 because it=E2=80=99s th= e equivalent of =E2=80=98eval=E2=80=99); 3. Conversion of Cabal packages of Guix package sexps. My concern was about making sure these three phases were clearly visible in the code. Tu put it differently, #1 and #2 would conceptually be part of a Cabal parsing/evaluation library, while #3 would be the only Guix-specific part. > Right now 'read-cabal' is fairly simple and easy to read and debug. > Some complexity for the evaluation of conditionals is postponed and > handled by the function '(dependencies-cond->sexp object)' which is > used internally by '(hackage-module->sexp object)' to create the > package. > > As far as I understand, you would like 'read-cabal' to directly > evaluate conditionals. No, precisely not. I=E2=80=99m saying =E2=80=98read-cabal=E2=80=99 should = include an AST of conditionals; that AST would be evaluated by =E2=80=98eval-cabal=E2=80=99. Anyway, I=E2=80=99ve probably used enough of your time by now. :-) If this discussion gives you ideas on how to structure the code, that is fine, but otherwise we can probably go with the architecture you propose. How does that sound? Thanks, Ludo=E2=80=99.