From mboxrd@z Thu Jan 1 00:00:00 1970 From: Federico Beffa Subject: Re: hackage importer Date: Sun, 29 Mar 2015 18:55:34 +0200 Message-ID: References: <87k2yiqqaw.fsf@gnu.org> <871tkbdhwc.fsf@gnu.org> <871tk7ykfj.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:45424) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YcGUq-0005yV-W9 for guix-devel@gnu.org; Sun, 29 Mar 2015 12:55:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YcGUp-0002g5-Sn for guix-devel@gnu.org; Sun, 29 Mar 2015 12:55:36 -0400 In-Reply-To: <871tk7ykfj.fsf@gnu.org> List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org To: =?UTF-8?Q?Ludovic_Court=C3=A8s?= Cc: Guix-devel On Sun, Mar 29, 2015 at 3:58 PM, Ludovic Court=C3=A8s wrote: >> On Thu, Mar 26, 2015 at 2:09 PM, Ludovic Court=C3=A8s wro= te: >>> Could you post the actual backtrace you get (?) when running the progra= m >>> with LC_ALL=3DC? >> >> I doesn't backtrace, the function just gives the wrong result. > > Hmm, OK. Still sounds like an encoding error. After changing the character that I mentioned in the previous email it works correctly with LC_ALL=3DC. >> Before working further on improving the interface, I want first to >> understand what are the root causes of the errors (especially the one >> causing the backtrace) and fix them. The problems turned out to be related to: * the use of TABs in some .cabal files. I've now updated a couple of regexp= . * The following odd indentation which confused the parsing (this is the one which caused the backtrace): build-depends: base >=3D 4.3 && < 4.9 , bytestring , filepath ... I've now improved the algorithm which can now handle this odd indentation. I've now tested the importer with ca. 40 packages and (I believe) they are all handled without errors. > OK. I would rather have =E2=80=98read-cabal=E2=80=99 take an input port = (like Scheme=E2=80=99s > =E2=80=98read=E2=80=99) and return the list above; this would be the leas= t surprising, > more idiomatic approach. =E2=80=98strip-cabal=E2=80=99 (or > =E2=80=98strip-insignificant-lines=E2=80=99?) would be an internal proced= ure used only > by =E2=80=98read-cabal=E2=80=99. That's no problem. The way it is right now makes it easier to test in the REPL, but is in no way essential. > Would it be possible for =E2=80=98read-cabal=E2=80=99 to instead return a > tree-structured sexp like: > > (if (os windows) > (build-depends (Win32 >=3D 2 && < 3)) > (build-depends (unix >=3D 2.0 && < 2.8))) > > That would use a variant of =E2=80=98conditional->sexp-like=E2=80=99, ess= entially. > > (Of course the achieve that the parser must keep track of indentation > levels and everything, as you already implemented; I=E2=80=99m just comme= nting > on the interface here.) > > Then, if I could imagine: > > (eval-cabal '(("name" "foo") > ("version" "1.0" > ("executable cabal" (if (os windows) ...))) > =3D> # > > This way the structure of the Cabal file would be preserved, only > converted to sexp form, which is easier to work with. > > Does that make sense? To be honest, I'm not sure I understand what you would like to achieve. 'read-cabal' returns an object and, according to your proposal, you would like a function '(eval-cabal object)' returning a package. In the code that is exactly what '(hackage-module->sexp object)' does. Is it a matter of naming? (I've taken names from the python and perl build systems, but of course I can change them if desired.) To the representation of object: Right now 'read-cabal' is fairly simple and easy to read and debug. Some complexity for the evaluation of conditionals is postponed and handled by the function '(dependencies-cond->sexp object)' which is used internally by '(hackage-module->sexp object)' to create the package. As far as I understand, you would like 'read-cabal' to directly evaluate conditionals. To achieve that, essentially all of the functionality of '(dependencies-cond->sexp object)' would have to be included in it, making 'read-cabal' a substantially more complex function and simplifying the work of "later" functions. So, as I see it, we would just move the complexity from one function to another one. In addition, with the current approach within '(dependencies-cond->sexp object)': (i) I can easily discard everything not related to depencendies before handling the conditionals of interest. (ii) I have all the cabal file flags, even if they come after the conditional in the file. If I'm completely missing the point, could you please be more verbose with your explanation. Thanks for your patience! Regards, Fede