unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: ludo@gnu.org (Ludovic Courtès)
To: Federico Beffa <beffa@ieee.org>
Cc: Guix-devel <guix-devel@gnu.org>
Subject: Re: hackage importer
Date: Thu, 26 Mar 2015 14:09:55 +0100	[thread overview]
Message-ID: <871tkbdhwc.fsf@gnu.org> (raw)
In-Reply-To: <CAKrPhPM3Hk_vpvyxYdDX3XgPnhvydaLt0HWijPoUJ8LTO4DRkQ@mail.gmail.com> (Federico Beffa's message of "Sun, 22 Mar 2015 21:12:44 +0100")

Federico Beffa <beffa@ieee.org> skribis:

> I've created a test for hackage.  In doing so I've had an issue with the
> locale and would like an advice: When I run "guix import hackage
> package-name" from the shell, or when I invoke a REPL with Geiser
> everything works fine.  When I initially did run the test-suite with
>
> make check TESTS=tests/hackage.scm
>
> the conditionals conversion looked wrong.  I've found that this is
> related to the choice of locale.  To make the test work as desired I
> added '(setlocale LC_ALL "en_US.UTF-8")' to the file declaring the
> tests.

What’s the exact error you were getting?  Character classes in regexps
are locale sensitive, I think, which could be one reason things behave
differently.  There’s also the problem that, when running in the C
locale, ‘regexp-exec’ (and other C-implemented procedures) cannot be
passed any string that is not strictly ASCII.

Could you post the actual backtrace you get (?) when running the program
with LC_ALL=C?

[...]

>> I would expect the conversion of conditional expressions to sexps to be
>> done in the parsing phase above, such that ‘read-cabal’ returns an
>> object with some sort of an AST for those conditionals.
>>
>> Then this part would focus on the evaluation of those conditionals,
>> like:
>>
>>   ;; Evaluate the conditionals in CABAL according to FLAGS.  Return an
>>   ;; evaluated Cabal object.
>>   (eval-cabal cabal flags)
>>
>> WDYT?
>
> I think it's better to keep the parsing phase to a bare minimum and work
> with layers to keep each function as manageable and simple as possible.
>
> The way it works is as follows:
>
> 1. 'strip-cabal' reads the file, strips comment and empty lines and
> return a list.  Each element of the list is a line from the Cabal file.
>
> 2. 'read-cabal' takes the above list and does the parsing required to
> read the indentation based structure of the file. It returns a list
> composed of list pairs. The pair is composed by a list of keys and a
> list of values. For example, the following snippet
>
> name: foo
> version: 1.0
> ...
>
> is returned as
>
> ((("name")    ("foo"))
>  (("version") ("1.0"))
>  ...)

OK.  I would rather have ‘read-cabal’ take an input port (like Scheme’s
‘read’) and return the list above; this would be the least surprising,
more idiomatic approach.  ‘strip-cabal’ (or
‘strip-insignificant-lines’?) would be an internal procedure used only
by ‘read-cabal’.

WDYT?

> This is enough for all the information that we need to build a Guix
> package, but dependencies.
>
> Dependencies are listed in 'Library' and 'Executable cabal' sections of
> the Cabal file as in the following example snippet:
>
> executable cabal
>   build-depends:
>     HTTP       >= 4000.2.5 && < 4000.3
> ...
>
> and may include conditionals as in (continuing from above with the
> proper indentation)
>
>     if os(windows)
>       build-depends: Win32 >= 2 && < 3
>     else
>       build-depends: unix >= 2.0 && < 2.8
> ...
>
> Now, to make sense of the indentation based structure I need to keep a
> state indicating how many indentation levels we currently have and the
> name of the various sub-sections. For this reason I keep a one to one
> correspondence between indentation levels and section titles. That means
> that the above snipped is encoded as follows in my Cabal object:
>
> ((("executable cabal" "build-depends:")                  ("HTTP..."))
>  (("executable cabal" "if os(windows)" "build-depends:") ("Win32 ..."))
>  (("executable cabal" "else"           "build-depends:") ("unix ..."))
>  ...)
>
> If I split 'if' from the predicate 'os(windows)' then the level of
> indentation and the corresponding section header loose synchronization
> (different length).

Would it be possible for ‘read-cabal’ to instead return a
tree-structured sexp like:

  (if (os windows)
      (build-depends (Win32 >= 2 && < 3))
      (build-depends (unix >= 2.0 && < 2.8)))

That would use a variant of ‘conditional->sexp-like’, essentially.

(Of course the achieve that the parser must keep track of indentation
levels and everything, as you already implemented; I’m just commenting
on the interface here.)

Then, if I could imagine:

  (eval-cabal '(("name" "foo")
                ("version" "1.0"
                ("executable cabal" (if (os windows) ...)))
  => #<cabal-package name: "foo" dependencies: '(unix)>

This way the structure of the Cabal file would be preserved, only
converted to sexp form, which is easier to work with.

Does that make sense?

> For this reason I only convert the predicates from Cabal syntax to
> Scheme syntax when I take the list of dependencies and convert the list
> in the form expected in a guix package.
>
> I hope to have clarified the strategy.
>
>>> +(define (guix-name name)
>>
>> Rather ‘hackage-name->package-name’?
>
> OK
>
>>> +;; Split the comma separated list of dependencies coming from the cabal file
>>> +;; and return a list with inputs suitable for the GUIX package.  Currently the
>>> +;; version information is discarded.
>>
>> s/GUIX/Guix/
>
> OK
>
>>> +(define (split-dependencies ls)
>>> +  (define (split-at-comma d)
>>> +    (map
>>> +     (lambda (m)
>>> +       (let ((name (guix-name (match:substring m 1))))
>>> +         (list name (list 'unquote (string->symbol name)))))
>>> +     (list-matches dependencies-rx d)))
>>
>> I think it could use simply:
>>
>>   (map string-trim-both
>>        (string-tokenize d (char-set-complement (char-set #\,))))
>
> Actually, this wouldn't do.  On top of splitting at commas, the function also
> translates the name of the package from hackage-name to guix-name.

Right, but then it’s best to have two separate procedures and two
compose them:

  (map (compose hackage-name->package-name string-trim-both)
       (string-tokenize d (char-set-complement (char-set #\,))))

> From 231ea64519505f84e252a4b3ab14d3857a9374c2 Mon Sep 17 00:00:00 2001
> From: Federico Beffa <beffa@fbengineering.ch>
> Date: Sat, 7 Mar 2015 17:23:14 +0100
> Subject: [PATCH 1/5] import: Add hackage importer.
> 
> * guix/scripts/import.scm (importers): Add hackage.

[...]

> From ad79b3b0bc8b08ec98f49b665b32ce9259516713 Mon Sep 17 00:00:00 2001
> From: Federico Beffa <beffa@fbengineering.ch>
> Date: Sat, 7 Mar 2015 17:30:17 +0100
> Subject: [PATCH 2/5] import: Add hackage importer.
> 
> * guix/scripts/import/hackage.scm: New file.

These two should be one patch, along with the modification of
POTFILES.in, because these 3 things correspond to a single logical
change–the addition of a sub-command and associated infrastructure.  A
fourth part is also needed: the addition of a couple of paragraphs in
guix.texi.

> From 447c6b8f1ee6cc1d4edba44cabef1b28b75c7608 Mon Sep 17 00:00:00 2001
> From: Federico Beffa <beffa@fbengineering.ch>
> Date: Sun, 8 Mar 2015 07:48:38 +0100
> Subject: [PATCH 3/5] import: Add hackage importer.
> 
> * guix/import/hackage.scm: New file.

Please add tests/hackage.test as part of the same patch.

> +;; (use-modules (ice-9 match))
> +;; (use-modules (ice-9 regex))
> +;; (use-modules (ice-9 rdelim))
> +;; (use-modules (ice-9 receive))
> +;; (use-modules (ice-9 pretty-print))
> +;; (use-modules (srfi srfi-26))
> +;; (use-modules (srfi srfi-11))
> +;; (use-modules (srfi srfi-1))

Debugging leftover?

Nothing to add to what I wrote above.


[...]

> +(define test-cabal-1
> +  "name: foo
> +version: 1.0.0
> +homepage: http://test.org
> +synopsis: synopsis
> +description: description
> +license: BSD3
> +executable cabal
> +  build-depends:
> +    HTTP       >= 4000.2.5 && < 4000.3,
> +    mtl        >= 2.0      && < 3
> +")
> +
> +(define test-cond-1
> +  "(os(darwin) || !(flag(debug))) && flag(cips)")
> +
> +(define read-cabal
> +  (@@ (guix import hackage) read-cabal))
> +
> +(define strip-cabal
> +  (@@ (guix import hackage) strip-cabal))
> +
> +(define eval-tests
> +  (@@ (guix import hackage) eval-tests))
> +
> +(define eval-impl
> +  (@@ (guix import hackage) eval-impl))
> +
> +(define eval-flags
> +  (@@ (guix import hackage) eval-flags))
> +
> +(define conditional->sexp-like
> +  (@@ (guix import hackage) conditional->sexp-like))

I think it would be fine to export ‘read-cabal’ and ‘eval-cabal’ once
these two have the desired interface.

> +(test-begin "hackage")
> +
> +(test-assert "hackage->guix-package"
> +  ;; Replace network resources with sample data.
> +  (mock
> +   ((guix import hackage) hackage-fetch
> +    (lambda (name-version)
> +      (read-cabal
> +       (call-with-input-string test-cabal-1
> +         strip-cabal))))
> +    (match (hackage->guix-package "foo")
> +      (('package
> +         ('name "ghc-foo")
> +         ('version "1.0.0")
> +         ('source
> +          ('origin
> +            ('method 'url-fetch)
> +            ('uri ('string-append
> +                  "http://hackage.haskell.org/package/foo/foo-"
> +                  'version
> +                  ".tar.gz"))
> +            ('sha256
> +             ('base32
> +              (? string? hash)))))
> +         ('build-system 'haskell-build-system)
> +         ('inputs
> +          ('quasiquote
> +           (("ghc-http" ('unquote 'ghc-http))
> +            ("ghc-mtl" ('unquote 'ghc-mtl)))))
> +         ('home-page "http://test.org")
> +         ('synopsis (? string?))
> +         ('description (? string?))
> +         ('license 'bsd-3))
> +        #t)

Nice.

With the adjusted semantics, there could be lower-level tests:

  (test-equal "read-cabal"
    '(("name" "foo") ...)
    (call-with-input-string test-cabal-1 read-cabal))

and:

  (test-assert "eval-cabal, conditions"
    (let ((p (eval-cabal '(("name" "foo")
                           ("executable cabal" (if (os windows) ...))))))
      (and (cabal-package? p)
           (string=? "foo" (cabal-package-name p))
           (equal? '(unix) (cabal-package-dependencies p)))))

Looks like we’re almost there.  I hope the above makes sense!

Thank you,
Ludo’.

  reply	other threads:[~2015-03-26 13:10 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-13 17:59 hackage importer Federico Beffa
2015-03-15 14:38 ` Ludovic Courtès
2015-03-15 22:29   ` Federico Beffa
2015-03-22 20:12     ` Federico Beffa
2015-03-26 13:09       ` Ludovic Courtès [this message]
2015-03-28  8:53         ` Federico Beffa
2015-03-29 13:58           ` Ludovic Courtès
2015-03-29 16:55             ` Federico Beffa
2015-03-31 13:33               ` Ludovic Courtès
2015-04-03 13:01                 ` Federico Beffa
2015-04-05 18:24                   ` Ludovic Courtès
2015-04-26 11:38                 ` Federico Beffa
2015-05-02 12:48                   ` Ludovic Courtès
2015-06-01 15:20                     ` Federico Beffa
2015-06-05  7:30                       ` Ludovic Courtès
2015-06-05 15:19                         ` Federico Beffa
2015-06-09  7:38                           ` Ludovic Courtès
2015-06-09  8:38                             ` Federico Beffa
  -- strict thread matches above, loose matches on Subject: below --
2015-04-26 11:52 Federico Beffa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871tkbdhwc.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=beffa@ieee.org \
    --cc=guix-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).