all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: ludo@gnu.org (Ludovic Courtès)
To: Mark H Weaver <mhw@netris.org>
Cc: guix-devel@gnu.org
Subject: Re: Optimizing union.scm
Date: Tue, 25 Mar 2014 18:18:07 +0100	[thread overview]
Message-ID: <87bnwugptc.fsf@gnu.org> (raw)
In-Reply-To: <87txamkbde.fsf@yeeloong.lan> (Mark H. Weaver's message of "Tue, 25 Mar 2014 03:04:13 -0400")

Mark H Weaver <mhw@netris.org> skribis:

> Here's a first draft of a new union.scm.

Thanks for being so fast!

> On my YeeLoong 8101B, when building my 155-package profile, it uses
> about 1/14 as much CPU time (25 seconds vs 6 minutes), and about 2/7 as
> much real time (2 minutes vs about 7.17 minutes).  It also handles
> libffi in core-updates properly.  A few notes:

That’s good news, and definitely an incentive to switch to this
implementation.

> * For now, I'm using a hash table, because it turned out to be rather
>   awkward to use a vhash for this.  I can make it purely functional
>   later if it's deemed important.

I would prefer it, but it must not be blocking.

> * I've not yet updated tests/union.scm, which tests internal procedures
>   in union.scm that no longer exist.

Right, the ‘tree-union’ and ‘delete-duplicate-leaves’ tests will have to
be removed.

However, could you add similar tests?  They will have to use the real
‘union-build’, though.  Maybe the ‘with-file-tree’ can be useful to
write the tests.

> * Although I can now run union-build in 2 minutes, another full minute
>   is spent in 'display-search-paths', plus another 30-40 seconds in
>   'guix package' before the profile build begins.

Ouch, you know where to look afterwards.  :-)

> In the end, the total time for "guix package -i" goes from around 8.5
> minutes to around 3.5 minutes: a significant improvement, but there's
> still more code for me to look at trying to optimize.

Excellent!

Some stylistic comments:

> (define (names-in-directory dirname)
>   (let ((dir (opendir dirname)))
>     (let loop ((filenames '()))
>       (match (readdir dir)
>         ((or "." "..")
>          (loop filenames))
>         ((? eof-object?)
>          (closedir dir)
>          filenames)
>         (name
>          (loop (cons name filenames)))))))

Rather use something like:

  (scandir directory (lambda (file)
                       (not (member file '("." "..")))))

‘scandir’ also has the advantage of being deterministic (it sorts
entries.)

> (define (directory? name)
>   (eq? 'directory (stat:type (stat name))))

Rather ‘file-is-directory?’.

> (define* (union-build output inputs
>                       #:key (log-port (current-error-port)))

Docstring.

>   (define (make-link input output)

Maybe ‘symlink*’?  (‘make-link’ sounds like it returns a newly-allocated
object.)

>     (format log-port "`~a' ~~> `~a'~%" input output)
>     (symlink input output))
>
>   (define (union output inputs)
>     (match inputs
>       ((input)
>        ;; There's only one input, so just make a link.
>        (make-link input output))
>       (_
>        (receive (dirs files) (partition directory? inputs)

Rather SRFI-11 let-values.

>          (cond
>           ((null? files)
>
>            ;; All inputs are directories.  Create a new directory
>            ;; where we will merge the input directories.
>            (mkdir output)
>
>            ;; Build a hash table mapping each name to a list of input
>            ;; directories containing that name.
>            (let ((table (make-hash-table)))
>
>              (define (add-to-table! name dir)
>                (let ((handle (hash-create-handle! table name '())))
>                  (set-cdr! handle (cons dir (cdr handle)))))

Rather ‘hash-set!’ (‘set-cdr!’ is doomed, you know ;-)).

>               (format (current-error-port)
>                       "warning: collision encountered: ~{~a ~}~%"
>                       files)
>
>               ;; TODO: Implement smarter strategies.
>               (format (current-error-port)
>                       "warning: arbitrarily choosing ~a~%"
>                       file)

Can we keep that as a separate, possibly inner ‘resolve-collisions’
procedure?  That makes it clear what to extend, when we want to
implement that (the original idea was that ‘union-build’ could be passed
a collision-resolution procedure.)

>               (make-link file output))))
>
>           (else
>            ;; The inputs are a mixture of files and directories
>            (error "union-build: collision between file and directories"
>                   `((files ,files) (dirs ,dirs)))))))))

Same for this one.

I like that it’s smaller that the current union.scm.  :-)

Could you please update it and tests/union.scm, and then I think we’ll
be all set?

Thank you!

Ludo’.

  reply	other threads:[~2014-03-25 17:19 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-12  7:12 Problems with handicapped 'bash' from glibc package Mark H Weaver
2014-02-12 13:14 ` Ludovic Courtès
2014-02-12 17:39   ` Mark H Weaver
2014-02-12 19:31     ` Andreas Enge
2014-02-12 20:33       ` Ludovic Courtès
2014-02-12 21:33         ` Mark H Weaver
2014-02-13  9:14           ` Andreas Enge
2014-03-23 16:19 ` Ludovic Courtès
2014-03-23 20:19   ` Mark H Weaver
2014-03-23 20:27     ` Ludovic Courtès
2014-03-24  3:31       ` Mark H Weaver
2014-03-28 13:48         ` Ludovic Courtès
2014-03-24  3:55       ` Optimizing union.scm Mark H Weaver
2014-03-24 13:45         ` Ludovic Courtès
2014-03-25  7:04           ` Mark H Weaver
2014-03-25 17:18             ` Ludovic Courtès [this message]
2014-03-25 22:30               ` Mark H Weaver
2014-03-25 22:58                 ` Ludovic Courtès
2014-03-27  7:09                   ` Mark H Weaver
2014-03-27  9:57                     ` Ludovic Courtès
2014-04-02 14:14             ` Optimizing ‘guix package’ Ludovic Courtès
2014-04-02 16:58               ` Mark H Weaver
2014-03-26 23:29   ` Problems with handicapped 'bash' from glibc package Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bnwugptc.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=guix-devel@gnu.org \
    --cc=mhw@netris.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.