unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Problems with handicapped 'bash' from glibc package
@ 2014-02-12  7:12 Mark H Weaver
  2014-02-12 13:14 ` Ludovic Courtès
  2014-03-23 16:19 ` Ludovic Courtès
  0 siblings, 2 replies; 23+ messages in thread
From: Mark H Weaver @ 2014-02-12  7:12 UTC (permalink / raw)
  To: guix-devel

Hello all,

The 'bash' in the glibc package is handicapped in at least two ways:

* It can't set the locale, because it looks for locales in
  /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-intermediate-2.18-locales

* It can't look up anything from NSS, such as passwd data, because it
  tries to load the modules from
  /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-intermediate-2.18

There are two problems that need to be addressed, I think:

* Users could easily end up with this handicapped 'bash' as their
  primary bash, if they installed (or upgraded?) 'glibc' since the last
  time I installed 'bash'.  This happened to me, for example.

* Some (most?) programs in Guix that launch subprocesses with the shell
  use this handicapped one.  For example, every time I run 'w3m', it
  prints two warnings about 'sh' being unable to set the locale.

Any suggestions about how we should address these problems?

     Thanks,
       Mark

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Problems with handicapped 'bash' from glibc package
  2014-02-12  7:12 Problems with handicapped 'bash' from glibc package Mark H Weaver
@ 2014-02-12 13:14 ` Ludovic Courtès
  2014-02-12 17:39   ` Mark H Weaver
  2014-03-23 16:19 ` Ludovic Courtès
  1 sibling, 1 reply; 23+ messages in thread
From: Ludovic Courtès @ 2014-02-12 13:14 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel

Mark H Weaver <mhw@netris.org> skribis:

> The 'bash' in the glibc package is handicapped in at least two ways:
>
> * It can't set the locale, because it looks for locales in
>   /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-intermediate-2.18-locales
>
> * It can't look up anything from NSS, such as passwd data, because it
>   tries to load the modules from
>   /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-intermediate-2.18
>
> There are two problems that need to be addressed, I think:
>
> * Users could easily end up with this handicapped 'bash' as their
>   primary bash, if they installed (or upgraded?) 'glibc' since the last
>   time I installed 'bash'.  This happened to me, for example.
>
> * Some (most?) programs in Guix that launch subprocesses with the shell
>   use this handicapped one.  For example, every time I run 'w3m', it
>   prints two warnings about 'sh' being unable to set the locale.
>
> Any suggestions about how we should address these problems?

Indeed, that’s a problem.

For the record, the handicaped bash comes from the removal of /bin/sh
[0].  It is used by ‘system’ and ‘popen’.

Looks like solving this would require either rewriting glibc references
in the static bash binary (tricky, especially since the glibc directory
names have different lengths currently), or building Bash directly in
the glibc-final derivation so that it refers to the right libc with all
its bells and whistles.

The latter sounds best, but it would require to sort of duplicate the
build recipe of Bash internally.

Another option would be to apply glibc-bootstrap-system.patch
unconditionally, but I’m not sure if it’s a good idea.

Thoughts?

Ludo’.

[0] http://lists.gnu.org/archive/html/bug-guix/2013-01/msg00041.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Problems with handicapped 'bash' from glibc package
  2014-02-12 13:14 ` Ludovic Courtès
@ 2014-02-12 17:39   ` Mark H Weaver
  2014-02-12 19:31     ` Andreas Enge
  0 siblings, 1 reply; 23+ messages in thread
From: Mark H Weaver @ 2014-02-12 17:39 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Mark H Weaver <mhw@netris.org> skribis:
>
>> The 'bash' in the glibc package is handicapped in at least two ways:
>>
>> * It can't set the locale, because it looks for locales in
>>   /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-intermediate-2.18-locales
>>
>> * It can't look up anything from NSS, such as passwd data, because it
>>   tries to load the modules from
>>   /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-intermediate-2.18
>>
>> There are two problems that need to be addressed, I think:
>>
>> * Users could easily end up with this handicapped 'bash' as their
>>   primary bash, if they installed (or upgraded?) 'glibc' since the last
>>   time I installed 'bash'.  This happened to me, for example.
>>
>> * Some (most?) programs in Guix that launch subprocesses with the shell
>>   use this handicapped one.  For example, every time I run 'w3m', it
>>   prints two warnings about 'sh' being unable to set the locale.
>>
>> Any suggestions about how we should address these problems?
>
> Indeed, that’s a problem.
>
> For the record, the handicaped bash comes from the removal of /bin/sh
> [0].  It is used by ‘system’ and ‘popen’.
>
> Looks like solving this would require either rewriting glibc references
> in the static bash binary (tricky, especially since the glibc directory
> names have different lengths currently), or building Bash directly in
> the glibc-final derivation so that it refers to the right libc with all
> its bells and whistles.
>
> The latter sounds best, but it would require to sort of duplicate the
> build recipe of Bash internally.

FWIW, I think this latter option is the least bad of the ones you
suggested, and I can't think of a better solution.

While we're on the subject of 'bash', should we be applying the patches
in http://ftp.gnu.org/gnu/bash/bash-4.2-patches/ ?

      Mark

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Problems with handicapped 'bash' from glibc package
  2014-02-12 17:39   ` Mark H Weaver
@ 2014-02-12 19:31     ` Andreas Enge
  2014-02-12 20:33       ` Ludovic Courtès
  0 siblings, 1 reply; 23+ messages in thread
From: Andreas Enge @ 2014-02-12 19:31 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel

On Wed, Feb 12, 2014 at 05:39:35PM +0000, Mark H Weaver wrote:
> While we're on the subject of 'bash', should we be applying the patches
> in http://ftp.gnu.org/gnu/bash/bash-4.2-patches/ ?

Is there a chance of convincing the bash people to release bash 4.2.1?
It seems absurd to off-load the work of applying 45 patches from the last
three years on the users.

Andreas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Problems with handicapped 'bash' from glibc package
  2014-02-12 19:31     ` Andreas Enge
@ 2014-02-12 20:33       ` Ludovic Courtès
  2014-02-12 21:33         ` Mark H Weaver
  0 siblings, 1 reply; 23+ messages in thread
From: Ludovic Courtès @ 2014-02-12 20:33 UTC (permalink / raw)
  To: Andreas Enge; +Cc: guix-devel

Andreas Enge <andreas@enge.fr> skribis:

> On Wed, Feb 12, 2014 at 05:39:35PM +0000, Mark H Weaver wrote:
>> While we're on the subject of 'bash', should we be applying the patches
>> in http://ftp.gnu.org/gnu/bash/bash-4.2-patches/ ?
>
> Is there a chance of convincing the bash people to release bash 4.2.1?
> It seems absurd to off-load the work of applying 45 patches from the last
> three years on the users.

Yeah, that’s annoying.

Basically, these patches are the version control system of Bash
(really!).  So in my view, if the maintainer doesn’t publish a new
version, then it means that users can live without these patches.

I haven’t checked what the current patch set is about, though.

Ludo’.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Problems with handicapped 'bash' from glibc package
  2014-02-12 20:33       ` Ludovic Courtès
@ 2014-02-12 21:33         ` Mark H Weaver
  2014-02-13  9:14           ` Andreas Enge
  0 siblings, 1 reply; 23+ messages in thread
From: Mark H Weaver @ 2014-02-12 21:33 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Andreas Enge <andreas@enge.fr> skribis:
>
>> On Wed, Feb 12, 2014 at 05:39:35PM +0000, Mark H Weaver wrote:
>>> While we're on the subject of 'bash', should we be applying the patches
>>> in http://ftp.gnu.org/gnu/bash/bash-4.2-patches/ ?
>>
>> Is there a chance of convincing the bash people to release bash 4.2.1?
>> It seems absurd to off-load the work of applying 45 patches from the last
>> three years on the users.
>
> Yeah, that’s annoying.
>
> Basically, these patches are the version control system of Bash
> (really!).  So in my view, if the maintainer doesn’t publish a new
> version, then it means that users can live without these patches.
>
> I haven’t checked what the current patch set is about, though.

The patches are bug fixes.  Things like:

* bash42-003: When using the pattern replacement and pattern removal
  word expansions, bash miscalculates the possible match length in the
  presence of an unescaped left bracket without a closing right bracket,
  resulting in a failure to match the pattern.

* bash42-004: When used in contexts where word splitting and quote
  removal were not performed, such as pattern removal or pattern
  substitution, empty strings (either literal or resulting from quoted
  variables that were unset or null) were not matched correctly,
  resulting in failure.

* base42-007: When used in contexts where word splitting and quote
  removal were not performed, such as case statement word expansion,
  empty strings (either literal or resulting from quoted variables that
  were unset or null) were not expanded correctly, resulting in failure.

* base42-008: Bash-4.2 does not attempt to save the shell history on
  receipt of a terminating signal that is handled synchronously.
  Unfortunately, the `close' button on most X11 terminal emulators sends
  SIGHUP, which kills the shell.

* base42-009: Under certain circumstances, running `fc -l' two times in
  succession with a relative history offset at the end of the history
  will result in an incorrect calculation of the last history entry and
  a seg fault.

...etc.

I don't want to run a shell with known bugs.
I want these fixes on my system.

What do you think?

      Mark

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Problems with handicapped 'bash' from glibc package
  2014-02-12 21:33         ` Mark H Weaver
@ 2014-02-13  9:14           ` Andreas Enge
  0 siblings, 0 replies; 23+ messages in thread
From: Andreas Enge @ 2014-02-13  9:14 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel

On Wed, Feb 12, 2014 at 09:33:16PM +0000, Mark H Weaver wrote:
> The patches are bug fixes.  Things like:
> I don't want to run a shell with known bugs.
> I want these fixes on my system.
> 
> What do you think?

I think you should try to gently prod the bash maintainers to prepare a new
release; that would be the reasonable thing. One cannot expect users to
apply 45 patches. I just made a new release of gnu mpc at the request of the
glibc developers, fixing two bugs...

Andreas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Problems with handicapped 'bash' from glibc package
  2014-02-12  7:12 Problems with handicapped 'bash' from glibc package Mark H Weaver
  2014-02-12 13:14 ` Ludovic Courtès
@ 2014-03-23 16:19 ` Ludovic Courtès
  2014-03-23 20:19   ` Mark H Weaver
  2014-03-26 23:29   ` Problems with handicapped 'bash' from glibc package Ludovic Courtès
  1 sibling, 2 replies; 23+ messages in thread
From: Ludovic Courtès @ 2014-03-23 16:19 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel

Mark H Weaver <mhw@netris.org> skribis:

> The 'bash' in the glibc package is handicapped in at least two ways:
>
> * It can't set the locale, because it looks for locales in
>   /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-intermediate-2.18-locales
>
> * It can't look up anything from NSS, such as passwd data, because it
>   tries to load the modules from
>   /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-intermediate-2.18
>
> There are two problems that need to be addressed, I think:
>
> * Users could easily end up with this handicapped 'bash' as their
>   primary bash, if they installed (or upgraded?) 'glibc' since the last
>   time I installed 'bash'.  This happened to me, for example.

I realized that this particular problem is easily solved by moving
glibc’s bash away from $bindir, for instance to $libexecdir.

I’m trying it out locally, and plan to commit to core-updates if
everything works as expected (hopefully as the last core-updates
change.)

Thoughts?

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Problems with handicapped 'bash' from glibc package
  2014-03-23 16:19 ` Ludovic Courtès
@ 2014-03-23 20:19   ` Mark H Weaver
  2014-03-23 20:27     ` Ludovic Courtès
  2014-03-26 23:29   ` Problems with handicapped 'bash' from glibc package Ludovic Courtès
  1 sibling, 1 reply; 23+ messages in thread
From: Mark H Weaver @ 2014-03-23 20:19 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Mark H Weaver <mhw@netris.org> skribis:
>
>> The 'bash' in the glibc package is handicapped in at least two ways:
>>
>> * It can't set the locale, because it looks for locales in
>>   /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-intermediate-2.18-locales
>>
>> * It can't look up anything from NSS, such as passwd data, because it
>>   tries to load the modules from
>>   /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-intermediate-2.18
>>
>> There are two problems that need to be addressed, I think:
>>
>> * Users could easily end up with this handicapped 'bash' as their
>>   primary bash, if they installed (or upgraded?) 'glibc' since the last
>>   time I installed 'bash'.  This happened to me, for example.
>
> I realized that this particular problem is easily solved by moving
> glibc’s bash away from $bindir, for instance to $libexecdir.
>
> I’m trying it out locally, and plan to commit to core-updates if
> everything works as expected (hopefully as the last core-updates
> change.)
>
> Thoughts?

Hmm.  We need a more intelligent union.scm anyway, and that would solve
this problem and many others.  Therefore, I'd be inclined to avoid
making this change, and instead work on union.scm.  I want to optimize
it anyway, since it takes over 5 minutes to build my profile, which is a
bit painful.

I confess that I'm biased, because it would mean throwing away most of
the core-updates build outputs my YeeLoong 8101B has produced over the
last four days, and starting again from scratch :-(

So perhaps I should recuse myself from this decision.

     Mark

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Problems with handicapped 'bash' from glibc package
  2014-03-23 20:19   ` Mark H Weaver
@ 2014-03-23 20:27     ` Ludovic Courtès
  2014-03-24  3:31       ` Mark H Weaver
  2014-03-24  3:55       ` Optimizing union.scm Mark H Weaver
  0 siblings, 2 replies; 23+ messages in thread
From: Ludovic Courtès @ 2014-03-23 20:27 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel

Mark H Weaver <mhw@netris.org> skribis:

> ludo@gnu.org (Ludovic Courtès) writes:
>
>> Mark H Weaver <mhw@netris.org> skribis:
>>
>>> The 'bash' in the glibc package is handicapped in at least two ways:
>>>
>>> * It can't set the locale, because it looks for locales in
>>>   /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-intermediate-2.18-locales
>>>
>>> * It can't look up anything from NSS, such as passwd data, because it
>>>   tries to load the modules from
>>>   /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-intermediate-2.18
>>>
>>> There are two problems that need to be addressed, I think:
>>>
>>> * Users could easily end up with this handicapped 'bash' as their
>>>   primary bash, if they installed (or upgraded?) 'glibc' since the last
>>>   time I installed 'bash'.  This happened to me, for example.
>>
>> I realized that this particular problem is easily solved by moving
>> glibc’s bash away from $bindir, for instance to $libexecdir.
>>
>> I’m trying it out locally, and plan to commit to core-updates if
>> everything works as expected (hopefully as the last core-updates
>> change.)
>>
>> Thoughts?
>
> Hmm.  We need a more intelligent union.scm anyway,

Agreed, but that’s a separate issue.

Having it in $bindir also means that patch-shebangs can pick it up,
which is usually not what we want.

So I think it really makes sense to move it out of sight.

> and that would solve this problem and many others.  Therefore, I'd be
> inclined to avoid making this change, and instead work on union.scm.
> I want to optimize it anyway, since it takes over 5 minutes to build
> my profile, which is a bit painful.

Oh, this much?

I have 140 packages in my profile and it takes less than 30s to build
it; that’s an SSD though, so that probably makes a big difference.

> I confess that I'm biased, because it would mean throwing away most of
> the core-updates build outputs my YeeLoong 8101B has produced over the
> last four days, and starting again from scratch :-(

Yeah, I understand the frustration.  However, I really view core-updates
as the place where we can make this kind of change without having to pay
much attention to build time (though I reckon this one occurs close to
the intended merge date.)

Ludo’.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Problems with handicapped 'bash' from glibc package
  2014-03-23 20:27     ` Ludovic Courtès
@ 2014-03-24  3:31       ` Mark H Weaver
  2014-03-28 13:48         ` Ludovic Courtès
  2014-03-24  3:55       ` Optimizing union.scm Mark H Weaver
  1 sibling, 1 reply; 23+ messages in thread
From: Mark H Weaver @ 2014-03-24  3:31 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Mark H Weaver <mhw@netris.org> skribis:
>
>> ludo@gnu.org (Ludovic Courtès) writes:
>>
>>> Mark H Weaver <mhw@netris.org> skribis:
>>>
>>>> The 'bash' in the glibc package is handicapped in at least two ways:
>>>>
>>>> * It can't set the locale, because it looks for locales in
>>>>   /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-intermediate-2.18-locales
>>>>
>>>> * It can't look up anything from NSS, such as passwd data, because it
>>>>   tries to load the modules from
>>>>   /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-intermediate-2.18
>>>>
>>>> There are two problems that need to be addressed, I think:
>>>>
>>>> * Users could easily end up with this handicapped 'bash' as their
>>>>   primary bash, if they installed (or upgraded?) 'glibc' since the last
>>>>   time I installed 'bash'.  This happened to me, for example.
>>>
>>> I realized that this particular problem is easily solved by moving
>>> glibc’s bash away from $bindir, for instance to $libexecdir.
>>>
>>> I’m trying it out locally, and plan to commit to core-updates if
>>> everything works as expected (hopefully as the last core-updates
>>> change.)
>>>
>>> Thoughts?
>>
>> Hmm.  We need a more intelligent union.scm anyway,
>
> Agreed, but that’s a separate issue.
>
> Having it in $bindir also means that patch-shebangs can pick it up,
> which is usually not what we want.
>
> So I think it really makes sense to move it out of sight.

Well, that's sweeping it under the rug, when actually this broken bash
should never be used for anything except perhaps during the bootstrap.

The fact is, we have broken 'system' and 'popen' functions in Guix.
For example:

--8<---------------cut here---------------start------------->8---
mhw:~$ cat foo.c
#include <stdlib.h>

int
main (int argc, char *argv[])
{
  return system ("echo Hello world");
}

mhw:~$ gcc -o foo foo.c
mhw:~$ ./foo
sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
Hello world
mhw:~$ 
--8<---------------cut here---------------end--------------->8---

Our glibc should not use a broken bash.  It would also be nice if the
bash used by glibc was linked with the same glibc, so that we only need
one copy of glibc in memory.  So there's a circular dependency here.

I think we need a better way of dealing with circular dependencies in
general, but in the meantime, I think we should build the final glibc
and bash as two outputs from the same derivation, so that they can refer
to each other.

What do you think?

>> I confess that I'm biased, because it would mean throwing away most of
>> the core-updates build outputs my YeeLoong 8101B has produced over the
>> last four days, and starting again from scratch :-(
>
> Yeah, I understand the frustration.  However, I really view core-updates
> as the place where we can make this kind of change without having to pay
> much attention to build time (though I reckon this one occurs close to
> the intended merge date.)

Well, I certainly agree with this general view, and if I felt that this
was a proper fix to the glibc/bash problem, I'd of course agree that we
should do it.

     Mark

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Optimizing union.scm
  2014-03-23 20:27     ` Ludovic Courtès
  2014-03-24  3:31       ` Mark H Weaver
@ 2014-03-24  3:55       ` Mark H Weaver
  2014-03-24 13:45         ` Ludovic Courtès
  1 sibling, 1 reply; 23+ messages in thread
From: Mark H Weaver @ 2014-03-24  3:55 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Mark H Weaver <mhw@netris.org> skribis:
>
>> I want to optimize it anyway, since it takes over 5 minutes to build
>> my profile, which is a bit painful.
>
> Oh, this much?
>
> I have 140 packages in my profile and it takes less than 30s to build
> it; that’s an SSD though, so that probably makes a big difference.

My profile has 155 packages.  I've looked over union.scm and can see
some extreme wastefulness, most notably in the use of 'others-have-it?'.
I guess this typically does N 'lstat' calls for every file or directory
in every package in the resulting profile that cannot be pruned (due to
being in a directory that's only in one package), where N is the number
of packages in the profile.

I'm fairly sure it is possible to replace those N 'lstat' calls with
something that requires 0 system calls and at most O(log N) time.  The
basic idea would be to iterate over all packages in a breadth-first
manner, as follows:

I think we should readdir the top-level directories of every package,
and merge them together into a single map structure (vhash?) that maps
filenames to sets of packages containing that filename.  We don't even
need to 'lstat' them at this point, all we need are the names.  After
we've read the directories of every package, we then iterate over the
map.  For any unique entries, we simply make a symlink in the new
profile.

For duplicates we'd do conflict resolution, which starts with an
'lstat'.  For directories, the default conflict resolution would simply
create a directory in the new profile and then recurse into that
subdirectory, considering only the packages that contained that
subdirectory.

My guess is that this would speed up profile creation by at least an
order of magnitude, maybe more.

What do you think?

      Mark

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Optimizing union.scm
  2014-03-24  3:55       ` Optimizing union.scm Mark H Weaver
@ 2014-03-24 13:45         ` Ludovic Courtès
  2014-03-25  7:04           ` Mark H Weaver
  0 siblings, 1 reply; 23+ messages in thread
From: Ludovic Courtès @ 2014-03-24 13:45 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel

Mark H Weaver <mhw@netris.org> skribis:

> ludo@gnu.org (Ludovic Courtès) writes:
>
>> Mark H Weaver <mhw@netris.org> skribis:
>>
>>> I want to optimize it anyway, since it takes over 5 minutes to build
>>> my profile, which is a bit painful.
>>
>> Oh, this much?
>>
>> I have 140 packages in my profile and it takes less than 30s to build
>> it; that’s an SSD though, so that probably makes a big difference.
>
> My profile has 155 packages.  I've looked over union.scm and can see
> some extreme wastefulness, most notably in the use of 'others-have-it?'.
> I guess this typically does N 'lstat' calls for every file or directory
> in every package in the resulting profile that cannot be pruned (due to
> being in a directory that's only in one package), where N is the number
> of packages in the profile.

We could use a memoizing version of ‘file-exists?’ in ‘union-build’ and
see what happens.

> I'm fairly sure it is possible to replace those N 'lstat' calls with
> something that requires 0 system calls and at most O(log N) time.  The
> basic idea would be to iterate over all packages in a breadth-first
> manner, as follows:
>
> I think we should readdir the top-level directories of every package,
> and merge them together into a single map structure (vhash?) that maps
> filenames to sets of packages containing that filename.  We don't even
> need to 'lstat' them at this point, all we need are the names.  After
> we've read the directories of every package, we then iterate over the
> map.  For any unique entries, we simply make a symlink in the new
> profile.
>
> For duplicates we'd do conflict resolution, which starts with an
> 'lstat'.  For directories, the default conflict resolution would simply
> create a directory in the new profile and then recurse into that
> subdirectory, considering only the packages that contained that
> subdirectory.
>
> My guess is that this would speed up profile creation by at least an
> order of magnitude, maybe more.
>
> What do you think?

I’m all for it!

It’s not obvious to me that we would gain this much (esp. an order of
magnitude), but if that’s the case, it’s great.  (I’m guessing that the
bottleneck is I/O, not CPU.)

My only concern is that that code should remain readable and tested.

So: patches welcome.  ;-)

Ludo’.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Optimizing union.scm
  2014-03-24 13:45         ` Ludovic Courtès
@ 2014-03-25  7:04           ` Mark H Weaver
  2014-03-25 17:18             ` Ludovic Courtès
  2014-04-02 14:14             ` Optimizing ‘guix package’ Ludovic Courtès
  0 siblings, 2 replies; 23+ messages in thread
From: Mark H Weaver @ 2014-03-25  7:04 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 1141 bytes --]

Here's a first draft of a new union.scm.

On my YeeLoong 8101B, when building my 155-package profile, it uses
about 1/14 as much CPU time (25 seconds vs 6 minutes), and about 2/7 as
much real time (2 minutes vs about 7.17 minutes).  It also handles
libffi in core-updates properly.  A few notes:

* For now, I'm using a hash table, because it turned out to be rather
  awkward to use a vhash for this.  I can make it purely functional
  later if it's deemed important.

* I've not yet updated tests/union.scm, which tests internal procedures
  in union.scm that no longer exist.

* Although I can now run union-build in 2 minutes, another full minute
  is spent in 'display-search-paths', plus another 30-40 seconds in
  'guix package' before the profile build begins.

In the end, the total time for "guix package -i" goes from around 8.5
minutes to around 3.5 minutes: a significant improvement, but there's
still more code for me to look at trying to optimize.

Instead of a patch, I've simply attached the new union.scm.  The only
code they have in common is 'file=?', which is unchanged.

Comments and suggestions welcome,

      Mark


[-- Attachment #2: New optimized union.scm --]
[-- Type: text/plain, Size: 5396 bytes --]

;;; GNU Guix --- Functional package management for GNU
;;; Copyright © 2012, 2013, 2014 Ludovic Courtès <ludo@gnu.org>
;;; Copyright © 2014 Mark H Weaver <mhw@netris.org>
;;;
;;; This file is part of GNU Guix.
;;;
;;; GNU Guix is free software; you can redistribute it and/or modify it
;;; under the terms of the GNU General Public License as published by
;;; the Free Software Foundation; either version 3 of the License, or (at
;;; your option) any later version.
;;;
;;; GNU Guix is distributed in the hope that it will be useful, but
;;; WITHOUT ANY WARRANTY; without even the implied warranty of
;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
;;; GNU General Public License for more details.
;;;
;;; You should have received a copy of the GNU General Public License
;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.

(define-module (guix build union)
  #:use-module (ice-9 match)
  #:use-module (ice-9 format)
  #:use-module (ice-9 receive)
  #:use-module (srfi srfi-1)
  #:use-module (srfi srfi-26)
  #:use-module (rnrs bytevectors)
  #:use-module (rnrs io ports)
  #:export (union-build))

;;; Commentary:
;;;
;;; Build a directory that is the union of a set of directories, using
;;; symbolic links.
;;;
;;; Code:

(define (names-in-directory dirname)
  (let ((dir (opendir dirname)))
    (let loop ((filenames '()))
      (match (readdir dir)
        ((or "." "..")
         (loop filenames))
        ((? eof-object?)
         (closedir dir)
         filenames)
        (name
         (loop (cons name filenames)))))))

(define (directory? name)
  (eq? 'directory (stat:type (stat name))))

(define (file=? file1 file2)
  "Return #t if FILE1 and FILE2 are regular files and their contents are
identical, #f otherwise."
  (let ((st1 (stat file1))
        (st2 (stat file2)))
    (and (eq? (stat:type st1) 'regular)
         (eq? (stat:type st2) 'regular)
         (= (stat:size st1) (stat:size st2))
         (call-with-input-file file1
           (lambda (port1)
             (call-with-input-file file2
               (lambda (port2)
                 (define len 8192)
                 (define buf1 (make-bytevector len))
                 (define buf2 (make-bytevector len))
                 (let loop ()
                   (let ((n1 (get-bytevector-n! port1 buf1 0 len))
                         (n2 (get-bytevector-n! port2 buf2 0 len)))
                     (and (equal? n1 n2)
                          (or (eof-object? n1)
                              (loop))))))))))))

(define* (union-build output inputs
                      #:key (log-port (current-error-port)))

  (define (make-link input output)
    (format log-port "`~a' ~~> `~a'~%" input output)
    (symlink input output))

  (define (union output inputs)
    (match inputs
      ((input)
       ;; There's only one input, so just make a link.
       (make-link input output))
      (_
       (receive (dirs files) (partition directory? inputs)
         (cond
          ((null? files)

           ;; All inputs are directories.  Create a new directory
           ;; where we will merge the input directories.
           (mkdir output)

           ;; Build a hash table mapping each name to a list of input
           ;; directories containing that name.
           (let ((table (make-hash-table)))

             (define (add-to-table! name dir)
               (let ((handle (hash-create-handle! table name '())))
                 (set-cdr! handle (cons dir (cdr handle)))))

             ;; Populate the table.
             (for-each (lambda (dir)
                         (for-each (cut add-to-table! <> dir)
                                   (names-in-directory dir)))
                       dirs)

             ;; Now iterate over the table and recursively
             ;; perform a union for each entry.
             (hash-for-each (lambda (name dirs-with-name)
                              (union (string-append output "/" name)
                                     (map (cut string-append <> "/" name)
                                          (reverse dirs-with-name))))
                            table)))

          ((null? dirs)
           ;; All inputs are non-directories.  We assume they are files.
           (match files
             ((file (? (cut file=? <> file)) ...)
              ;; All files have the same contents, so there's no conflict.
              (make-link file output))

             ((file other-files ...)
              (format (current-error-port)
                      "warning: collision encountered: ~{~a ~}~%"
                      files)

              ;; TODO: Implement smarter strategies.
              (format (current-error-port)
                      "warning: arbitrarily choosing ~a~%"
                      file)

              (make-link file output))))

          (else
           ;; The inputs are a mixture of files and directories
           (error "union-build: collision between file and directories"
                  `((files ,files) (dirs ,dirs)))))))))

  (setvbuf (current-output-port) _IOLBF)
  (setvbuf (current-error-port) _IOLBF)
  (when (file-port? log-port)
    (setvbuf log-port _IOLBF))

  (union output (delete-duplicates inputs)))

;;; union.scm ends here

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Optimizing union.scm
  2014-03-25  7:04           ` Mark H Weaver
@ 2014-03-25 17:18             ` Ludovic Courtès
  2014-03-25 22:30               ` Mark H Weaver
  2014-04-02 14:14             ` Optimizing ‘guix package’ Ludovic Courtès
  1 sibling, 1 reply; 23+ messages in thread
From: Ludovic Courtès @ 2014-03-25 17:18 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel

Mark H Weaver <mhw@netris.org> skribis:

> Here's a first draft of a new union.scm.

Thanks for being so fast!

> On my YeeLoong 8101B, when building my 155-package profile, it uses
> about 1/14 as much CPU time (25 seconds vs 6 minutes), and about 2/7 as
> much real time (2 minutes vs about 7.17 minutes).  It also handles
> libffi in core-updates properly.  A few notes:

That’s good news, and definitely an incentive to switch to this
implementation.

> * For now, I'm using a hash table, because it turned out to be rather
>   awkward to use a vhash for this.  I can make it purely functional
>   later if it's deemed important.

I would prefer it, but it must not be blocking.

> * I've not yet updated tests/union.scm, which tests internal procedures
>   in union.scm that no longer exist.

Right, the ‘tree-union’ and ‘delete-duplicate-leaves’ tests will have to
be removed.

However, could you add similar tests?  They will have to use the real
‘union-build’, though.  Maybe the ‘with-file-tree’ can be useful to
write the tests.

> * Although I can now run union-build in 2 minutes, another full minute
>   is spent in 'display-search-paths', plus another 30-40 seconds in
>   'guix package' before the profile build begins.

Ouch, you know where to look afterwards.  :-)

> In the end, the total time for "guix package -i" goes from around 8.5
> minutes to around 3.5 minutes: a significant improvement, but there's
> still more code for me to look at trying to optimize.

Excellent!

Some stylistic comments:

> (define (names-in-directory dirname)
>   (let ((dir (opendir dirname)))
>     (let loop ((filenames '()))
>       (match (readdir dir)
>         ((or "." "..")
>          (loop filenames))
>         ((? eof-object?)
>          (closedir dir)
>          filenames)
>         (name
>          (loop (cons name filenames)))))))

Rather use something like:

  (scandir directory (lambda (file)
                       (not (member file '("." "..")))))

‘scandir’ also has the advantage of being deterministic (it sorts
entries.)

> (define (directory? name)
>   (eq? 'directory (stat:type (stat name))))

Rather ‘file-is-directory?’.

> (define* (union-build output inputs
>                       #:key (log-port (current-error-port)))

Docstring.

>   (define (make-link input output)

Maybe ‘symlink*’?  (‘make-link’ sounds like it returns a newly-allocated
object.)

>     (format log-port "`~a' ~~> `~a'~%" input output)
>     (symlink input output))
>
>   (define (union output inputs)
>     (match inputs
>       ((input)
>        ;; There's only one input, so just make a link.
>        (make-link input output))
>       (_
>        (receive (dirs files) (partition directory? inputs)

Rather SRFI-11 let-values.

>          (cond
>           ((null? files)
>
>            ;; All inputs are directories.  Create a new directory
>            ;; where we will merge the input directories.
>            (mkdir output)
>
>            ;; Build a hash table mapping each name to a list of input
>            ;; directories containing that name.
>            (let ((table (make-hash-table)))
>
>              (define (add-to-table! name dir)
>                (let ((handle (hash-create-handle! table name '())))
>                  (set-cdr! handle (cons dir (cdr handle)))))

Rather ‘hash-set!’ (‘set-cdr!’ is doomed, you know ;-)).

>               (format (current-error-port)
>                       "warning: collision encountered: ~{~a ~}~%"
>                       files)
>
>               ;; TODO: Implement smarter strategies.
>               (format (current-error-port)
>                       "warning: arbitrarily choosing ~a~%"
>                       file)

Can we keep that as a separate, possibly inner ‘resolve-collisions’
procedure?  That makes it clear what to extend, when we want to
implement that (the original idea was that ‘union-build’ could be passed
a collision-resolution procedure.)

>               (make-link file output))))
>
>           (else
>            ;; The inputs are a mixture of files and directories
>            (error "union-build: collision between file and directories"
>                   `((files ,files) (dirs ,dirs)))))))))

Same for this one.

I like that it’s smaller that the current union.scm.  :-)

Could you please update it and tests/union.scm, and then I think we’ll
be all set?

Thank you!

Ludo’.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Optimizing union.scm
  2014-03-25 17:18             ` Ludovic Courtès
@ 2014-03-25 22:30               ` Mark H Weaver
  2014-03-25 22:58                 ` Ludovic Courtès
  0 siblings, 1 reply; 23+ messages in thread
From: Mark H Weaver @ 2014-03-25 22:30 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Mark H Weaver <mhw@netris.org> skribis:
>
>> Here's a first draft of a new union.scm.
>
> Thanks for being so fast!
>
>> On my YeeLoong 8101B, when building my 155-package profile, it uses
>> about 1/14 as much CPU time (25 seconds vs 6 minutes), and about 2/7 as
>> much real time (2 minutes vs about 7.17 minutes).  It also handles
>> libffi in core-updates properly.  A few notes:
>
> That’s good news, and definitely an incentive to switch to this
> implementation.
>
>> * For now, I'm using a hash table, because it turned out to be rather
>>   awkward to use a vhash for this.  I can make it purely functional
>>   later if it's deemed important.
>
> I would prefer it, but it must not be blocking.

Okay, I'll work on it.

>> * I've not yet updated tests/union.scm, which tests internal procedures
>>   in union.scm that no longer exist.
>
> Right, the ‘tree-union’ and ‘delete-duplicate-leaves’ tests will have to
> be removed.
>
> However, could you add similar tests?  They will have to use the real
> ‘union-build’, though.  Maybe the ‘with-file-tree’ can be useful to
> write the tests.

Could you help with this?  I'm a bit overloaded right now.

[...]

> Some stylistic comments:
>
>> (define (names-in-directory dirname)
>>   (let ((dir (opendir dirname)))
>>     (let loop ((filenames '()))
>>       (match (readdir dir)
>>         ((or "." "..")
>>          (loop filenames))
>>         ((? eof-object?)
>>          (closedir dir)
>>          filenames)
>>         (name
>>          (loop (cons name filenames)))))))
>
> Rather use something like:
>
>   (scandir directory (lambda (file)
>                        (not (member file '("." "..")))))
>
> ‘scandir’ also has the advantage of being deterministic (it sorts
> entries.)

I looked at 'scandir', but it's very wasteful.  For one thing, it
unconditionally calls 'lstat' on every directory entry, which entails
over ten thousand unnecessary 'lstat' system calls, a lot of unnecessary
I/O to read the inodes, the use of a vhash to keep track of where it has
been to detect cycles (since it's based on 'file-system-fold'), etc.

>> (define (directory? name)
>>   (eq? 'directory (stat:type (stat name))))
>
> Rather ‘file-is-directory?’.

Okay.

>> (define* (union-build output inputs
>>                       #:key (log-port (current-error-port)))
>
> Docstring.

Oops, I removed yours by mistake.

>>   (define (make-link input output)
>
> Maybe ‘symlink*’?  (‘make-link’ sounds like it returns a newly-allocated
> object.)

Okay.

>>     (format log-port "`~a' ~~> `~a'~%" input output)
>>     (symlink input output))
>>
>>   (define (union output inputs)
>>     (match inputs
>>       ((input)
>>        ;; There's only one input, so just make a link.
>>        (make-link input output))
>>       (_
>>        (receive (dirs files) (partition directory? inputs)
>
> Rather SRFI-11 let-values.

Hmm.  In simple cases like this, I find 'receive' much more attractive
than 'let-values', since the latter would involve triple-nested parens,
which is a bit much even for me.

Can you explain why you prefer 'let-values' to 'receive' in this case?

Also, how do you feel about 'call-with-values'?  In an earlier draft of
this code, I used 'call-with-values' with 'match-lambda*' as the
consumer, which eliminated the need for the 'cond'.  Would you prefer
that?

>>          (cond
>>           ((null? files)
>>
>>            ;; All inputs are directories.  Create a new directory
>>            ;; where we will merge the input directories.
>>            (mkdir output)
>>
>>            ;; Build a hash table mapping each name to a list of input
>>            ;; directories containing that name.
>>            (let ((table (make-hash-table)))
>>
>>              (define (add-to-table! name dir)
>>                (let ((handle (hash-create-handle! table name '())))
>>                  (set-cdr! handle (cons dir (cdr handle)))))
>
> Rather ‘hash-set!’ (‘set-cdr!’ is doomed, you know ;-)).

That would double the number of hash table lookups.  Anyway, I plan to
eliminate the use of the hash table altogether, so the point is moot :)

>>               (format (current-error-port)
>>                       "warning: collision encountered: ~{~a ~}~%"
>>                       files)
>>
>>               ;; TODO: Implement smarter strategies.
>>               (format (current-error-port)
>>                       "warning: arbitrarily choosing ~a~%"
>>                       file)
>
> Can we keep that as a separate, possibly inner ‘resolve-collisions’
> procedure?  That makes it clear what to extend, when we want to
> implement that (the original idea was that ‘union-build’ could be passed
> a collision-resolution procedure.)
>
>>               (make-link file output))))
>>
>>           (else
>>            ;; The inputs are a mixture of files and directories
>>            (error "union-build: collision between file and directories"
>>                   `((files ,files) (dirs ,dirs)))))))))
>
> Same for this one.

Okay.

Thanks for the quick review :)

      Mark

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Optimizing union.scm
  2014-03-25 22:30               ` Mark H Weaver
@ 2014-03-25 22:58                 ` Ludovic Courtès
  2014-03-27  7:09                   ` Mark H Weaver
  0 siblings, 1 reply; 23+ messages in thread
From: Ludovic Courtès @ 2014-03-25 22:58 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel

Mark H Weaver <mhw@netris.org> skribis:

> ludo@gnu.org (Ludovic Courtès) writes:
>
>> Mark H Weaver <mhw@netris.org> skribis:

[...]

>>> * I've not yet updated tests/union.scm, which tests internal procedures
>>>   in union.scm that no longer exist.
>>
>> Right, the ‘tree-union’ and ‘delete-duplicate-leaves’ tests will have to
>> be removed.
>>
>> However, could you add similar tests?  They will have to use the real
>> ‘union-build’, though.  Maybe the ‘with-file-tree’ can be useful to
>> write the tests.
>
> Could you help with this?  I'm a bit overloaded right now.

OK, I’ll add a few tests.

>> Some stylistic comments:
>>
>>> (define (names-in-directory dirname)
>>>   (let ((dir (opendir dirname)))
>>>     (let loop ((filenames '()))
>>>       (match (readdir dir)
>>>         ((or "." "..")
>>>          (loop filenames))
>>>         ((? eof-object?)
>>>          (closedir dir)
>>>          filenames)
>>>         (name
>>>          (loop (cons name filenames)))))))
>>
>> Rather use something like:
>>
>>   (scandir directory (lambda (file)
>>                        (not (member file '("." "..")))))
>>
>> ‘scandir’ also has the advantage of being deterministic (it sorts
>> entries.)
>
> I looked at 'scandir', but it's very wasteful.  For one thing, it
> unconditionally calls 'lstat' on every directory entry, which entails
> over ten thousand unnecessary 'lstat' system calls, a lot of unnecessary
> I/O to read the inodes, the use of a vhash to keep track of where it has
> been to detect cycles (since it's based on 'file-system-fold'), etc.

Oh right, makes sense.

Then just s/filenames/files/ and (sort files string<?) at the end.

>>>     (format log-port "`~a' ~~> `~a'~%" input output)
>>>     (symlink input output))
>>>
>>>   (define (union output inputs)
>>>     (match inputs
>>>       ((input)
>>>        ;; There's only one input, so just make a link.
>>>        (make-link input output))
>>>       (_
>>>        (receive (dirs files) (partition directory? inputs)
>>
>> Rather SRFI-11 let-values.
>
> Hmm.  In simple cases like this, I find 'receive' much more attractive
> than 'let-values', since the latter would involve triple-nested parens,
> which is a bit much even for me.
>
> Can you explain why you prefer 'let-values' to 'receive' in this case?

Because I’m used to it, but here it’s mostly for consistency.

> Also, how do you feel about 'call-with-values'?  In an earlier draft of
> this code, I used 'call-with-values' with 'match-lambda*' as the
> consumer, which eliminated the need for the 'cond'.  Would you prefer
> that?

Yeah I don’t feel too strongly, so whatever you find appropriate.  :-)

Thanks!

Ludo’.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Problems with handicapped 'bash' from glibc package
  2014-03-23 16:19 ` Ludovic Courtès
  2014-03-23 20:19   ` Mark H Weaver
@ 2014-03-26 23:29   ` Ludovic Courtès
  1 sibling, 0 replies; 23+ messages in thread
From: Ludovic Courtès @ 2014-03-26 23:29 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 1167 bytes --]

ludo@gnu.org (Ludovic Courtès) skribis:

> Mark H Weaver <mhw@netris.org> skribis:
>
>> The 'bash' in the glibc package is handicapped in at least two ways:
>>
>> * It can't set the locale, because it looks for locales in
>>   /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-intermediate-2.18-locales
>>
>> * It can't look up anything from NSS, such as passwd data, because it
>>   tries to load the modules from
>>   /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-intermediate-2.18
>>
>> There are two problems that need to be addressed, I think:
>>
>> * Users could easily end up with this handicapped 'bash' as their
>>   primary bash, if they installed (or upgraded?) 'glibc' since the last
>>   time I installed 'bash'.  This happened to me, for example.
>
> I realized that this particular problem is easily solved by moving
> glibc’s bash away from $bindir, for instance to $libexecdir.

I gave up on this one for this time, because it started taking too much
time, and because new hacks were needed to make sure we wouldn’t keep
references to the bootstrap sh.

For reference, below is the unfinished patch.

Ludo’.


[-- Attachment #2: Type: text/x-patch, Size: 6217 bytes --]

commit 3e8578b7899454645072594fdc392c872e8947c0 (HEAD, refs/heads/wip-glibc-bash)
Author: Ludovic Courtès <ludo@gnu.org>
Date:   Mon Mar 24 00:53:16 2014 +0100

    gnu: glibc: Move stand-alone Bash to $libexecdir.
    
    * gnu/packages/base.scm (glibc)[arguments] <pre-configure phase>:
      Install 'static-bash' to $out/libexec/glibc-2.19 instead of $out/bin.
      Adjust system.c and iopopen.c substitutions accordingly.
      (gcc-final)[arguments] <phases>: Add 'libc-bash-first' phase.
    * gnu/packages/ncurses.scm (ncurses) <configure phase>: Search the
      'bash' executable under $libc/libexec.  Set $PATH.

	Modified   gnu/packages/base.scm
diff --git a/gnu/packages/base.scm b/gnu/packages/base.scm
index bf1ebfa..888c20a 100644
--- a/gnu/packages/base.scm
+++ b/gnu/packages/base.scm
@@ -431,8 +431,9 @@ library for working with executable and object formats is also included.")
       #:phases (alist-cons-before
                 'configure 'pre-configure
                 (lambda* (#:key inputs outputs #:allow-other-keys)
-                  (let* ((out  (assoc-ref outputs "out"))
-                         (bin  (string-append out "/bin")))
+                  (let* ((out     (assoc-ref outputs "out"))
+                         (libexec (string-append out "/libexec/" ,name
+                                                 "-" ,version)))
                     ;; Use `pwd', not `/bin/pwd'.
                     (substitute* "configure"
                       (("/bin/pwd") "pwd"))
@@ -454,27 +455,29 @@ library for working with executable and object formats is also included.")
 
                     ;; Copy a statically-linked Bash in the output, with
                     ;; no references to other store paths.
-                    (mkdir-p bin)
+                    (mkdir-p libexec)
                     (copy-file (string-append (assoc-ref inputs "static-bash")
                                               "/bin/bash")
-                               (string-append bin "/bash"))
-                    (remove-store-references (string-append bin "/bash"))
-                    (chmod (string-append bin "/bash") #o555)
-
-                    ;; Keep a symlink, for `patch-shebang' resolution.
-                    (with-directory-excursion bin
+                               (string-append libexec "/bash"))
+                    (remove-store-references (string-append libexec "/bash"))
+                    (chmod (string-append libexec "/bash") #o555)
+
+                    ;; Keep a symlink and augment $PATH, for `patch-shebang'
+                    ;; resolution for $bin/sotruss, etc.
+                    (setenv "PATH" (string-append libexec ":" (getenv "PATH")))
+                    (with-directory-excursion libexec
                       (symlink "bash" "sh"))
 
                     ;; Have `system' use that Bash.
                     (substitute* "sysdeps/posix/system.c"
                       (("#define[[:blank:]]+SHELL_PATH.*$")
-                       (format #f "#define SHELL_PATH \"~a/bin/bash\"\n"
-                               out)))
+                       (format #f "#define SHELL_PATH \"~a/bash\"\n"
+                               libexec)))
 
                     ;; Same for `popen'.
                     (substitute* "libio/iopopen.c"
                       (("/bin/sh")
-                       (string-append out "/bin/bash")))
+                       (string-append libexec "/bash")))
 
                     ;; Make sure we don't retain a reference to the
                     ;; bootstrap Perl.
@@ -1000,7 +1003,21 @@ exec ~a/bin/~a-~a -B~a/lib -Wl,-dynamic-linker -Wl,~a/~a \"$@\"~%"
                               flag))
                         ,flags)))
            ((#:phases phases)
-            `(alist-delete 'symlink-libgcc_eh ,phases)))))
+            `(alist-delete
+              'symlink-libgcc_eh
+              (alist-cons-before
+               'configure 'libc-bash-first
+               (lambda* (#:key inputs #:allow-other-keys)
+                 ;; Arrange so that the shebangs of 'mkheaders', 'fixinc.sh',
+                 ;; etc. refer to libc's bash rather than to the bootstrap
+                 ;; bash.
+                 (let* ((libc (assoc-ref inputs "libc"))
+                        (bash (car (find-files (string-append libc "/libexec")
+                                               "^bash$"))))
+                   (setenv "PATH"
+                           (string-append (dirname bash) ":"
+                                          (getenv "PATH")))))
+               ,phases))))))
 
     (inputs `(("gmp-source" ,(package-source gmp))
               ("mpfr-source" ,(package-source mpfr))
	Modified   gnu/packages/ncurses.scm
diff --git a/gnu/packages/ncurses.scm b/gnu/packages/ncurses.scm
index b8f6bc8..8e75cdb 100644
--- a/gnu/packages/ncurses.scm
+++ b/gnu/packages/ncurses.scm
@@ -1,5 +1,5 @@
 ;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2012, 2013 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2012, 2013, 2014 Ludovic Courtès <ludo@gnu.org>
 ;;; Copyright © 2014 Mark H Weaver <mhw@netris.org>
 ;;;
 ;;; This file is part of GNU Guix.
@@ -35,8 +35,15 @@
             ;; it to point to libc's embedded Bash, to avoid retaining a
             ;; reference to the bootstrap Bash.
             (let* ((libc (assoc-ref inputs "libc"))
-                   (bash (string-append libc "/bin/bash"))
+                   (bash (car (find-files (string-append libc "/libexec")
+                                          "^bash$")))
                    (out  (assoc-ref outputs "out")))
+              ;; The post-installation 'patch-shebangs' phase patches
+              ;; according to what's in $PATH, so make sure libc's bash comes
+              ;; first.
+              (setenv "PATH"
+                      (string-append (dirname bash) ":" (getenv "PATH")))
+
               (format #t "configure flags: ~s~%" configure-flags)
               (zero? (apply system* bash "./configure"
                             (string-append "SHELL=" bash)

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: Optimizing union.scm
  2014-03-25 22:58                 ` Ludovic Courtès
@ 2014-03-27  7:09                   ` Mark H Weaver
  2014-03-27  9:57                     ` Ludovic Courtès
  0 siblings, 1 reply; 23+ messages in thread
From: Mark H Weaver @ 2014-03-27  7:09 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 139 bytes --]

Here's a second draft of union.scm, incorporating your suggestions.
I haven't yet replaced the use of the hash table, though.

     Mark



[-- Attachment #2: Optimized union.scm v2 --]
[-- Type: text/plain, Size: 5549 bytes --]

;;; GNU Guix --- Functional package management for GNU
;;; Copyright © 2012, 2013, 2014 Ludovic Courtès <ludo@gnu.org>
;;; Copyright © 2014 Mark H Weaver <mhw@netris.org>
;;;
;;; This file is part of GNU Guix.
;;;
;;; GNU Guix is free software; you can redistribute it and/or modify it
;;; under the terms of the GNU General Public License as published by
;;; the Free Software Foundation; either version 3 of the License, or (at
;;; your option) any later version.
;;;
;;; GNU Guix is distributed in the hope that it will be useful, but
;;; WITHOUT ANY WARRANTY; without even the implied warranty of
;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
;;; GNU General Public License for more details.
;;;
;;; You should have received a copy of the GNU General Public License
;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.

(define-module (guix build union)
  #:use-module (ice-9 match)
  #:use-module (ice-9 format)
  #:use-module (srfi srfi-1)
  #:use-module (srfi srfi-26)
  #:use-module (rnrs bytevectors)
  #:use-module (rnrs io ports)
  #:export (union-build))

;;; Commentary:
;;;
;;; Build a directory that is the union of a set of directories, using
;;; symbolic links.
;;;
;;; Code:

(define (files-in-directory dirname)
  (let ((dir (opendir dirname)))
    (let loop ((files '()))
      (match (readdir dir)
        ((or "." "..")
         (loop files))
        ((? eof-object?)
         (closedir dir)
         (sort files string<?))
        (file
         (loop (cons file files)))))))

(define (file-is-directory? file)
  (eq? 'directory (stat:type (stat file))))

(define (file=? file1 file2)
  "Return #t if FILE1 and FILE2 are regular files and their contents are
identical, #f otherwise."
  (let ((st1 (stat file1))
        (st2 (stat file2)))
    (and (eq? (stat:type st1) 'regular)
         (eq? (stat:type st2) 'regular)
         (= (stat:size st1) (stat:size st2))
         (call-with-input-file file1
           (lambda (port1)
             (call-with-input-file file2
               (lambda (port2)
                 (define len 8192)
                 (define buf1 (make-bytevector len))
                 (define buf2 (make-bytevector len))
                 (let loop ()
                   (let ((n1 (get-bytevector-n! port1 buf1 0 len))
                         (n2 (get-bytevector-n! port2 buf2 0 len)))
                     (and (equal? n1 n2)
                          (or (eof-object? n1)
                              (loop))))))))))))

(define* (union-build output inputs
                      #:key (log-port (current-error-port)))
  "Build in the OUTPUT directory a symlink tree that is the union of all
the INPUTS."

  (define (symlink* input output)
    (format log-port "`~a' ~~> `~a'~%" input output)
    (symlink input output))

  (define (resolve-collisions output dirs files)
    (cond ((null? dirs)
           ;; The inputs are all files.
           (format (current-error-port)
                   "warning: collision encountered: ~{~a ~}~%"
                   files)

           (let ((file (first files)))
             ;; TODO: Implement smarter strategies.
             (format (current-error-port)
                     "warning: arbitrarily choosing ~a~%"
                     file)

             (symlink* file output)))

          (else
           ;; The inputs are a mixture of files and directories
           (error "union-build: collision between file and directories"
                  `((files ,files) (dirs ,dirs))))))

  (define (union output inputs)
    (match inputs
      ((input)
       ;; There's only one input, so just make a link.
       (symlink* input output))
      (_
       (call-with-values (lambda () (partition file-is-directory? inputs))
         (match-lambda*
           ((dirs ())
            ;; All inputs are directories.  Create a new directory
            ;; where we will merge the input directories.
            (mkdir output)

            ;; Build a hash table mapping each file to a list of input
            ;; directories containing that file.
            (let ((table (make-hash-table)))

              (define (add-to-table! file dir)
                (hash-set! table file (cons dir (hash-ref table file '()))))

              ;; Populate the table.
              (for-each (lambda (dir)
                          (for-each (cut add-to-table! <> dir)
                                    (files-in-directory dir)))
                        dirs)

              ;; Now iterate over the table and recursively
              ;; perform a union for each entry.
              (hash-for-each (lambda (file dirs-with-file)
                               (union (string-append output "/" file)
                                      (map (cut string-append <> "/" file)
                                           (reverse dirs-with-file))))
                             table)))

           ((() (file (? (cut file=? <> file)) ...))
            ;; There are no directories, and all files have the same contents,
            ;; so there's no conflict.
            (symlink* file output))

           ((dirs files)
            (resolve-collisions output dirs files)))))))

  (setvbuf (current-output-port) _IOLBF)
  (setvbuf (current-error-port) _IOLBF)
  (when (file-port? log-port)
    (setvbuf log-port _IOLBF))

  (union output (delete-duplicates inputs)))

;;; union.scm ends here

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Optimizing union.scm
  2014-03-27  7:09                   ` Mark H Weaver
@ 2014-03-27  9:57                     ` Ludovic Courtès
  0 siblings, 0 replies; 23+ messages in thread
From: Ludovic Courtès @ 2014-03-27  9:57 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel

Mark H Weaver <mhw@netris.org> skribis:

> Here's a second draft of union.scm, incorporating your suggestions.
> I haven't yet replaced the use of the hash table, though.

Looks good to me!

Ludo’.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Problems with handicapped 'bash' from glibc package
  2014-03-24  3:31       ` Mark H Weaver
@ 2014-03-28 13:48         ` Ludovic Courtès
  0 siblings, 0 replies; 23+ messages in thread
From: Ludovic Courtès @ 2014-03-28 13:48 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel

Mark H Weaver <mhw@netris.org> skribis:

> ludo@gnu.org (Ludovic Courtès) writes:
>
>> Mark H Weaver <mhw@netris.org> skribis:
>>
>>> ludo@gnu.org (Ludovic Courtès) writes:
>>>
>>>> Mark H Weaver <mhw@netris.org> skribis:
>>>>
>>>>> The 'bash' in the glibc package is handicapped in at least two ways:
>>>>>
>>>>> * It can't set the locale, because it looks for locales in
>>>>>   /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-intermediate-2.18-locales
>>>>>
>>>>> * It can't look up anything from NSS, such as passwd data, because it
>>>>>   tries to load the modules from
>>>>>   /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-intermediate-2.18
>>>>>
>>>>> There are two problems that need to be addressed, I think:
>>>>>
>>>>> * Users could easily end up with this handicapped 'bash' as their
>>>>>   primary bash, if they installed (or upgraded?) 'glibc' since the last
>>>>>   time I installed 'bash'.  This happened to me, for example.
>>>>
>>>> I realized that this particular problem is easily solved by moving
>>>> glibc’s bash away from $bindir, for instance to $libexecdir.
>>>>
>>>> I’m trying it out locally, and plan to commit to core-updates if
>>>> everything works as expected (hopefully as the last core-updates
>>>> change.)
>>>>
>>>> Thoughts?
>>>
>>> Hmm.  We need a more intelligent union.scm anyway,
>>
>> Agreed, but that’s a separate issue.
>>
>> Having it in $bindir also means that patch-shebangs can pick it up,
>> which is usually not what we want.
>>
>> So I think it really makes sense to move it out of sight.
>
> Well, that's sweeping it under the rug, when actually this broken bash
> should never be used for anything except perhaps during the bootstrap.
>
> The fact is, we have broken 'system' and 'popen' functions in Guix.
> For example:
>
> mhw:~$ cat foo.c
> #include <stdlib.h>
>
> int
> main (int argc, char *argv[])
> {
>   return system ("echo Hello world");
> }
>
> mhw:~$ gcc -o foo foo.c
> mhw:~$ ./foo
> sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
> Hello world

Yes, understood.

> Our glibc should not use a broken bash.  It would also be nice if the
> bash used by glibc was linked with the same glibc, so that we only need
> one copy of glibc in memory.  So there's a circular dependency here.
>
> I think we need a better way of dealing with circular dependencies in
> general, but in the meantime, I think we should build the final glibc
> and bash as two outputs from the same derivation, so that they can refer
> to each other.
>
> What do you think?

Again, I definitely agree.  We discussed ways to do it here:

  https://lists.gnu.org/archive/html/guix-devel/2014-02/msg00145.html

Someone just needs to do it.  :-)  We could schedule it for the next
core-updates round, WDYT?

Note that technically, it cannot be two separate outputs because outputs
currently cannot have circular references (and it wouldn’t buy us much
anyway.)

Ludo’.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Optimizing ‘guix package’
  2014-03-25  7:04           ` Mark H Weaver
  2014-03-25 17:18             ` Ludovic Courtès
@ 2014-04-02 14:14             ` Ludovic Courtès
  2014-04-02 16:58               ` Mark H Weaver
  1 sibling, 1 reply; 23+ messages in thread
From: Ludovic Courtès @ 2014-04-02 14:14 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel

Mark H Weaver <mhw@netris.org> skribis:

> * Although I can now run union-build in 2 minutes, another full minute
>   is spent in 'display-search-paths',

Commit 27c6845 optimizes that (50%!).

The problem is that currently this is an inherently inefficient
operation: search path information is not stored in the manifest, so
‘search-path-environment-variables’ has to look up the package
corresponding to each manifest entry to find out what the search path
variables are.  That involves traversing the module tree, which does a
lot of I/O (this is OKish on an SSD, but much less on a real hard disk.)

Perhaps eventually we could store search path info in manifest entries.

Ludo’.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Optimizing ‘guix package’
  2014-04-02 14:14             ` Optimizing ‘guix package’ Ludovic Courtès
@ 2014-04-02 16:58               ` Mark H Weaver
  0 siblings, 0 replies; 23+ messages in thread
From: Mark H Weaver @ 2014-04-02 16:58 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Mark H Weaver <mhw@netris.org> skribis:
>
>> * Although I can now run union-build in 2 minutes, another full minute
>>   is spent in 'display-search-paths',
>
> Commit 27c6845 optimizes that (50%!).

Oooh, this makes a significant improvement on my YeeLoong :)

> The problem is that currently this is an inherently inefficient
> operation: search path information is not stored in the manifest, so
> ‘search-path-environment-variables’ has to look up the package
> corresponding to each manifest entry to find out what the search path
> variables are.  That involves traversing the module tree, which does a
> lot of I/O (this is OKish on an SSD, but much less on a real hard disk.)
>
> Perhaps eventually we could store search path info in manifest entries.

Sounds good to me.  When we make union smarter about handling conflicts,
it may be that some hints from the individual packages should be
consulted, and perhaps those hints should go in the manifest as well.

    Thanks!
      Mark

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2014-04-02 16:59 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-12  7:12 Problems with handicapped 'bash' from glibc package Mark H Weaver
2014-02-12 13:14 ` Ludovic Courtès
2014-02-12 17:39   ` Mark H Weaver
2014-02-12 19:31     ` Andreas Enge
2014-02-12 20:33       ` Ludovic Courtès
2014-02-12 21:33         ` Mark H Weaver
2014-02-13  9:14           ` Andreas Enge
2014-03-23 16:19 ` Ludovic Courtès
2014-03-23 20:19   ` Mark H Weaver
2014-03-23 20:27     ` Ludovic Courtès
2014-03-24  3:31       ` Mark H Weaver
2014-03-28 13:48         ` Ludovic Courtès
2014-03-24  3:55       ` Optimizing union.scm Mark H Weaver
2014-03-24 13:45         ` Ludovic Courtès
2014-03-25  7:04           ` Mark H Weaver
2014-03-25 17:18             ` Ludovic Courtès
2014-03-25 22:30               ` Mark H Weaver
2014-03-25 22:58                 ` Ludovic Courtès
2014-03-27  7:09                   ` Mark H Weaver
2014-03-27  9:57                     ` Ludovic Courtès
2014-04-02 14:14             ` Optimizing ‘guix package’ Ludovic Courtès
2014-04-02 16:58               ` Mark H Weaver
2014-03-26 23:29   ` Problems with handicapped 'bash' from glibc package Ludovic Courtès

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).