hydra python2-numpy-1.9.1 failure

unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed

* hydra python2-numpy-1.9.1 failure
@ 2014-12-10 18:06 Federico Beffa
  2014-12-10 18:33 ` Mark H Weaver
  0 siblings, 1 reply; 7+ messages in thread
From: Federico Beffa @ 2014-12-10 18:06 UTC (permalink / raw)
  To: Guix-devel

Hi,

I've noticed that on hydra python2-numpy-1.9.1 for x86_64-linux (and
other architectures) fails to pass the test procedure (really the
module local version without documentation called
python2-numpy-bootstrap-1.9.1):

http://hydra.gnu.org/build/172563

So, I've built it locally to investigate and on my machine it builds
without errors. The hash in the store is the same as on hydra

/gnu/store/wyzv0xw9fi0hks5x1kagy2nl2sykbr7l-python2-numpy-bootstrap-1.9.1

so the inputs should be the same.

I'm at commit 57c3f71632692d1e2e12e5d2db5c2cc4c6e075c9 (of today). The
only part which is not up to date on my system is the daemon which is
still from the 0.7 release (back then I've setup an init script to
start it automatically at boot).

Any hint at what could be going wrong on hydra?

Thanks,
Fede

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: hydra python2-numpy-1.9.1 failure
  2014-12-10 18:06 hydra python2-numpy-1.9.1 failure Federico Beffa
@ 2014-12-10 18:33 ` Mark H Weaver
  2014-12-10 21:19   ` Federico Beffa
  0 siblings, 1 reply; 7+ messages in thread
From: Mark H Weaver @ 2014-12-10 18:33 UTC (permalink / raw)
  To: Federico Beffa; +Cc: Guix-devel

Federico Beffa <beffa@ieee.org> writes:

> I've noticed that on hydra python2-numpy-1.9.1 for x86_64-linux (and
> other architectures) fails to pass the test procedure (really the
> module local version without documentation called
> python2-numpy-bootstrap-1.9.1):
>
> http://hydra.gnu.org/build/172563
>
> So, I've built it locally to investigate and on my machine it builds
> without errors. The hash in the store is the same as on hydra
>
> /gnu/store/wyzv0xw9fi0hks5x1kagy2nl2sykbr7l-python2-numpy-bootstrap-1.9.1
>
> so the inputs should be the same.
>
> I'm at commit 57c3f71632692d1e2e12e5d2db5c2cc4c6e075c9 (of today). The
> only part which is not up to date on my system is the daemon which is
> still from the 0.7 release (back then I've setup an init script to
> start it automatically at boot).
>
> Any hint at what could be going wrong on hydra?

It's not uncommon for some tests to fail occasionally in certain
packages.  Reasons I've seen include: timeouts set too short, race
conditions, randomized tests that fail for some values, and dependencies
on the kernel version and/or configuration.  We've had to debug these
problems on a case-by-case basis.  Sometimes we've disabled the
unreliable tests, or even disabled the entire test suite.

       Mark

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: hydra python2-numpy-1.9.1 failure
  2014-12-10 18:33 ` Mark H Weaver
@ 2014-12-10 21:19   ` Federico Beffa
  2014-12-11  4:19     ` Mark H Weaver
  2014-12-11 13:20     ` Ludovic Courtès
  0 siblings, 2 replies; 7+ messages in thread
From: Federico Beffa @ 2014-12-10 21:19 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Guix-devel

On Wed, Dec 10, 2014 at 7:33 PM, Mark H Weaver <mhw@netris.org> wrote:
> It's not uncommon for some tests to fail occasionally in certain
> packages.  Reasons I've seen include: timeouts set too short, race
> conditions, randomized tests that fail for some values, and dependencies
> on the kernel version and/or configuration.  We've had to debug these
> problems on a case-by-case basis.  Sometimes we've disabled the
> unreliable tests, or even disabled the entire test suite.

Thanks for the input! It looks like on hydra the failure is
reproducible. I would therefore tend to discard randomized tests. Do
you happen to know the timeout for tests? The test fails after
17seconds which is not very long.

Now that you mention race conditions, I remember having a hard time
with parallel builds in ATLAS; and numpy makes use of it. I may try
adding '#:parallel-tests? #f'.

Thanks,
Fede

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: hydra python2-numpy-1.9.1 failure
  2014-12-10 21:19   ` Federico Beffa
@ 2014-12-11  4:19     ` Mark H Weaver
  2014-12-11 13:20     ` Ludovic Courtès
  1 sibling, 0 replies; 7+ messages in thread
From: Mark H Weaver @ 2014-12-11  4:19 UTC (permalink / raw)
  To: Federico Beffa; +Cc: Guix-devel

Federico Beffa <beffa@ieee.org> writes:

> On Wed, Dec 10, 2014 at 7:33 PM, Mark H Weaver <mhw@netris.org> wrote:
>> It's not uncommon for some tests to fail occasionally in certain
>> packages.  Reasons I've seen include: timeouts set too short, race
>> conditions, randomized tests that fail for some values, and dependencies
>> on the kernel version and/or configuration.  We've had to debug these
>> problems on a case-by-case basis.  Sometimes we've disabled the
>> unreliable tests, or even disabled the entire test suite.
>
> Thanks for the input! It looks like on hydra the failure is
> reproducible. I would therefore tend to discard randomized tests. Do
> you happen to know the timeout for tests? The test fails after
> 17seconds which is not very long.

Any such timeout would be built in to the test suite of that package.
IIRC, the only relevant timeout that guix-daemon imposes is this: if the
build is silent (no output) for 1 hour, the build is aborted.

> Now that you mention race conditions, I remember having a hard time
> with parallel builds in ATLAS; and numpy makes use of it. I may try
> adding '#:parallel-tests? #f'.

It's worth a try.  We currently need that in over 20 packages.

      Mark

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: hydra python2-numpy-1.9.1 failure
  2014-12-10 21:19   ` Federico Beffa
  2014-12-11  4:19     ` Mark H Weaver
@ 2014-12-11 13:20     ` Ludovic Courtès
  2014-12-13 17:35       ` Federico Beffa
  1 sibling, 1 reply; 7+ messages in thread
From: Ludovic Courtès @ 2014-12-11 13:20 UTC (permalink / raw)
  To: Federico Beffa; +Cc: Guix-devel

Federico Beffa <beffa@ieee.org> skribis:

> Thanks for the input! It looks like on hydra the failure is
> reproducible.

It’s actually hard to know, because guix-daemon on hydra.gnu.org runs
with --cache-failures.

> Now that you mention race conditions, I remember having a hard time
> with parallel builds in ATLAS; and numpy makes use of it. I may try
> adding '#:parallel-tests? #f'.

Check if that flag has an effect at all.  For instance,
Automake-generated makefiles have supported parallel test suites for
not-too-long (a couple of years maybe) and even there it’s optional.
Hand-written build systems often run tests sequentially.

Ludo’.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: hydra python2-numpy-1.9.1 failure
  2014-12-11 13:20     ` Ludovic Courtès
@ 2014-12-13 17:35       ` Federico Beffa
  2014-12-13 19:09         ` Mark H Weaver
  0 siblings, 1 reply; 7+ messages in thread
From: Federico Beffa @ 2014-12-13 17:35 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: Guix-devel

On Thu, Dec 11, 2014 at 2:20 PM, Ludovic Courtès <ludo@gnu.org> wrote:
> Check if that flag has an effect at all.  For instance,
> Automake-generated makefiles have supported parallel test suites for
> not-too-long (a couple of years maybe) and even there it’s optional.
> Hand-written build systems often run tests sequentially.

I think you are right that the flag does not have any effect here.

I've now identified the failing test. From the comment in the test
procedure it appears that it has to do with memory

https://github.com/numpy/numpy/issues/4442

Could I know the specs of the hydra (virtual?) machine running the
code? I may reduce the size of the test matrix, or skip the test
altogether.

Regards,
Fede

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: hydra python2-numpy-1.9.1 failure
  2014-12-13 17:35       ` Federico Beffa
@ 2014-12-13 19:09         ` Mark H Weaver
  0 siblings, 0 replies; 7+ messages in thread
From: Mark H Weaver @ 2014-12-13 19:09 UTC (permalink / raw)
  To: Federico Beffa; +Cc: Guix-devel

Federico Beffa <beffa@ieee.org> writes:

> On Thu, Dec 11, 2014 at 2:20 PM, Ludovic Courtès <ludo@gnu.org> wrote:
>> Check if that flag has an effect at all.  For instance,
>> Automake-generated makefiles have supported parallel test suites for
>> not-too-long (a couple of years maybe) and even there it’s optional.
>> Hand-written build systems often run tests sequentially.
>
> I think you are right that the flag does not have any effect here.
>
> I've now identified the failing test. From the comment in the test
> procedure it appears that it has to do with memory
>
> https://github.com/numpy/numpy/issues/4442

Thanks for looking into it!

> Could I know the specs of the hydra (virtual?) machine running the
> code? I may reduce the size of the test matrix, or skip the test
> altogether.

The builds don't happen on hydra.gnu.org itself, they happen on its
build slaves, of which there are several.  Anyway, the question is not
relevant, because we want users to be able to build packages on their
own machines.  We'll just have to use our own judgment about what the
minimum requirements should be to build this package.

    Thanks!
      Mark

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-12-13 19:09 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-10 18:06 hydra python2-numpy-1.9.1 failure Federico Beffa
2014-12-10 18:33 ` Mark H Weaver
2014-12-10 21:19   ` Federico Beffa
2014-12-11  4:19     ` Mark H Weaver
2014-12-11 13:20     ` Ludovic Courtès
2014-12-13 17:35       ` Federico Beffa
2014-12-13 19:09         ` Mark H Weaver

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).