unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
From: Leo Famulari <leo@famulari.name>
To: Florian Paul Schmidt <mista.tapas@gmx.net>
Cc: 22078@debbugs.gnu.org
Subject: bug#22078: failed builds due to exceeding max-silent-time not marked as failed in db
Date: Wed, 9 Dec 2015 14:57:20 -0500	[thread overview]
Message-ID: <20151209195720.GA18503@jasmine> (raw)
In-Reply-To: <56621663.4080007@gmx.net>

On Fri, Dec 04, 2015 at 11:40:35PM +0100, Florian Paul Schmidt wrote:
> Attached is a first stab at fixing this. There are additional options to
> guix-daemons now:
> 
>       --cache-failures       cache build failures
>       --cache-hook-failures  cache build failures due to hook failures
> (depends
>                              on cache-failures)
>       --cache-timeout-failures   cache build failures due to timeouts
> (depends
>                              on cache-failures)

I see the value of this.

I wonder how it would change the semantics of the store? I think the
current failure caching system is only based on the hash of the
dependency graph. Is that correct? That reinforces the determinism of
the system.

It seems impossible to make build hook and timeout failures
deterministic. Both depend on things that can't be observed by the
daemon or the builder. So if these failures are cached, the caching
should be enabled separately from the memoized failures.

I think that if we do decide to cache timeout failures, the value of
"--timeout" and "--max-silent-time" should be included. I have a slow
machine that can't build certain packages with the default
"--max-silent-time" even when nothing is "competing" for the system's
resources. But of course there are other factors that affect this like
other processes on the system so it's not deterministic.

> Patch compiles, but is yet untested since the system I need it has gone away
> for the time being..

Have you tested the patch yet?

> 
> Flo
> 
> On 12/02/2015 11:03 PM, Florian Paul Schmidt wrote:
> >-----BEGIN PGP SIGNED MESSAGE-----
> >Hash: SHA256
> >
> >
> >Hi,
> >
> >on my system bulding the derivation for the package tbb (version
> >4.3.2) does not complete due to exceeding the max-silent-time default
> >value of 3600 seconds (one hour).
> >
> >It seems that in this case the path is not marked as failed in the
> >sqlite3 db
> >
> >/var/guix/db/db.sqlite
> >
> >in the table FailedPaths. This is quite annoying since it seems that
> >several packages depend on it causing the derivation to be built
> >several times (each taking over an hour to fail).
> >
> >The guix daemon is running with the --cache-failures option and I
> >would expect the second run of
> >
> >for n in `guix package -A | cut -f1`; do guix build --no-substitutes
> >"$n" || true; done
> >
> >to be mostly a NOOP, since all failures from the first run should be
> >cached. And even in the first run I wouldn't expect failed
> >dependencies to be tried to build again. Contrary to this on this box
> >even the second run of this takes about half a day or so to complete ;)
> >
> >Flo
> >
> >P.S.: FYI: The thing that takes over an hour to run is
> >
> >./test_atomic.exe
> >
> >
> >- -- https://fps.io
> >-----BEGIN PGP SIGNATURE-----
> >Version: GnuPG v2
> >
> >iQEcBAEBCAAGBQJWX2qaAAoJEA5f4Coltk8ZnasH/jOg+E0Y/CDxw5SGgcJN0Q6K
> >TYo41AVz0u9tLJEVYW4ZW9Z7A3UL5OTB+03LwC1zT7iDtFzU6a7BzaW2N3gP+GGi
> >Tx+Rq0z7ZIHEF1t71YFtPOAIpuyxwl1yMnRo0kd8BVsrNu843ITI4w+kzGV4tcP1
> >l9uDf7c+WQ8MFhoMDUqjW5ufIb3zy6yKk1GDXw14xZ8laeiE8hrXFE2LFV4WCxzP
> >VMPDgHBlPF6pAKLYpWSpL2RtL/WxO9tYIYpQ16EW7GjOouCy2ObT+1CJ75kSIOie
> >DZ/RLUSxa39amDFwii5liR+ETgvz3FCoBAcyI5AP/76uMToub1z3S1PNt58EnsE=
> >=Hivd
> >-----END PGP SIGNATURE-----
> >
> >
> >
> 

> From 3e376f7d22a62c19491d830c34182f2f4828f0a3 Mon Sep 17 00:00:00 2001
> From: Florian Paul Schmidt <mista.tapas@gmx.net>
> Date: Fri, 4 Dec 2015 23:37:13 +0100
> Subject: [PATCH] guix-daemon: cache more failures if requested
> 
> ---
>  nix/libstore/build.cc         |  8 ++++++++
>  nix/libstore/globals.cc       |  4 ++++
>  nix/libstore/globals.hh       |  6 ++++++
>  nix/nix-daemon/guix-daemon.cc | 12 ++++++++++++
>  4 files changed, 30 insertions(+)
> 
> diff --git a/nix/libstore/build.cc b/nix/libstore/build.cc
> index efe1ab2..48936f9 100644
> --- a/nix/libstore/build.cc
> +++ b/nix/libstore/build.cc
> @@ -1483,12 +1483,20 @@ void DerivationGoal::buildDone()
>              if (settings.printBuildTrace)
>                  printMsg(lvlError, format("@ build-failed %1% - timeout") % drvPath);
>              worker.timedOut = true;
> +
> +            if (settings.cacheFailure && settings.cacheTimeoutFailure)
> +                foreach (DerivationOutputs::iterator, i, drv.outputs)
> +                    worker.store.registerFailedPath(i->second.path);
>          }
>  
>          else if (hook && (!WIFEXITED(status) || WEXITSTATUS(status) != 100)) {
>              if (settings.printBuildTrace)
>                  printMsg(lvlError, format("@ hook-failed %1% - %2% %3%")
>                      % drvPath % status % e.msg());
> +
> +            if (settings.cacheFailure && settings.cacheHookFailure)
> +                foreach (DerivationOutputs::iterator, i, drv.outputs)
> +                    worker.store.registerFailedPath(i->second.path);
>          }
>  
>          else {
> diff --git a/nix/libstore/globals.cc b/nix/libstore/globals.cc
> index 07f23d4..7829c1c 100644
> --- a/nix/libstore/globals.cc
> +++ b/nix/libstore/globals.cc
> @@ -48,6 +48,8 @@ Settings::Settings()
>      compressLog = true;
>      maxLogSize = 0;
>      cacheFailure = false;
> +    cacheTimeoutFailure = false;
> +    cacheHookFailure = false;
>      pollInterval = 5;
>      checkRootReachability = false;
>      gcKeepOutputs = false;
> @@ -158,6 +160,8 @@ void Settings::update()
>      _get(compressLog, "build-compress-log");
>      _get(maxLogSize, "build-max-log-size");
>      _get(cacheFailure, "build-cache-failure");
> +    _get(cacheTimeoutFailure, "build-cache-timeout-failure");
> +    _get(cacheHookFailure, "build-cache-hook-failure");
>      _get(pollInterval, "build-poll-interval");
>      _get(checkRootReachability, "gc-check-reachability");
>      _get(gcKeepOutputs, "gc-keep-outputs");
> diff --git a/nix/libstore/globals.hh b/nix/libstore/globals.hh
> index c17e10d..bf8666a 100644
> --- a/nix/libstore/globals.hh
> +++ b/nix/libstore/globals.hh
> @@ -170,6 +170,12 @@ struct Settings {
>      /* Whether to cache build failures. */
>      bool cacheFailure;
>  
> +    /* Whether to cache timeout failures */
> +    bool cacheTimeoutFailure;
> +
> +    /* Whether to cache hook failures */
> +    bool cacheHookFailure;
> +  
>      /* How often (in seconds) to poll for locks. */
>      unsigned int pollInterval;
>  
> diff --git a/nix/nix-daemon/guix-daemon.cc b/nix/nix-daemon/guix-daemon.cc
> index 1934487..f613de9 100644
> --- a/nix/nix-daemon/guix-daemon.cc
> +++ b/nix/nix-daemon/guix-daemon.cc
> @@ -80,6 +80,8 @@ builds derivations on behalf of its clients.");
>  #define GUIX_OPT_NO_BUILD_HOOK 14
>  #define GUIX_OPT_GC_KEEP_OUTPUTS 15
>  #define GUIX_OPT_GC_KEEP_DERIVATIONS 16
> +#define GUIX_OPT_CACHE_TIMEOUT_FAILURES 17
> +#define GUIX_OPT_CACHE_HOOK_FAILURES 18
>  
>  static const struct argp_option options[] =
>    {
> @@ -104,6 +106,10 @@ static const struct argp_option options[] =
>        n_("do not use the 'build hook'") },
>      { "cache-failures", GUIX_OPT_CACHE_FAILURES, 0, 0,
>        n_("cache build failures") },
> +    { "cache-timeout-failures", GUIX_OPT_CACHE_TIMEOUT_FAILURES, 0, 0,
> +      n_("cache build failures due to timeouts (depends on cache-failures)") },
> +    { "cache-hook-failures", GUIX_OPT_CACHE_HOOK_FAILURES, 0, 0,
> +      n_("cache build failures due to hook failures (depends on cache-failures)") },
>      { "lose-logs", GUIX_OPT_LOSE_LOGS, 0, 0,
>        n_("do not keep build logs") },
>      { "disable-log-compression", GUIX_OPT_DISABLE_LOG_COMPRESSION, 0, 0,
> @@ -189,6 +195,12 @@ parse_opt (int key, char *arg, struct argp_state *state)
>      case GUIX_OPT_CACHE_FAILURES:
>        settings.cacheFailure = true;
>        break;
> +    case GUIX_OPT_CACHE_TIMEOUT_FAILURES:
> +      settings.cacheTimeoutFailure = true;
> +      break;
> +    case GUIX_OPT_CACHE_HOOK_FAILURES:
> +      settings.cacheHookFailure = true;
> +      break;
>      case GUIX_OPT_IMPERSONATE_LINUX_26:
>        settings.impersonateLinux26 = true;
>        break;
> -- 
> 2.5.0
> 

  reply	other threads:[~2015-12-09 20:53 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-02 22:03 bug#22078: failed builds due to exceeding max-silent-time not marked as failed in db Florian Paul Schmidt
2015-12-04 22:40 ` Florian Paul Schmidt
2015-12-09 19:57   ` Leo Famulari [this message]
2015-12-13 23:11   ` Ludovic Courtès
2015-12-14  8:39     ` Florian Paul Schmidt
2015-12-14 16:40       ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151209195720.GA18503@jasmine \
    --to=leo@famulari.name \
    --cc=22078@debbugs.gnu.org \
    --cc=mista.tapas@gmx.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).