unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Phil Beadling <phil@beadling.co.uk>
To: Ricardo Wurmus <rekado@elephly.net>
Cc: "guix-devel@gnu.org" <guix-devel@gnu.org>
Subject: Re: Parallel guix builds can trample?
Date: Mon, 17 Jan 2022 17:23:24 +0000	[thread overview]
Message-ID: <CAOvsyQt0z-Q2AjHZEZu4r_dmMNq3VGN5Sgat5FYzRDBbnS+-Hw@mail.gmail.com> (raw)
In-Reply-To: <871r1dguq6.fsf@beadling.co.uk>

[-- Attachment #1: Type: text/plain, Size: 6552 bytes --]

Hi Ricardo, all,



I think we’ve worked out what the issue is, and have a proposed workaround,
and perhaps a case for solving the problem in Guix itself (depending on
what you people think!).



The issue is that despite each build being performed in its own isolated
container, these containers are fed by the same per-user cached source
directory.  In the case where *different* versions of the *same* repo are
built at once, this results in a race condition.



In our case we have one Linux account that does a lot of automated Guix
builds for us.



One example is this account watches our source control and automatically
rebuilds all outstanding Pull Requests (PRs) on a repo, after a separate
successful merge to our integration branch.  PRs are uniquely identified as
monotonically increasing PR numbers eg, PR-1, PR-2, PR-3 and so on.  Each
is a different branch on the same Repo with slightly different candidate
changes in it.  They are automatically kept up to date with the integration
branch.



To do this our watcher fires off (near) instantaneously dozens of guix
builds, each with their own local channel customized for the PR it is
building.  Doing them in parallel is important to make the system usably
responsive.



Each fired process does this:

   - Clone the channel containing the package into a local directory
   - Modify the commit id of the package to the new merged head of the PR
   - Modify the package version to some dummy version containing the PR
   number
   - Build the modified package using the local channel
   - Report the result (the build is effectively discarded; it is never
   used for anything)



What we think is happening is the following:



   - For each build that is kicked off in quick succession the local cache
   of the repo required updated by *update-cached-checkou*t
      -
      https://github.com/guix-mirror/guix/blob/9f526f5dad5f4af69d158c50369e182305147f3b/guix/git.scm#L476
      -
      https://github.com/guix-mirror/guix/blob/9f526f5dad5f4af69d158c50369e182305147f3b/guix/git.scm#L279
   - The problem with this is because each version is using the same cached
   repo --- before one has a chance to take a copy of the updated checkout,
   that checkout can be changed by a separate build process



Thus there is a race condition in this scenario.  We can provide a longer
test script to demo this if required – it’s quite straightforward to
reproduce just with a bash script, now we know what is causing it.



Our workaround has been to change XDG_CACHE_HOME for each PR build we do.
But this is a bit unsatisfactory as it effects processes beyond Guix – it
casts too wide of a net, but it does resolve the problem for the time being.



Do people think this is enough of an issue to make a switch available in
Guix to prevent sharing of cached clones?  This would be easy enough to
implement – a crude solution would be that each cache directory name would
simply be generated using a SHA of a string which includes the PID or
similar to ensure a unique name, and because it is never going to be reused
it could be deleted immediately after the build.



Whilst this is unlikely to happen at the console, as people script guix
build use-cases to fit their own problems (in particular building lots of
variations of a single piece of software) – I can see this causing a
headache?  I think at least the manual should make it clear that you cannot
build 2 packages referencing the same repo at the same time with the same
user (unless I’ve missed this bit I don’t think it’s made explicitly
clear?).  An even simpler change would be introduce a lock file that
refused the 2nd build and at least preventing the race condition happening,
and ensuring referential transparency, or simpler still just placed a
warning on stderr?



If people are amenable to adding a switch or other config option, we’d be
happy to look writing the patch?


Any thoughts/comments/advice?


Cheers!
Phil.


On Wed, 12 Jan 2022 at 09:37, Phil <phil@beadling.co.uk> wrote:

> Hi - more details below.
>
> Ricardo Wurmus writes:
>
> >
> > How are you using Guix with this?  Do you generate Guix package
> > expressions?  Do you use “guix build --with-commit”?
> >
>
> The situation is like this - if we had a directory of clones of my
> channel:
>
> - pr-1
> - pr-2
> - pr-3
> - pr-4
> ... and so on
>
> Initially all the clones are taken from the master branch of my
> channel and are all identical - but we change the version and commit to
> match the head of each PR branch as per below.
>
> Each clone looks like this:
> - pr-1
>       - my-package.scm
> - pr-2
>       - my-package.scm
> and so on....
>
> Each my-package.scm has a package like below - the inital packages are all
> identical, but my system effectively seds the version and commit values
> like the below.  These values are never committed back to master they
> are used only as local channels to build each PR to test each build
> still passes.
>
> (define-public my-package
>   (package
>     (name "my-package")
>     (version "this-is-different-for-each-pr")  ;; replace master version
>     (source
>       (git-checkout
>         (url "ssh://same@repo:7999/same/repo.git")
>         (commit "this-is-different-for-each-pr") ;; replace master version
> everything else remains the same in the package....
>
>
> At this point we have lots of local channels referencing different
> commits, in
> the same package, ready to build - so I spawn them all simultaneously -
> the equivalent pseudo-shell that I will mock up today would be:
>
> # define some sort of return code array:
> RC=[]
>
> for dir in pr-dirs
>   RC[${dir}]=`guix build -K -L ${dir} my-package & 2>&1 >
> /tmp/${dir}.log`  # note the ampersand
> wait
>
> for rc in $RC
>   if $rc.value != 0:
>     report the failure of build $rc.key
>
> What I'm seeing occasionally is that the logs and return code for say
> directory pr-1
> and appearing in the guix build for pr-3 or pr-6 instead.
>
> We know this becuse the code is different enough in pr-1 that it's logs
> are unique across all the PRs.  We can also check the source code if the
> build fails using --keep-failed to show it doesn't match the commit id
> in the package used to build it.
>
> Hopefully that makes sense?  I can post the actual shell script once
> I've written the mock.
>

[-- Attachment #2: Type: text/html, Size: 13220 bytes --]

  reply	other threads:[~2022-01-17 17:24 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-11 21:26 Parallel guix builds can trample? Phil
2022-01-11 23:16 ` Ricardo Wurmus
2022-01-12  7:00   ` Philip Beadling
2022-01-12  8:27     ` Ricardo Wurmus
2022-01-12  9:37       ` Phil
2022-01-17 17:23         ` Phil Beadling [this message]
2022-01-17 17:44           ` Maxime Devos
2022-01-18  9:28             ` Phil
2022-01-18  9:36               ` Maxime Devos
2022-01-18 14:59             ` Ludovic Courtès
2022-01-18 10:10       ` Phil
2022-01-18 12:53         ` Phil Beadling

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOvsyQt0z-Q2AjHZEZu4r_dmMNq3VGN5Sgat5FYzRDBbnS+-Hw@mail.gmail.com \
    --to=phil@beadling.co.uk \
    --cc=guix-devel@gnu.org \
    --cc=rekado@elephly.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).