From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mark H Weaver Subject: bug#34157: Hydra: mozjs-60 builds on x86_64 and i686 seemingly get stuck Date: Mon, 21 Jan 2019 21:54:43 -0500 Message-ID: <87sgxlv1td.fsf@netris.org> References: <87zhruuiv6.fsf@netris.org> <20190121153947.GD11658@macbook41> Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([209.51.188.92]:50695) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1glmN4-00069p-Cg for bug-guix@gnu.org; Mon, 21 Jan 2019 22:05:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1glmEM-0002tO-JG for bug-guix@gnu.org; Mon, 21 Jan 2019 21:56:03 -0500 Received: from debbugs.gnu.org ([209.51.188.43]:41405) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1glmEM-0002tH-FW for bug-guix@gnu.org; Mon, 21 Jan 2019 21:56:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1glmEL-00044V-W4 for bug-guix@gnu.org; Mon, 21 Jan 2019 21:56:02 -0500 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <20190121153947.GD11658@macbook41> (Efraim Flashner's message of "Mon, 21 Jan 2019 17:39:47 +0200") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: Efraim Flashner Cc: 34157@debbugs.gnu.org Efraim Flashner writes: > On Mon, Jan 21, 2019 at 10:31:46AM -0500, Mark H Weaver wrote: >> Yesterday on Hydra, I found both Intel mozjs-60 builds seemingly stuck >> while exporting the source checkout to hydra.gnunet.org. One had been >> going for ~22.5 hours, and the other for ~12 hours. I forcefully killed >> them and restarted them. Now I see the same thing has happened on the >> second attempt. Both builds have been seemingly stuck like this for >> about 19 hours: >> >> https://hydra.gnu.org/build/3342528 >> https://hydra.gnu.org/build/3343511 >> >> In both cases, the build logs are empty, and the hydra log ends with: >> >> sending 1 store item to 'hydra.gnunet.org'... >> exporting path `/gnu/store/j2sz7dg35vkcz38sim71jll2ix1nk554-mozjs-60.2.3-2-checkout' >> >> Of course, it's possible that they're not really stuck, but that they're >> merely taking a ridiculously long time to send the source checkout to >> the build slave. My personal checkout of the mozilla-esr60 branch, >> without the .hg directory, is about 2.1 gigabytes. >> >> What do you think? >> >> Mark >> > 12 hours is far too long for it to tie up a build slave, sending code or > not. Those two builds are still occupying build slots. As I write this, they've been running for over 30 hours. I was curious whether the transfers were actually happening, even if slowly, so I looked at 'netstat' output: --8<---------------cut here---------------start------------->8--- root@20121227-hydra:~# netstat --inet --program | grep net.in.tum tcp 0 0 20121227-hydra.gn:58007 hydra.net.in.tum.de:ssh ESTABLISHED 18774/guile tcp 0 0 20121227-hydra.gn:42586 hydra.net.in.tum.de:ssh ESTABLISHED 10042/guile tcp 0 0 20121227-hydra.gn:56413 hydra.net.in.tum.de:ssh ESTABLISHED 16236/guile --8<---------------cut here---------------end--------------->8--- There are currently three builds allocated to hydra.gnunet.org (a.k.a. hydra.net.in.tum), so it appears that all three ssh connections are still active. However, even after repeating this command many times, I've never seen a non-zero "Send-Q" value. This suggests that no data is actually being sent, but that it's stuck waiting for something. I'll leave these builds alone for now, in case Ludovic wants to investigate further. > Being silent that long doesn't trigger the auto-kill? I guess that the usual timeouts do not apply to file transfers performed before the actual build takes place. Thanks, Mark