* bug#36754: New linux-libre failed to build on armhf on Berlin
@ 2019-07-21 23:56 Mark H Weaver
2019-07-22 16:10 ` Ricardo Wurmus
2019-07-22 17:13 ` Mark H Weaver
0 siblings, 2 replies; 17+ messages in thread
From: Mark H Weaver @ 2019-07-21 23:56 UTC (permalink / raw)
To: 36754
In commit 1ad9c105c208caa9059924cbfbe4759c8101f6c9, I changed our
linux-libre packages to deblob the linux-libre source tarballs
ourselves, i.e. to run the deblobbing scripts provided by the
linux-libre project to produce linux-libre source tarballs from the
upstream linux tarballs:
https://git.savannah.gnu.org/cgit/guix.git/commit/?id=1ad9c105c208caa9059924cbfbe4759c8101f6c9
The following queries show that the updated packages built successfully
on x86_64, i686, and aarch64, but they all failed on armhf:
https://ci.guix.gnu.org/search?query=linux-libre-5.2.2
https://ci.guix.gnu.org/search?query=linux-libre-4.19.60
https://ci.guix.gnu.org/search?query=linux-libre-4.14.134
https://ci.guix.gnu.org/search?query=linux-libre-4.9.186
https://ci.guix.gnu.org/search?query=linux-libre-4.4.186
https://ci.guix.gnu.org/search?query=linux-libre-arm-veyron-5.2.2
https://ci.guix.gnu.org/search?query=linux-libre-arm-generic-5.2.2
https://ci.guix.gnu.org/search?query=linux-libre-arm-generic-4.19.60
https://ci.guix.gnu.org/search?query=linux-libre-arm-generic-4.14.134
https://ci.guix.gnu.org/search?query=linux-libre-arm-omap2plus-5.2.2
https://ci.guix.gnu.org/search?query=linux-libre-arm-omap2plus-4.19.60
https://ci.guix.gnu.org/search?query=linux-libre-arm-omap2plus-4.14.134
Unfortunately, I'm unable to get *any* information about what went wrong
from Cuirass. None of the failed builds have associated log files, and
the build details page has no useful information either. For example:
https://ci.guix.gnu.org/build/1488517/details
My first guess was that something went wrong in the 'computed' origin
that runs the deblobbing script. However, that's apparently not the
case, because all of the updated 'linux-libre-headers' packages built
successfully on armhf, and those use the same source tarballs as the
main 'linux-libre' packages.
https://ci.guix.gnu.org/search?query=linux-libre-headers-5.2.2
https://ci.guix.gnu.org/search?query=linux-libre-headers-4.19.60
https://ci.guix.gnu.org/search?query=linux-libre-headers-4.14.134
Can someone help me find out what's going on here? Until then, I'm
sorry to say that armhf-linux users will be unable to update their
systems.
Mark
^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#36754: New linux-libre failed to build on armhf on Berlin
2019-07-21 23:56 bug#36754: New linux-libre failed to build on armhf on Berlin Mark H Weaver
@ 2019-07-22 16:10 ` Ricardo Wurmus
2019-07-22 17:13 ` Mark H Weaver
1 sibling, 0 replies; 17+ messages in thread
From: Ricardo Wurmus @ 2019-07-22 16:10 UTC (permalink / raw)
To: mhw; +Cc: 36754
Mark H Weaver <mhw@netris.org> writes:
> Unfortunately, I'm unable to get *any* information about what went wrong
> from Cuirass. None of the failed builds have associated log files, and
> the build details page has no useful information either. For example:
>
> https://ci.guix.gnu.org/build/1488517/details
On that page I see a link to the build log, but it appears to be
truncated:
https://ci.guix.gnu.org/log/33hv7mij9bqqgf5hqwrw14106z9zgav9-linux-libre-5.2.2
Maybe the build node died before the build could be completed?
--
Ricardo
^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#36754: New linux-libre failed to build on armhf on Berlin
2019-07-21 23:56 bug#36754: New linux-libre failed to build on armhf on Berlin Mark H Weaver
2019-07-22 16:10 ` Ricardo Wurmus
@ 2019-07-22 17:13 ` Mark H Weaver
2019-07-23 16:46 ` Marius Bakke
1 sibling, 1 reply; 17+ messages in thread
From: Mark H Weaver @ 2019-07-22 17:13 UTC (permalink / raw)
To: Ricardo Wurmus; +Cc: 36754
Hi Ricardo,
Interesting. I distinctly remember that there was no log file when I
looked last time. Hmm.
Anyway, it seems that now, all of the failed builds have either build
logs available or else information about which dependency failed. I
don't remember seeing any of this last time, but I'm glad to see it now.
A pattern has now emerged, but I don't know what it means. All of the
armhf kernel builds failed except for linux-libre-arm-veyron-5.2.2,
which succeeded:
https://ci.guix.gnu.org/build/1488502/details (arm-veyron-5.2.2)
Apart from this anomalous success, all of the armhf 5.2.2 and 4.19.60
have a truncated log file:
https://ci.guix.gnu.org/build/1488517/details (5.2.2)
https://ci.guix.gnu.org/build/1488503/details (4.19.60)
https://ci.guix.gnu.org/build/1488513/details (arm-generic-5.2.2)
https://ci.guix.gnu.org/build/1488519/details (arm-generic-4.19.60)
https://ci.guix.gnu.org/build/1488504/details (arm-omap2plus-5.2.2)
https://ci.guix.gnu.org/build/1488501/details (arm-omap2plus-4.19.60)
This pattern seems too regular to be a coincidence. Can we find out
which build machines were used for these builds?
All of the 4.14.134 builds failed in the deblobbing step, due to timeout
(1 hour of silence) while packing the linux-libre tarball:
https://ci.guix.gnu.org/build/1488514/details (4.14.134)
https://ci.guix.gnu.org/build/1488515/details (arm-generic-4.14.134)
https://ci.guix.gnu.org/build/1488512/details (arm-omap2plus-4.14.134)
I'm not sure how to deal with this. This is a computed origin, not a
normal package, and so I don't see a way to configure a longer timeout.
Perhaps I should make the tarball packing and unpacking operations
verbose, to work around the issue. Of course that's our usual practice,
but I find it suboptimal because any warnings will be buried in a
mountain of uninteresting output.
Thoughts? Anyway, thanks for looking into it.
Mark
^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#36754: New linux-libre failed to build on armhf on Berlin
2019-07-22 17:13 ` Mark H Weaver
@ 2019-07-23 16:46 ` Marius Bakke
2019-07-23 17:33 ` bug#36754: SSH connections to hydra-slave{1, 2, 3} fail during builds (was: New linux-libre failed to build on armhf on Berlin) Mark H Weaver
0 siblings, 1 reply; 17+ messages in thread
From: Marius Bakke @ 2019-07-23 16:46 UTC (permalink / raw)
To: Mark H Weaver, Ricardo Wurmus; +Cc: 36754
Mark H Weaver <mhw@netris.org> writes:
> Hi Ricardo,
>
> Interesting. I distinctly remember that there was no log file when I
> looked last time. Hmm.
>
> Anyway, it seems that now, all of the failed builds have either build
> logs available or else information about which dependency failed. I
> don't remember seeing any of this last time, but I'm glad to see it now.
>
> A pattern has now emerged, but I don't know what it means. All of the
> armhf kernel builds failed except for linux-libre-arm-veyron-5.2.2,
> which succeeded:
>
> https://ci.guix.gnu.org/build/1488502/details (arm-veyron-5.2.2)
>
> Apart from this anomalous success, all of the armhf 5.2.2 and 4.19.60
> have a truncated log file:
>
> https://ci.guix.gnu.org/build/1488517/details (5.2.2)
> https://ci.guix.gnu.org/build/1488503/details (4.19.60)
> https://ci.guix.gnu.org/build/1488513/details (arm-generic-5.2.2)
> https://ci.guix.gnu.org/build/1488519/details (arm-generic-4.19.60)
> https://ci.guix.gnu.org/build/1488504/details (arm-omap2plus-5.2.2)
> https://ci.guix.gnu.org/build/1488501/details (arm-omap2plus-4.19.60)
>
> This pattern seems too regular to be a coincidence. Can we find out
> which build machines were used for these builds?
I tried building 5.2.2 'interactively' on Berlin, and got an SSH error:
CC [M] net/openvswitch/vport-geneve.o
CC [M] net/openvswitch/vport-gre.o
LD [M] net/openvswitch/openvswitch.o
;;; [2019/07/23 05:14:53.501502, 0] read_from_channel_port: [GSSH ERROR] Error reading from the channel: #<input-output: channel (closed) 14c0e60>
Backtrace:
16 (apply-smob/1 #<catch-closure b79640>)
In ice-9/boot-9.scm:
705:2 15 (call-with-prompt _ _ #<procedure default-prompt-handle…>)
In ice-9/eval.scm:
619:8 14 (_ #(#(#<directory (guile-user) bfb140>)))
In guix/ui.scm:
1747:12 13 (run-guix-command _ . _)
In guix/scripts/offload.scm:
781:22 12 (guix-offload . _)
In ice-9/boot-9.scm:
829:9 11 (catch _ _ #<procedure 7f576678d910 at guix/ui.scm:703…> …)
829:9 10 (catch _ _ #<procedure 7f576678d928 at guix/ui.scm:826…> …)
In guix/scripts/offload.scm:
580:19 9 (process-request _ _ _ _ #:print-build-trace? _ # _ # _)
531:6 8 (call-with-timeout _ _ _)
361:2 7 (transfer-and-offload #<derivation /gnu/store/yfns7ga4…> …)
In ice-9/boot-9.scm:
829:9 6 (catch _ _ #<procedure dbdab0 at guix/scripts/offload.…> …)
In guix/scripts/offload.scm:
385:6 5 (_)
In guix/store.scm:
1203:15 4 (_ #<store-connection 256.99 19a0ba0> _ _)
692:11 3 (process-stderr #<store-connection 256.99 19a0ba0> _)
In guix/serialization.scm:
87:11 2 (read-int _)
73:12 1 (get-bytevector-n* #<input-output: channel (closed) 14…> …)
In unknown file:
0 (get-bytevector-n #<input-output: channel (closed) 14c…> …)
ERROR: In procedure get-bytevector-n:
Throw to key `guile-ssh-error' with args `("read_from_channel_port" "Error reading from the channel" #<input-output: channel (closed) 14c0e60> #f)'.
guix build: error: build of `/gnu/store/yfns7ga468vmv9jn72snk79b16p8mhfa-linux-libre-5.2.2.drv' failed
real 637m24.906s
user 0m6.661s
sys 0m0.897s
Unfortunately I failed to record which machine was used and don't know a
way to find out after the fact.
^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#36754: SSH connections to hydra-slave{1, 2, 3} fail during builds (was: New linux-libre failed to build on armhf on Berlin)
2019-07-23 16:46 ` Marius Bakke
@ 2019-07-23 17:33 ` Mark H Weaver
2019-07-23 17:49 ` Mark H Weaver
0 siblings, 1 reply; 17+ messages in thread
From: Mark H Weaver @ 2019-07-23 17:33 UTC (permalink / raw)
To: Marius Bakke; +Cc: 36754
retitle 36754 SSH connections to hydra-slave{1,2,3} fail during builds
thanks
Hi,
I've added Ludovic to the CC list, since he recently added
hydra-slave{1,2,3} to Berlin.
Marius wrote:
> I tried building 5.2.2 'interactively' on Berlin, and got an SSH error:
>
> CC [M] net/openvswitch/vport-geneve.o
> CC [M] net/openvswitch/vport-gre.o
> LD [M] net/openvswitch/openvswitch.o
> ;;; [2019/07/23 05:14:53.501502, 0] read_from_channel_port: [GSSH ERROR] Error reading from the channel: #<input-output: channel (closed) 14c0e60>
> Backtrace:
> 16 (apply-smob/1 #<catch-closure b79640>)
> In ice-9/boot-9.scm:
> 705:2 15 (call-with-prompt _ _ #<procedure default-prompt-handle…>)
> In ice-9/eval.scm:
> 619:8 14 (_ #(#(#<directory (guile-user) bfb140>)))
> In guix/ui.scm:
> 1747:12 13 (run-guix-command _ . _)
> In guix/scripts/offload.scm:
> 781:22 12 (guix-offload . _)
> In ice-9/boot-9.scm:
> 829:9 11 (catch _ _ #<procedure 7f576678d910 at guix/ui.scm:703…> …)
> 829:9 10 (catch _ _ #<procedure 7f576678d928 at guix/ui.scm:826…> …)
> In guix/scripts/offload.scm:
> 580:19 9 (process-request _ _ _ _ #:print-build-trace? _ # _ # _)
> 531:6 8 (call-with-timeout _ _ _)
> 361:2 7 (transfer-and-offload #<derivation /gnu/store/yfns7ga4…> …)
> In ice-9/boot-9.scm:
> 829:9 6 (catch _ _ #<procedure dbdab0 at guix/scripts/offload.…> …)
> In guix/scripts/offload.scm:
> 385:6 5 (_)
> In guix/store.scm:
> 1203:15 4 (_ #<store-connection 256.99 19a0ba0> _ _)
> 692:11 3 (process-stderr #<store-connection 256.99 19a0ba0> _)
> In guix/serialization.scm:
> 87:11 2 (read-int _)
> 73:12 1 (get-bytevector-n* #<input-output: channel (closed) 14…> …)
> In unknown file:
> 0 (get-bytevector-n #<input-output: channel (closed) 14c…> …)
>
> ERROR: In procedure get-bytevector-n:
> Throw to key `guile-ssh-error' with args `("read_from_channel_port" "Error reading from the channel" #<input-output: channel (closed) 14c0e60> #f)'.
> guix build: error: build of `/gnu/store/yfns7ga468vmv9jn72snk79b16p8mhfa-linux-libre-5.2.2.drv' failed
>
> real 637m24.906s
> user 0m6.661s
> sys 0m0.897s
Thank you, this is helpful.
> Unfortunately I failed to record which machine was used and don't know a
> way to find out after the fact.
I believe it was hydra-slave2, one of the three armhf machines that I
host which were formerly part of hydra.gnu.org's build farm and were
recently added to Berlin by Ludovic. I checked hydra-slave{1,2,3} for
build log files corresponding to the derivation above, and found that
all three of them have been attempted recently:
hydra-slave2 attempted to build it on July 23 08:07 UTC.
hydra-slave3 attempted to build it on July 22 16:40 UTC.
hydra-slave1 attempted to build it on July 22 04:44 UTC.
To be precise, each of those dates correspond to the end of the build
attempt. All three build logs are truncated on the build machine as
well, with no error message at the end.
I now believe that these failures are related to the newly added armhf
build slaves, and that they have nothing to do with the recent changes
to our linux-libre packages.
Well, except for the silence timeout that sometimes happens on slower
machines while deblobbing linux-libre. That's a separate issue.
Thanks,
Mark
^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#36754: SSH connections to hydra-slave{1, 2, 3} fail during builds (was: New linux-libre failed to build on armhf on Berlin)
2019-07-23 17:33 ` bug#36754: SSH connections to hydra-slave{1, 2, 3} fail during builds (was: New linux-libre failed to build on armhf on Berlin) Mark H Weaver
@ 2019-07-23 17:49 ` Mark H Weaver
2019-07-23 21:26 ` bug#36754: SSH connections to hydra-slave{1, 2, 3} fail during builds Ludovic Courtès
0 siblings, 1 reply; 17+ messages in thread
From: Mark H Weaver @ 2019-07-23 17:49 UTC (permalink / raw)
To: Marius Bakke; +Cc: 36754
I wrote earlier:
> I now believe that these failures are related to the newly added armhf
> build slaves, and that they have nothing to do with the recent changes
> to our linux-libre packages.
I should mention that the armhf build slaves are on a private network,
and I use my public-facing internet server to forward TCP connections to
them, using the following entries in /etc/inetd.conf:
--8<---------------cut here---------------start------------->8---
# TCP-level forwards for SSH connections to build machines for the GNU
# Guix build farm:
7275 stream tcp nowait nobody /bin/nc /bin/nc -w 10 172.19.189.11 7275
7276 stream tcp nowait nobody /bin/nc /bin/nc -w 10 172.19.189.12 7276
7274 stream tcp nowait nobody /bin/nc /bin/nc -w 10 172.19.189.13 7274
--8<---------------cut here---------------end--------------->8---
It's possible that this arrangement is somehow part of the problem.
However, note that nothing has changed here in several years, and it
worked fine on hydra.gnu.org. The build slaves were running a *very*
old version of Guix though. It seems likely that the new Guile-SSH code
doesn't cope well with this setup.
Mark
^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#36754: SSH connections to hydra-slave{1, 2, 3} fail during builds
2019-07-23 17:49 ` Mark H Weaver
@ 2019-07-23 21:26 ` Ludovic Courtès
2019-07-23 21:55 ` Ricardo Wurmus
2019-07-24 11:09 ` Mark H Weaver
0 siblings, 2 replies; 17+ messages in thread
From: Ludovic Courtès @ 2019-07-23 21:26 UTC (permalink / raw)
To: Mark H Weaver; +Cc: 36754
Hi Mark,
Mark H Weaver <mhw@netris.org> skribis:
> I wrote earlier:
>> I now believe that these failures are related to the newly added armhf
>> build slaves, and that they have nothing to do with the recent changes
>> to our linux-libre packages.
>
> I should mention that the armhf build slaves are on a private network,
> and I use my public-facing internet server to forward TCP connections to
> them, using the following entries in /etc/inetd.conf:
>
> # TCP-level forwards for SSH connections to build machines for the GNU
> # Guix build farm:
> 7275 stream tcp nowait nobody /bin/nc /bin/nc -w 10 172.19.189.11 7275
> 7276 stream tcp nowait nobody /bin/nc /bin/nc -w 10 172.19.189.12 7276
> 7274 stream tcp nowait nobody /bin/nc /bin/nc -w 10 172.19.189.13 7274
>
> It's possible that this arrangement is somehow part of the problem.
> However, note that nothing has changed here in several years, and it
> worked fine on hydra.gnu.org. The build slaves were running a *very*
> old version of Guix though. It seems likely that the new Guile-SSH code
> doesn't cope well with this setup.
I noticed that connections to the machines were unstable (using
OpenSSH’s client). That is, the connection would eventually “hang”,
apparently several times a day.
Currently we have an SSH tunnel set up on berlin to connect to each of
these machines via overdrive1.guixsd.org. This setup proved to be
robust in the past (we used it to connect to another build machine), so
I suspect something’s wrong on “your” end of the network. It’s hard to
tell exactly what, though.
Ideas?
If it’s causing build failures, I’m afraid we’ll have to comment out
those machines from berlin’s machines.scm until we’ve figured it out.
Thanks,
Ludo’.
^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#36754: SSH connections to hydra-slave{1, 2, 3} fail during builds
2019-07-23 21:26 ` bug#36754: SSH connections to hydra-slave{1, 2, 3} fail during builds Ludovic Courtès
@ 2019-07-23 21:55 ` Ricardo Wurmus
2019-08-01 15:39 ` Ricardo Wurmus
2019-07-24 11:09 ` Mark H Weaver
1 sibling, 1 reply; 17+ messages in thread
From: Ricardo Wurmus @ 2019-07-23 21:55 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: 36754
Ludovic Courtès <ludo@gnu.org> writes:
> Currently we have an SSH tunnel set up on berlin to connect to each of
> these machines via overdrive1.guixsd.org. This setup proved to be
> robust in the past (we used it to connect to another build machine), so
> I suspect something’s wrong on “your” end of the network. It’s hard to
> tell exactly what, though.
FWIW by the end of this week we should have the firewall changes
implemented so we can do without the SSH tunnel.
--
Ricardo
^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#36754: SSH connections to hydra-slave{1, 2, 3} fail during builds
2019-07-23 21:26 ` bug#36754: SSH connections to hydra-slave{1, 2, 3} fail during builds Ludovic Courtès
2019-07-23 21:55 ` Ricardo Wurmus
@ 2019-07-24 11:09 ` Mark H Weaver
2019-07-24 14:56 ` Ludovic Courtès
1 sibling, 1 reply; 17+ messages in thread
From: Mark H Weaver @ 2019-07-24 11:09 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: 36754
Hi Ludovic,
Ludovic Courtès <ludo@gnu.org> wrote:
> I noticed that connections to the machines were unstable (using
> OpenSSH’s client). That is, the connection would eventually “hang”,
> apparently several times a day.
>
> Currently we have an SSH tunnel set up on berlin to connect to each of
> these machines via overdrive1.guixsd.org. This setup proved to be
> robust in the past (we used it to connect to another build machine), so
> I suspect something’s wrong on “your” end of the network. It’s hard to
> tell exactly what, though.
>
> Ideas?
Okay, I'll look into it. I'm very busy with something else for the next
couple of days, but I'll try to get to it in the next week.
> If it’s causing build failures, I’m afraid we’ll have to comment out
> those machines from berlin’s machines.scm until we’ve figured it out.
Agreed.
Thanks,
Mark
^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#36754: SSH connections to hydra-slave{1, 2, 3} fail during builds
2019-07-24 11:09 ` Mark H Weaver
@ 2019-07-24 14:56 ` Ludovic Courtès
2019-08-01 14:09 ` Marius Bakke
0 siblings, 1 reply; 17+ messages in thread
From: Ludovic Courtès @ 2019-07-24 14:56 UTC (permalink / raw)
To: Mark H Weaver; +Cc: 36754
Hello,
Mark H Weaver <mhw@netris.org> skribis:
> Ludovic Courtès <ludo@gnu.org> wrote:
>> I noticed that connections to the machines were unstable (using
>> OpenSSH’s client). That is, the connection would eventually “hang”,
>> apparently several times a day.
>>
>> Currently we have an SSH tunnel set up on berlin to connect to each of
>> these machines via overdrive1.guixsd.org. This setup proved to be
>> robust in the past (we used it to connect to another build machine), so
>> I suspect something’s wrong on “your” end of the network. It’s hard to
>> tell exactly what, though.
>>
>> Ideas?
>
> Okay, I'll look into it. I'm very busy with something else for the next
> couple of days, but I'll try to get to it in the next week.
OK!
>> If it’s causing build failures, I’m afraid we’ll have to comment out
>> those machines from berlin’s machines.scm until we’ve figured it out.
>
> Agreed.
I’ve commented them out now.
Thanks,
Ludo’.
^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#36754: SSH connections to hydra-slave{1, 2, 3} fail during builds
2019-07-24 14:56 ` Ludovic Courtès
@ 2019-08-01 14:09 ` Marius Bakke
2019-08-01 16:37 ` Mark H Weaver
0 siblings, 1 reply; 17+ messages in thread
From: Marius Bakke @ 2019-08-01 14:09 UTC (permalink / raw)
To: Ludovic Courtès, Mark H Weaver; +Cc: 36754
[-- Attachment #1: Type: text/plain, Size: 563 bytes --]
The truncated log files seems to happen for other builds as well, even
within the Berlin data center.
https://ci.guix.gnu.org/log/n3ra1b8ic6qhfinnhb80mrn7snsqws9d-geocode-glib-3.26.0
https://ci.guix.gnu.org/log/zqhqlib00i8f7f10g4c2dfzprw16h4xv-scintilla-4.2.0
https://ci.guix.gnu.org/log/718jmbq94mvdgnmjyqgxgy7zaj8xzxk3-htslib-1.9
All of these builds are for i686-linux.
Mark: are the armhf nodes still operational? I would like to re-enable
them again, since we desperately need the computing power with four huge
branches going concurrently at the moment.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#36754: SSH connections to hydra-slave{1, 2, 3} fail during builds
2019-07-23 21:55 ` Ricardo Wurmus
@ 2019-08-01 15:39 ` Ricardo Wurmus
0 siblings, 0 replies; 17+ messages in thread
From: Ricardo Wurmus @ 2019-08-01 15:39 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: 36754
Ricardo Wurmus <rekado@elephly.net> writes:
> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Currently we have an SSH tunnel set up on berlin to connect to each of
>> these machines via overdrive1.guixsd.org. This setup proved to be
>> robust in the past (we used it to connect to another build machine), so
>> I suspect something’s wrong on “your” end of the network. It’s hard to
>> tell exactly what, though.
>
> FWIW by the end of this week we should have the firewall changes
> implemented so we can do without the SSH tunnel.
The firewall changes have been applied today.
--
Ricardo
^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#36754: SSH connections to hydra-slave{1, 2, 3} fail during builds
2019-08-01 14:09 ` Marius Bakke
@ 2019-08-01 16:37 ` Mark H Weaver
2019-08-01 21:06 ` Ricardo Wurmus
0 siblings, 1 reply; 17+ messages in thread
From: Mark H Weaver @ 2019-08-01 16:37 UTC (permalink / raw)
To: Marius Bakke; +Cc: 36754
Hi Marius,
Marius Bakke <mbakke@fastmail.com> wrote:
> The truncated log files seems to happen for other builds as well, even
> within the Berlin data center.
>
> https://ci.guix.gnu.org/log/n3ra1b8ic6qhfinnhb80mrn7snsqws9d-geocode-glib-3.26.0
> https://ci.guix.gnu.org/log/zqhqlib00i8f7f10g4c2dfzprw16h4xv-scintilla-4.2.0
> https://ci.guix.gnu.org/log/718jmbq94mvdgnmjyqgxgy7zaj8xzxk3-htslib-1.9
>
> All of these builds are for i686-linux.
Thanks, that's very useful information.
> Mark: are the armhf nodes still operational?
I assume so. They all respond to pings anyway, and I haven't touched
them since before they were disconnected from Berlin. (I would need to
boot up my other, more secure computer to try SSHing into them).
> I would like to re-enable them again, since we desperately need the
> computing power with four huge branches going concurrently at the
> moment.
I have no objection, but since Ludovic made the decision to disconnect
them, it would be good to hear from him first.
Thanks,
Mark
^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#36754: SSH connections to hydra-slave{1, 2, 3} fail during builds
2019-08-01 16:37 ` Mark H Weaver
@ 2019-08-01 21:06 ` Ricardo Wurmus
2019-08-07 14:30 ` Ricardo Wurmus
0 siblings, 1 reply; 17+ messages in thread
From: Ricardo Wurmus @ 2019-08-01 21:06 UTC (permalink / raw)
To: Mark H Weaver; +Cc: 36754
Mark H Weaver <mhw@netris.org> writes:
>> Mark: are the armhf nodes still operational?
>
> I assume so. They all respond to pings anyway, and I haven't touched
> them since before they were disconnected from Berlin. (I would need to
> boot up my other, more secure computer to try SSHing into them).
>
>> I would like to re-enable them again, since we desperately need the
>> computing power with four huge branches going concurrently at the
>> moment.
>
> I have no objection, but since Ludovic made the decision to disconnect
> them, it would be good to hear from him first.
Now that we should be able to SSH to them directly from Berlin we can
try connecting and perhaps upgrading the guix-daemon on these machines.
--
Ricardo
^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#36754: SSH connections to hydra-slave{1, 2, 3} fail during builds
2019-08-01 21:06 ` Ricardo Wurmus
@ 2019-08-07 14:30 ` Ricardo Wurmus
2019-08-16 10:25 ` Ludovic Courtès
0 siblings, 1 reply; 17+ messages in thread
From: Ricardo Wurmus @ 2019-08-07 14:30 UTC (permalink / raw)
To: Mark H Weaver; +Cc: 36754
Ricardo Wurmus <rekado@elephly.net> writes:
> Mark H Weaver <mhw@netris.org> writes:
>
>>> Mark: are the armhf nodes still operational?
>>
>> I assume so. They all respond to pings anyway, and I haven't touched
>> them since before they were disconnected from Berlin. (I would need to
>> boot up my other, more secure computer to try SSHing into them).
>>
>>> I would like to re-enable them again, since we desperately need the
>>> computing power with four huge branches going concurrently at the
>>> moment.
>>
>> I have no objection, but since Ludovic made the decision to disconnect
>> them, it would be good to hear from him first.
>
> Now that we should be able to SSH to them directly from Berlin we can
> try connecting and perhaps upgrading the guix-daemon on these machines.
I have removed the SSH tunnel configuration from /etc/guix/machines.scm
and re-enabled the machines.
Let’s see if this makes any difference. If not we should try to upgrade
Guix on these build machines.
--
Ricardo
^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#36754: SSH connections to hydra-slave{1, 2, 3} fail during builds
2019-08-07 14:30 ` Ricardo Wurmus
@ 2019-08-16 10:25 ` Ludovic Courtès
2019-09-12 8:41 ` Ludovic Courtès
0 siblings, 1 reply; 17+ messages in thread
From: Ludovic Courtès @ 2019-08-16 10:25 UTC (permalink / raw)
To: Ricardo Wurmus; +Cc: 36754
Hi,
Ricardo Wurmus <rekado@elephly.net> skribis:
> Ricardo Wurmus <rekado@elephly.net> writes:
>
>> Mark H Weaver <mhw@netris.org> writes:
>>
>>>> Mark: are the armhf nodes still operational?
>>>
>>> I assume so. They all respond to pings anyway, and I haven't touched
>>> them since before they were disconnected from Berlin. (I would need to
>>> boot up my other, more secure computer to try SSHing into them).
>>>
>>>> I would like to re-enable them again, since we desperately need the
>>>> computing power with four huge branches going concurrently at the
>>>> moment.
>>>
>>> I have no objection, but since Ludovic made the decision to disconnect
>>> them, it would be good to hear from him first.
>>
>> Now that we should be able to SSH to them directly from Berlin we can
>> try connecting and perhaps upgrading the guix-daemon on these machines.
>
> I have removed the SSH tunnel configuration from /etc/guix/machines.scm
> and re-enabled the machines.
>
> Let’s see if this makes any difference.
Is it working well now?
> If not we should try to upgrade Guix on these build machines.
I think there’s a misunderstanding: these machines used to run a very
old Guix but I installed 1.0 from scratch before migrating them to
berlin.
Thanks,
Ludo’.
^ permalink raw reply [flat|nested] 17+ messages in thread
* bug#36754: SSH connections to hydra-slave{1, 2, 3} fail during builds
2019-08-16 10:25 ` Ludovic Courtès
@ 2019-09-12 8:41 ` Ludovic Courtès
0 siblings, 0 replies; 17+ messages in thread
From: Ludovic Courtès @ 2019-09-12 8:41 UTC (permalink / raw)
To: Ricardo Wurmus; +Cc: 36754-done
Hello,
AFAICS we no longer have connection issues to
hydra-slave{1,2,3}.netris.org so I’m closing this bug.
Ludo’.
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2019-09-12 8:42 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-07-21 23:56 bug#36754: New linux-libre failed to build on armhf on Berlin Mark H Weaver
2019-07-22 16:10 ` Ricardo Wurmus
2019-07-22 17:13 ` Mark H Weaver
2019-07-23 16:46 ` Marius Bakke
2019-07-23 17:33 ` bug#36754: SSH connections to hydra-slave{1, 2, 3} fail during builds (was: New linux-libre failed to build on armhf on Berlin) Mark H Weaver
2019-07-23 17:49 ` Mark H Weaver
2019-07-23 21:26 ` bug#36754: SSH connections to hydra-slave{1, 2, 3} fail during builds Ludovic Courtès
2019-07-23 21:55 ` Ricardo Wurmus
2019-08-01 15:39 ` Ricardo Wurmus
2019-07-24 11:09 ` Mark H Weaver
2019-07-24 14:56 ` Ludovic Courtès
2019-08-01 14:09 ` Marius Bakke
2019-08-01 16:37 ` Mark H Weaver
2019-08-01 21:06 ` Ricardo Wurmus
2019-08-07 14:30 ` Ricardo Wurmus
2019-08-16 10:25 ` Ludovic Courtès
2019-09-12 8:41 ` Ludovic Courtès
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).