unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
* bug#67041: [cuirass] cuirass-web crash
@ 2023-11-10 13:59 Maxim Cournoyer
  2023-11-16 16:09 ` Ludovic Courtès
  0 siblings, 1 reply; 4+ messages in thread
From: Maxim Cournoyer @ 2023-11-10 13:59 UTC (permalink / raw)
  To: 67041; +Cc: guix-sysadmin

Hi,

Today ci.guix.gnu.org was showing a 504 Gateway Time-out error.  Looking
at /var/log/cuirass-web.log on berlin, there's this backtrace:

--8<---------------cut here---------------start------------->8---
2023-11-10 13:11:30 Uncaught exception in task:
2023-11-10 13:11:30 GET /build/18305/details
2023-11-10 13:11:30 In fibers.scm:
2023-11-10 13:11:30     172:8  4 (_)
2023-11-10 13:11:30 In web/server/fiberized.scm:
2023-11-10 13:11:30    187:12  3 (socket-loop #<input-output: socket 44> #<<channel> get?>)
2023-11-10 13:11:30 In ice-9/suspendable-ports.scm:
2023-11-10 13:11:30    733:12  2 (_ #<input-output: socket 44> _)
2023-11-10 13:11:30 In unknown file:
2023-11-10 13:11:30            1 (accept #<input-output: socket 44> 526336)
2023-11-10 13:11:30 In ice-9/boot-9.scm:
2023-11-10 13:11:30   1685:16  0 (raise-exception _ #:continuable? _)
2023-11-10 13:11:30 ice-9/boot-9.scm:1685:16: In procedure raise-exception:
2023-11-10 13:11:30 In procedure accept: Too many open files
--8<---------------cut here---------------end--------------->8---

Restarting it with 'sudo herd restart cuirass-web' resolved that for
now.

-- 
Thanks,
Maxim




^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#67041: [cuirass] cuirass-web crash
  2023-11-10 13:59 bug#67041: [cuirass] cuirass-web crash Maxim Cournoyer
@ 2023-11-16 16:09 ` Ludovic Courtès
  2023-11-16 22:26   ` Ludovic Courtès
  0 siblings, 1 reply; 4+ messages in thread
From: Ludovic Courtès @ 2023-11-16 16:09 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 67041, guix-sysadmin

Hi,

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

> 2023-11-10 13:11:30 ice-9/boot-9.scm:1685:16: In procedure raise-exception:
> 2023-11-10 13:11:30 In procedure accept: Too many open files

Apparently there’s something that causes ‘cuirass web’ to leak file
descriptors, the number being otherwise stable around 50:

--8<---------------cut here---------------start------------->8---
2023-11-08 06:53:23 heap: 61.34 MiB; threads: 9; file descriptors: 50
2023-11-08 07:03:23 heap: 61.34 MiB; threads: 9; file descriptors: 50
2023-11-08 07:13:23 heap: 61.34 MiB; threads: 9; file descriptors: 50
2023-11-08 07:23:23 heap: 61.34 MiB; threads: 9; file descriptors: 50
2023-11-08 07:33:23 heap: 54.97 MiB; threads: 9; file descriptors: 50
2023-11-08 07:43:23 heap: 54.97 MiB; threads: 9; file descriptors: 50
2023-11-08 07:53:23 heap: 54.97 MiB; threads: 9; file descriptors: 50
2023-11-08 08:03:23 heap: 54.97 MiB; threads: 9; file descriptors: 50
2023-11-08 08:13:23 heap: 61.34 MiB; threads: 9; file descriptors: 50
2023-11-08 08:23:23 heap: 61.34 MiB; threads: 9; file descriptors: 50
2023-11-08 08:33:23 heap: 61.34 MiB; threads: 9; file descriptors: 50
2023-11-08 08:43:23 heap: 61.34 MiB; threads: 9; file descriptors: 50
2023-11-08 08:53:23 heap: 61.34 MiB; threads: 9; file descriptors: 50
2023-11-08 09:03:23 heap: 61.34 MiB; threads: 9; file descriptors: 51
2023-11-08 09:13:23 heap: 61.34 MiB; threads: 9; file descriptors: 154
2023-11-08 09:23:23 heap: 61.34 MiB; threads: 9; file descriptors: 232
2023-11-08 09:33:23 heap: 61.34 MiB; threads: 9; file descriptors: 282
2023-11-08 09:43:23 heap: 61.34 MiB; threads: 9; file descriptors: 385
2023-11-08 09:53:23 heap: 61.34 MiB; threads: 9; file descriptors: 489
2023-11-08 10:03:23 heap: 61.34 MiB; threads: 9; file descriptors: 608
2023-11-08 10:13:23 heap: 61.34 MiB; threads: 9; file descriptors: 665
2023-11-08 10:23:23 heap: 61.34 MiB; threads: 9; file descriptors: 706
2023-11-08 10:33:23 heap: 61.34 MiB; threads: 9; file descriptors: 760
2023-11-08 10:43:23 heap: 61.34 MiB; threads: 9; file descriptors: 802
2023-11-08 10:53:23 heap: 61.34 MiB; threads: 9; file descriptors: 865
2023-11-08 11:03:23 heap: 61.34 MiB; threads: 9; file descriptors: 969
2023-11-08 11:13:24 heap: 61.34 MiB; threads: 9; file descriptors: 0
2023-11-08 11:23:24 heap: 61.34 MiB; threads: 9; file descriptors: 0
--8<---------------cut here---------------end--------------->8---

Looking at the logs, the FD leak may come from this:

--8<---------------cut here---------------start------------->8---
2023-11-08 09:03:35 GET /eval/903503
2023-11-08 09:03:35 In cuirass/http.scm:
2023-11-08 09:03:35   1074:25 11 (url-handler _ _ _)
2023-11-08 09:03:35     295:4 10 (evaluation-html-page #<<evaluation-summary> id: 90350?> ?)
2023-11-08 09:03:35 In cuirass/logging.scm:
2023-11-08 09:03:35    111:18  9 (call-with-time-logging "builds request for evaluation?" ?)
2023-11-08 09:03:35 In ice-9/boot-9.scm:
2023-11-08 09:03:35   1752:10  8 (with-exception-handler _ _ #:unwind? _ # _)
2023-11-08 09:03:35 In cuirass/utils.scm:
2023-11-08 09:03:35     99:24  7 (_)
2023-11-08 09:03:35 In cuirass/database.scm:
2023-11-08 09:03:35    1503:2  6 (_ _)
2023-11-08 09:03:35   1439:28  5 (proc _)
2023-11-08 09:03:35 In ice-9/boot-9.scm:
2023-11-08 09:03:35   1685:16  4 (raise-exception _ #:continuable? _)
2023-11-08 09:03:35 In cuirass/utils.scm:
2023-11-08 09:03:35     96:12  3 (_ #<&compound-exception components: (#<&error> #<&orig?>)
2023-11-08 09:03:35 In fibers/operations.scm:
2023-11-08 09:03:35    154:10  2 (perform-operation _)
2023-11-08 09:03:35 In fibers/scheduler.scm:
2023-11-08 09:03:35     357:6  1 (suspend-current-task _)
2023-11-08 09:03:35 In ice-9/boot-9.scm:
2023-11-08 09:03:35   1685:16  0 (raise-exception _ #:continuable? _)
2023-11-08 09:03:35 Attempt to suspend fiber within continuation barrier
--8<---------------cut here---------------end--------------->8---

Fortunately, this is easy to reproduce:

--8<---------------cut here---------------start------------->8---
$ ./pre-inst-env guile
GNU Guile 3.0.9
Copyright (C) 1995-2023 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
scheme@(guile-user)> ,use(fibers)
scheme@(guile-user)> ,use(cuirass utils)
scheme@(guile-user)> (run-fibers
(lambda ()
  (define pool (make-resource-pool (iota 10)))
  (with-resource-from-pool pool x (pk 'x x) (throw 'doh!))))

;;; (x 0)
Uncaught exception in task:
In fibers.scm:
   186:20  9 (_)
   145:21  8 (_)
In ice-9/boot-9.scm:
  1752:10  7 (with-exception-handler _ _ #:unwind? _ #:unwind-for-type _)
In cuirass/utils.scm:
    99:24  6 (_)
In current input:
     6:44  5 (_ _)
In ice-9/boot-9.scm:
  1685:16  4 (raise-exception _ #:continuable? _)
In cuirass/utils.scm:
    96:12  3 (_ #<&compound-exception components: (#<&error> #<&irritants irritants: ()> #<&exception-with-kind-and-args kind: doh! args: ()>)>)
In fibers/operations.scm:
   154:10  2 (perform-operation _)
In fibers/scheduler.scm:
    357:6  1 (suspend-current-task _)
In ice-9/boot-9.scm:
  1685:16  0 (raise-exception _ #:continuable? _)
ice-9/boot-9.scm:1685:16: In procedure raise-exception:
Attempt to suspend fiber within continuation barrier
--8<---------------cut here---------------end--------------->8---

To be continued…

Ludo’.




^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#67041: [cuirass] cuirass-web crash
  2023-11-16 16:09 ` Ludovic Courtès
@ 2023-11-16 22:26   ` Ludovic Courtès
  2023-11-23 11:43     ` Ludovic Courtès
  0 siblings, 1 reply; 4+ messages in thread
From: Ludovic Courtès @ 2023-11-16 22:26 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 67041, guix-sysadmin

Ludovic Courtès <ludo@gnu.org> skribis:

> scheme@(guile-user)> (run-fibers
> (lambda ()
>   (define pool (make-resource-pool (iota 10)))
>   (with-resource-from-pool pool x (pk 'x x) (throw 'doh!))))
>
> ;;; (x 0)
> Uncaught exception in task:
> In fibers.scm:
>    186:20  9 (_)
>    145:21  8 (_)
> In ice-9/boot-9.scm:
>   1752:10  7 (with-exception-handler _ _ #:unwind? _ #:unwind-for-type _)
> In cuirass/utils.scm:
>     99:24  6 (_)
> In current input:
>      6:44  5 (_ _)
> In ice-9/boot-9.scm:
>   1685:16  4 (raise-exception _ #:continuable? _)
> In cuirass/utils.scm:
>     96:12  3 (_ #<&compound-exception components: (#<&error> #<&irritants irritants: ()> #<&exception-with-kind-and-args kind: doh! args: ()>)>)
> In fibers/operations.scm:
>    154:10  2 (perform-operation _)
> In fibers/scheduler.scm:
>     357:6  1 (suspend-current-task _)
> In ice-9/boot-9.scm:
>   1685:16  0 (raise-exception _ #:continuable? _)
> ice-9/boot-9.scm:1685:16: In procedure raise-exception:
> Attempt to suspend fiber within continuation barrier

This is fixed by Cuirass commit
7c697ad7f15c13264615d2b6c9165b21abaf61dd.

Ludo’.




^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#67041: [cuirass] cuirass-web crash
  2023-11-16 22:26   ` Ludovic Courtès
@ 2023-11-23 11:43     ` Ludovic Courtès
  0 siblings, 0 replies; 4+ messages in thread
From: Ludovic Courtès @ 2023-11-23 11:43 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: guix-sysadmin, 67041-done

Ludovic Courtès <ludo@gnu.org> skribis:

> This is fixed by Cuirass commit
> 7c697ad7f15c13264615d2b6c9165b21abaf61dd.

Included in the ‘cuirass’ package update in Guix commit
300e9ad43d1f7a10013aa0724ed3aeb7d93500c1, now deployed on berlin and its
x86 build nodes.

Ludo'.




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-11-23 11:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-10 13:59 bug#67041: [cuirass] cuirass-web crash Maxim Cournoyer
2023-11-16 16:09 ` Ludovic Courtès
2023-11-16 22:26   ` Ludovic Courtès
2023-11-23 11:43     ` Ludovic Courtès

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).