all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: ludo@gnu.org (Ludovic Courtès)
To: 31925@debbugs.gnu.org, Andy Wingo <wingo@igalia.com>
Subject: bug#31925: 'guix substitutes' sometimes hangs on glibc 2.27
Date: Wed, 04 Jul 2018 18:58:30 +0200	[thread overview]
Message-ID: <87tvpfaqfd.fsf@gnu.org> (raw)
In-Reply-To: <874lhffpnq.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Wed, 04 Jul 2018 09:03:53 +0200")

(+Cc: Andy as the ultimate authority for all these things.  :-))

ludo@gnu.org (Ludovic Courtès) skribis:

> (let loop ((files files)
>            (n 0))
>   (match files
>     ((file . tail)
>      (call-with-input-file file
>        (lambda (port)
>          (call-with-decompressed-port 'gzip port
>            (lambda (port)
>              (let loop ()
>                (unless (eof-object? (get-bytevector-n port 777))
>                  (loop)))))))
>      ;; (pk 'loop n file)
>      (display ".")
>      (loop tail (+ n 1)))))

One problem I’ve noticed is that the child process that
‘call-with-decompressed-port’ spawns would be stuck trying to get the
allocation lock:

--8<---------------cut here---------------start------------->8---
(gdb) bt
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f9fd8d5cb25 in __GI___pthread_mutex_lock (mutex=0x7f9fd91b3240 <GC_allocate_ml>) at ../nptl/pthread_mutex_lock.c:78
#2  0x00007f9fd8f8ef8f in GC_call_with_alloc_lock (fn=fn@entry=0x7f9fd92b0420 <do_copy_weak_entry>, client_data=client_data@entry=0x7ffe4b9a0d80) at misc.c:1929
#3  0x00007f9fd92b1270 in copy_weak_entry (dst=0x7ffe4b9a0d70, src=0x759ed0) at weak-set.c:124
#4  weak_set_remove_x (closure=0x8850c0, pred=0x7f9fd92b0440 <eq_predicate>, hash=3944337866184184181, set=0x70cf00) at weak-set.c:615
#5  scm_c_weak_set_remove_x (set=set@entry=#<weak-set 756df0>, raw_hash=<optimized out>, pred=pred@entry=0x7f9fd92b0440 <eq_predicate>, closure=closure@entry=0x8850c0) at weak-set.c:791
#6  0x00007f9fd92b13b0 in scm_weak_set_remove_x (set=#<weak-set 756df0>, obj=obj@entry=#<port 2 8850c0>) at weak-set.c:812
#7  0x00007f9fd926f72f in close_port (port=#<port 2 8850c0>, explicit=<optimized out>) at ports.c:884
#8  0x00007f9fd92ad307 in vm_regular_engine (thread=0x7f9fd91b3240 <GC_allocate_ml>, vp=0x7adf30, registers=0x0, resume=-657049556) at vm-engine.c:786
#9  0x00007f9fd92afb37 in scm_call_n (proc=<error reading variable: ERROR: Cannot access memory at address 0xd959b030>0x7f9fd959b030, argv=argv@entry=0x7ffe4b9a1018, nargs=nargs@entry=1) at vm.c:1257
#10 0x00007f9fd9233017 in scm_primitive_eval (exp=<optimized out>, exp@entry=<error reading variable: ERROR: Cannot access memory at address 0xd5677cf8>0x855280) at eval.c:662
#11 0x00007f9fd9233073 in scm_eval (exp=<error reading variable: ERROR: Cannot access memory at address 0xd5677cf8>0x855280, module_or_state=module_or_state@entry=<error reading variable: ERROR: Cannot access memory at address 0xd95580d8>0x83d140) at eval.c:696
#12 0x00007f9fd927e8d0 in scm_shell (argc=2, argv=0x7ffe4b9a1668) at script.c:454
#13 0x00007f9fd9249a9d in invoke_main_func (body_data=0x7ffe4b9a1510) at init.c:340
#14 0x00007f9fd922c28a in c_body (d=0x7ffe4b9a1450) at continuations.c:422
#15 0x00007f9fd92ad307 in vm_regular_engine (thread=0x7f9fd91b3240 <GC_allocate_ml>, vp=0x7adf30, registers=0x0, resume=-657049556) at vm-engine.c:786
#16 0x00007f9fd92afb37 in scm_call_n (proc=proc@entry=#<smob catch-closure 795120>, argv=argv@entry=0x0, nargs=nargs@entry=0) at vm.c:1257
#17 0x00007f9fd9231e69 in scm_call_0 (proc=proc@entry=#<smob catch-closure 795120>) at eval.c:481
#18 0x00007f9fd929e7b2 in catch (tag=tag@entry=#t, thunk=#<smob catch-closure 795120>, handler=<error reading variable: ERROR: Cannot access memory at address 0x400000000>0x7950c0, pre_unwind_handler=<error reading variable: ERROR: Cannot access memory at address 0x400000000>0x7950a0) at throw.c:137
#19 0x00007f9fd929ea95 in scm_catch_with_pre_unwind_handler (key=key@entry=#t, thunk=<optimized out>, handler=<optimized out>, pre_unwind_handler=<optimized out>) at throw.c:254
#20 0x00007f9fd929ec5f in scm_c_catch (tag=tag@entry=#t, body=body@entry=0x7f9fd922c280 <c_body>, body_data=body_data@entry=0x7ffe4b9a1450, handler=handler@entry=0x7f9fd922c510 <c_handler>, handler_data=handler_data@entry=0x7ffe4b9a1450, pre_unwind_handler=pre_unwind_handler@entry=0x7f9fd922c370 <pre_unwind_handler>, pre_unwind_handler_data=0x7a9bc0) at throw.c:377
#21 0x00007f9fd922c870 in scm_i_with_continuation_barrier (body=body@entry=0x7f9fd922c280 <c_body>, body_data=body_data@entry=0x7ffe4b9a1450, handler=handler@entry=0x7f9fd922c510 <c_handler>, handler_data=handler_data@entry=0x7ffe4b9a1450, pre_unwind_handler=pre_unwind_handler@entry=0x7f9fd922c370 <pre_unwind_handler>, pre_unwind_handler_data=0x7a9bc0) at continuations.c:360
#22 0x00007f9fd922c905 in scm_c_with_continuation_barrier (func=<optimized out>, data=<optimized out>) at continuations.c:456
#23 0x00007f9fd929d3ec in with_guile (base=base@entry=0x7ffe4b9a14b8, data=data@entry=0x7ffe4b9a14e0) at threads.c:661
#24 0x00007f9fd8f8efb8 in GC_call_with_stack_base (fn=fn@entry=0x7f9fd929d3a0 <with_guile>, arg=arg@entry=0x7ffe4b9a14e0) at misc.c:1949
#25 0x00007f9fd929d708 in scm_i_with_guile (dynamic_state=<optimized out>, data=data@entry=0x7ffe4b9a14e0, func=func@entry=0x7f9fd9249a80 <invoke_main_func>) at threads.c:704
#26 scm_with_guile (func=func@entry=0x7f9fd9249a80 <invoke_main_func>, data=data@entry=0x7ffe4b9a1510) at threads.c:710
#27 0x00007f9fd9249c32 in scm_boot_guile (argc=argc@entry=2, argv=argv@entry=0x7ffe4b9a1668, main_func=main_func@entry=0x400cb0 <inner_main>, closure=closure@entry=0x0) at init.c:323
#28 0x0000000000400b70 in main (argc=2, argv=0x7ffe4b9a1668) at guile.c:101
(gdb) info threads
  Id   Target Id         Frame 
* 1    Thread 0x7f9fd972eb80 (LWP 15573) "guile" __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
--8<---------------cut here---------------end--------------->8---

So it seems quite clear that the thing has the alloc lock taken.  I
suppose this can happen if one of the libgc threads runs right when we
call fork and takes the alloc lock, right?

If that is correct, the fix would be to call fork within
‘GC_call_with_alloc_lock’.

How does that sound?

As a workaround on the Guix side, we might achieve the same effect by
calling ‘gc-disable’ right before ‘primitive-fork’.

Ludo’.

  reply	other threads:[~2018-07-04 17:00 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-21 11:45 bug#31925: 'guix substitutes' sometimes hangs on glibc 2.27 Ludovic Courtès
2018-06-21 14:10 ` Ricardo Wurmus
2018-07-04  7:03 ` Ludovic Courtès
2018-07-04 16:58   ` Ludovic Courtès [this message]
2018-07-05  3:33     ` Mark H Weaver
2018-07-05  8:00       ` Andy Wingo
2018-07-05 10:05         ` Mark H Weaver
2018-07-05 14:04           ` Andy Wingo
2018-07-05 12:27         ` Ludovic Courtès
2018-07-05 14:08           ` Andy Wingo
2018-07-06 15:35             ` Ludovic Courtès
2018-07-05  8:34       ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tvpfaqfd.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=31925@debbugs.gnu.org \
    --cc=wingo@igalia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.