From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ricardo Wurmus Subject: Re: bug#27476: guix pull fails on powerful server Date: Fri, 13 Oct 2017 22:29:09 +0200 Message-ID: <877evyu6zu.fsf@elephly.net> References: <87h8vvp1q7.fsf@elephly.net> <87377esu1a.fsf@gnu.org> <87k20nz18u.fsf@igalia.com> <87a81jj5gg.fsf@gnu.org> <87bmlyzxj7.fsf@elephly.net> <87shf44ny0.fsf@elephly.net> <877ew0o5br.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:45754) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e36bQ-0002VX-Az for help-guix@gnu.org; Fri, 13 Oct 2017 16:30:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1e36bP-0000jO-24 for help-guix@gnu.org; Fri, 13 Oct 2017 16:30:40 -0400 In-reply-to: <877ew0o5br.fsf@gnu.org> List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-guix-bounces+gcggh-help-guix=m.gmane.org@gnu.org Sender: "Help-Guix" To: Ludovic =?utf-8?Q?Court=C3=A8s?= Cc: Andy Wingo , help-guix@gnu.org, 27476@debbugs.gnu.org Hi Ludo, > Ricardo Wurmus skribis: > >> The following derivation will be built: >> /gnu/store/z5bhk17nxmdhvj0g4cy038p25mzh1gys-guix-latest.drv >> copying and compiling to '/gnu/store/s3s7xlqa10mvf8v0ypxz8gzw3lcf1x5z-guix-latest' with Guile 2.2.2... >> loading... 25.7% of 635 filesrandom seed for tests: 1506720257 >> loading... 99.8% of 635 files >> compiling... 69.1% of 635 filesice-9/threads.scm:289:22: In procedure loop: >> ice-9/threads.scm:289:22: Syntax error: >> guix/scripts/graph.scm:103:10: return: return used outside of 'with-monad' in form (return (package-node-edges a)) > > The program below crashes with completely surreal backtraces in less > than a minute on my 4-thread laptop: > > --8<---------------cut here---------------start------------->8--- > (use-modules (ice-9 threads) > (srfi srfi-1) > (guix monads) > (guix store)) > > (define threads > (unfold (lambda (x) (> x 100)) > (lambda (x) > (call-with-new-thread > (lambda () > (define monad > (symbol-append 'foo-monad > (string->symbol (number->string x)))) > > (while #t > (macroexpand > `(begin > (define-monad ,monad > (bind +) > (return -)) > (with-monad ,monad > (return 3)) > (mapm ,monad + '(1 2 3)))))))) > 1+ > 0)) > > (for-each join-thread threads) > --8<---------------cut here---------------end--------------->8--- > > Can you check if that also happens on your many-core machine? It does not crash. I left it running for more than an hour (without compiling) and it printed things like this: --8<---------------cut here---------------start------------->8--- … GC Warning: Repeated allocation of very large block (appr. size 57528320): May lead to memory leak and poor performance GC Warning: Repeated allocation of very large block (appr. size 57528320): May lead to memory leak and poor performance GC Warning: Repeated allocation of very large block (appr. size 57528320): May lead to memory leak and poor performance GC Warning: Repeated allocation of very large block (appr. size 57528320): May lead to memory leak and poor performance GC Warning: Repeated allocation of very large block (appr. size 14385152): May lead to memory leak and poor performance GC Warning: Repeated allocation of very large block (appr. size 14385152): May lead to memory leak and poor performance GC Warning: Repeated allocation of very large block (appr. size 57528320): May lead to memory leak and poor performance GC Warning: Repeated allocation of very large block (appr. size 28766208): May lead to memory leak and poor performance … --8<---------------cut here---------------end--------------->8--- That’s on the machine with 1.5T RAM and 192 cores. Then I ran it again for 10 minutes after compiling it. It did not crash. > The patch below seems to fix the problem: (guix monads) has shared state > (hash tables) used both at expansion-time and run-time, and it wasn’t > protected. > > My hypothesis is that this was causing random memory corruption. The > weird thing, though, is that the errors we were getting were not so > random. Also, the load phase of ‘guix pull’ is sequential. > > Could you test it and report back? I’m trying the patch right now with “guix pull”. -- Ricardo GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC https://elephly.net