From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id SZg7OYmRKV9bMAAA0tVLHw (envelope-from ) for ; Tue, 04 Aug 2020 16:49:13 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id OF/3NImRKV8wBQAAB5/wlQ (envelope-from ) for ; Tue, 04 Aug 2020 16:49:13 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 929849404D3 for ; Tue, 4 Aug 2020 16:49:13 +0000 (UTC) Received: from localhost ([::1]:50768 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k307k-0001uo-GB for larch@yhetil.org; Tue, 04 Aug 2020 12:49:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55706) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1k307a-0001sb-5j for bug-guix@gnu.org; Tue, 04 Aug 2020 12:49:02 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:37482) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1k307Z-0003SM-RX for bug-guix@gnu.org; Tue, 04 Aug 2020 12:49:01 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1k307Z-0003oG-N0 for bug-guix@gnu.org; Tue, 04 Aug 2020 12:49:01 -0400 X-Loop: help-debbugs@gnu.org Subject: bug#42548: Cuirass 504 errors Resent-From: Mathieu Othacehe Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Tue, 04 Aug 2020 16:49:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 42548 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: 42548@debbugs.gnu.org Received: via spool by 42548-submit@debbugs.gnu.org id=B42548.159655971814615 (code B ref 42548); Tue, 04 Aug 2020 16:49:01 +0000 Received: (at 42548) by debbugs.gnu.org; 4 Aug 2020 16:48:38 +0000 Received: from localhost ([127.0.0.1]:49028 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k307B-0003nf-SM for submit@debbugs.gnu.org; Tue, 04 Aug 2020 12:48:38 -0400 Received: from eggs.gnu.org ([209.51.188.92]:45484) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k3078-0003nQ-1m for 42548@debbugs.gnu.org; Tue, 04 Aug 2020 12:48:36 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:39679) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k3072-0003Q4-Oq for 42548@debbugs.gnu.org; Tue, 04 Aug 2020 12:48:28 -0400 Received: from [109.190.253.14] (port=52204 helo=cervin) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1k3072-0000QH-3V for 42548@debbugs.gnu.org; Tue, 04 Aug 2020 12:48:28 -0400 From: Mathieu Othacehe References: <87eeoy9rzk.fsf@gnu.org> <86zh7kzjys.fsf@gmail.com> <87sgdckscm.fsf@gnu.org> <86d04gyqfs.fsf@gmail.com> <87mu3jheme.fsf@gnu.org> Date: Tue, 04 Aug 2020 18:48:24 +0200 In-Reply-To: <87mu3jheme.fsf@gnu.org> (Mathieu Othacehe's message of "Tue, 28 Jul 2020 16:56:57 +0200") Message-ID: <87bljq8ihz.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 1.3 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-Spam-Score: 0.3 (/) X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: "bug-Guix" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of bug-guix-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=bug-guix-bounces@gnu.org X-Spam-Score: 0.29 X-TUID: Uye3Yd8etSg5 Hello, > that suggests that we try to write something to a closed file. > > To be investigated :) Ok, so I have a better grasp on what's going on. Cuirass web server is receiving some requests such as "/builds/1234)" which were not rejected, but worst, caused SQL queries such as "select * from Builds". As the table is quite large, it caused some of the DB workers to hang. Once all DB workers were hanging, the queries started to accumulate until the open fd limit (1024) was reached. I did consolidate the HTTP queries validation, and Cuirass web server is now running since 48 hours, which has not happened in months I think. I also added some warnings to detect DB workers hanging for more than 5 seconds. The next step is to log all SQL queries using[1]. This should allow us to spot this kind of issues more easily. Logging the duration of each query should also help us to optimize the queries. I'm still waiting a few days before closing this issue. Thanks, Mathieu [1]: https://notabug.org/guile-sqlite3/guile-sqlite3/pulls/16