From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.2 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF shortcircuit=no autolearn=ham autolearn_force=no version=3.4.6 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 9E7B11F4C1; Tue, 12 Nov 2024 21:54:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=80x24.org; s=selector1; t=1731448472; bh=ln0jnv2L9cs49IBl/nUfoe+rLySa9FshjbcdYkAW3No=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=RnJn3vGEokAv9puNOvw7xfiniK6n9NMIWa9TkWSUb4LXzktIiqu2Gw/FIka8iyJ8r UDRKaHljzNj2uOIIfcPnRS1FNH9Gmh3rvLF3RNlslFkf+chh+3fiRjWZUA26vXMTSp wmPcFh2j76bRyM6naXRCvPkXe02HYumTHGP/D5uw= Date: Tue, 12 Nov 2024 21:54:32 +0000 From: Eric Wong To: Jonathan Corbet Cc: meta@public-inbox.org Subject: Re: Occasional public-inbox-httpd flakiness Message-ID: <20241112215432.M457200@dcvr> References: <875xp15n3o.fsf@trenco.lwn.net> <20241105232445.M291444@dcvr> <87zfm4qn6t.fsf@trenco.lwn.net> <20241112192002.M245700@dcvr> <87o72kp2kz.fsf@trenco.lwn.net> <20241112214116.M819944@dcvr> <87bjykp1kl.fsf@trenco.lwn.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <87bjykp1kl.fsf@trenco.lwn.net> List-Id: Jonathan Corbet wrote: > Eric Wong writes: > > Jonathan Corbet wrote: > >> Eric Wong writes: > >> > Jonathan Corbet wrote: > >> >> Just to add a data point...the problem just recurred, and there are > >> >> definitely cat-file processes running: > >> >> > >> >> $ ps ax | fgrep git > >> >> 2640024 ? S 0:40 /usr/bin/git --git-dir=repos/ALL.git -c core.abbrev=40 cat-file --batch > >> >> 2735080 ? S 0:18 /usr/bin/git --git-dir=repos/ALL.git -c core.abbrev=40 cat-file --batch > >> >> 3184082 ? S 0:03 /usr/bin/git --git-dir=repos/ALL.git -c core.abbrev=40 cat-file --batch > >> >> 3723223 ? Z 0:00 [git] > >> >> 3723227 ? Z 0:00 [git] > >> > > >> > Can you see if the worker process causing warnings is connected to defunct gits? > >> > Should've been fixed in master a while ago, but there's a lot of changes :x > >> > master should be fine as long as you're not using -cindex + coderepos yet. > >> > >> By "connected to" you mean "is the parent of"? > > > > Yes. lsof +E should show how pipes are connecting processes. > > Just wondering, those git zombies lingered until the restart, right? > > > > IOW, they didn't disappear after a few seconds if the -httpd > > worker was busy with other things. During heavy traffic you'll > > inevitably see short-lived zombies as the -httpd may not reap > > fast enough, but zombies shouldn't linger indefinitely. > > Looking back through the terminal history, it looks like the zombies > hung out for a bit, but then went away. There were a couple of zombies > every time I looked, but the PIDs eventually changed. OK. Any idea how long the zombies lingered? How much load the LWN -httpd instance see? AFAIK, the main cause of zombies I've seen was from the blob solver, but that requires coderepos, which AFAIK nobody else ever used... Also, it would likely be useful if you got an strace on the worker PID that was causing warnings