From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp12.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms9.migadu.com with LMTPS id SE2kOfLAcWTsMAEASxT56A (envelope-from ) for ; Sat, 27 May 2023 10:36:03 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp12.migadu.com with LMTPS id KFgtOfLAcWSiCQAAauVa8A (envelope-from ) for ; Sat, 27 May 2023 10:36:02 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 9358932226 for ; Sat, 27 May 2023 10:36:02 +0200 (CEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1q2pOU-0001Oz-JN; Sat, 27 May 2023 04:35:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q2pOS-0001Ok-0J for guix-devel@gnu.org; Sat, 27 May 2023 04:35:20 -0400 Received: from mail-40136.proton.ch ([185.70.40.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q2pON-0006QO-Ih for guix-devel@gnu.org; Sat, 27 May 2023 04:35:19 -0400 Date: Sat, 27 May 2023 08:34:51 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lendvai.name; s=protonmail2; t=1685176508; x=1685435708; bh=zqljD9hpqau3N7KOBbHt3gqGmZG1QIXimWPhsYNH8og=; h=Date:To:From:Subject:Message-ID:Feedback-ID:From:To:Cc:Date: Subject:Reply-To:Feedback-ID:Message-ID:BIMI-Selector; b=mtmCjgk6VEGI3iv9wC+aVzifUz7gsdwI7w+hLR+CHjkoLu13cbbFf/02uko0Pf9x0 /DzTcYwTmj7/HS91Ri2DbP0O3E/5Wi1aQcybgUkaHaHlbB6GVfJ9ESDozsSv9f55/V bEqJ/HfHCYuvulmtA84XD4JrzsEglg9H/dkD7j7llGZd8MmD6TYVKwTrzkgLtMgb3I hAh0piFg6oAuB9k2rr6fmzALtYErcO/b7dXLa5rZ6anbhN5No6U8x+n22P3tpVoA5Q YWGWtoqGDXrWPI2rmd/MKqtaJLZmn4Jg2rSvPsTAj9vQC5QGmdejtJW2EipsG+CVju fQbR7e8KI3r7w== To: guix-devel From: Attila Lendvai Subject: shepherd respawn frequency Message-ID: Feedback-ID: 28384833:user:proton MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=185.70.40.136; envelope-from=attila@lendvai.name; helo=mail-40136.proton.ch X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: guix-devel-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN ARC-Seal: i=1; s=key1; d=yhetil.org; t=1685176562; a=rsa-sha256; cv=none; b=cYLlXZ3o1/9LMBle4iw11hl9daWNATVULLOanJa/b7bgcY9KHl74ll8augcP0zajt/Dtzy n6tjavaDWJDDek4ZSPfBijgpoOaIEP3Y16UccKvjstPkJd+jP7GxA3v69CI9Gi7OPHni8I erJ80zLVmh2hMYtCApDX/2nH4xexvWTMrMfdIqQ0aN5G1dSDZ2OauIYa5+u7qjFFOIomEE bMOEqJwf50z4upV6r3zTrUH2ZHT0xmL9k9zikVz9nzpXnixwk0vGSyaxMUTNL4d8KiZIX8 bXSQcLy0qbaGukXeh+lxxQ7X70eLswvRhx4T9UlrKXikJBPYHiY5fOB4xSCM7g== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=lendvai.name header.s=protonmail2 header.b=mtmCjgk6; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1685176562; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=zqljD9hpqau3N7KOBbHt3gqGmZG1QIXimWPhsYNH8og=; b=AV+dB60oIRet9+vAJWT09MpAUBLlEX+CzPZLoowhzb6Ek2WeD7Q0S7j2ZPMNGJaT2/TClX mqhUOkfIl96aButEQjcWoMUbygruVIUsVtv+f/vV4KTEr8vq4IklVxoYx04ih343qhmXRh qSkByPkbIUqIGuw1Vg5WCOoYdTBtqKhvUQvMT6vD/CZrRzLlhjfh4GUpQ49ksIYTrSJDrn 5Z2E9QGcvPG1wFnCyaY1oPUS24/64NnxqGBOowXUo+0Wt+q8YCrvEC2im5PH6VkAGmRb07 UA3cGEv1ei/Vgq71S9JixuRp14ccmv48rrj35vRKKtfSLtLT/aBQJgLKD1OMCQ== X-Migadu-Spam-Score: -4.06 X-Spam-Score: -4.06 X-Migadu-Queue-Id: 9358932226 X-Migadu-Scanner: scn0.migadu.com Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=lendvai.name header.s=protonmail2 header.b=mtmCjgk6; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-TUID: pfQ3+GBAvFZi dear guix, the issue at hand: i have a daemon that simply quits when some of its running condition is not= satisfied. this can be dependent on unpredictable external factors, like t= he temporary unreachability of a remote service. shepherd respawns it immediately in RESPAWN-SERVICE, without any delay, whi= ch leads to a kind of a busy loop (i noticed this through the fan noise of = the machine). i know that there's a stopgap measure to disable such service= s, but: 1) some of these daemons struggle long enough before quitting that they do not trigger the default RESPAWN-LIMIT-HIT? stopgap measure 2) i *do* want shepherd to keep restarting them indefinitely, but not immediately after their premature exit proposed solution: would the shepherd maintaners (looking at you Ludo :) accept a change that = introduces a new field into called RESPAWN-DELAY, and issue a fib= er sleep in RESPAWN-SERVICE when it is not #false, and the daemon process q= uits unexpectedly? in an initial commit i'd also turn the global variable called RESPAWN-LIMIT= into a field of , and make it take its default value from a prope= rly named %RESPAWN-LIMIT global variable. open questions: - what should be the default value of the respawn delay? i suggest 5 seconds, and i'd argue against it being disabled by default: - premature exits happen more frequently at startup than in an already running process - an unwanted default respawn delay causes less headache than an unwanted busy loop. - if the respawn delay is set, then should respawn-limit be ignored? IOW, should the logic treat them as two independent variables, or should it not? and should there be some logic in how/where they take their defaults from? my pick: treat them as two independent variables, but when the user explicitly specifies a respawn delay for the service object, then there shouldn't be any respawn limit, unless the user also explicitly specifies it on the object. corollary: the handling of defaults should be implemented so that the fields of hold #false as default value, in which case the logic takes the default value from a global variable in shepherd. - should i bother with detecting a first respawn in a given past period (of e.g. 1 minute?), and do not apply any delay when this is the first respawn in that time window? this adds extra complexity, which may not be worth it. i'd go with a pass here. - after a cursory look, i don't understand the relationship between RESPAWNS and FAILURES. the former seems to be an endlessly growing list of timestamps, while the latter is a ring buffer of timestamps. it's not crucial for me to understand it, but i wonder if there's a bug lurking there that eats up the heap when a service keeps respawning without any delay? i'm all ears for suggestions, and i'm also happy to hand over the implement= ation to someone else, who already had plans to do it, and knows the intern= als of shepherd better than me. --=20 =E2=80=A2 attila lendvai =E2=80=A2 PGP: 963F 5D5F 45C7 DFCD 0A39 -- =E2=80=9CLooking back on a 30-year teaching career full of rewards and priz= es, somehow I can't completely believe that I spent my time on earth instit= utionalized; I can't believe that centralized schooling is allowed to exist= at all as a gigantic indoctrination and sorting machine, robbing people of= their children. Did it really happen? Was this my life? God help me.= =E2=80=9D =09=E2=80=94 John Taylor Gatto (1935=E2=80=932018), Teacher of the Year, bo= th in New York City and State, multiple times