From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pjotr Prins Subject: Re: System monitoring Date: Sun, 29 Dec 2019 16:07:41 -0600 Message-ID: <20191229220741.udqedz5fpbuezmjj@thebird.nl> References: <20191228170317.2zfdsoewk4cfkzyf@thebird.nl> <87k16enbhn.fsf@guixSD.i-did-not-set--mail-host-address--so-tickle-me> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:470:142:3::10]:35520) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ilgsV-0003C0-HA for guix-devel@gnu.org; Sun, 29 Dec 2019 17:17:40 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ilgsU-000622-Dh for guix-devel@gnu.org; Sun, 29 Dec 2019 17:17:39 -0500 Content-Disposition: inline In-Reply-To: <87k16enbhn.fsf@guixSD.i-did-not-set--mail-host-address--so-tickle-me> List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: =?iso-8859-1?Q?Nicol=F2?= Balzarotti Cc: Guix Devel , guix-sysadmin@gnu.org On Sun, Dec 29, 2019 at 09:05:40PM +0100, Nicol=F2 Balzarotti wrote: > I think zabbix should work, but I've never used it. On the surface, it > seems to have a steep learning curve, but this is just my impression. The problem with these systems is that they target (complex) deployments that have people watching these systems.=20 What I need is much simpler - I don't want to watch systems, but I need a cursory idea of health of say 20-40 machines out there. I also want something that can notify me if things go really wrong. For example when backups fail. These are not massive requirements - just something flexible! I used to have scripts for that that would mail/text me. But that was all a bit ad hoc and I got tired of maintaining them and I got tired of repeating notifications ;) What would be really cool is to be able to use logic programming. It would allow questions like: What services showed interruptions in the last month on low RAM machines that also ran guix < 1.0 and a specific version of nginx. This would mean storing state of machines in a database that gets updated by messages. It means a good message broker. It means that every time you write a monitoring service, you'll have to write a receiver to turn it into a datastructure something like miniKanren can solve. Key is to make *creating* such small reporter/receiver tools really easy. Visualisations are less important - though I am sure some people enjoy creating those. I.e., what I have in mind is a different type of systems monitor: a minimalistic system that is hackable and can work out of the box for Guix systems and are really easy to extend. I think if we can prototype something in the coming months it would make a great GSoC project to build out functionality. Pj.