From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp12.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id AN46BX5pC2IWkwAAgWs5BA (envelope-from ) for ; Tue, 15 Feb 2022 09:51:10 +0100 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp12.migadu.com with LMTPS id 0Gb5AX5pC2LKfAEAauVa8A (envelope-from ) for ; Tue, 15 Feb 2022 09:51:10 +0100 Received: from mail.notmuchmail.org (yantan.tethera.net [IPv6:2a01:4f9:c011:7a79::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 5F06B20E30 for ; Tue, 15 Feb 2022 09:51:09 +0100 (CET) Received: from yantan.tethera.net (localhost [127.0.0.1]) by mail.notmuchmail.org (Postfix) with ESMTP id AF74C5F6ED; Tue, 15 Feb 2022 08:51:06 +0000 (UTC) Received: from mailproxy06.manitu.net (mailproxy06.manitu.net [217.11.48.70]) by mail.notmuchmail.org (Postfix) with ESMTPS id 3957A5F6DD for ; Tue, 15 Feb 2022 08:51:03 +0000 (UTC) Received: from localhost (200116b860c406003f2e2ad43d6940fc.dip.versatel-1u1.de [IPv6:2001:16b8:60c4:600:3f2e:2ad4:3d69:40fc]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: michael@grubix.eu) by mailproxy06.manitu.net (Postfix) with ESMTPSA id 11B2A5800AE; Tue, 15 Feb 2022 09:51:02 +0100 (CET) MIME-Version: 1.0 In-Reply-To: References: <164458773197.3086.16103597141743611268.git@grubix.eu> <164467277576.6467.13919733764427871872.git@grubix.eu> <874k54xfuw.fsf@tethera.net> <87v8xjx048.fsf@tethera.net> <878rudwnrt.fsf@tethera.net> Subject: Re: Test suite timing issues? From: Michael J Gruber To: David Bremner , Tomi Ollila Message-ID: <164491506123.3239.10549568704571641948.git@grubix.eu> Date: Tue, 15 Feb 2022 09:51:01 +0100 User-Agent: alot/0.10 Message-ID-Hash: DAOASW2GMXXMO6MT3TY72ZPHFIMI5P7H X-Message-ID-Hash: DAOASW2GMXXMO6MT3TY72ZPHFIMI5P7H X-MailFrom: michael@grubix.eu X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-notmuch.notmuchmail.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: notmuch@notmuchmail.org X-Mailman-Version: 3.3.3 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_IN X-Migadu-Country: DE ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1644915069; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-owner:list-unsubscribe:list-subscribe:list-post; bh=M2JLFcK2PTct+SHwwz10e9KoKoXBgqV1jzOZ2r4o2IE=; b=XYNcjE4Q8rh5lNyAFHT3H77BvpGFmtr4eFqYFgfEZN6J2IewfN2AefeLEphDoULbKfNdWq s1AWDd4bgnfiQC7JNrS2V/eawFno/o+Sqv4re4ZnsAlr2xoxy5M0oZVVteqaA+ysTfNS9u c5uPdtlS8LdDstjXFxJ8D17sWQ313gx8IfyW4zR9lgrIgziJxCdXl8CYff0AYvBovyg7/f VyoQElGaD6PZpKM/fJeLJwu+LRVKmgecTQakCgkOEZ6IfPA9pD1AAQszcxTXobGZaVg4HB 2U2Z5e2ZWgkZD/FVzEjCetG1va2NO10k2+tKyxgFy0o/sehgeH03F0rObxgptA== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1644915069; a=rsa-sha256; cv=none; b=FUXiH9MvHecNPb5z7VUVoWuUhT/LXjU6kqeNZuuwhb79kVq6dD93vLb9FRm5Y9ibS6jr2x JCaiAFJVrVUFVrsCddDGTZhdB/xnB8ZwTlmuMpmaDQRnJnUPXkQ0Ev7qCDn7/PwSlcoEOs AdoGxMVfTSmfDxHFQ/GJBw+hxP+w8vuh/nvq3b5ewe8pYPBFYv3ydibpoGqJeUmRej1G1q ToqHvhWArC0sBv/YWL2vbAFvvIqbPe9iqbA28N3tUvN+xysxQT7wn/a6Lc/6JT8O33b1Yv Ioxsb2UpikWRqZo6TtVz5X9LbhrodFmP69vPFE3d0WB//NalaME8aRrPG2EKFw== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 2a01:4f9:c011:7a79::1 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org X-Migadu-Spam-Score: -0.55 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 2a01:4f9:c011:7a79::1 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org X-Migadu-Queue-Id: 5F06B20E30 X-Spam-Score: -0.55 X-Migadu-Scanner: scn1.migadu.com X-TUID: 02eR1ZSCofFm Tomi Ollila venit, vidit, dixit 2022-02-14 23:53:54: > On Mon, Feb 14 2022, David Bremner wrote: > > > Tomi Ollila writes: > > > >> > >> Looked notmuch-new.c -- time_t (seconds since epoch) is used as timestamp > >> comparisons (which would indicate the subsecond resolution most fs' provide > >> is not used)... > >> > >> ... and if so, I wonder why some of our tests are not failing all the time > >> for everyone...? > > > > Not claiming everything is fine, but there is code there targetted at > > the failure mode you mentioned: > > > > /* If the directory's mtime is the same as the wall-clock time > > * when we stat'ed the directory, we skip updating the mtime in > > * the database because a message could be delivered later in this > > * same second. This may lead to unnecessary re-scans, but it > > * avoids overlooking messages. */ > > if (fs_mtime != stat_time) > > _filename_list_add (state->directory_mtimes, path)->mtime = fs_mtime; > > This sure had to be tested... :D > > so I outcommented the line as // if (fs_mtime != stat_time) > > and then build and run test -- a lot of failures... > > also > > ./test/T750-gzip.sh 2>&1 | grep -e PASS -e FAIL > > PASS Single new gzipped message > PASS Single new gzipped message (full-scan) > FAIL Multiple new messages, one gzipped > FAIL Multiple new messages, one gzipped (full-scan) > FAIL Renamed (gzipped) message > PASS notmuch search with partially gzipped mail store > FAIL notmuch search --output=files with partially gzipped mail store > PASS show un-gzipped message > PASS show un-gzipped message (format mbox) > PASS show un-gzipped message (format raw) > FAIL show gzipped message > FAIL show gzipped message (mbox) > FAIL show gzipped message (raw) > PASS new doesn't run out of file descriptors with many gzipped files > > (above was "lucky" run, usually that 6th test, ...partially gzipped... > test also FAILed (I'd guess second happened to change there)). > > then restored the fs_mtime != stat_time line -- then all of 750 passed. > > (finally, run that 750-gzip in a loop (dropped last, slow test), hundreds > of times already -- no FAILures... (ecryptfs on ext4)) > > Tomi > > > BTW, I have so far run the test suite 68 times in a row without failures > > on a Debian s390x host. The file system is ext4, mounted relatime. It > > would be interesting to know what file system is yielding the failures > > Michael is seeing. Thanks for the detailed analysis. It convinces me that on notmuch's side everything is OK. This very much boils down to fs and stat issues. If I remember correctly, I've seen this isse once on a different release frpm epel-8 but only on epel-8 otherwise. This is on a build infrastructure where you may end up getting different hosts on each run, primed from the same "chroot". I'll try to find out more. Until now, we have been building notmuch release packages on Fedora without running the tests during the release build, because there used to be issues with the test suite (and before that that the separate test corpus download). With everything looking robust for a while, I'm switching tests on for Fedora release builds now, and at the same start packaging those extra packages for RedHat's enterprise platform. I want to avoid spurious release build failures, of course - otoh having those tests run is a good thing. So I'll try out a bit more, and if everything else fails, then running the test suite with --full-scan is still better than not running it at all. (I would have suggested a make variable, but it's really a corner case. given the stat/mtime check). Michael