From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-3.9 required=3.0 tests=AWL,BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sin.source.kernel.org (sin.source.kernel.org [IPv6:2604:1380:40e1:4800::1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 620621F59D for ; Mon, 11 Jul 2022 21:59:23 +0000 (UTC) Authentication-Results: dcvr.yhbt.net; dkim=pass (2048-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.b="F2O3pzDV"; dkim-atps=neutral Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 60790CE17E4 for ; Mon, 11 Jul 2022 21:59:21 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8A62EC3411C for ; Mon, 11 Jul 2022 21:59:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1657576759; bh=vW+hlub6XhJvFjAHBQ8gt+pxtZ0Ja2nCRWtbrBeezpI=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=F2O3pzDVleLVzkYS2MT8OwncGZiK7j1QN3WlVjQdWCzvYXTCMAsVIyWmT8mrytvv9 62ywUOiCB6ulRKmQnuzBKIJvR6+lr+yhlpT0YZhx7PhtSFJEKBKTo5t+V8mCzzhqGy o+2zrzQxMMu6r4knywNY0vez6VKkR8LfiJXcHJ6Mk4cq8jjKwGcWkAUBpHb2fMPkpv o724YqAcL5IU4NKXgKG6igWHNWWevXZk+uh9aIb8T1x3iNZAoGnTHirA+TK3gdSOin hwN5fOmNpjro4GMEgz7jZSVTdf23tOoetZyYxo4GncchHuuNrz3D3r8GceZb0HKXya 5jDKJfz9jhoNw== Received: by mail-vs1-f50.google.com with SMTP id j65so6175457vsc.3 for ; Mon, 11 Jul 2022 14:59:19 -0700 (PDT) X-Gm-Message-State: AJIora9yu3SfR8MgrLvFhHQgJ9WJtZazS2/8qjYeZ25Rn3uJBLmJJxJz EE+xBj/nu56Fw6JztNNsdmU63ZM5oGpYnDVOKg== X-Google-Smtp-Source: AGRyM1tFfI/LFWDZLKk2KV1zth6eHP4Pa7YZEMzIljtL04cuvWaDQzn6K3VsygLhB2liryGqKvKknWpwvcKhKt849Kg= X-Received: by 2002:a67:d194:0:b0:357:8ea:5554 with SMTP id w20-20020a67d194000000b0035708ea5554mr7372871vsi.0.1657576758484; Mon, 11 Jul 2022 14:59:18 -0700 (PDT) MIME-Version: 1.0 References: <20220629163033.GA14412@dcvr> <20220629172742.M978900@dcvr> <20220630085539.M324144@dcvr> In-Reply-To: <20220630085539.M324144@dcvr> From: Rob Herring Date: Mon, 11 Jul 2022 15:59:07 -0600 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: lei missing mails To: Eric Wong Cc: meta@public-inbox.org Content-Type: text/plain; charset="UTF-8" List-Id: On Thu, Jun 30, 2022 at 2:55 AM Eric Wong wrote: > > Rob Herring wrote: > > On Wed, Jun 29, 2022 at 11:27 AM Eric Wong wrote: > > > Rob Herring wrote: > > > > On Wed, Jun 29, 2022 at 10:30 AM Eric Wong wrote: > > > > > Rob Herring wrote: > > > > > > Hi, > > > > > > > > > > > > I'm using lei with lore where I have 2 queries which overlap. Really, > > > > > > one is a subset of the other. On those overlapping threads, I'm > > > > > > finding that sometimes new messages are written to one mailbox and not > > > > > > the other. (At least sometimes, the messages may be missing from all > > > > > > mailboxes sometimes too. I'm not certain.) Using --remote-fudge-time > > > > > > to force refetching seems to get the missing mails. I haven't found > > > > > > anything strange in timestamps of the missing mails, but otherwise am > > > > > > not sure how to debug this further. The queries are retrieving full > > > > > > threads and the missing mails are in the threads, but not direct > > > > > > matches to the queries. I realize that's not a lot of detail to go on. > > > > > > Suggestions on debugging this further? > > > > > > > > > > Is this with 1.8 or 1.7? > > > > > > > > Commit 68b53c888911 actually. So post 1.8. > > > > > > OK, thanks for that info. > > > > > > > > I forgot to note in the release notes, but there were some > > > > > SQLite usage-related fixes which could avoid missing messages. > > > > > > > > > > You'll need "lei daemon-kill" after upgrading to 1.8 to ensure > > > > > the new code is running. > > > > > > > > It's possible I haven't done that since updating though I do vaguely > > > > recall seeing something about needing to do that. Is there any way to > > > > tell before I restart it? > > > > > > Not really, but it's pretty cheap to restart (assuming there's no > > > long-running jobs). > > > > I've restarted and just hit this again. > > Ugh, sorry to hear that :< > > > > > > What might be interesting is to use the URLs lei prints and > > > > > comparing the results w/o lei. > > > > $ lei up --all > > # updating /home/rob/Mail/from-me > > # updating /home/rob/Mail/missing-cc > > # updating /home/rob/Mail/my-patches > > # updating /home/rob/Mail/pci > > # https://lore.kernel.org/all/ limiting to 2022-06-27 12:42 -0600 and newer > > # https://lore.kernel.org/all/ limiting to 2022-06-27 9:50 -0600 and newer > > # https://lore.kernel.org/all/ limiting to 2022-06-27 12:42 -0600 and newer > > # /usr/bin/curl -Sf -s -d '' > > https://lore.kernel.org/all/?x=m&t=1&q=(dt%3A20220529211430..+AND+(f%3Arobh%40kernel.org+OR+f%3Arobh%2Bdt%40kernel.org))+AND+dt%3A20220627184226.. > > # /home/rob/.local/share/lei/store 144/144 > > # /home/rob/.local/share/lei/store 3/3 > > # /usr/bin/curl -Sf -s -d '' > > https://lore.kernel.org/all/?x=m&t=1&q=((dfn%3Adrivers+OR+dfn%3Aarch+OR+dfn%3ADocumentation%2F*+OR+dfn%3Ainclude+OR+dfn%3Ascripts)+AND+f%3Arobh%40kernel.org+AND+rt%3A1640812470..)+AND+dt%3A20220627155025.. > > # /usr/bin/curl -Sf -s -d '' > > https://lore.kernel.org/all/?x=m&t=1&q=(l%3Alinux-pci+dfn%3Adrivers%2Fpci%2Fcontroller+dt%3A20220529211430..)+AND+dt%3A20220627184226.. > > # /home/rob/.local/share/lei/store 0/0 > > # /home/rob/.local/share/lei/store 362/362 > > # 0 written to /home/rob/Mail/missing-cc/ (0 matches) > > # https://lore.kernel.org/all/ 72/72 > > # https://lore.kernel.org/all/ 4/4 > > # https://lore.kernel.org/all/ 131/? > > # https://lore.kernel.org/all/ 184/? > > # https://lore.kernel.org/all/ 412/? > > # https://lore.kernel.org/all/ 603/? > > # https://lore.kernel.org/all/ 853/? > > # https://lore.kernel.org/all/ 1069/? > > # https://lore.kernel.org/all/ 1442/? > > # https://lore.kernel.org/all/ 1443/1443 > > # 1 written to /home/rob/Mail/pci/ (75 matches) > > # 2 written to /home/rob/Mail/my-patches/ (148 matches) > > # 7 written to /home/rob/Mail/from-me/ (1805 matches) > > > > > > What I expected was 3 messages written to 'my-patches'. > > > > I think the problem is just simply that the new message missing > > doesn't match the query, but is a reply to a match. So with a date > > after the original match in the thread won't pick up anything. The 2nd > > URL above indeed only has 2 results. I guess I just have to fetch a > > wider window like a month every time? What's needed is a get any new > > messages in existing threads. I don't suppose there's an efficient way > > to do that? > > No, I don't think so. I think this is a separate issue in lei... > "t=1" in the remote query expands threads in a time-agnostic > way, so I don't think that's the problem (though I may be wrong...). Based on what the web interface presents, it sure seems like 't=1' is independent of the query. The results listed are only those that match the query and date range on the match. For example, this query returns 3 matches: https://lore.kernel.org/all/?x=m&t=1&q=((dfn%3Adrivers+OR+dfn%3Aarch+OR+dfn%3ADocumentation%2F*+OR+dfn%3Ainclude+OR+dfn%3Ascripts)+AND+f%3Arobh%40kernel.org+AND+rt%3A1641934905..)+AND+dt%3A20220630203819.. If I change 'dt' to 1 day earlier, I get 1 more match: https://lore.kernel.org/all/?x=m&t=1&q=((dfn%3Adrivers+OR+dfn%3Aarch+OR+dfn%3ADocumentation%2F*+OR+dfn%3Ainclude+OR+dfn%3Ascripts)+AND+f%3Arobh%40kernel.org+AND+rt%3A1641934905..)+AND+dt%3A20220629203819.. That 4th match has a reply after 6/30, but the 1st query will not get the reply. This is all reproducible without lei involved at all. What seems to be needed is a 'thread date' which is the latest time for any message in a thread that matches. Or perhaps some way to separate the query from what's transferred. IOW, query for X, but only send results newer than some date. Rob