From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=0.2 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM,RCVD_IN_DNSWL_HI,RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=no autolearn_force=no version=3.4.6 Received: from mail-lf1-x133.google.com (mail-lf1-x133.google.com [IPv6:2a00:1450:4864:20::133]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 43CAD1F62B for ; Sun, 26 Feb 2023 12:18:02 +0000 (UTC) Authentication-Results: dcvr.yhbt.net; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=Xi9tkSrR; dkim-atps=neutral Received: by mail-lf1-x133.google.com with SMTP id t11so5072066lfr.1 for ; Sun, 26 Feb 2023 04:18:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=2YnfeQKxkVKLCt955wo38q21+q59I5lyD87lXI3DRAU=; b=Xi9tkSrR8jLGXxNj4RMgsiJo6jNuyhKLRpgrZ/I3Fm++ZvexlaaVzJgIkatI0TeR0j 5f2RMGk/20MVNZctlU65Ly5Xkq8+RzBRH+4MSYIbM/Sqe1gpG0DXYACnBOC/XNWJdbNB aaqyohoGHGEKtSr8cszbDL3RiU4/43zEiUOtBWQ2tD854WVj4ArqWA8/I9mptfaInygN IsLvMv2yJhNL9MljcL6lqJ2kYtfM7Z6ZZN/F7RW2EfahDbeycAFlp10acRoECNEB7Wlc vhzVJks2l73nF7P+FFproai+14IbV7y4tp8i+WNUqqJMoWrk2FtMGxhpGuMp5OKgPYEZ 3KJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=2YnfeQKxkVKLCt955wo38q21+q59I5lyD87lXI3DRAU=; b=0qAW31DselvSJpwDk75Ct64wjkPvspaWD/v4eZ+uYSGoaca/GstZ4u1/mzFt3Yd9my aJKlckBkiL3pf9/N9joUhuSD8oaO6B6i8hwxWbxh4f7glJ5YM7BQ5tmMtldwKuqOtgWw UxOA4HlZma8u6xwAN95OfZM2BKTz8CcKzUketmNrve2XicExuA7orneJ60zzTUFrqYgc sRp4sno6MucOvivbp8Ez41kIeNOHp5DrZ8EGpKD6p7q958UHDommdnkLZR+h7KkrjH9B Eneb7V8mzaGRW0D0DIRP7nsvgDbXFWQ/Il9tGNmanVZRmt3kK4IG0oWIjkbT0BDltsF0 oWdw== X-Gm-Message-State: AO0yUKW6fAJt4/Dd8VdqLjCD+8vCiW3JMCIGt3cTjefF6cEtWfNiO0ai vW/tYrETBjWTPyhUhdYpV2s1BNilFAA= X-Google-Smtp-Source: AK7set93+TeR/UjqaLY/iwHdXWgo77xWXYdFoDbMpqW4mCYBkCrP1OIJM7u2v/1W7uAZ2kWkkgMF6w== X-Received: by 2002:ac2:55b5:0:b0:4d8:6540:a72f with SMTP id y21-20020ac255b5000000b004d86540a72fmr6943498lfg.46.1677413878424; Sun, 26 Feb 2023 04:17:58 -0800 (PST) Received: from localhost (tor-project-exit10.dotsrc.org. [185.129.61.10]) by smtp.gmail.com with ESMTPSA id y6-20020ac255a6000000b004d61af6771dsm541422lfg.41.2023.02.26.04.17.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 26 Feb 2023 04:17:57 -0800 (PST) Date: Sun, 26 Feb 2023 14:17:50 +0200 From: Maxim Mikityanskiy To: Eric Wong Cc: meta@public-inbox.org, Kyle Meyer Subject: Re: [PATCH] lei q: do not collapse threads with `-tt' Message-ID: References: <20230214024232.M64373@dcvr> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230214024232.M64373@dcvr> List-Id: On Tue, Feb 14, 2023 at 02:42:32AM +0000, Eric Wong wrote: > Maxim Mikityanskiy wrote: > > lei q --no-save -a -o /tmp/lei-test -I 'https://lore.kernel.org/all' \ > > -tt 'a:syzbot AND rt:2023-01-01..2023-01-07' > > At first, I thought -a (--augment) was causing it... > > Sidenote: you also don't need to quote the query (I forget the exact > rules, but I tried to keep quotes easier for phrase searches). > > > It looks as if the match works correctly, but the -tt option fails to > > mark most of the matched emails as important, except a few that actually > > got marked (I couldn't find a pattern here). It's also not consistent, > > for example, after I removed /tmp/lei-test and restarted the lei q > > command, I got many more important emails, almost in each thread, but > > there were still threads without flagged emails. > > Yes, now it seems it's the collapsing optimization. > > > I'm checking the flags with mutt. > > > > Does anyone know what could be the reason for such behavior? > > I think the following patch fixes it. Sorry for taking too long, I finally found a minute to test it, and unfortunately I didn't see a difference. I queried for: a:syzbot AND rt:2023-02-01..2023-02-07 and I still saw I lot of threads without a single flag. I double-checked that the patch was actually applied, killed lei-daemon, and removed the mailbox directory, but it didn't help. > (I accidentally sent you a private copy with invalid blobs since > I had other unpublished changes) > > -----8<------- > Subject: [PATCH] lei q: do not collapse threads with `-tt' > > While having Xapian collapse threads is an easy way to reduce > the amount of deduplication work we need to do when writing > out threads; we can't rely on it when using `lei q -tt` since > that needs to flag all hits. > > Reported-by: Maxim Mikityanskiy > Link: https://public-inbox.org/git/Y+pgBmj0jxR+cVkD@mail.gmail.com/ > --- > lib/PublicInbox/Search.pm | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm > index 2feb3e13..273cc57c 100644 > --- a/lib/PublicInbox/Search.pm > +++ b/lib/PublicInbox/Search.pm > @@ -460,8 +460,9 @@ sub _enquire_once { # retry_reopen callback > $enquire->set_sort_by_relevance_then_value(TS, !$opts->{asc}); > } > > - # `mairix -t / --threads' or JMAP collapseThreads > - if ($opts->{threads} && has_threadid($self)) { > + # `lei q -t / --threads' or JMAP collapseThreads; but don't collapse > + # on `-tt' ({threads} > 1) which sets the Flagged|Important keyword > + if (($opts->{threads} // 0) == 1 && has_threadid($self)) { > $enquire->set_collapse_key(THREADID); > } > $enquire->get_mset($opts->{offset} || 0, $opts->{limit} || 50);