From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <e@80x24.org> X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 24ACA1F953; Tue, 9 Nov 2021 03:12:33 +0000 (UTC) Date: Tue, 9 Nov 2021 03:12:33 +0000 From: Eric Wong <e@80x24.org> To: Rob Herring <robh@kernel.org> Cc: Konstantin Ryabitsev <konstantin@linuxfoundation.org>, meta@public-inbox.org Subject: Re: [PATCH] searchidx: index "diff --git a/... b/..." headers Message-ID: <20211109031233.GA19089@dcvr> References: <lorelei.part1.202111051304.mdtebsxahljcrxak@meerkat.local> <CAL_JsqJBh1O3H2-P07AHzVq0x89BoP_N6P=rT5up6=3QyF_B0Q@mail.gmail.com> <20211108202204.q5zg6bachnvbjlnx@meerkat.local> <CAL_Jsq+XtqOEF7p5zbO2O2YdHPr61+ahPgdDhH7_XMwyuDuc2w@mail.gmail.com> <20211108212714.GA13642@dcvr> <CAL_Jsq+aqDmpxUHsw844xS8f6WRX3gcvt7GQhf2XB7-Lb=Yx8Q@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <CAL_Jsq+aqDmpxUHsw844xS8f6WRX3gcvt7GQhf2XB7-Lb=Yx8Q@mail.gmail.com> List-Id: <meta.public-inbox.org> Rob Herring <robh@kernel.org> wrote: > On Mon, Nov 8, 2021 at 3:27 PM Eric Wong <e@80x24.org> wrote: > > > > Rob Herring <robh@kernel.org> wrote: > > > On Mon, Nov 8, 2021 at 2:22 PM Konstantin Ryabitsev > > > > I think 's:patch AND nq:diff' is a good option here. > > > > > > Not even close really. That mainly finds my replies with 'diff' in > > > them. I'm not sure why, but it misses most actual patches: > > > > > > https://lore.kernel.org/all/?q=s%3Apatch+nq%3Adiff+f%3Arobh%40kernel.org > > > > Actually, it looks like nq:diff never works. The diff indexer > > skips right over 'diff --git a/... b/...' lines :x > > Never works for 'diff' being a patch? Because it works very well > finding all the other cases. Yeah, the index_diff() code path ignored the "diff --git" phrase before this patch. > > The following should fix it, but reindexing is necessary. > > ---------8<---------- > > Subject: [PATCH] searchidx: index "diff --git a/... b/..." headers > > > > While we do detailed indexing of git diffs, the header itself > > was failing and queries like 'nq:diff' would not work. > > Any thoughts on supporting an 'is a patch' type query? I think 's:patch' should be sufficient, don't think there's many false-positives on that front, actually. With this fix, nq:"diff --git" should also be working across https://yhbt.net/lore/ in about 40 hours (whenever reindex finishes) I'm not sure if there needs to be a specific term to index patches on; maybe there is. There's still a lot of Xapian we're not using, yet...