From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <e@80x24.org>
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net
X-Spam-Level: 
X-Spam-ASN:  
X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00
	shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2
Received: from localhost (dcvr.yhbt.net [127.0.0.1])
	by dcvr.yhbt.net (Postfix) with ESMTP id 24ACA1F953;
	Tue,  9 Nov 2021 03:12:33 +0000 (UTC)
Date: Tue, 9 Nov 2021 03:12:33 +0000
From: Eric Wong <e@80x24.org>
To: Rob Herring <robh@kernel.org>
Cc: Konstantin Ryabitsev <konstantin@linuxfoundation.org>,
	meta@public-inbox.org
Subject: Re: [PATCH] searchidx: index "diff --git a/... b/..." headers
Message-ID: <20211109031233.GA19089@dcvr>
References: <lorelei.part1.202111051304.mdtebsxahljcrxak@meerkat.local>
 <CAL_JsqJBh1O3H2-P07AHzVq0x89BoP_N6P=rT5up6=3QyF_B0Q@mail.gmail.com>
 <20211108202204.q5zg6bachnvbjlnx@meerkat.local>
 <CAL_Jsq+XtqOEF7p5zbO2O2YdHPr61+ahPgdDhH7_XMwyuDuc2w@mail.gmail.com>
 <20211108212714.GA13642@dcvr>
 <CAL_Jsq+aqDmpxUHsw844xS8f6WRX3gcvt7GQhf2XB7-Lb=Yx8Q@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <CAL_Jsq+aqDmpxUHsw844xS8f6WRX3gcvt7GQhf2XB7-Lb=Yx8Q@mail.gmail.com>
List-Id: <meta.public-inbox.org>

Rob Herring <robh@kernel.org> wrote:
> On Mon, Nov 8, 2021 at 3:27 PM Eric Wong <e@80x24.org> wrote:
> >
> > Rob Herring <robh@kernel.org> wrote:
> > > On Mon, Nov 8, 2021 at 2:22 PM Konstantin Ryabitsev
> > > > I think 's:patch AND nq:diff' is a good option here.
> > >
> > > Not even close really. That mainly finds my replies with 'diff' in
> > > them. I'm not sure why, but it misses most actual patches:
> > >
> > > https://lore.kernel.org/all/?q=s%3Apatch+nq%3Adiff+f%3Arobh%40kernel.org
> >
> > Actually, it looks like nq:diff never works.  The diff indexer
> > skips right over 'diff --git a/... b/...' lines :x
> 
> Never works for 'diff' being a patch? Because it works very well
> finding all the other cases.

Yeah, the index_diff() code path ignored the "diff --git" phrase
before this patch.

> > The following should fix it, but reindexing is necessary.
> > ---------8<----------
> > Subject: [PATCH] searchidx: index "diff --git a/... b/..." headers
> >
> > While we do detailed indexing of git diffs, the header itself
> > was failing and queries like 'nq:diff' would not work.
> 
> Any thoughts on supporting an 'is a patch' type query?

I think 's:patch' should be sufficient, don't think there's
many false-positives on that front, actually.

With this fix, nq:"diff --git" should also be working across
https://yhbt.net/lore/ in about 40 hours (whenever reindex
finishes)

I'm not sure if there needs to be a specific term to index
patches on; maybe there is.  There's still a lot of Xapian
we're not using, yet...