unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
* Troubleshooting threads missing from /all/
@ 2021-10-01 13:05 Konstantin Ryabitsev
  2021-10-01 19:58 ` Eric Wong
  2021-10-01 20:17 ` Troubleshooting threads missing from /all/ Konstantin Ryabitsev
  0 siblings, 2 replies; 33+ messages in thread
From: Konstantin Ryabitsev @ 2021-10-01 13:05 UTC (permalink / raw)
  To: meta

Hello:

I was told about the following problem today:

The following thread:
https://lore.kernel.org/regressions/87czop5j33.fsf@tynnyri.adurom.net/

Doesn't appear to show up in /all/:
https://lore.kernel.org/all/87czop5j33.fsf@tynnyri.adurom.net/

Any hints how to troubleshoot what's happening?

This is happening across all 3 mirrors nodes, but it does the right thing on
yhbt.net, so I it would appear that something went wonky when running
public-inbox-extindex specifically on lore systems.

http://yhbt.net/lore/all/87czop5j33.fsf@tynnyri.adurom.net/

Rerunning public-inbox-extindex doesn't seem to fix it.

-K

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-01 13:05 Troubleshooting threads missing from /all/ Konstantin Ryabitsev
@ 2021-10-01 19:58 ` Eric Wong
  2021-10-01 20:08   ` Konstantin Ryabitsev
  2021-10-01 20:17 ` Troubleshooting threads missing from /all/ Konstantin Ryabitsev
  1 sibling, 1 reply; 33+ messages in thread
From: Eric Wong @ 2021-10-01 19:58 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> Hello:
> 
> I was told about the following problem today:
> 
> The following thread:
> https://lore.kernel.org/regressions/87czop5j33.fsf@tynnyri.adurom.net/
> 
> Doesn't appear to show up in /all/:
> https://lore.kernel.org/all/87czop5j33.fsf@tynnyri.adurom.net/
> 
> Any hints how to troubleshoot what's happening?
> 
> This is happening across all 3 mirrors nodes, but it does the right thing on
> yhbt.net, so I it would appear that something went wonky when running
> public-inbox-extindex specifically on lore systems.
> 
> http://yhbt.net/lore/all/87czop5j33.fsf@tynnyri.adurom.net/
> 
> Rerunning public-inbox-extindex doesn't seem to fix it.

Are you running "public-inbox-extindex --all"?
Or relying on "public-inbox-index -E ..."? (which is automatic
for /all/?).

I noticed a bug for the latter case yesterday, but don't think
it affects you unless you reindexed...

I also broke something on yhbt/lore from a few hours ago
(around 10/1 10:00..10:30 UTC), but don't think it affects
that thread since it's some hours older.

I'll take a closer look in a bit.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-01 19:58 ` Eric Wong
@ 2021-10-01 20:08   ` Konstantin Ryabitsev
  2021-10-01 20:41     ` Eric Wong
  0 siblings, 1 reply; 33+ messages in thread
From: Konstantin Ryabitsev @ 2021-10-01 20:08 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On Fri, Oct 01, 2021 at 07:58:11PM +0000, Eric Wong wrote:
> Are you running "public-inbox-extindex --all"?
> Or relying on "public-inbox-index -E ..."? (which is automatic
> for /all/?).

After each repository update, we run:

    public-inbox-index --no-update-extindex

And at the end of each grokmirror run, we perform a single:

    public-inbox-extindex --all

I believe this is what you suggested a while back to minimize disk thrashing.

-K

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-01 13:05 Troubleshooting threads missing from /all/ Konstantin Ryabitsev
  2021-10-01 19:58 ` Eric Wong
@ 2021-10-01 20:17 ` Konstantin Ryabitsev
  2021-10-02 11:39   ` Eric Wong
  1 sibling, 1 reply; 33+ messages in thread
From: Konstantin Ryabitsev @ 2021-10-01 20:17 UTC (permalink / raw)
  To: meta

On Fri, Oct 01, 2021 at 09:05:27AM -0400, Konstantin Ryabitsev wrote:
> I was told about the following problem today:
> 
> The following thread:
> https://lore.kernel.org/regressions/87czop5j33.fsf@tynnyri.adurom.net/
> 
> Doesn't appear to show up in /all/:
> https://lore.kernel.org/all/87czop5j33.fsf@tynnyri.adurom.net/

One thing I noticed is that this returns a 300, not a 404:

  HTTP/1.1 300 Multiple Choices

However, this seems to happen for all unknown message-ids (e.g.
lore.kernel.org/all/bogus@bogus), so probably entirely unrelated.

-K

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-01 20:08   ` Konstantin Ryabitsev
@ 2021-10-01 20:41     ` Eric Wong
  2021-10-01 20:49       ` Konstantin Ryabitsev
  0 siblings, 1 reply; 33+ messages in thread
From: Eric Wong @ 2021-10-01 20:41 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> On Fri, Oct 01, 2021 at 07:58:11PM +0000, Eric Wong wrote:
> > Are you running "public-inbox-extindex --all"?
> > Or relying on "public-inbox-index -E ..."? (which is automatic
> > for /all/?).

>     public-inbox-extindex --all
> 
> I believe this is what you suggested a while back to minimize disk thrashing.

Yup, that's what I do, too.   I'm looking at another mirror I have
(off yhbt.net/lore/all) and it works correctly.  I wonder if
it's a deduplication problem, since I see <87czop5j33.fsf@tynnyri.adurom.net>
twice.   But deduplication problems shouldn't result in
completely missing messages...

Curious, what is the output of:

   lei inspect --dir /path/to/all mid:87czop5j33.fsf@tynnyri.adurom.net

for you?

This would be a good time to beef up lei inspect's
capabilities...

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-01 20:41     ` Eric Wong
@ 2021-10-01 20:49       ` Konstantin Ryabitsev
  2021-10-01 20:54         ` Eric Wong
  0 siblings, 1 reply; 33+ messages in thread
From: Konstantin Ryabitsev @ 2021-10-01 20:49 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On Fri, Oct 01, 2021 at 08:41:31PM +0000, Eric Wong wrote:
> Curious, what is the output of:
> 
>    lei inspect --dir /path/to/all mid:87czop5j33.fsf@tynnyri.adurom.net
> 
> for you?

Not much. :)

	lei inspect --dir /srv/public-inbox/extindex mid:87czop5j33.fsf@tynnyri.adurom.net
	48242 lei-inspect worker wq_worker: Can't call method "can" on an undefined value at /usr/local/share/perl5/PublicInbox/LeiInspect.pm line 156.

I seem to get the same error for a message-id that *is* present in /all/

The tree is at af774d3bb0d728f2f37c418b8c3e215f1d4d860f

-K

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-01 20:49       ` Konstantin Ryabitsev
@ 2021-10-01 20:54         ` Eric Wong
  2021-10-01 20:58           ` Konstantin Ryabitsev
  0 siblings, 1 reply; 33+ messages in thread
From: Eric Wong @ 2021-10-01 20:54 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> On Fri, Oct 01, 2021 at 08:41:31PM +0000, Eric Wong wrote:
> > Curious, what is the output of:
> > 
> >    lei inspect --dir /path/to/all mid:87czop5j33.fsf@tynnyri.adurom.net
> > 
> > for you?
> 
> Not much. :)
> 
> 	lei inspect --dir /srv/public-inbox/extindex mid:87czop5j33.fsf@tynnyri.adurom.net
> 	48242 lei-inspect worker wq_worker: Can't call method "can" on an undefined value at /usr/local/share/perl5/PublicInbox/LeiInspect.pm line 156.

Oops, inspect doesn't work well w/o initialization (it should).
Running "lei init" first should workaround it, for now.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-01 20:54         ` Eric Wong
@ 2021-10-01 20:58           ` Konstantin Ryabitsev
  2021-10-01 22:25             ` Eric Wong
  0 siblings, 1 reply; 33+ messages in thread
From: Konstantin Ryabitsev @ 2021-10-01 20:58 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On Fri, Oct 01, 2021 at 08:54:42PM +0000, Eric Wong wrote:
> Oops, inspect doesn't work well w/o initialization (it should).
> Running "lei init" first should workaround it, for now.

Yes, that fixes the problem, but it still doesn't return much:

	{
	   "mid" : "87czop5j33.fsf@tynnyri.adurom.net"
	}

-K

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-01 20:58           ` Konstantin Ryabitsev
@ 2021-10-01 22:25             ` Eric Wong
  2021-10-01 22:27               ` Eric Wong
  2021-10-01 23:11               ` Konstantin Ryabitsev
  0 siblings, 2 replies; 33+ messages in thread
From: Eric Wong @ 2021-10-01 22:25 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

OK, so I'm also trying this on a CentOS 7.x VM, and
the lei part works as expected, so it seems specific to
extindex and not a problem with package differences in CentOS.

  export HOME=/tmp/trash # fresh lei/store instance
  M=87czop5j33.fsf@tynnyri.adurom.net
  lei import https://yhbt.net/lore/all/$M/t.mbox.gz
  lei q z:0.. | wc -l # should have all (11) msgs
  lei q m:$M -t | wc -l # should have the same msgs (11)

Can you confirm the above gives all 11 msgs for you?

I am running an -extindex --reindex on lore/all, though;
hopefully it doesn't break anything.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-01 22:25             ` Eric Wong
@ 2021-10-01 22:27               ` Eric Wong
  2021-10-01 23:11               ` Konstantin Ryabitsev
  1 sibling, 0 replies; 33+ messages in thread
From: Eric Wong @ 2021-10-01 22:27 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Eric Wong <e@80x24.org> wrote:
> Can you confirm the above gives all 11 msgs for you?

erm, that's 10 msgs, I forget the JSON output always ends with a
"null" line

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-01 22:25             ` Eric Wong
  2021-10-01 22:27               ` Eric Wong
@ 2021-10-01 23:11               ` Konstantin Ryabitsev
  2021-10-01 23:46                 ` Eric Wong
  1 sibling, 1 reply; 33+ messages in thread
From: Konstantin Ryabitsev @ 2021-10-01 23:11 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On Fri, Oct 01, 2021 at 10:25:09PM +0000, Eric Wong wrote:
>   export HOME=/tmp/trash # fresh lei/store instance
>   M=87czop5j33.fsf@tynnyri.adurom.net
>   lei import https://yhbt.net/lore/all/$M/t.mbox.gz
>   lei q z:0.. | wc -l # should have all (11) msgs
>   lei q m:$M -t | wc -l # should have the same msgs (11)

Yes, both are reporting 11.

> Can you confirm the above gives all 11 msgs for you?
> 
> I am running an -extindex --reindex on lore/all, though;
> hopefully it doesn't break anything.

I'll also be happy to provide the extindex, though that's a bit on the largish
side at almost 225G. :) Just let me know and I'll see if I can set up a
temporary VM with access for you.

-K

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-01 23:11               ` Konstantin Ryabitsev
@ 2021-10-01 23:46                 ` Eric Wong
  2021-10-02  0:02                   ` Eric Wong
  2021-10-05  4:39                   ` Eric Wong
  0 siblings, 2 replies; 33+ messages in thread
From: Eric Wong @ 2021-10-01 23:46 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> On Fri, Oct 01, 2021 at 10:25:09PM +0000, Eric Wong wrote:
> >   export HOME=/tmp/trash # fresh lei/store instance
> >   M=87czop5j33.fsf@tynnyri.adurom.net
> >   lei import https://yhbt.net/lore/all/$M/t.mbox.gz
> >   lei q z:0.. | wc -l # should have all (11) msgs
> >   lei q m:$M -t | wc -l # should have the same msgs (11)
> 
> Yes, both are reporting 11.

OK, good.

> I'll also be happy to provide the extindex, though that's a bit on the largish
> side at almost 225G. :) Just let me know and I'll see if I can set up a
> temporary VM with access for you.

Probably not needed    I wonder if it's related to some
dedupe bugs I also noticed with lei.  I'll take a harder look
at it once I get some more calories+sleep in me...  I hit some
massive brain fade yesterday-ish working on yhbt :<

If you have extra space somewhere to:

a) copy the old extindex somewhere and do extindex --reindex on it
b) just reindex in place (it /should/ work...)
c) start a new extindex from scratch...

And seeing if inspect gives different results

I'm also running:

  public-inbox-extindex --no-fsync /tmp/foo lore/linux-wireless lore/regressions

To create a new extindex with just those two inboxes (and maybe
again with inbox order reversed).  But it's also taking a while...

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-01 23:46                 ` Eric Wong
@ 2021-10-02  0:02                   ` Eric Wong
  2021-10-05  4:39                   ` Eric Wong
  1 sibling, 0 replies; 33+ messages in thread
From: Eric Wong @ 2021-10-02  0:02 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Eric Wong <e@80x24.org> wrote:
> If you have extra space somewhere to:
> 
> a) copy the old extindex somewhere and do extindex --reindex on it
> b) just reindex in place (it /should/ work...)
> c) start a new extindex from scratch...

I might have to try c) myself.  Though it might be time/batch-related;
because my mirrors probably see a bigger delays and larger
batches of indexing...

> I'm also running:
> 
>   public-inbox-extindex --no-fsync /tmp/foo lore/linux-wireless lore/regressions
> 
> To create a new extindex with just those two inboxes (and maybe
> again with inbox order reversed).  But it's also taking a while...

OK, 2 inboxes finished once I stopped some other jobs :x
Couldn't reproduce the problem with either order:

lei q --only /tmp/foo -t mid:87czop5j33.fsf@tynnyri.adurom.net |wc -l
lei q --only /tmp/foo-reversed -t mid:87czop5j33.fsf@tynnyri.adurom.net |wc -l

Both threaded the results properly and showed 11 lines.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-01 20:17 ` Troubleshooting threads missing from /all/ Konstantin Ryabitsev
@ 2021-10-02 11:39   ` Eric Wong
  0 siblings, 0 replies; 33+ messages in thread
From: Eric Wong @ 2021-10-02 11:39 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
>   HTTP/1.1 300 Multiple Choices
> 
> However, this seems to happen for all unknown message-ids (e.g.
> lore.kernel.org/all/bogus@bogus), so probably entirely unrelated.

Yeah, unrelated, it's always returned 300 since I wanted to highlight
the fact there's many hosts out there which understand Message-IDs.

I just posted a few more patches, one which may help detect missing blobs:
https://public-inbox.org/meta/20211002111835.19220-5-e@80x24.org/

(I don't know if you log the output of extindex, I don't :x)

mid:87czop5j33.fsf@tynnyri.adurom.net
	a10c4087bee4dd10eee64903de80ca5c9c064db2
	ad29c5999a31edcbbed722b0d8299bf4f72bcae7

mid:d4405cdb-95ee-1dd9-6957-502269feb15e@leemhuis.info
	910d8e8beb3f5c692c4ba9522f6e22819cce69fb
	doesn't show up in l.k.o/all/ either...

So I wonder if there was a blob reachability problem during
indexing.

Btw, unlikely to matter, but are you using boost on either
regressions or linux-wireless?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-01 23:46                 ` Eric Wong
  2021-10-02  0:02                   ` Eric Wong
@ 2021-10-05  4:39                   ` Eric Wong
  2021-10-05 18:03                     ` Konstantin Ryabitsev
  1 sibling, 1 reply; 33+ messages in thread
From: Eric Wong @ 2021-10-05  4:39 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Eric Wong <e@80x24.org> wrote:
> b) just reindex in place (it /should/ work...)

I reindexing live on yhbt/lore and it didn't break...

Btw, did you see my other questions about whether or not boost
was in use?

I need to work on some read-only "fsck"-type functionality
and limited reindexing.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-05  4:39                   ` Eric Wong
@ 2021-10-05 18:03                     ` Konstantin Ryabitsev
  2021-10-06 10:18                       ` Eric Wong
  2021-10-07 21:33                       ` Eric Wong
  0 siblings, 2 replies; 33+ messages in thread
From: Konstantin Ryabitsev @ 2021-10-05 18:03 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On Tue, Oct 05, 2021 at 04:39:54AM +0000, Eric Wong wrote:
> Eric Wong <e@80x24.org> wrote:
> > b) just reindex in place (it /should/ work...)
> 
> I reindexing live on yhbt/lore and it didn't break...
> 
> Btw, did you see my other questions about whether or not boost
> was in use?

Yes, but I was attending the Linux Security Summit last week, so my attention
was all over the place. We do use boost values there. Looking at that
particular message, it was sent to regressions and linux-wireless, which have
different boost values:

[publicinbox "regressions"]
  address = regressions@lists.linux.dev
  url = regressions
  inboxdir = /srv/public-inbox/lore.kernel.org/regressions
  indexlevel = full
  newsgroup = dev.linux.lists.regressions
  boost = 11
  listid = regressions.lists.linux.dev

[publicinbox "linux-wireless"]
  address = linux-wireless@vger.kernel.org
  url = linux-wireless
  inboxdir = /srv/public-inbox/lore.kernel.org/linux-wireless
  indexlevel = full
  newsgroup = org.kernel.vger.linux-wireless
  boost = 10
  listid = linux-wireless.vger.kernel.org

we give lists.linux.dev a higher boost value because we populate it straight
from watched mlmmj archive dirs and it's more likely to have "truest" headers.

-K

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-05 18:03                     ` Konstantin Ryabitsev
@ 2021-10-06 10:18                       ` Eric Wong
  2021-10-07  8:36                         ` Eric Wong
  2021-10-07 21:33                         ` Eric Wong
  2021-10-07 21:33                       ` Eric Wong
  1 sibling, 2 replies; 33+ messages in thread
From: Eric Wong @ 2021-10-06 10:18 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> On Tue, Oct 05, 2021 at 04:39:54AM +0000, Eric Wong wrote:
> > Eric Wong <e@80x24.org> wrote:
> > > b) just reindex in place (it /should/ work...)
> > 
> > I reindexing live on yhbt/lore and it didn't break...
> > 
> > Btw, did you see my other questions about whether or not boost
> > was in use?
> 
> Yes, but I was attending the Linux Security Summit last week, so my attention
> was all over the place. We do use boost values there. Looking at that
> particular message, it was sent to regressions and linux-wireless, which have
> different boost values:

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-06 10:18                       ` Eric Wong
@ 2021-10-07  8:36                         ` Eric Wong
  2021-10-07 13:57                           ` Konstantin Ryabitsev
  2021-10-07 21:33                         ` Eric Wong
  1 sibling, 1 reply; 33+ messages in thread
From: Eric Wong @ 2021-10-07  8:36 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Also, did you capture any error messages to stderr?
I suppose you would've told us if you did.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-07  8:36                         ` Eric Wong
@ 2021-10-07 13:57                           ` Konstantin Ryabitsev
  2021-10-07 21:36                             ` Eric Wong
  0 siblings, 1 reply; 33+ messages in thread
From: Konstantin Ryabitsev @ 2021-10-07 13:57 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On Thu, Oct 07, 2021 at 08:36:52AM +0000, Eric Wong wrote:
> Also, did you capture any error messages to stderr?
> I suppose you would've told us if you did.

Yeah, I looked through any place that would have logged an error and I didn't
really see anything. I expect this would have happened during an extindex run,
but I didn't see any non-zero exits when I looked through the logs.

Regarding reindex -- is that something that would make sense to do
occasionally simply for potential improvements, e.g. similarly to how we
periodically repack repos with -f for better packs? Or would that be pointless
churn in the context of xapian?

-K

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-05 18:03                     ` Konstantin Ryabitsev
  2021-10-06 10:18                       ` Eric Wong
@ 2021-10-07 21:33                       ` Eric Wong
  2021-10-08 17:33                         ` Konstantin Ryabitsev
  1 sibling, 1 reply; 33+ messages in thread
From: Eric Wong @ 2021-10-07 21:33 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

(resend, screwed up something with my MTA :x)

OK.  I tried reproducing the problem even with f28fdcd6d8d6
(content_hash: normalize whitespace before hashing addresses, 2021-10-02)
reverted, but haven't been able to...

So far I've found some gc and dedupe bugs, but something's still
eluding me.  And I also noticed and started fixing another bug
which may necessitate a full --reindex, anyways (at least for
non-ASCII subjects).

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> [publicinbox "regressions"]
>   address = regressions@lists.linux.dev
>   url = regressions
>   inboxdir = /srv/public-inbox/lore.kernel.org/regressions
>   indexlevel = full

Btw, "indexlevel = basic" ought to be sufficient if an inbox
is in extindex once bugs are ironed out.  full/medium is
of course helpful if messages are missing from extindex,
though...

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-06 10:18                       ` Eric Wong
  2021-10-07  8:36                         ` Eric Wong
@ 2021-10-07 21:33                         ` Eric Wong
  1 sibling, 0 replies; 33+ messages in thread
From: Eric Wong @ 2021-10-07 21:33 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Also, did you capture any error messages to stderr?
I suppose you would've told us if you did.

(resend, MTA dropped this part:)

In particular, I just posted a patch to fix
"Can't bless non-reference value" messages could've been causing
some messages to fail indexing completely.

<20211007082932.6985-1-e@80x24.org>
(overidx: each_by_mid: account for messages being deleted)

Error reporting/handling needs some work... :x

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-07 13:57                           ` Konstantin Ryabitsev
@ 2021-10-07 21:36                             ` Eric Wong
  0 siblings, 0 replies; 33+ messages in thread
From: Eric Wong @ 2021-10-07 21:36 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> On Thu, Oct 07, 2021 at 08:36:52AM +0000, Eric Wong wrote:
> > Also, did you capture any error messages to stderr?
> > I suppose you would've told us if you did.
> 
> Yeah, I looked through any place that would have logged an error and I didn't
> really see anything. I expect this would have happened during an extindex run,
> but I didn't see any non-zero exits when I looked through the logs.
> 
> Regarding reindex -- is that something that would make sense to do
> occasionally simply for potential improvements, e.g. similarly to how we
> periodically repack repos with -f for better packs? Or would that be pointless
> churn in the context of xapian?

Yes, I try to note when a reindex is necessary in commit messages.
It takes my system around 2 days to do, typically, but it should
run safely in parallel with everything else safely.

It should probably be done every release, but I suck at making those :<

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-07 21:33                       ` Eric Wong
@ 2021-10-08 17:33                         ` Konstantin Ryabitsev
  2021-10-08 21:34                           ` Eric Wong
  0 siblings, 1 reply; 33+ messages in thread
From: Konstantin Ryabitsev @ 2021-10-08 17:33 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On Thu, Oct 07, 2021 at 09:33:07PM +0000, Eric Wong wrote:
> Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> > [publicinbox "regressions"]
> >   address = regressions@lists.linux.dev
> >   url = regressions
> >   inboxdir = /srv/public-inbox/lore.kernel.org/regressions
> >   indexlevel = full
> 
> Btw, "indexlevel = basic" ought to be sufficient if an inbox
> is in extindex once bugs are ironed out.  full/medium is
> of course helpful if messages are missing from extindex,
> though...

That would save tons of space for sure. How does that work, would the search
box still be available on per-list basis? Does it just use extindex and
additionally filter by list-id or some similar parameter?

To switch from the current setup where every list has its own full index, is
it sufficient to just set indexlevel = basic and delete the xapian db?

Thanks,
-K

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-08 17:33                         ` Konstantin Ryabitsev
@ 2021-10-08 21:34                           ` Eric Wong
  2021-10-16  9:43                             ` Eric Wong
  0 siblings, 1 reply; 33+ messages in thread
From: Eric Wong @ 2021-10-08 21:34 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> On Thu, Oct 07, 2021 at 09:33:07PM +0000, Eric Wong wrote:
> > Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> > > [publicinbox "regressions"]
> > >   address = regressions@lists.linux.dev
> > >   url = regressions
> > >   inboxdir = /srv/public-inbox/lore.kernel.org/regressions
> > >   indexlevel = full
> > 
> > Btw, "indexlevel = basic" ought to be sufficient if an inbox
> > is in extindex once bugs are ironed out.  full/medium is
> > of course helpful if messages are missing from extindex,
> > though...
> 
> That would save tons of space for sure. How does that work, would the search
> box still be available on per-list basis? Does it just use extindex and
> additionally filter by list-id or some similar parameter?

It filters by newsgroup; and falls back to inboxdir if no newsgroup
is configured

> To switch from the current setup where every list has its own full index, is
> it sufficient to just set indexlevel = basic and delete the xapian db?

Yes.  Though given the current situation with missing messages
from /all/, I'd wait until a reindex recovers the missing
messages (and probably a fast fsck checker).

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-08 21:34                           ` Eric Wong
@ 2021-10-16  9:43                             ` Eric Wong
  2021-10-17 18:04                               ` Konstantin Ryabitsev
  2021-10-18 14:03                               ` Konstantin Ryabitsev
  0 siblings, 2 replies; 33+ messages in thread
From: Eric Wong @ 2021-10-16  9:43 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Eric Wong <e@80x24.org> wrote:
> Yes.  Though given the current situation with missing messages
> from /all/, I'd wait until a reindex recovers the missing
> messages (and probably a fast fsck checker).

I think "public-inbox-extindex --reindex --all --fast" is
reasonably ready as an fsck checker.  I've been running it a
bunch in recent days/weeks and also found+fixed some other bugs
along the way.

With --fast, --reindex takes around 20 minutes for me with
"--batch-size=20m --no-fsync".  The first run may take longer
if it has stuff to do.  But running it repeatedly should not
cause it to complain about unseen/stale/mismatched messages
(likely the first run will).

So it's not /really/ fast, but compared to ~35 hours w/o --fast,
then it's alright.   Either way, --reindex should be safe
with parallel -index and -extindex being run by cronjobs.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-16  9:43                             ` Eric Wong
@ 2021-10-17 18:04                               ` Konstantin Ryabitsev
  2021-10-17 23:12                                 ` Eric Wong
  2021-10-18 14:03                               ` Konstantin Ryabitsev
  1 sibling, 1 reply; 33+ messages in thread
From: Konstantin Ryabitsev @ 2021-10-17 18:04 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On Sat, Oct 16, 2021 at 09:43:24AM +0000, Eric Wong wrote:
> Eric Wong <e@80x24.org> wrote:
> > Yes.  Though given the current situation with missing messages
> > from /all/, I'd wait until a reindex recovers the missing
> > messages (and probably a fast fsck checker).
> 
> I think "public-inbox-extindex --reindex --all --fast" is
> reasonably ready as an fsck checker.  I've been running it a
> bunch in recent days/weeks and also found+fixed some other bugs
> along the way.

Thanks, Eric! I've been out this week for some family time (it was
Thanksgiving in Canada), which is why I was staying conspicuously silent. :)
I'll give --reindex --fast a whirl in the next few days.

> With --fast, --reindex takes around 20 minutes for me with
> "--batch-size=20m --no-fsync".  The first run may take longer
> if it has stuff to do.  But running it repeatedly should not
> cause it to complain about unseen/stale/mismatched messages
> (likely the first run will).
> 
> So it's not /really/ fast, but compared to ~35 hours w/o --fast,
> then it's alright.   Either way, --reindex should be safe
> with parallel -index and -extindex being run by cronjobs.

Absolutely, thanks for working on this.

-K

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-17 18:04                               ` Konstantin Ryabitsev
@ 2021-10-17 23:12                                 ` Eric Wong
  2021-10-18  5:25                                   ` Eric Wong
  0 siblings, 1 reply; 33+ messages in thread
From: Eric Wong @ 2021-10-17 23:12 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> On Sat, Oct 16, 2021 at 09:43:24AM +0000, Eric Wong wrote:
> > Eric Wong <e@80x24.org> wrote:
> > > Yes.  Though given the current situation with missing messages
> > > from /all/, I'd wait until a reindex recovers the missing
> > > messages (and probably a fast fsck checker).
> > 
> > I think "public-inbox-extindex --reindex --all --fast" is
> > reasonably ready as an fsck checker.  I've been running it a
> > bunch in recent days/weeks and also found+fixed some other bugs
> > along the way.

Btw, I'm chasing a separate bug in v2 which causes recycled
Message-IDs to go missing sometimes from a v2 over.sqlite3;
which then causes -extindex to lose a message...

For example, patches v7 and v8 of
  "btrfs: consolidate device_list_mutex in prepare_sprout to its parent"
reused the same Message-ID, but hitting the v2 inbox directly
https://lore.kernel.org/linux-btrfs/6585e7d938e6600189c1bc7b61a7c76badef18dd.1633003671.git.anand.jain@oracle.com/
doesn't show it, anymore.  --reindex on the v2 inbox seems to
work, but not always...

> Thanks, Eric! I've been out this week for some family time (it was
> Thanksgiving in Canada), which is why I was staying conspicuously silent. :)
> I'll give --reindex --fast a whirl in the next few days.

No problem, but there's more bugs to fix :/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-17 23:12                                 ` Eric Wong
@ 2021-10-18  5:25                                   ` Eric Wong
  2021-10-18 14:04                                     ` Konstantin Ryabitsev
  0 siblings, 1 reply; 33+ messages in thread
From: Eric Wong @ 2021-10-18  5:25 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Eric Wong <e@80x24.org> wrote:
> Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> > On Sat, Oct 16, 2021 at 09:43:24AM +0000, Eric Wong wrote:
> > > Eric Wong <e@80x24.org> wrote:
> > > > Yes.  Though given the current situation with missing messages
> > > > from /all/, I'd wait until a reindex recovers the missing
> > > > messages (and probably a fast fsck checker).
> > > 
> > > I think "public-inbox-extindex --reindex --all --fast" is
> > > reasonably ready as an fsck checker.  I've been running it a
> > > bunch in recent days/weeks and also found+fixed some other bugs
> > > along the way.
> 
> Btw, I'm chasing a separate bug in v2 which causes recycled
> Message-IDs to go missing sometimes from a v2 over.sqlite3;
> which then causes -extindex to lose a message...

I just pushed out commit 325fbe26c3e7731e
(v2: mirrors don't clobber msgs w/ reused Message-IDs, 2021-10-18)

Now I'm reindexing all my v2 inboxes before running
"-extindex --all --reindex --fast".  Fortunately, v2 inboxes
are all "-L basic" so they're not too expensive to reindex.

Really hoping this is the last bug related to indexing for a
while...

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-16  9:43                             ` Eric Wong
  2021-10-17 18:04                               ` Konstantin Ryabitsev
@ 2021-10-18 14:03                               ` Konstantin Ryabitsev
  2021-10-24  0:03                                 ` Eric Wong
  1 sibling, 1 reply; 33+ messages in thread
From: Konstantin Ryabitsev @ 2021-10-18 14:03 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On Sat, Oct 16, 2021 at 09:43:24AM +0000, Eric Wong wrote:
> With --fast, --reindex takes around 20 minutes for me with
> "--batch-size=20m --no-fsync".  The first run may take longer
> if it has stuff to do.  But running it repeatedly should not
> cause it to complain about unseen/stale/mismatched messages
> (likely the first run will).

I just ran it across the 3 lore nodes and the results have been fairly
consistent:

- 16m for the initial run where it finds a few hundred things to fix
- 14m for the subsequent runs

I used the default flags for --reindex --all --fast, so it can perhaps be sped
up with larger memory use, but this is good enough for daily runs already.

-K

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-18  5:25                                   ` Eric Wong
@ 2021-10-18 14:04                                     ` Konstantin Ryabitsev
  0 siblings, 0 replies; 33+ messages in thread
From: Konstantin Ryabitsev @ 2021-10-18 14:04 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On Mon, Oct 18, 2021 at 05:25:26AM +0000, Eric Wong wrote:
> > Btw, I'm chasing a separate bug in v2 which causes recycled
> > Message-IDs to go missing sometimes from a v2 over.sqlite3;
> > which then causes -extindex to lose a message...
> 
> I just pushed out commit 325fbe26c3e7731e
> (v2: mirrors don't clobber msgs w/ reused Message-IDs, 2021-10-18)
> 
> Now I'm reindexing all my v2 inboxes before running
> "-extindex --all --reindex --fast".  Fortunately, v2 inboxes
> are all "-L basic" so they're not too expensive to reindex.

Okay, I guess I should plan the same, then. I'll see if I can pair this with
the switching over to the "basic" indexing for individual inboxes.

-K

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-18 14:03                               ` Konstantin Ryabitsev
@ 2021-10-24  0:03                                 ` Eric Wong
  2021-10-26 21:11                                   ` Konstantin Ryabitsev
  0 siblings, 1 reply; 33+ messages in thread
From: Eric Wong @ 2021-10-24  0:03 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> On Sat, Oct 16, 2021 at 09:43:24AM +0000, Eric Wong wrote:
> > With --fast, --reindex takes around 20 minutes for me with
> > "--batch-size=20m --no-fsync".  The first run may take longer
> > if it has stuff to do.  But running it repeatedly should not
> > cause it to complain about unseen/stale/mismatched messages
> > (likely the first run will).
> 
> I just ran it across the 3 lore nodes and the results have been fairly
> consistent:
> 
> - 16m for the initial run where it finds a few hundred things to fix
> - 14m for the subsequent runs
> 
> I used the default flags for --reindex --all --fast, so it can perhaps be sped
> up with larger memory use, but this is good enough for daily runs already.

Cool.  Just wondering if all is well on your end with daily runs.

--reindex --all --fast hasn't found any work to do the past week
or so on https://yhbt.net/lore/

I'm thinking about cutting a new release soonish and just
putting giant warnings around the not-yet-ready parts of
lei for now...

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Troubleshooting threads missing from /all/
  2021-10-24  0:03                                 ` Eric Wong
@ 2021-10-26 21:11                                   ` Konstantin Ryabitsev
  2021-10-29 10:18                                     ` release updates [was: Troubleshooting threads missing from /all/] Eric Wong
  0 siblings, 1 reply; 33+ messages in thread
From: Konstantin Ryabitsev @ 2021-10-26 21:11 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On Sun, Oct 24, 2021 at 12:03:17AM +0000, Eric Wong wrote:
> > I used the default flags for --reindex --all --fast, so it can perhaps be sped
> > up with larger memory use, but this is good enough for daily runs already.
> 
> Cool.  Just wondering if all is well on your end with daily runs.

Yes, so far nothing else has come up and the runs are just quietly succeeding.
> 
> --reindex --all --fast hasn't found any work to do the past week
> or so on https://yhbt.net/lore/
> 
> I'm thinking about cutting a new release soonish and just
> putting giant warnings around the not-yet-ready parts of
> lei for now...

I think this may be a good plan, considering that lei is now getting more and
more attention from kernel devs and it would be convenient to be able to
provide a packaged version of it to install on popular distros.

Thanks,
-K

^ permalink raw reply	[flat|nested] 33+ messages in thread

* release updates [was: Troubleshooting threads missing from /all/]
  2021-10-26 21:11                                   ` Konstantin Ryabitsev
@ 2021-10-29 10:18                                     ` Eric Wong
  0 siblings, 0 replies; 33+ messages in thread
From: Eric Wong @ 2021-10-29 10:18 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> On Sun, Oct 24, 2021 at 12:03:17AM +0000, Eric Wong wrote:
> > I'm thinking about cutting a new release soonish and just
> > putting giant warnings around the not-yet-ready parts of
> > lei for now...
> 
> I think this may be a good plan, considering that lei is now getting more and
> more attention from kernel devs and it would be convenient to be able to
> provide a packaged version of it to install on popular distros.

I keep finding new bugs every time I start documenting something :<
Some SIGPIPE fixes are being worked on, one is a worrying
segfault...

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2021-10-29 10:18 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-01 13:05 Troubleshooting threads missing from /all/ Konstantin Ryabitsev
2021-10-01 19:58 ` Eric Wong
2021-10-01 20:08   ` Konstantin Ryabitsev
2021-10-01 20:41     ` Eric Wong
2021-10-01 20:49       ` Konstantin Ryabitsev
2021-10-01 20:54         ` Eric Wong
2021-10-01 20:58           ` Konstantin Ryabitsev
2021-10-01 22:25             ` Eric Wong
2021-10-01 22:27               ` Eric Wong
2021-10-01 23:11               ` Konstantin Ryabitsev
2021-10-01 23:46                 ` Eric Wong
2021-10-02  0:02                   ` Eric Wong
2021-10-05  4:39                   ` Eric Wong
2021-10-05 18:03                     ` Konstantin Ryabitsev
2021-10-06 10:18                       ` Eric Wong
2021-10-07  8:36                         ` Eric Wong
2021-10-07 13:57                           ` Konstantin Ryabitsev
2021-10-07 21:36                             ` Eric Wong
2021-10-07 21:33                         ` Eric Wong
2021-10-07 21:33                       ` Eric Wong
2021-10-08 17:33                         ` Konstantin Ryabitsev
2021-10-08 21:34                           ` Eric Wong
2021-10-16  9:43                             ` Eric Wong
2021-10-17 18:04                               ` Konstantin Ryabitsev
2021-10-17 23:12                                 ` Eric Wong
2021-10-18  5:25                                   ` Eric Wong
2021-10-18 14:04                                     ` Konstantin Ryabitsev
2021-10-18 14:03                               ` Konstantin Ryabitsev
2021-10-24  0:03                                 ` Eric Wong
2021-10-26 21:11                                   ` Konstantin Ryabitsev
2021-10-29 10:18                                     ` release updates [was: Troubleshooting threads missing from /all/] Eric Wong
2021-10-01 20:17 ` Troubleshooting threads missing from /all/ Konstantin Ryabitsev
2021-10-02 11:39   ` Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).