* public-inbox-convert hangs on systems using musl libc
@ 2022-12-21 11:28 Chris Brannon
2022-12-21 12:21 ` Eric Wong
0 siblings, 1 reply; 13+ messages in thread
From: Chris Brannon @ 2022-12-21 11:28 UTC (permalink / raw)
To: meta
I can reproduce it reliably on both Alpine Linux and the musl variant of
Void Linux. Just clone a mirror of public-inbox.org/meta and try and
convert it:
$ git clone --mirror https://public-inbox.org/meta
$ public-inbox-convert meta metanew
If I pass --no-index, the process is successful.
This is a deadlock of some sort, because during the hang, git cat-file
is blocked on write and other things are blocked on read.
That's as far as I have gotten with debugging it.
-- Chris
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: public-inbox-convert hangs on systems using musl libc
2022-12-21 11:28 public-inbox-convert hangs on systems using musl libc Chris Brannon
@ 2022-12-21 12:21 ` Eric Wong
2022-12-21 13:46 ` Chris Brannon
0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2022-12-21 12:21 UTC (permalink / raw)
To: Chris Brannon; +Cc: meta
Chris Brannon <chris@the-brannons.com> wrote:
> I can reproduce it reliably on both Alpine Linux and the musl variant of
> Void Linux. Just clone a mirror of public-inbox.org/meta and try and
> convert it:
>
> $ git clone --mirror https://public-inbox.org/meta
> $ public-inbox-convert meta metanew
>
> If I pass --no-index, the process is successful.
>
> This is a deadlock of some sort, because during the hang, git cat-file
> is blocked on write and other things are blocked on read.
I've requested some packages w/ cfarm admins to test with.
Do you know which pipes are which? "lsof -p $PID +E" can help
with connectivity checking, as can script/dtas-graph in
https://80x24.org/dtas.git if you have Graph::Easy
What's curious is all the code paths should be independent of
stdio buffering, so I wouldn't think libc differences would
matter...
I also just did the above meta conversion on FreeBSD w/o problems.
> That's as far as I have gotten with debugging it.
Some shots in the dark:
1. force read pipe on our end to be non-blocking
diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm
index 882a9a4a..b239ceb9 100644
--- a/lib/PublicInbox/Git.pm
+++ b/lib/PublicInbox/Git.pm
@@ -143,6 +143,7 @@ sub _bidi_pipe {
fcntl($out_w, 1031, 4096);
fcntl($in_r, 1031, 4096) if $batch eq '--batch-check';
}
+ $in_r->blocking(0);
$self->{$out} = $out_w;
$self->{$in} = $in_r;
}
2. Tweak $PIPE_BUFSIZ and/or MAX_INFLIGHT to smaller values. e.g.
diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm
index 882a9a4a..ec40edd7 100644
--- a/lib/PublicInbox/Git.pm
+++ b/lib/PublicInbox/Git.pm
@@ -23,13 +23,12 @@ use Carp qw(croak carp);
use Digest::SHA ();
use PublicInbox::DS qw(dwaitpid);
our @EXPORT_OK = qw(git_unquote git_quote);
-our $PIPE_BUFSIZ = 65536; # Linux default
+our $PIPE_BUFSIZ = 4096; # Linux default
our $in_cleanup;
our $RDTIMEO = 60_000; # milliseconds
our $async_warn; # true in read-only daemons
-use constant MAX_INFLIGHT => (POSIX::PIPE_BUF * 3) /
- 65; # SHA-256 hex size + "\n" in preparation for git using non-SHA1
+use constant MAX_INFLIGHT => 4;
my %GIT_ESC = (
a => "\a",
MAX_INFLIGHT could go down to 1 outside of t/git.t, I think...
But my sleep deprived mind isn't finding anything that jumps out...
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: public-inbox-convert hangs on systems using musl libc
2022-12-21 12:21 ` Eric Wong
@ 2022-12-21 13:46 ` Chris Brannon
2022-12-21 19:48 ` Eric Wong
0 siblings, 1 reply; 13+ messages in thread
From: Chris Brannon @ 2022-12-21 13:46 UTC (permalink / raw)
To: Eric Wong; +Cc: meta
[-- Attachment #1: Type: text/plain, Size: 1650 bytes --]
Eric Wong <e@80x24.org> writes:
> Do you know which pipes are which? "lsof -p $PID +E" can help
> with connectivity checking, as can script/dtas-graph in
> https://80x24.org/dtas.git if you have Graph::Easy
Yes. I'm attaching my lsof output and a typescript.
The processes of interest here are 4849 public-inbox-convert and 4879
git cat-file.
PID 4849's FD 11 is the write end of a pipe, with 4879's stdin as the
read end.
PID 4849's FD 12 is the read end of a pipe, with 4879's stdout as the
write end. At the point of the hang, 4849 is trying to write a SHA1 to
FD 11, while 4879 is writing an email message to its stdout.
> Some shots in the dark:
>
> 2. Tweak $PIPE_BUFSIZ and/or MAX_INFLIGHT to smaller values. e.g.
>
> diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm
> index 882a9a4a..ec40edd7 100644
> --- a/lib/PublicInbox/Git.pm
> +++ b/lib/PublicInbox/Git.pm
> @@ -23,13 +23,12 @@ use Carp qw(croak carp);
> use Digest::SHA ();
> use PublicInbox::DS qw(dwaitpid);
> our @EXPORT_OK = qw(git_unquote git_quote);
> -our $PIPE_BUFSIZ = 65536; # Linux default
> +our $PIPE_BUFSIZ = 4096; # Linux default
> our $in_cleanup;
> our $RDTIMEO = 60_000; # milliseconds
> our $async_warn; # true in read-only daemons
>
> -use constant MAX_INFLIGHT => (POSIX::PIPE_BUF * 3) /
> - 65; # SHA-256 hex size + "\n" in preparation for git using non-SHA1
> +use constant MAX_INFLIGHT => 4;
>
> my %GIT_ESC = (
> a => "\a",
>
This right here seems to have fixed it, when testing locally.
PS. Thank you for that lsof command. I've never used lsof in that way;
I'll have to add that to my *nix debugging toolbelt.
-- Chris
[-- Attachment #2: lsofout.txt --]
[-- Type: text/plain, Size: 4867 bytes --]
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
screen 4028 chris 11u CHR 5,2 0t0 86 /dev/ptmx ->/dev/pts/7 2214,mksh,0u 2214,mksh,1u 2214,mksh,2u 4849,public-in,0u 4849,public-in,1u 4849,public-in,2u 4874,shard[0],0u 4874,shard[0],1u 4874,shard[0],2u 4879,git,2u
public-in 4849 chris cwd DIR 0,42 412 17211 /files/cmb-files/repos
public-in 4849 chris rtd DIR 0,26 28 34 /
public-in 4849 chris txt REG 0,26 14144 239824 /usr/bin/perl
public-in 4849 chris mem REG 0,26 2969912 227085 /usr/lib/libstdc++.so.6.0.30
public-in 4849 chris mem REG 0,26 100328 564900 /usr/lib/libz.so.1.2.13
public-in 4849 chris mem REG 0,26 30760 399826 /usr/lib/libuuid.so.1.3.0
public-in 4849 chris mem REG 0,26 121208 226756 /usr/lib/libgcc_s.so.1
public-in 4849 chris mem REG 0,26 2234600 563318 /usr/lib/libxapian.so.30.12.2
public-in 4849 chris mem REG 0,26 500184 219769 /usr/lib/perl5/vendor_perl/auto/Search/Xapian/Xapian.so
public-in 4849 chris mem REG 0,26 132928 239703 /usr/lib/perl5/core_perl/auto/Compress/Raw/Zlib/Zlib.so
public-in 4849 chris mem REG 0,26 1351488 347561 /usr/lib/libsqlite3.so.0.8.6
public-in 4849 chris mem REG 0,26 180112 239928 /usr/lib/perl5/vendor_perl/auto/DBD/SQLite/SQLite.so
public-in 4849 chris mem REG 0,26 138264 240017 /usr/lib/perl5/vendor_perl/auto/DBI/DBI.so
public-in 4849 chris mem REG 0,26 2310 4758 /usr/share/zoneinfo/PST8PDT
public-in 4849 chris mem REG 0,26 18168 239686 /usr/lib/perl5/core_perl/auto/MIME/Base64/Base64.so
public-in 4849 chris mem REG 0,26 657216 239665 /usr/lib/perl5/core_perl/auto/re/re.so
public-in 4849 chris mem REG 0,26 116640 239674 /usr/lib/perl5/core_perl/auto/Storable/Storable.so
public-in 4849 chris mem REG 0,26 46920 239702 /usr/lib/perl5/core_perl/auto/Encode/Encode.so
public-in 4849 chris mem REG 0,26 39608 218906 /usr/lib/perl5/vendor_perl/auto/Email/Address/XS/XS.so
public-in 4849 chris mem REG 0,26 46920 239675 /usr/lib/perl5/core_perl/auto/Socket/Socket.so
public-in 4849 chris mem REG 0,26 59128 239687 /usr/lib/perl5/core_perl/auto/List/Util/Util.so
public-in 4849 chris mem REG 0,26 46840 238715 /usr/lib/perl5/core_perl/auto/Digest/SHA/SHA.so
public-in 4849 chris mem REG 0,26 30456 239671 /usr/lib/perl5/core_perl/auto/Time/HiRes/HiRes.so
public-in 4849 chris mem REG 0,26 18168 239744 /usr/lib/perl5/core_perl/auto/Cwd/Cwd.so
public-in 4849 chris mem REG 0,26 30536 239695 /usr/lib/perl5/core_perl/auto/File/Glob/Glob.so
public-in 4849 chris mem REG 0,26 22264 239689 /usr/lib/perl5/core_perl/auto/IO/IO.so
public-in 4849 chris mem REG 0,26 108360 239682 /usr/lib/perl5/core_perl/auto/POSIX/POSIX.so
public-in 4849 chris mem REG 0,26 22344 239697 /usr/lib/perl5/core_perl/auto/Fcntl/Fcntl.so
public-in 4849 chris mem REG 0,26 3789544 239902 /usr/lib/libperl.so.5.36.0
public-in 4849 chris mem REG 0,26 4251720 228062 /usr/lib/libc.so
public-in 4849 chris 0u CHR 136,7 0t0 10 /dev/pts/7 4028,screen,11u
public-in 4849 chris 1u CHR 136,7 0t0 10 /dev/pts/7 4028,screen,11u
public-in 4849 chris 2u CHR 136,7 0t0 10 /dev/pts/7 4028,screen,11u
public-in 4849 chris 3w CHR 1,3 0t0 4 /dev/null
public-in 4849 chris 4uW REG 0,42 0 2074958 /files/cmb-files/repos/metanew/inbox.lock
public-in 4849 chris 5ur REG 0,42 24576 2074742 /files/cmb-files/repos/metanew/msgmap.sqlite3
public-in 4849 chris 6w FIFO 0,12 0t0 26905684 pipe 4874,shard[0],5r
public-in 4849 chris 7r FIFO 0,12 0t0 26905685 pipe 4874,shard[0],8w
public-in 4849 chris 8ur REG 0,42 69632 2074960 /files/cmb-files/repos/metanew/xap15/over.sqlite3
public-in 4849 chris 9u REG 0,49 461301 410306 /tmp/PerlIO_LaapMm (deleted)
public-in 4849 chris 10u REG 0,42 24576 2074986 /files/cmb-files/repos/metanew/mm_tmp-4849-i60n
public-in 4849 chris 11w FIFO 0,12 0t0 26905179 pipe 4879,git,0r
public-in 4849 chris 12r FIFO 0,12 0t0 26905180 pipe 4879,git,1w
public-in 4849 chris 13u REG 0,42 16928 2074743 /files/cmb-files/repos/metanew/msgmap.sqlite3-journal
public-in 4849 chris 14u REG 0,42 66176 2074961 /files/cmb-files/repos/metanew/xap15/over.sqlite3-journal
shard[0] 4874 chris 5r FIFO 0,12 0t0 26905684 pipe 4849,public-in,6w
shard[0] 4874 chris 8w FIFO 0,12 0t0 26905685 pipe 4849,public-in,7r
git 4879 chris 0r FIFO 0,12 0t0 26905179 pipe 4849,public-in,11w
git 4879 chris 1w FIFO 0,12 0t0 26905180 pipe 4849,public-in,12r
[-- Attachment #3: typescript --]
[-- Type: text/plain, Size: 610 bytes --]
Script started on 2022-12-21 05:12:15-08:00 [TERM="screen.linux" TTY="/dev/pts/10" COLUMNS="80" LINES="30"]
chris@beast:/home/chris $ doas strace -p 4849
\rdoas (chris@beast) password:
strace: Process 4849 attached
write(11, "4187293ba385ec204df84742b7197f4d"..., 41^Cstrace: Process 4849 detached
<detached ...>
Interrupt
chris@beast:/home/chris $ doas strace -p 4879
strace: Process 4879 attached
write(1, "Return-Path: <e@80x24.org>\nX-Spa"..., 3793^Cstrace: Process 4879 detached
<detached ...>
Interrupt
chris@beast:/home/chris $ exit
Script done on 2022-12-21 05:12:43-08:00 [COMMAND_EXIT_CODE="130"]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: public-inbox-convert hangs on systems using musl libc
2022-12-21 13:46 ` Chris Brannon
@ 2022-12-21 19:48 ` Eric Wong
2022-12-21 20:46 ` Chris Brannon
0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2022-12-21 19:48 UTC (permalink / raw)
To: Chris Brannon; +Cc: meta
Chris Brannon <chris@the-brannons.com> wrote:
> Eric Wong <e@80x24.org> writes:
>
> > Do you know which pipes are which? "lsof -p $PID +E" can help
> > with connectivity checking, as can script/dtas-graph in
> > https://80x24.org/dtas.git if you have Graph::Easy
>
> Yes. I'm attaching my lsof output and a typescript.
>
> The processes of interest here are 4849 public-inbox-convert and 4879
> git cat-file.
> PID 4849's FD 11 is the write end of a pipe, with 4879's stdin as the
> read end.
> PID 4849's FD 12 is the read end of a pipe, with 4879's stdout as the
> write end. At the point of the hang, 4849 is trying to write a SHA1 to
> FD 11, while 4879 is writing an email message to its stdout.
OK. That is strange, because the current values are sized
conservatively for Linux (which has larger-than-required
PIPE_BUF size).
> > Some shots in the dark:
> >
> > 2. Tweak $PIPE_BUFSIZ and/or MAX_INFLIGHT to smaller values. e.g.
> >
> > diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm
> > index 882a9a4a..ec40edd7 100644
> > --- a/lib/PublicInbox/Git.pm
> > +++ b/lib/PublicInbox/Git.pm
> > @@ -23,13 +23,12 @@ use Carp qw(croak carp);
> > use Digest::SHA ();
> > use PublicInbox::DS qw(dwaitpid);
> > our @EXPORT_OK = qw(git_unquote git_quote);
> > -our $PIPE_BUFSIZ = 65536; # Linux default
> > +our $PIPE_BUFSIZ = 4096; # Linux default
> > our $in_cleanup;
> > our $RDTIMEO = 60_000; # milliseconds
> > our $async_warn; # true in read-only daemons
> >
> > -use constant MAX_INFLIGHT => (POSIX::PIPE_BUF * 3) /
> > - 65; # SHA-256 hex size + "\n" in preparation for git using non-SHA1
> > +use constant MAX_INFLIGHT => 4;
>
> This right here seems to have fixed it, when testing locally.
Are you able to isolate whether $PIPE_BUFSIZ or MAX_INFLIGHT on
it's own fixes it?
And can you confirm the ->blocking(0) change had no effect on
it's own?
Capping MAX_INFLIGHT to a smaller value is probably fine (maybe
10 can work).
The old MAX_INFLIGHT value ((512 * 3)/65 = 23) is actually
extremely conservative for Linux, since the smallest possible
PIPE_BUF is 4096 (so (4096 * 3)/65 = 189), but giant values
don't help (and hurt interruptibility).
However, making $PIPE_BUFSIZ smaller would increase syscalls
made and hurt performance on systems with expensive syscalls.
So I want to keep $PIPE_BUFSIZ as big as reasonable. Setting
`$PIPE_BUFSIZ = 1024 ** 2' ought to work on a system with giant
pipes, even; but the default Linux pipe size is 65536 unless
it's low on memory.
I'm honestly at a loss on how to explain what went wrong for you
because the existing values are OS-independent.
I also do wonder if you've hit a kernel bug under low memory
conditions. I've certainly encountered problems with TCP
traffic hanging processes due to memory compaction.
> PS. Thank you for that lsof command. I've never used lsof in that way;
> I'll have to add that to my *nix debugging toolbelt.
np :>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: public-inbox-convert hangs on systems using musl libc
2022-12-21 19:48 ` Eric Wong
@ 2022-12-21 20:46 ` Chris Brannon
2022-12-21 21:11 ` Eric Wong
0 siblings, 1 reply; 13+ messages in thread
From: Chris Brannon @ 2022-12-21 20:46 UTC (permalink / raw)
To: Eric Wong; +Cc: meta
Eric Wong <e@80x24.org> writes:
> Are you able to isolate whether $PIPE_BUFSIZ or MAX_INFLIGHT on
> it's own fixes it?
MAX_INFLIGHT is sufficient. In fact, $PIPE_BUFSIZ had no effect at all
when I tried just now.
> And can you confirm the ->blocking(0) change had no effect on
> it's own?
That was the first thing I tried, and it had no effect.
> Capping MAX_INFLIGHT to a smaller value is probably fine (maybe
> 10 can work).
I tried with 10 just now, and that was fine also.
> I also do wonder if you've hit a kernel bug under low memory
> conditions. I've certainly encountered problems with TCP
> traffic hanging processes due to memory compaction.
I doubt it. I looked during a deadlock, and I had 2.66 GiB available.
-- Chris
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: public-inbox-convert hangs on systems using musl libc
2022-12-21 20:46 ` Chris Brannon
@ 2022-12-21 21:11 ` Eric Wong
2022-12-21 22:17 ` Chris Brannon
0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2022-12-21 21:11 UTC (permalink / raw)
To: Chris Brannon; +Cc: meta
Chris Brannon <chris@the-brannons.com> wrote:
> Eric Wong <e@80x24.org> writes:
>
> > Are you able to isolate whether $PIPE_BUFSIZ or MAX_INFLIGHT on
> > it's own fixes it?
>
> MAX_INFLIGHT is sufficient. In fact, $PIPE_BUFSIZ had no effect at all
> when I tried just now.
OK, good to know. Thanks so far for your other testing and
reports.
> > Capping MAX_INFLIGHT to a smaller value is probably fine (maybe
> > 10 can work).
>
> I tried with 10 just now, and that was fine also.
Oops, so POSIX::PIPE_BUF is actually 4096 on Linux.
(`perl -MPOSIX -E 'say POSIX::PIPE_BUF'` shows 4096 on the
Alpine cfarm machine)
But I'm still confused as to why the original code didn't
work on Linux since the pipe can't be smaller than 4096...
The following changes MAX_INFLIGHT to 23 for all platforms (and
explains things better). I also wonder if you can find a
threshold for when things start failing (e.g. hardcoding 24
should fail on FreeBSD)
diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm
index 882a9a4a..a1af776b 100644
--- a/lib/PublicInbox/Git.pm
+++ b/lib/PublicInbox/Git.pm
@@ -28,8 +28,10 @@ our $in_cleanup;
our $RDTIMEO = 60_000; # milliseconds
our $async_warn; # true in read-only daemons
-use constant MAX_INFLIGHT => (POSIX::PIPE_BUF * 3) /
- 65; # SHA-256 hex size + "\n" in preparation for git using non-SHA1
+# 512: POSIX PIPE_BUF minimum (see pipe(7))
+# 3: @$inflight is flattened [ $OID, $cb, $arg ]
+# 65: SHA-256 hex size + "\n" in preparation for git using non-SHA1
+use constant MAX_INFLIGHT => 512 * 3 / 65;
my %GIT_ESC = (
a => "\a",
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: public-inbox-convert hangs on systems using musl libc
2022-12-21 21:11 ` Eric Wong
@ 2022-12-21 22:17 ` Chris Brannon
2022-12-21 23:22 ` [PATCH] git: cap MAX_INFLIGHT value to POSIX minimum Eric Wong
0 siblings, 1 reply; 13+ messages in thread
From: Chris Brannon @ 2022-12-21 22:17 UTC (permalink / raw)
To: Eric Wong; +Cc: meta
Eric Wong <e@80x24.org> writes:
> The following changes MAX_INFLIGHT to 23 for all platforms (and
> explains things better). I also wonder if you can find a
> threshold for when things start failing (e.g. hardcoding 24
> should fail on FreeBSD)
I did a binary search. On my Linux system with musl, deadlocks start at
MAX_INFLIGHT >= 115.
-- Chris
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] git: cap MAX_INFLIGHT value to POSIX minimum
2022-12-21 22:17 ` Chris Brannon
@ 2022-12-21 23:22 ` Eric Wong
2022-12-21 23:57 ` Chris Brannon
0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2022-12-21 23:22 UTC (permalink / raw)
To: Chris Brannon; +Cc: meta
Chris Brannon <chris@the-brannons.com> wrote:
> Eric Wong <e@80x24.org> writes:
>
> > The following changes MAX_INFLIGHT to 23 for all platforms (and
> > explains things better). I also wonder if you can find a
> > threshold for when things start failing (e.g. hardcoding 24
> > should fail on FreeBSD)
>
> I did a binary search. On my Linux system with musl, deadlocks start at
> MAX_INFLIGHT >= 115.
Thanks. I'll push the following out and intend to keep it at <= 23
going forward. However, it still worries me greatly that I have
no explanation to why MAX_INFLIGHT >= 115 fails. This stuff is
my bread and butter, really...
I'll wait on a cfarm admins to test.
Also, can I assume just running 'public-inbox-index' on any
freshly-cloned inbox also fails for you independently of
-convert?
------------8<----------
Subject: [PATCH] git: cap MAX_INFLIGHT value to POSIX minimum
This ensures we get consistent pipelining behavior across
platforms. Furthermore, a smaller value is probably more
reasonable since "git cat-file" can usually outpace indexing and
lower values allow us to react to user interaction (e.g. Ctrl-C)
more quickly.
The previous value based on Linux PIPE_BUF (4096) allowed a
value of 189 which worked fine on non-musl Linux systems, but
failed on musl-based Void and Alpine Linux. Mysteriously, this
works on musl up to a value of 114 and starts locking up at 115.
The reason for this failure is currently unexplained and will
hopefully be discovered soon.
Regardless, capping the value to 23 based on the universal
PIPE_BUF minimum (512) seems reasonable, anyways.
Reported-by: Chris Brannon <chris@the-brannons.com>
Tested-by: Chris Brannon <chris@the-brannons.com>
Link: https://public-inbox.org/meta/87edssl7u0.fsf@the-brannons.com/T/
---
lib/PublicInbox/Git.pm | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm
index 882a9a4a..a1af776b 100644
--- a/lib/PublicInbox/Git.pm
+++ b/lib/PublicInbox/Git.pm
@@ -28,8 +28,10 @@ our $in_cleanup;
our $RDTIMEO = 60_000; # milliseconds
our $async_warn; # true in read-only daemons
-use constant MAX_INFLIGHT => (POSIX::PIPE_BUF * 3) /
- 65; # SHA-256 hex size + "\n" in preparation for git using non-SHA1
+# 512: POSIX PIPE_BUF minimum (see pipe(7))
+# 3: @$inflight is flattened [ $OID, $cb, $arg ]
+# 65: SHA-256 hex size + "\n" in preparation for git using non-SHA1
+use constant MAX_INFLIGHT => 512 * 3 / 65;
my %GIT_ESC = (
a => "\a",
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] git: cap MAX_INFLIGHT value to POSIX minimum
2022-12-21 23:22 ` [PATCH] git: cap MAX_INFLIGHT value to POSIX minimum Eric Wong
@ 2022-12-21 23:57 ` Chris Brannon
2023-01-04 3:49 ` [PATCH] git: fix asynchronous batching for deep pipelines Eric Wong
0 siblings, 1 reply; 13+ messages in thread
From: Chris Brannon @ 2022-12-21 23:57 UTC (permalink / raw)
To: Eric Wong; +Cc: meta
Eric Wong <e@80x24.org> writes:
> Also, can I assume just running 'public-inbox-index' on any
> freshly-cloned inbox also fails for you independently of
> -convert?
Yes it does. I double-checked to be sure.
Thanks for the fix! If you come up with something more permanent that
you want me to test, please ask.
-- Chris
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] git: fix asynchronous batching for deep pipelines
2022-12-21 23:57 ` Chris Brannon
@ 2023-01-04 3:49 ` Eric Wong
2023-01-05 1:08 ` Chris Brannon
0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2023-01-04 3:49 UTC (permalink / raw)
To: Chris Brannon; +Cc: meta
Chris Brannon <chris@the-brannons.com> wrote:
> Eric Wong <e@80x24.org> writes:
>
> > Also, can I assume just running 'public-inbox-index' on any
> > freshly-cloned inbox also fails for you independently of
> > -convert?
>
> Yes it does. I double-checked to be sure.
>
> Thanks for the fix! If you come up with something more permanent that
> you want me to test, please ask.
--------8<-------
Subject: [PATCH] git: fix asynchronous batching for deep pipelines
...By using non-blocking pipe writes. This avoids problems for
musl (and other libc) where getdelim(3) used by `git cat-file --batch*'
uses a smaller input buffer than glibc or FreeBSD libc.
My key mistake was our check against MAX_INFLIGHT is only useful
for the initial batch of requests. It is not useful for
subsequent requests since git will drain the pipe at
unpredictable rates due to libc differences.
To fix this problem, I initially tried to drain the read pipe
as long as readable data was pending. However, reading git
output without giving git more work would also limit parallelism
opportunities since we don't want git to sit idle, either. This
change ensures we keep both pipes reasonably full to reduce
stalls and maximize parallelism between git and public-inbox.
While the limit set a few weeks ago in commit
56e6e587745c (git: cap MAX_INFLIGHT value to POSIX minimum, 2022-12-21)
remains in place, any higher or lower limit will work. It may
be worth it to use an even lower limit to improve interactivity
w.r.t. Ctrl-C interrupts.
I've tested the pre-56e6e587745c and even higher values on an
Alpine VM in the GCC Farm <https://cfarm.tetaneutral.net>
Reported-by: Chris Brannon <chris@the-brannons.com>
Link: https://public-inbox.org/meta/87edssl7u0.fsf@the-brannons.com/T/
---
lib/PublicInbox/Git.pm | 30 ++++++++++++++++++++++--------
1 file changed, 22 insertions(+), 8 deletions(-)
diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm
index a1af776b..86b80a4e 100644
--- a/lib/PublicInbox/Git.pm
+++ b/lib/PublicInbox/Git.pm
@@ -145,6 +145,7 @@ sub _bidi_pipe {
fcntl($out_w, 1031, 4096);
fcntl($in_r, 1031, 4096) if $batch eq '--batch-check';
}
+ $out_w->blocking(0);
$self->{$out} = $out_w;
$self->{$in} = $in_r;
}
@@ -203,7 +204,9 @@ sub cat_async_retry ($$) {
for (my $i = 0; $i < @$inflight; $i += 3) {
$buf .= "$inflight->[$i]\n";
}
+ $self->{out}->blocking(1); # brand new pipe, should never block
print { $self->{out} } $buf or $self->fail("write error: $!");
+ $self->{out}->blocking(0);
my $req = shift @$inflight;
unshift(@$inflight, \$req); # \$ref to indicate retried
@@ -305,13 +308,27 @@ sub check_async_begin ($) {
$self->{inflight_c} = [];
}
+sub write_all {
+ my ($self, $out, $buf, $read_step, $inflight) = @_;
+ $read_step->($self, $inflight) while @$inflight >= MAX_INFLIGHT;
+ do {
+ my $w = syswrite($out, $buf);
+ if (defined $w) {
+ return if $w == length($buf);
+ warn "chop: $w";
+ substr($buf, 0, $w, ''); # sv_chop
+ } elsif ($! != EAGAIN) {
+ $self->fail("write: $!");
+ } else { warn "E: $!" }
+ $read_step->($self, $inflight);
+ } while (1);
+}
+
sub check_async ($$$$) {
my ($self, $oid, $cb, $arg) = @_;
my $inflight_c = $self->{inflight_c} // check_async_begin($self);
- while (scalar(@$inflight_c) >= MAX_INFLIGHT) {
- check_async_step($self, $inflight_c);
- }
- print { $self->{out_c} } $oid, "\n" or $self->fail("write error: $!");
+ write_all($self, $self->{out_c}, $oid."\n",
+ \&check_async_step, $inflight_c);
push(@$inflight_c, $oid, $cb, $arg);
}
@@ -496,10 +513,7 @@ sub cat_async_begin {
sub cat_async ($$$;$) {
my ($self, $oid, $cb, $arg) = @_;
my $inflight = $self->{inflight} // cat_async_begin($self);
- while (scalar(@$inflight) >= MAX_INFLIGHT) {
- cat_async_step($self, $inflight);
- }
- print { $self->{out} } $oid, "\n" or $self->fail("write error: $!");
+ write_all($self, $self->{out}, $oid."\n", \&cat_async_step, $inflight);
push(@$inflight, $oid, $cb, $arg);
}
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] git: fix asynchronous batching for deep pipelines
2023-01-04 3:49 ` [PATCH] git: fix asynchronous batching for deep pipelines Eric Wong
@ 2023-01-05 1:08 ` Chris Brannon
2023-01-05 1:44 ` [PATCH] git: write_all: remove leftover debug messages Eric Wong
0 siblings, 1 reply; 13+ messages in thread
From: Chris Brannon @ 2023-01-05 1:08 UTC (permalink / raw)
To: Eric Wong; +Cc: meta
Eric Wong <e@80x24.org> writes:
>
> --------8<-------
> Subject: [PATCH] git: fix asynchronous batching for deep pipelines
>
> ...By using non-blocking pipe writes. This avoids problems for
> musl (and other libc) where getdelim(3) used by `git cat-file --batch*'
> uses a smaller input buffer than glibc or FreeBSD libc.
>
> My key mistake was our check against MAX_INFLIGHT is only useful
> for the initial batch of requests. It is not useful for
> subsequent requests since git will drain the pipe at
> unpredictable rates due to libc differences.
>
> To fix this problem, I initially tried to drain the read pipe
> as long as readable data was pending. However, reading git
> output without giving git more work would also limit parallelism
> opportunities since we don't want git to sit idle, either. This
> change ensures we keep both pipes reasonably full to reduce
> stalls and maximize parallelism between git and public-inbox.
>
> While the limit set a few weeks ago in commit
> 56e6e587745c (git: cap MAX_INFLIGHT value to POSIX minimum, 2022-12-21)
> remains in place, any higher or lower limit will work. It may
> be worth it to use an even lower limit to improve interactivity
> w.r.t. Ctrl-C interrupts.
>
> I've tested the pre-56e6e587745c and even higher values on an
> Alpine VM in the GCC Farm <https://cfarm.tetaneutral.net>
>
> Reported-by: Chris Brannon <chris@the-brannons.com>
> Tested-by: Chris Brannon <chris@the-brannons.com>
> Link: https://public-inbox.org/meta/87edssl7u0.fsf@the-brannons.com/T/
*SNIP*
Thanks!
The conversion seems to have been successful, but it logged very many of
the following:
E: Resource temporarily unavailable at
/usr/share/perl5/vendor_perl/PublicInbox/Git.pm line 320.
I'm sorry, I didn't count how many. It seemed like multiple screenfuls
at least.
-- Chris
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] git: write_all: remove leftover debug messages
2023-01-05 1:08 ` Chris Brannon
@ 2023-01-05 1:44 ` Eric Wong
2023-01-05 7:32 ` Chris Brannon
0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2023-01-05 1:44 UTC (permalink / raw)
To: Chris Brannon; +Cc: meta
Chris Brannon <chris@the-brannons.com> wrote:
> The conversion seems to have been successful, but it logged very many of
> the following:
> E: Resource temporarily unavailable at
> /usr/share/perl5/vendor_perl/PublicInbox/Git.pm line 320.
>
> I'm sorry, I didn't count how many. It seemed like multiple screenfuls
> at least.
Oops, yeah. Just leftover debug messages that never triggered
on my FreeBSD and Debian systems. Will push this out:
---------8<----------
Subject: [PATCH] git: write_all: remove leftover debug messages
I used these messages during development to verify Alpine was
triggering the intended codepaths. They're no longer necessary
and just noise at this point.
Reported-by: Chris Brannon <chris@the-brannons.com>
Fixes: d4ba8828ab23 ("git: fix asynchronous batching for deep pipelines")
---
lib/PublicInbox/Git.pm | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm
index 86b80a4e..c7f82ba2 100644
--- a/lib/PublicInbox/Git.pm
+++ b/lib/PublicInbox/Git.pm
@@ -315,11 +315,10 @@ sub write_all {
my $w = syswrite($out, $buf);
if (defined $w) {
return if $w == length($buf);
- warn "chop: $w";
substr($buf, 0, $w, ''); # sv_chop
} elsif ($! != EAGAIN) {
$self->fail("write: $!");
- } else { warn "E: $!" }
+ }
$read_step->($self, $inflight);
} while (1);
}
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] git: write_all: remove leftover debug messages
2023-01-05 1:44 ` [PATCH] git: write_all: remove leftover debug messages Eric Wong
@ 2023-01-05 7:32 ` Chris Brannon
0 siblings, 0 replies; 13+ messages in thread
From: Chris Brannon @ 2023-01-05 7:32 UTC (permalink / raw)
To: Eric Wong; +Cc: meta
Eric Wong <e@80x24.org> writes:
> Subject: [PATCH] git: write_all: remove leftover debug messages
>
> I used these messages during development to verify Alpine was
> triggering the intended codepaths. They're no longer necessary
> and just noise at this point.
>
> Reported-by: Chris Brannon <chris@the-brannons.com>
> Tested-by: Chris Brannon <chris@the-brannons.com>
> Fixes: d4ba8828ab23 ("git: fix asynchronous batching for deep pipelines")
> ---
> lib/PublicInbox/Git.pm | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
Thanks!
-- Chris
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2023-01-05 7:32 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-12-21 11:28 public-inbox-convert hangs on systems using musl libc Chris Brannon
2022-12-21 12:21 ` Eric Wong
2022-12-21 13:46 ` Chris Brannon
2022-12-21 19:48 ` Eric Wong
2022-12-21 20:46 ` Chris Brannon
2022-12-21 21:11 ` Eric Wong
2022-12-21 22:17 ` Chris Brannon
2022-12-21 23:22 ` [PATCH] git: cap MAX_INFLIGHT value to POSIX minimum Eric Wong
2022-12-21 23:57 ` Chris Brannon
2023-01-04 3:49 ` [PATCH] git: fix asynchronous batching for deep pipelines Eric Wong
2023-01-05 1:08 ` Chris Brannon
2023-01-05 1:44 ` [PATCH] git: write_all: remove leftover debug messages Eric Wong
2023-01-05 7:32 ` Chris Brannon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).