* public-inbox-convert hangs on systems using musl libc @ 2022-12-21 11:28 Chris Brannon 2022-12-21 12:21 ` Eric Wong 0 siblings, 1 reply; 13+ messages in thread From: Chris Brannon @ 2022-12-21 11:28 UTC (permalink / raw) To: meta I can reproduce it reliably on both Alpine Linux and the musl variant of Void Linux. Just clone a mirror of public-inbox.org/meta and try and convert it: $ git clone --mirror https://public-inbox.org/meta $ public-inbox-convert meta metanew If I pass --no-index, the process is successful. This is a deadlock of some sort, because during the hang, git cat-file is blocked on write and other things are blocked on read. That's as far as I have gotten with debugging it. -- Chris ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: public-inbox-convert hangs on systems using musl libc 2022-12-21 11:28 public-inbox-convert hangs on systems using musl libc Chris Brannon @ 2022-12-21 12:21 ` Eric Wong 2022-12-21 13:46 ` Chris Brannon 0 siblings, 1 reply; 13+ messages in thread From: Eric Wong @ 2022-12-21 12:21 UTC (permalink / raw) To: Chris Brannon; +Cc: meta Chris Brannon <chris@the-brannons.com> wrote: > I can reproduce it reliably on both Alpine Linux and the musl variant of > Void Linux. Just clone a mirror of public-inbox.org/meta and try and > convert it: > > $ git clone --mirror https://public-inbox.org/meta > $ public-inbox-convert meta metanew > > If I pass --no-index, the process is successful. > > This is a deadlock of some sort, because during the hang, git cat-file > is blocked on write and other things are blocked on read. I've requested some packages w/ cfarm admins to test with. Do you know which pipes are which? "lsof -p $PID +E" can help with connectivity checking, as can script/dtas-graph in https://80x24.org/dtas.git if you have Graph::Easy What's curious is all the code paths should be independent of stdio buffering, so I wouldn't think libc differences would matter... I also just did the above meta conversion on FreeBSD w/o problems. > That's as far as I have gotten with debugging it. Some shots in the dark: 1. force read pipe on our end to be non-blocking diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm index 882a9a4a..b239ceb9 100644 --- a/lib/PublicInbox/Git.pm +++ b/lib/PublicInbox/Git.pm @@ -143,6 +143,7 @@ sub _bidi_pipe { fcntl($out_w, 1031, 4096); fcntl($in_r, 1031, 4096) if $batch eq '--batch-check'; } + $in_r->blocking(0); $self->{$out} = $out_w; $self->{$in} = $in_r; } 2. Tweak $PIPE_BUFSIZ and/or MAX_INFLIGHT to smaller values. e.g. diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm index 882a9a4a..ec40edd7 100644 --- a/lib/PublicInbox/Git.pm +++ b/lib/PublicInbox/Git.pm @@ -23,13 +23,12 @@ use Carp qw(croak carp); use Digest::SHA (); use PublicInbox::DS qw(dwaitpid); our @EXPORT_OK = qw(git_unquote git_quote); -our $PIPE_BUFSIZ = 65536; # Linux default +our $PIPE_BUFSIZ = 4096; # Linux default our $in_cleanup; our $RDTIMEO = 60_000; # milliseconds our $async_warn; # true in read-only daemons -use constant MAX_INFLIGHT => (POSIX::PIPE_BUF * 3) / - 65; # SHA-256 hex size + "\n" in preparation for git using non-SHA1 +use constant MAX_INFLIGHT => 4; my %GIT_ESC = ( a => "\a", MAX_INFLIGHT could go down to 1 outside of t/git.t, I think... But my sleep deprived mind isn't finding anything that jumps out... ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: public-inbox-convert hangs on systems using musl libc 2022-12-21 12:21 ` Eric Wong @ 2022-12-21 13:46 ` Chris Brannon 2022-12-21 19:48 ` Eric Wong 0 siblings, 1 reply; 13+ messages in thread From: Chris Brannon @ 2022-12-21 13:46 UTC (permalink / raw) To: Eric Wong; +Cc: meta [-- Attachment #1: Type: text/plain, Size: 1650 bytes --] Eric Wong <e@80x24.org> writes: > Do you know which pipes are which? "lsof -p $PID +E" can help > with connectivity checking, as can script/dtas-graph in > https://80x24.org/dtas.git if you have Graph::Easy Yes. I'm attaching my lsof output and a typescript. The processes of interest here are 4849 public-inbox-convert and 4879 git cat-file. PID 4849's FD 11 is the write end of a pipe, with 4879's stdin as the read end. PID 4849's FD 12 is the read end of a pipe, with 4879's stdout as the write end. At the point of the hang, 4849 is trying to write a SHA1 to FD 11, while 4879 is writing an email message to its stdout. > Some shots in the dark: > > 2. Tweak $PIPE_BUFSIZ and/or MAX_INFLIGHT to smaller values. e.g. > > diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm > index 882a9a4a..ec40edd7 100644 > --- a/lib/PublicInbox/Git.pm > +++ b/lib/PublicInbox/Git.pm > @@ -23,13 +23,12 @@ use Carp qw(croak carp); > use Digest::SHA (); > use PublicInbox::DS qw(dwaitpid); > our @EXPORT_OK = qw(git_unquote git_quote); > -our $PIPE_BUFSIZ = 65536; # Linux default > +our $PIPE_BUFSIZ = 4096; # Linux default > our $in_cleanup; > our $RDTIMEO = 60_000; # milliseconds > our $async_warn; # true in read-only daemons > > -use constant MAX_INFLIGHT => (POSIX::PIPE_BUF * 3) / > - 65; # SHA-256 hex size + "\n" in preparation for git using non-SHA1 > +use constant MAX_INFLIGHT => 4; > > my %GIT_ESC = ( > a => "\a", > This right here seems to have fixed it, when testing locally. PS. Thank you for that lsof command. I've never used lsof in that way; I'll have to add that to my *nix debugging toolbelt. -- Chris [-- Attachment #2: lsofout.txt --] [-- Type: text/plain, Size: 4867 bytes --] COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME screen 4028 chris 11u CHR 5,2 0t0 86 /dev/ptmx ->/dev/pts/7 2214,mksh,0u 2214,mksh,1u 2214,mksh,2u 4849,public-in,0u 4849,public-in,1u 4849,public-in,2u 4874,shard[0],0u 4874,shard[0],1u 4874,shard[0],2u 4879,git,2u public-in 4849 chris cwd DIR 0,42 412 17211 /files/cmb-files/repos public-in 4849 chris rtd DIR 0,26 28 34 / public-in 4849 chris txt REG 0,26 14144 239824 /usr/bin/perl public-in 4849 chris mem REG 0,26 2969912 227085 /usr/lib/libstdc++.so.6.0.30 public-in 4849 chris mem REG 0,26 100328 564900 /usr/lib/libz.so.1.2.13 public-in 4849 chris mem REG 0,26 30760 399826 /usr/lib/libuuid.so.1.3.0 public-in 4849 chris mem REG 0,26 121208 226756 /usr/lib/libgcc_s.so.1 public-in 4849 chris mem REG 0,26 2234600 563318 /usr/lib/libxapian.so.30.12.2 public-in 4849 chris mem REG 0,26 500184 219769 /usr/lib/perl5/vendor_perl/auto/Search/Xapian/Xapian.so public-in 4849 chris mem REG 0,26 132928 239703 /usr/lib/perl5/core_perl/auto/Compress/Raw/Zlib/Zlib.so public-in 4849 chris mem REG 0,26 1351488 347561 /usr/lib/libsqlite3.so.0.8.6 public-in 4849 chris mem REG 0,26 180112 239928 /usr/lib/perl5/vendor_perl/auto/DBD/SQLite/SQLite.so public-in 4849 chris mem REG 0,26 138264 240017 /usr/lib/perl5/vendor_perl/auto/DBI/DBI.so public-in 4849 chris mem REG 0,26 2310 4758 /usr/share/zoneinfo/PST8PDT public-in 4849 chris mem REG 0,26 18168 239686 /usr/lib/perl5/core_perl/auto/MIME/Base64/Base64.so public-in 4849 chris mem REG 0,26 657216 239665 /usr/lib/perl5/core_perl/auto/re/re.so public-in 4849 chris mem REG 0,26 116640 239674 /usr/lib/perl5/core_perl/auto/Storable/Storable.so public-in 4849 chris mem REG 0,26 46920 239702 /usr/lib/perl5/core_perl/auto/Encode/Encode.so public-in 4849 chris mem REG 0,26 39608 218906 /usr/lib/perl5/vendor_perl/auto/Email/Address/XS/XS.so public-in 4849 chris mem REG 0,26 46920 239675 /usr/lib/perl5/core_perl/auto/Socket/Socket.so public-in 4849 chris mem REG 0,26 59128 239687 /usr/lib/perl5/core_perl/auto/List/Util/Util.so public-in 4849 chris mem REG 0,26 46840 238715 /usr/lib/perl5/core_perl/auto/Digest/SHA/SHA.so public-in 4849 chris mem REG 0,26 30456 239671 /usr/lib/perl5/core_perl/auto/Time/HiRes/HiRes.so public-in 4849 chris mem REG 0,26 18168 239744 /usr/lib/perl5/core_perl/auto/Cwd/Cwd.so public-in 4849 chris mem REG 0,26 30536 239695 /usr/lib/perl5/core_perl/auto/File/Glob/Glob.so public-in 4849 chris mem REG 0,26 22264 239689 /usr/lib/perl5/core_perl/auto/IO/IO.so public-in 4849 chris mem REG 0,26 108360 239682 /usr/lib/perl5/core_perl/auto/POSIX/POSIX.so public-in 4849 chris mem REG 0,26 22344 239697 /usr/lib/perl5/core_perl/auto/Fcntl/Fcntl.so public-in 4849 chris mem REG 0,26 3789544 239902 /usr/lib/libperl.so.5.36.0 public-in 4849 chris mem REG 0,26 4251720 228062 /usr/lib/libc.so public-in 4849 chris 0u CHR 136,7 0t0 10 /dev/pts/7 4028,screen,11u public-in 4849 chris 1u CHR 136,7 0t0 10 /dev/pts/7 4028,screen,11u public-in 4849 chris 2u CHR 136,7 0t0 10 /dev/pts/7 4028,screen,11u public-in 4849 chris 3w CHR 1,3 0t0 4 /dev/null public-in 4849 chris 4uW REG 0,42 0 2074958 /files/cmb-files/repos/metanew/inbox.lock public-in 4849 chris 5ur REG 0,42 24576 2074742 /files/cmb-files/repos/metanew/msgmap.sqlite3 public-in 4849 chris 6w FIFO 0,12 0t0 26905684 pipe 4874,shard[0],5r public-in 4849 chris 7r FIFO 0,12 0t0 26905685 pipe 4874,shard[0],8w public-in 4849 chris 8ur REG 0,42 69632 2074960 /files/cmb-files/repos/metanew/xap15/over.sqlite3 public-in 4849 chris 9u REG 0,49 461301 410306 /tmp/PerlIO_LaapMm (deleted) public-in 4849 chris 10u REG 0,42 24576 2074986 /files/cmb-files/repos/metanew/mm_tmp-4849-i60n public-in 4849 chris 11w FIFO 0,12 0t0 26905179 pipe 4879,git,0r public-in 4849 chris 12r FIFO 0,12 0t0 26905180 pipe 4879,git,1w public-in 4849 chris 13u REG 0,42 16928 2074743 /files/cmb-files/repos/metanew/msgmap.sqlite3-journal public-in 4849 chris 14u REG 0,42 66176 2074961 /files/cmb-files/repos/metanew/xap15/over.sqlite3-journal shard[0] 4874 chris 5r FIFO 0,12 0t0 26905684 pipe 4849,public-in,6w shard[0] 4874 chris 8w FIFO 0,12 0t0 26905685 pipe 4849,public-in,7r git 4879 chris 0r FIFO 0,12 0t0 26905179 pipe 4849,public-in,11w git 4879 chris 1w FIFO 0,12 0t0 26905180 pipe 4849,public-in,12r [-- Attachment #3: typescript --] [-- Type: text/plain, Size: 610 bytes --] Script started on 2022-12-21 05:12:15-08:00 [TERM="screen.linux" TTY="/dev/pts/10" COLUMNS="80" LINES="30"] chris@beast:/home/chris $ doas strace -p 4849 \rdoas (chris@beast) password: strace: Process 4849 attached write(11, "4187293ba385ec204df84742b7197f4d"..., 41^Cstrace: Process 4849 detached <detached ...> Interrupt chris@beast:/home/chris $ doas strace -p 4879 strace: Process 4879 attached write(1, "Return-Path: <e@80x24.org>\nX-Spa"..., 3793^Cstrace: Process 4879 detached <detached ...> Interrupt chris@beast:/home/chris $ exit Script done on 2022-12-21 05:12:43-08:00 [COMMAND_EXIT_CODE="130"] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: public-inbox-convert hangs on systems using musl libc 2022-12-21 13:46 ` Chris Brannon @ 2022-12-21 19:48 ` Eric Wong 2022-12-21 20:46 ` Chris Brannon 0 siblings, 1 reply; 13+ messages in thread From: Eric Wong @ 2022-12-21 19:48 UTC (permalink / raw) To: Chris Brannon; +Cc: meta Chris Brannon <chris@the-brannons.com> wrote: > Eric Wong <e@80x24.org> writes: > > > Do you know which pipes are which? "lsof -p $PID +E" can help > > with connectivity checking, as can script/dtas-graph in > > https://80x24.org/dtas.git if you have Graph::Easy > > Yes. I'm attaching my lsof output and a typescript. > > The processes of interest here are 4849 public-inbox-convert and 4879 > git cat-file. > PID 4849's FD 11 is the write end of a pipe, with 4879's stdin as the > read end. > PID 4849's FD 12 is the read end of a pipe, with 4879's stdout as the > write end. At the point of the hang, 4849 is trying to write a SHA1 to > FD 11, while 4879 is writing an email message to its stdout. OK. That is strange, because the current values are sized conservatively for Linux (which has larger-than-required PIPE_BUF size). > > Some shots in the dark: > > > > 2. Tweak $PIPE_BUFSIZ and/or MAX_INFLIGHT to smaller values. e.g. > > > > diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm > > index 882a9a4a..ec40edd7 100644 > > --- a/lib/PublicInbox/Git.pm > > +++ b/lib/PublicInbox/Git.pm > > @@ -23,13 +23,12 @@ use Carp qw(croak carp); > > use Digest::SHA (); > > use PublicInbox::DS qw(dwaitpid); > > our @EXPORT_OK = qw(git_unquote git_quote); > > -our $PIPE_BUFSIZ = 65536; # Linux default > > +our $PIPE_BUFSIZ = 4096; # Linux default > > our $in_cleanup; > > our $RDTIMEO = 60_000; # milliseconds > > our $async_warn; # true in read-only daemons > > > > -use constant MAX_INFLIGHT => (POSIX::PIPE_BUF * 3) / > > - 65; # SHA-256 hex size + "\n" in preparation for git using non-SHA1 > > +use constant MAX_INFLIGHT => 4; > > This right here seems to have fixed it, when testing locally. Are you able to isolate whether $PIPE_BUFSIZ or MAX_INFLIGHT on it's own fixes it? And can you confirm the ->blocking(0) change had no effect on it's own? Capping MAX_INFLIGHT to a smaller value is probably fine (maybe 10 can work). The old MAX_INFLIGHT value ((512 * 3)/65 = 23) is actually extremely conservative for Linux, since the smallest possible PIPE_BUF is 4096 (so (4096 * 3)/65 = 189), but giant values don't help (and hurt interruptibility). However, making $PIPE_BUFSIZ smaller would increase syscalls made and hurt performance on systems with expensive syscalls. So I want to keep $PIPE_BUFSIZ as big as reasonable. Setting `$PIPE_BUFSIZ = 1024 ** 2' ought to work on a system with giant pipes, even; but the default Linux pipe size is 65536 unless it's low on memory. I'm honestly at a loss on how to explain what went wrong for you because the existing values are OS-independent. I also do wonder if you've hit a kernel bug under low memory conditions. I've certainly encountered problems with TCP traffic hanging processes due to memory compaction. > PS. Thank you for that lsof command. I've never used lsof in that way; > I'll have to add that to my *nix debugging toolbelt. np :> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: public-inbox-convert hangs on systems using musl libc 2022-12-21 19:48 ` Eric Wong @ 2022-12-21 20:46 ` Chris Brannon 2022-12-21 21:11 ` Eric Wong 0 siblings, 1 reply; 13+ messages in thread From: Chris Brannon @ 2022-12-21 20:46 UTC (permalink / raw) To: Eric Wong; +Cc: meta Eric Wong <e@80x24.org> writes: > Are you able to isolate whether $PIPE_BUFSIZ or MAX_INFLIGHT on > it's own fixes it? MAX_INFLIGHT is sufficient. In fact, $PIPE_BUFSIZ had no effect at all when I tried just now. > And can you confirm the ->blocking(0) change had no effect on > it's own? That was the first thing I tried, and it had no effect. > Capping MAX_INFLIGHT to a smaller value is probably fine (maybe > 10 can work). I tried with 10 just now, and that was fine also. > I also do wonder if you've hit a kernel bug under low memory > conditions. I've certainly encountered problems with TCP > traffic hanging processes due to memory compaction. I doubt it. I looked during a deadlock, and I had 2.66 GiB available. -- Chris ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: public-inbox-convert hangs on systems using musl libc 2022-12-21 20:46 ` Chris Brannon @ 2022-12-21 21:11 ` Eric Wong 2022-12-21 22:17 ` Chris Brannon 0 siblings, 1 reply; 13+ messages in thread From: Eric Wong @ 2022-12-21 21:11 UTC (permalink / raw) To: Chris Brannon; +Cc: meta Chris Brannon <chris@the-brannons.com> wrote: > Eric Wong <e@80x24.org> writes: > > > Are you able to isolate whether $PIPE_BUFSIZ or MAX_INFLIGHT on > > it's own fixes it? > > MAX_INFLIGHT is sufficient. In fact, $PIPE_BUFSIZ had no effect at all > when I tried just now. OK, good to know. Thanks so far for your other testing and reports. > > Capping MAX_INFLIGHT to a smaller value is probably fine (maybe > > 10 can work). > > I tried with 10 just now, and that was fine also. Oops, so POSIX::PIPE_BUF is actually 4096 on Linux. (`perl -MPOSIX -E 'say POSIX::PIPE_BUF'` shows 4096 on the Alpine cfarm machine) But I'm still confused as to why the original code didn't work on Linux since the pipe can't be smaller than 4096... The following changes MAX_INFLIGHT to 23 for all platforms (and explains things better). I also wonder if you can find a threshold for when things start failing (e.g. hardcoding 24 should fail on FreeBSD) diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm index 882a9a4a..a1af776b 100644 --- a/lib/PublicInbox/Git.pm +++ b/lib/PublicInbox/Git.pm @@ -28,8 +28,10 @@ our $in_cleanup; our $RDTIMEO = 60_000; # milliseconds our $async_warn; # true in read-only daemons -use constant MAX_INFLIGHT => (POSIX::PIPE_BUF * 3) / - 65; # SHA-256 hex size + "\n" in preparation for git using non-SHA1 +# 512: POSIX PIPE_BUF minimum (see pipe(7)) +# 3: @$inflight is flattened [ $OID, $cb, $arg ] +# 65: SHA-256 hex size + "\n" in preparation for git using non-SHA1 +use constant MAX_INFLIGHT => 512 * 3 / 65; my %GIT_ESC = ( a => "\a", ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: public-inbox-convert hangs on systems using musl libc 2022-12-21 21:11 ` Eric Wong @ 2022-12-21 22:17 ` Chris Brannon 2022-12-21 23:22 ` [PATCH] git: cap MAX_INFLIGHT value to POSIX minimum Eric Wong 0 siblings, 1 reply; 13+ messages in thread From: Chris Brannon @ 2022-12-21 22:17 UTC (permalink / raw) To: Eric Wong; +Cc: meta Eric Wong <e@80x24.org> writes: > The following changes MAX_INFLIGHT to 23 for all platforms (and > explains things better). I also wonder if you can find a > threshold for when things start failing (e.g. hardcoding 24 > should fail on FreeBSD) I did a binary search. On my Linux system with musl, deadlocks start at MAX_INFLIGHT >= 115. -- Chris ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] git: cap MAX_INFLIGHT value to POSIX minimum 2022-12-21 22:17 ` Chris Brannon @ 2022-12-21 23:22 ` Eric Wong 2022-12-21 23:57 ` Chris Brannon 0 siblings, 1 reply; 13+ messages in thread From: Eric Wong @ 2022-12-21 23:22 UTC (permalink / raw) To: Chris Brannon; +Cc: meta Chris Brannon <chris@the-brannons.com> wrote: > Eric Wong <e@80x24.org> writes: > > > The following changes MAX_INFLIGHT to 23 for all platforms (and > > explains things better). I also wonder if you can find a > > threshold for when things start failing (e.g. hardcoding 24 > > should fail on FreeBSD) > > I did a binary search. On my Linux system with musl, deadlocks start at > MAX_INFLIGHT >= 115. Thanks. I'll push the following out and intend to keep it at <= 23 going forward. However, it still worries me greatly that I have no explanation to why MAX_INFLIGHT >= 115 fails. This stuff is my bread and butter, really... I'll wait on a cfarm admins to test. Also, can I assume just running 'public-inbox-index' on any freshly-cloned inbox also fails for you independently of -convert? ------------8<---------- Subject: [PATCH] git: cap MAX_INFLIGHT value to POSIX minimum This ensures we get consistent pipelining behavior across platforms. Furthermore, a smaller value is probably more reasonable since "git cat-file" can usually outpace indexing and lower values allow us to react to user interaction (e.g. Ctrl-C) more quickly. The previous value based on Linux PIPE_BUF (4096) allowed a value of 189 which worked fine on non-musl Linux systems, but failed on musl-based Void and Alpine Linux. Mysteriously, this works on musl up to a value of 114 and starts locking up at 115. The reason for this failure is currently unexplained and will hopefully be discovered soon. Regardless, capping the value to 23 based on the universal PIPE_BUF minimum (512) seems reasonable, anyways. Reported-by: Chris Brannon <chris@the-brannons.com> Tested-by: Chris Brannon <chris@the-brannons.com> Link: https://public-inbox.org/meta/87edssl7u0.fsf@the-brannons.com/T/ --- lib/PublicInbox/Git.pm | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm index 882a9a4a..a1af776b 100644 --- a/lib/PublicInbox/Git.pm +++ b/lib/PublicInbox/Git.pm @@ -28,8 +28,10 @@ our $in_cleanup; our $RDTIMEO = 60_000; # milliseconds our $async_warn; # true in read-only daemons -use constant MAX_INFLIGHT => (POSIX::PIPE_BUF * 3) / - 65; # SHA-256 hex size + "\n" in preparation for git using non-SHA1 +# 512: POSIX PIPE_BUF minimum (see pipe(7)) +# 3: @$inflight is flattened [ $OID, $cb, $arg ] +# 65: SHA-256 hex size + "\n" in preparation for git using non-SHA1 +use constant MAX_INFLIGHT => 512 * 3 / 65; my %GIT_ESC = ( a => "\a", ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] git: cap MAX_INFLIGHT value to POSIX minimum 2022-12-21 23:22 ` [PATCH] git: cap MAX_INFLIGHT value to POSIX minimum Eric Wong @ 2022-12-21 23:57 ` Chris Brannon 2023-01-04 3:49 ` [PATCH] git: fix asynchronous batching for deep pipelines Eric Wong 0 siblings, 1 reply; 13+ messages in thread From: Chris Brannon @ 2022-12-21 23:57 UTC (permalink / raw) To: Eric Wong; +Cc: meta Eric Wong <e@80x24.org> writes: > Also, can I assume just running 'public-inbox-index' on any > freshly-cloned inbox also fails for you independently of > -convert? Yes it does. I double-checked to be sure. Thanks for the fix! If you come up with something more permanent that you want me to test, please ask. -- Chris ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] git: fix asynchronous batching for deep pipelines 2022-12-21 23:57 ` Chris Brannon @ 2023-01-04 3:49 ` Eric Wong 2023-01-05 1:08 ` Chris Brannon 0 siblings, 1 reply; 13+ messages in thread From: Eric Wong @ 2023-01-04 3:49 UTC (permalink / raw) To: Chris Brannon; +Cc: meta Chris Brannon <chris@the-brannons.com> wrote: > Eric Wong <e@80x24.org> writes: > > > Also, can I assume just running 'public-inbox-index' on any > > freshly-cloned inbox also fails for you independently of > > -convert? > > Yes it does. I double-checked to be sure. > > Thanks for the fix! If you come up with something more permanent that > you want me to test, please ask. --------8<------- Subject: [PATCH] git: fix asynchronous batching for deep pipelines ...By using non-blocking pipe writes. This avoids problems for musl (and other libc) where getdelim(3) used by `git cat-file --batch*' uses a smaller input buffer than glibc or FreeBSD libc. My key mistake was our check against MAX_INFLIGHT is only useful for the initial batch of requests. It is not useful for subsequent requests since git will drain the pipe at unpredictable rates due to libc differences. To fix this problem, I initially tried to drain the read pipe as long as readable data was pending. However, reading git output without giving git more work would also limit parallelism opportunities since we don't want git to sit idle, either. This change ensures we keep both pipes reasonably full to reduce stalls and maximize parallelism between git and public-inbox. While the limit set a few weeks ago in commit 56e6e587745c (git: cap MAX_INFLIGHT value to POSIX minimum, 2022-12-21) remains in place, any higher or lower limit will work. It may be worth it to use an even lower limit to improve interactivity w.r.t. Ctrl-C interrupts. I've tested the pre-56e6e587745c and even higher values on an Alpine VM in the GCC Farm <https://cfarm.tetaneutral.net> Reported-by: Chris Brannon <chris@the-brannons.com> Link: https://public-inbox.org/meta/87edssl7u0.fsf@the-brannons.com/T/ --- lib/PublicInbox/Git.pm | 30 ++++++++++++++++++++++-------- 1 file changed, 22 insertions(+), 8 deletions(-) diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm index a1af776b..86b80a4e 100644 --- a/lib/PublicInbox/Git.pm +++ b/lib/PublicInbox/Git.pm @@ -145,6 +145,7 @@ sub _bidi_pipe { fcntl($out_w, 1031, 4096); fcntl($in_r, 1031, 4096) if $batch eq '--batch-check'; } + $out_w->blocking(0); $self->{$out} = $out_w; $self->{$in} = $in_r; } @@ -203,7 +204,9 @@ sub cat_async_retry ($$) { for (my $i = 0; $i < @$inflight; $i += 3) { $buf .= "$inflight->[$i]\n"; } + $self->{out}->blocking(1); # brand new pipe, should never block print { $self->{out} } $buf or $self->fail("write error: $!"); + $self->{out}->blocking(0); my $req = shift @$inflight; unshift(@$inflight, \$req); # \$ref to indicate retried @@ -305,13 +308,27 @@ sub check_async_begin ($) { $self->{inflight_c} = []; } +sub write_all { + my ($self, $out, $buf, $read_step, $inflight) = @_; + $read_step->($self, $inflight) while @$inflight >= MAX_INFLIGHT; + do { + my $w = syswrite($out, $buf); + if (defined $w) { + return if $w == length($buf); + warn "chop: $w"; + substr($buf, 0, $w, ''); # sv_chop + } elsif ($! != EAGAIN) { + $self->fail("write: $!"); + } else { warn "E: $!" } + $read_step->($self, $inflight); + } while (1); +} + sub check_async ($$$$) { my ($self, $oid, $cb, $arg) = @_; my $inflight_c = $self->{inflight_c} // check_async_begin($self); - while (scalar(@$inflight_c) >= MAX_INFLIGHT) { - check_async_step($self, $inflight_c); - } - print { $self->{out_c} } $oid, "\n" or $self->fail("write error: $!"); + write_all($self, $self->{out_c}, $oid."\n", + \&check_async_step, $inflight_c); push(@$inflight_c, $oid, $cb, $arg); } @@ -496,10 +513,7 @@ sub cat_async_begin { sub cat_async ($$$;$) { my ($self, $oid, $cb, $arg) = @_; my $inflight = $self->{inflight} // cat_async_begin($self); - while (scalar(@$inflight) >= MAX_INFLIGHT) { - cat_async_step($self, $inflight); - } - print { $self->{out} } $oid, "\n" or $self->fail("write error: $!"); + write_all($self, $self->{out}, $oid."\n", \&cat_async_step, $inflight); push(@$inflight, $oid, $cb, $arg); } ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] git: fix asynchronous batching for deep pipelines 2023-01-04 3:49 ` [PATCH] git: fix asynchronous batching for deep pipelines Eric Wong @ 2023-01-05 1:08 ` Chris Brannon 2023-01-05 1:44 ` [PATCH] git: write_all: remove leftover debug messages Eric Wong 0 siblings, 1 reply; 13+ messages in thread From: Chris Brannon @ 2023-01-05 1:08 UTC (permalink / raw) To: Eric Wong; +Cc: meta Eric Wong <e@80x24.org> writes: > > --------8<------- > Subject: [PATCH] git: fix asynchronous batching for deep pipelines > > ...By using non-blocking pipe writes. This avoids problems for > musl (and other libc) where getdelim(3) used by `git cat-file --batch*' > uses a smaller input buffer than glibc or FreeBSD libc. > > My key mistake was our check against MAX_INFLIGHT is only useful > for the initial batch of requests. It is not useful for > subsequent requests since git will drain the pipe at > unpredictable rates due to libc differences. > > To fix this problem, I initially tried to drain the read pipe > as long as readable data was pending. However, reading git > output without giving git more work would also limit parallelism > opportunities since we don't want git to sit idle, either. This > change ensures we keep both pipes reasonably full to reduce > stalls and maximize parallelism between git and public-inbox. > > While the limit set a few weeks ago in commit > 56e6e587745c (git: cap MAX_INFLIGHT value to POSIX minimum, 2022-12-21) > remains in place, any higher or lower limit will work. It may > be worth it to use an even lower limit to improve interactivity > w.r.t. Ctrl-C interrupts. > > I've tested the pre-56e6e587745c and even higher values on an > Alpine VM in the GCC Farm <https://cfarm.tetaneutral.net> > > Reported-by: Chris Brannon <chris@the-brannons.com> > Tested-by: Chris Brannon <chris@the-brannons.com> > Link: https://public-inbox.org/meta/87edssl7u0.fsf@the-brannons.com/T/ *SNIP* Thanks! The conversion seems to have been successful, but it logged very many of the following: E: Resource temporarily unavailable at /usr/share/perl5/vendor_perl/PublicInbox/Git.pm line 320. I'm sorry, I didn't count how many. It seemed like multiple screenfuls at least. -- Chris ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] git: write_all: remove leftover debug messages 2023-01-05 1:08 ` Chris Brannon @ 2023-01-05 1:44 ` Eric Wong 2023-01-05 7:32 ` Chris Brannon 0 siblings, 1 reply; 13+ messages in thread From: Eric Wong @ 2023-01-05 1:44 UTC (permalink / raw) To: Chris Brannon; +Cc: meta Chris Brannon <chris@the-brannons.com> wrote: > The conversion seems to have been successful, but it logged very many of > the following: > E: Resource temporarily unavailable at > /usr/share/perl5/vendor_perl/PublicInbox/Git.pm line 320. > > I'm sorry, I didn't count how many. It seemed like multiple screenfuls > at least. Oops, yeah. Just leftover debug messages that never triggered on my FreeBSD and Debian systems. Will push this out: ---------8<---------- Subject: [PATCH] git: write_all: remove leftover debug messages I used these messages during development to verify Alpine was triggering the intended codepaths. They're no longer necessary and just noise at this point. Reported-by: Chris Brannon <chris@the-brannons.com> Fixes: d4ba8828ab23 ("git: fix asynchronous batching for deep pipelines") --- lib/PublicInbox/Git.pm | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm index 86b80a4e..c7f82ba2 100644 --- a/lib/PublicInbox/Git.pm +++ b/lib/PublicInbox/Git.pm @@ -315,11 +315,10 @@ sub write_all { my $w = syswrite($out, $buf); if (defined $w) { return if $w == length($buf); - warn "chop: $w"; substr($buf, 0, $w, ''); # sv_chop } elsif ($! != EAGAIN) { $self->fail("write: $!"); - } else { warn "E: $!" } + } $read_step->($self, $inflight); } while (1); } ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] git: write_all: remove leftover debug messages 2023-01-05 1:44 ` [PATCH] git: write_all: remove leftover debug messages Eric Wong @ 2023-01-05 7:32 ` Chris Brannon 0 siblings, 0 replies; 13+ messages in thread From: Chris Brannon @ 2023-01-05 7:32 UTC (permalink / raw) To: Eric Wong; +Cc: meta Eric Wong <e@80x24.org> writes: > Subject: [PATCH] git: write_all: remove leftover debug messages > > I used these messages during development to verify Alpine was > triggering the intended codepaths. They're no longer necessary > and just noise at this point. > > Reported-by: Chris Brannon <chris@the-brannons.com> > Tested-by: Chris Brannon <chris@the-brannons.com> > Fixes: d4ba8828ab23 ("git: fix asynchronous batching for deep pipelines") > --- > lib/PublicInbox/Git.pm | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) Thanks! -- Chris ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2023-01-05 7:32 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-12-21 11:28 public-inbox-convert hangs on systems using musl libc Chris Brannon 2022-12-21 12:21 ` Eric Wong 2022-12-21 13:46 ` Chris Brannon 2022-12-21 19:48 ` Eric Wong 2022-12-21 20:46 ` Chris Brannon 2022-12-21 21:11 ` Eric Wong 2022-12-21 22:17 ` Chris Brannon 2022-12-21 23:22 ` [PATCH] git: cap MAX_INFLIGHT value to POSIX minimum Eric Wong 2022-12-21 23:57 ` Chris Brannon 2023-01-04 3:49 ` [PATCH] git: fix asynchronous batching for deep pipelines Eric Wong 2023-01-05 1:08 ` Chris Brannon 2023-01-05 1:44 ` [PATCH] git: write_all: remove leftover debug messages Eric Wong 2023-01-05 7:32 ` Chris Brannon
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).