From: Eric Wong <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH 09/15] spawn: drop IO layer support from redirects
Date: Thu, 30 Nov 2023 11:41:02 +0000 [thread overview]
Message-ID: <20231130114109.2577708-10-e@80x24.org> (raw)
In-Reply-To: <20231130114109.2577708-1-e@80x24.org>
When setting up stdin for commands, the write_file API is
convenient enough nowadays to not be worth having special
support with process spawning.
When reading stdout of commands, we should probably be using
utf8_maybe everywhere since there'll always be legacy encodings
in git repos.
Reading regular files with :utf8 also results in worse memory
management since the file size cannot be used as a hint.
---
lib/PublicInbox/MailDiff.pm | 3 ++-
lib/PublicInbox/SearchIdx.pm | 5 ++++-
lib/PublicInbox/Spawn.pm | 32 +++++++++++---------------------
3 files changed, 17 insertions(+), 23 deletions(-)
diff --git a/lib/PublicInbox/MailDiff.pm b/lib/PublicInbox/MailDiff.pm
index e4e262ef..125360fe 100644
--- a/lib/PublicInbox/MailDiff.pm
+++ b/lib/PublicInbox/MailDiff.pm
@@ -65,6 +65,7 @@ sub next_smsg ($) {
sub emit_msg_diff {
my ($bref, $self) = @_; # bref is `git diff' output
require PublicInbox::Hval;
+ PublicInbox::Hval::utf8_maybe($$bref);
# will be escaped to `•' in HTML
$self->{ctx}->{ibx}->{obfuscate} and
@@ -81,7 +82,7 @@ sub do_diff {
my $dir = "$self->{tmp}/$n";
$self->dump_eml($dir, $eml);
my $cmd = [ qw(git diff --no-index --no-color -- a), $n ];
- my $opt = { -C => "$self->{tmp}", quiet => 1, 1 => [':utf8', \my $o] };
+ my $opt = { -C => "$self->{tmp}", quiet => 1 };
my $qsp = PublicInbox::Qspawn->new($cmd, undef, $opt);
$qsp->psgi_qx($self->{ctx}->{env}, undef, \&emit_msg_diff, $self);
}
diff --git a/lib/PublicInbox/SearchIdx.pm b/lib/PublicInbox/SearchIdx.pm
index 17538027..86c435fd 100644
--- a/lib/PublicInbox/SearchIdx.pm
+++ b/lib/PublicInbox/SearchIdx.pm
@@ -355,8 +355,11 @@ sub index_body_text {
my $rd;
if ($$sref =~ /^(?:diff|---|\+\+\+) /ms) { # start patch-id in parallel
my $git = ($self->{ibx} // $self->{eidx} // $self)->git;
+ my $fh = PublicInbox::IO::write_file '+>:utf8', undef, $$sref;
+ $fh->flush or die "flush: $!";
+ sysseek($fh, 0, SEEK_SET);
$rd = popen_rd($git->cmd(qw(patch-id --stable)), undef,
- { 0 => [ ':utf8', $sref ] });
+ { 0 => $fh });
}
# split off quoted and unquoted blocks:
diff --git a/lib/PublicInbox/Spawn.pm b/lib/PublicInbox/Spawn.pm
index 9c680690..e6b12994 100644
--- a/lib/PublicInbox/Spawn.pm
+++ b/lib/PublicInbox/Spawn.pm
@@ -332,18 +332,6 @@ sub which ($) {
undef;
}
-sub scalar_redirect {
- my ($layer, $opt, $child_fd, $bref) = @_;
- open my $fh, '+>'.$layer, undef;
- $opt->{"fh.$child_fd"} = $fh;
- if ($child_fd == 0) {
- print $fh $$bref;
- $fh->flush or die "flush: $!";
- sysseek($fh, 0, SEEK_SET);
- }
- fileno($fh);
-}
-
sub spawn ($;$$) {
my ($cmd, $env, $opt) = @_;
my $f = which($cmd->[0]) // die "$cmd->[0]: command not found\n";
@@ -354,14 +342,18 @@ sub spawn ($;$$) {
}
for my $child_fd (0..2) {
my $pfd = $opt->{$child_fd};
- if ('ARRAY' eq ref($pfd)) {
- my ($layer, $bref) = @$pfd;
- $pfd = scalar_redirect($layer, $opt, $child_fd, $bref)
- } elsif ('SCALAR' eq ref($pfd)) {
- $pfd = scalar_redirect('', $opt, $child_fd, $pfd);
+ if ('SCALAR' eq ref($pfd)) {
+ open my $fh, '+>', undef;
+ $opt->{"fh.$child_fd"} = $fh; # for read_out_err
+ if ($child_fd == 0) {
+ print $fh $$pfd;
+ $fh->flush or die "flush: $!";
+ sysseek($fh, 0, SEEK_SET);
+ }
+ $pfd = fileno($fh);
} elsif (defined($pfd) && $pfd !~ /\A[0-9]+\z/) {
my $fd = fileno($pfd) //
- die "$pfd not an IO GLOB? $!";
+ croak "BUG: $pfd not an IO GLOB? $!";
$pfd = $fd;
}
$rdr[$child_fd] = $pfd // $child_fd;
@@ -399,9 +391,7 @@ sub read_out_err ($) {
for my $fd (1, 2) { # read stdout/stderr
my $fh = delete($opt->{"fh.$fd"}) // next;
seek($fh, 0, SEEK_SET);
- my $dst = $opt->{$fd};
- $dst = $opt->{$fd} = $dst->[1] if ref($dst) eq 'ARRAY';
- PublicInbox::IO::read_all $fh, 0, $dst
+ PublicInbox::IO::read_all $fh, undef, $opt->{$fd};
}
}
next prev parent reply other threads:[~2023-11-30 11:41 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-30 11:40 [PATCH 00/15] various cindex fixes + speedups Eric Wong
2023-11-30 11:40 ` [PATCH 01/15] cindex: fix store_repo+repo_stored on no-op Eric Wong
2023-11-30 11:40 ` [PATCH 02/15] codesearch: allow inbox count to exceed matches Eric Wong
2023-11-30 11:40 ` [PATCH 03/15] config: reject newlines consistently in dir names Eric Wong
2023-11-30 11:40 ` [PATCH 04/15] cindex: only create {-cidx_err} field on failures Eric Wong
2023-11-30 11:40 ` [PATCH 05/15] cindex: keep batch pipe for pruning SHA-256 repos Eric Wong
2023-11-30 11:40 ` [PATCH 06/15] cindex: store extensions.objectFormat with repo data Eric Wong
2023-11-30 21:36 ` Eric Wong
2023-11-30 11:41 ` [PATCH 07/15] git: share unlinked pack checking code with gcf2 Eric Wong
2023-11-30 11:41 ` [PATCH 08/15] cindex: skip getpid guard for most OnDestroy use Eric Wong
2023-11-30 11:41 ` Eric Wong [this message]
2023-11-30 11:41 ` [PATCH 10/15] cindex: speed up initial scan setup phase Eric Wong
2023-11-30 11:41 ` [PATCH 11/15] inbox: expire resources more aggressively Eric Wong
2023-11-30 11:41 ` [PATCH 12/15] git_async_cat: use git from "all" extindex if possible Eric Wong
2023-11-30 11:41 ` [PATCH 13/15] www_listing: support publicInbox.nameIsUrl Eric Wong
2023-12-01 1:29 ` Kyle Meyer
2023-12-01 2:01 ` [PATCH] doc: config: fix grammar for nameIsUrl Eric Wong
2023-11-30 11:41 ` [PATCH 14/15] inbox: shrink data structures for publicinbox.*.hide Eric Wong
2023-11-30 11:41 ` [PATCH 15/15] codesearch: use retry_reopen for WWW Eric Wong
2023-11-30 21:40 ` [PATCH v2] " Eric Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://public-inbox.org/README
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231130114109.2577708-10-e@80x24.org \
--to=e@80x24.org \
--cc=meta@public-inbox.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).