unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
* RFE: Long .onion URL breaks mobile view
@ 2021-08-16 16:36 Konstantin Ryabitsev
  2021-08-16 22:38 ` Eric Wong
  2021-08-26 12:33 ` [PATCH 0/8] various WWW + extindex stuff Eric Wong
  0 siblings, 2 replies; 21+ messages in thread
From: Konstantin Ryabitsev @ 2021-08-16 16:36 UTC (permalink / raw)
  To: meta

Hello:

Passing more observations from people testing out x-lore.kernel.org:

- the long .onion URL at the bottom of each page causes problems on mobile
  devices because it results in a very wide page with blank content on the
  right side.

I suggest that we don't need to provide the "git clone" part, but merely add a
link to the git repository:

<a href="http://7fh....onion/public-inbox.git">AGPL code for this site</a>

This should still serve the same purpose and not result in rendering issues on
mobile devices.

-K

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFE: Long .onion URL breaks mobile view
  2021-08-16 16:36 RFE: Long .onion URL breaks mobile view Konstantin Ryabitsev
@ 2021-08-16 22:38 ` Eric Wong
  2021-08-16 22:53   ` Eric Wong
  2021-08-26 12:33 ` [PATCH 0/8] various WWW + extindex stuff Eric Wong
  1 sibling, 1 reply; 21+ messages in thread
From: Eric Wong @ 2021-08-16 22:38 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> Hello:
> 
> Passing more observations from people testing out x-lore.kernel.org:
> 
> - the long .onion URL at the bottom of each page causes problems on mobile
>   devices because it results in a very wide page with blank content on the
>   right side.

Agreed, this is important feedback.

> I suggest that we don't need to provide the "git clone" part, but merely add a
> link to the git repository:
> 
> <a href="http://7fh....onion/public-inbox.git">AGPL code for this site</a>
> 
> This should still serve the same purpose and not result in rendering issues on
> mobile devices.

I think making it obvious a link goes to an external domain is
important.  How about splitting it on long tokens?  Something
like:

	<a href="...">http://
		7fh6tueqddpjyxjmgtdiueylzoqt6pt7hec3pukyptlmohoowvhde4yd
		.onion/public-inbox.git</a>

Is a 56-character token OK for mobile displays?  My own preference is to
wrap at 64-characters despite using 80-column displays.

I wish Tor could've figured out some better way to make .onion v3 URLs
shorter...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RFE: Long .onion URL breaks mobile view
  2021-08-16 22:38 ` Eric Wong
@ 2021-08-16 22:53   ` Eric Wong
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Wong @ 2021-08-16 22:53 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Eric Wong <e@80x24.org> wrote:
> 	<a href="...">http://
> 		7fh6tueqddpjyxjmgtdiueylzoqt6pt7hec3pukyptlmohoowvhde4yd
> 		.onion/public-inbox.git</a>

Actually, it's for "git clone" instructions, so something
shell-compatible might be required:

	u=7fh6tueqddpjyxjmgtdiueylzoqt6pt7hec3pukyptlmohoowvhde4yd
	torsocks git clone http://$u.onion/public-inbox.git

No good options, really :<

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 0/8] various WWW + extindex stuff
  2021-08-16 16:36 RFE: Long .onion URL breaks mobile view Konstantin Ryabitsev
  2021-08-16 22:38 ` Eric Wong
@ 2021-08-26 12:33 ` Eric Wong
  2021-08-26 12:33   ` [PATCH 1/8] get rid of unnecessary bytes::length usage Eric Wong
                     ` (8 more replies)
  1 sibling, 9 replies; 21+ messages in thread
From: Eric Wong @ 2021-08-26 12:33 UTC (permalink / raw)
  To: meta; +Cc: Konstantin Ryabitsev

This hopefully makes the long .onion URL more usable on small
displays; but I also got sidetracked into making our "use bytes"
stuff less scary based on the notice in the bytes(3perl)
manpage.

There's a couple of small extindex-related fixes to reconcile
the differences between the two ibxish object types for WWW.

Eric Wong (8):
  get rid of unnecessary bytes::length usage
  ds: use bytes::substr and bytes::length module-wide for now
  www_stream: sh-friendly .onion URLs wrapping
  www: avoid incorrect instructions for extindex
  www_text: fix example config snippet for extindex
  config: do not parse altid for extindex
  www_text: add coderepo config support for extindex
  move ->ids_after from mm to over

 lib/PublicInbox/Config.pm       |   2 +-
 lib/PublicInbox/DS.pm           |  21 +++---
 lib/PublicInbox/ExtSearch.pm    |   4 -
 lib/PublicInbox/HTTP.pm         |  17 ++---
 lib/PublicInbox/ManifestJsGz.pm |   3 +-
 lib/PublicInbox/Mbox.pm         |  10 +--
 lib/PublicInbox/Msgmap.pm       |  11 ---
 lib/PublicInbox/NNTP.pm         |   4 +-
 lib/PublicInbox/Over.pm         |  11 +++
 lib/PublicInbox/View.pm         |   5 +-
 lib/PublicInbox/ViewVCS.pm      |   5 +-
 lib/PublicInbox/WWW.pm          |  10 +--
 lib/PublicInbox/WwwAttach.pm    |   4 +-
 lib/PublicInbox/WwwHighlight.pm |   5 +-
 lib/PublicInbox/WwwListing.pm   |   4 +-
 lib/PublicInbox/WwwStatic.pm    |   4 +-
 lib/PublicInbox/WwwStream.pm    | 126 +++++++++++++++++++-------------
 lib/PublicInbox/WwwText.pm      | 103 ++++++++++++++++----------
 t/extindex-psgi.t               |  15 ++++
 t/psgi_search.t                 |   1 -
 t/search-thr-index.t            |   8 +-
 t/www_listing.t                 |  19 ++++-
 xt/cmp-msgstr.t                 |   2 +-
 23 files changed, 229 insertions(+), 165 deletions(-)


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/8] get rid of unnecessary bytes::length usage
  2021-08-26 12:33 ` [PATCH 0/8] various WWW + extindex stuff Eric Wong
@ 2021-08-26 12:33   ` Eric Wong
  2021-08-26 12:33   ` [PATCH 2/8] ds: use bytes::substr and bytes::length module-wide for now Eric Wong
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Eric Wong @ 2021-08-26 12:33 UTC (permalink / raw)
  To: meta

The only place where we could return wide characters with -httpd
was the raw $INBOX_DIR/description text, which is now converted
to octets.

All daemon (HTTP/NNTP/IMAP) sockets are opened in binary mode,
so length() and bytes::length() are equivalent on reads.  For
socket writes, any non-octet data would warn about wide characters
and we are strict in warnings with test_httpd.

All gzipped buffers are also octets, as is PublicInbox::Eml->body,
and anything from PerlIO objects ("git cat-file --batch" output,
filesystems), so bytes::length was unnecessary in all those places.
---
 lib/PublicInbox/HTTP.pm         | 17 ++++++++---------
 lib/PublicInbox/ManifestJsGz.pm |  3 +--
 lib/PublicInbox/NNTP.pm         |  2 +-
 lib/PublicInbox/View.pm         |  5 ++---
 lib/PublicInbox/ViewVCS.pm      |  5 ++---
 lib/PublicInbox/WWW.pm          | 10 ++++------
 lib/PublicInbox/WwwAttach.pm    |  4 ++--
 lib/PublicInbox/WwwHighlight.pm |  5 ++---
 lib/PublicInbox/WwwListing.pm   |  4 ++--
 lib/PublicInbox/WwwStatic.pm    |  4 ++--
 lib/PublicInbox/WwwStream.pm    |  4 ++--
 lib/PublicInbox/WwwText.pm      |  5 ++---
 t/psgi_search.t                 |  1 -
 t/search-thr-index.t            |  8 ++++----
 t/www_listing.t                 | 19 +++++++++++++++----
 xt/cmp-msgstr.t                 |  2 +-
 16 files changed, 50 insertions(+), 48 deletions(-)

diff --git a/lib/PublicInbox/HTTP.pm b/lib/PublicInbox/HTTP.pm
index d0708c5b..b2c74cf3 100644
--- a/lib/PublicInbox/HTTP.pm
+++ b/lib/PublicInbox/HTTP.pm
@@ -21,7 +21,6 @@
 package PublicInbox::HTTP;
 use strict;
 use parent qw(PublicInbox::DS);
-use bytes (); # only for bytes::length
 use Fcntl qw(:seek);
 use Plack::HTTPParser qw(parse_http_request); # XS or pure Perl
 use Plack::Util;
@@ -89,7 +88,7 @@ sub event_step { # called by PublicInbox::DS
 
 	return read_input($self) if ref($self->{env});
 	my $rbuf = $self->{rbuf} // (\(my $x = ''));
-	$self->do_read($rbuf, 8192, bytes::length($$rbuf)) or return;
+	$self->do_read($rbuf, 8192, length($$rbuf)) or return;
 	rbuf_process($self, $rbuf);
 }
 
@@ -104,7 +103,7 @@ sub rbuf_process {
 	# (they are rarely-used and git (as of 2.7.2) does not use them)
 	if ($r == -1 || $env{HTTP_TRAILER} ||
 			# this length-check is necessary for PURE_PERL=1:
-			($r == -2 && bytes::length($$rbuf) > 0x4000)) {
+			($r == -2 && length($$rbuf) > 0x4000)) {
 		return quit($self, 400);
 	}
 	if ($r < 0) { # incomplete
@@ -121,7 +120,7 @@ sub rbuf_process {
 # IO::Handle::write returns boolean, this returns bytes written:
 sub xwrite ($$$) {
 	my ($fh, $rbuf, $max) = @_;
-	my $w = bytes::length($$rbuf);
+	my $w = length($$rbuf);
 	$w = $max if $w > $max;
 	$fh->write($$rbuf, $w) or return;
 	$w;
@@ -236,7 +235,7 @@ sub response_header_write {
 sub chunked_write ($$) {
 	my $self = $_[0];
 	return if $_[1] eq '';
-	msg_more($self, sprintf("%x\r\n", bytes::length($_[1])));
+	msg_more($self, sprintf("%x\r\n", length($_[1])));
 	msg_more($self, $_[1]);
 
 	# use $self->write(\"\n\n") if you care about real-time
@@ -411,12 +410,12 @@ sub read_input_chunked { # unlikely...
 			$$rbuf =~ s/\A\r\n//s and
 				return app_dispatch($self, $input, $rbuf);
 
-			return quit($self, 400) if bytes::length($$rbuf) > 2;
+			return quit($self, 400) if length($$rbuf) > 2;
 		}
 		if ($len == CHUNK_END) {
 			if ($$rbuf =~ s/\A\r\n//s) {
 				$len = CHUNK_START;
-			} elsif (bytes::length($$rbuf) > 2) {
+			} elsif (length($$rbuf) > 2) {
 				return quit($self, 400);
 			}
 		}
@@ -426,14 +425,14 @@ sub read_input_chunked { # unlikely...
 				if (($len + -s $input) > $MAX_REQUEST_BUFFER) {
 					return quit($self, 413);
 				}
-			} elsif (bytes::length($$rbuf) > CHUNK_MAX_HDR) {
+			} elsif (length($$rbuf) > CHUNK_MAX_HDR) {
 				return quit($self, 400);
 			}
 			# will break from loop since $len >= 0
 		}
 
 		if ($len < 0) { # chunk header is trickled, read more
-			$self->do_read($rbuf, 8192, bytes::length($$rbuf)) or
+			$self->do_read($rbuf, 8192, length($$rbuf)) or
 				return recv_err($self, $len);
 			# (implicit) goto chunk_start if $r > 0;
 		}
diff --git a/lib/PublicInbox/ManifestJsGz.pm b/lib/PublicInbox/ManifestJsGz.pm
index 7fee78dd..69d81fa1 100644
--- a/lib/PublicInbox/ManifestJsGz.pm
+++ b/lib/PublicInbox/ManifestJsGz.pm
@@ -6,7 +6,6 @@ package PublicInbox::ManifestJsGz;
 use strict;
 use v5.10.1;
 use parent qw(PublicInbox::WwwListing);
-use bytes (); # length
 use PublicInbox::Config;
 use IO::Compress::Gzip qw(gzip);
 use HTTP::Date qw(time2str);
@@ -108,7 +107,7 @@ sub psgi_triple {
 	gzip(\$manifest => \(my $out));
 	[ 200, [ qw(Content-Type application/gzip),
 		 'Last-Modified', time2str($ctx->{-mtime}),
-		 'Content-Length', bytes::length($out) ], [ $out ] ]
+		 'Content-Length', length($out) ], [ $out ] ]
 }
 
 sub per_inbox {
diff --git a/lib/PublicInbox/NNTP.pm b/lib/PublicInbox/NNTP.pm
index 13a68bb8..aea04c05 100644
--- a/lib/PublicInbox/NNTP.pm
+++ b/lib/PublicInbox/NNTP.pm
@@ -241,7 +241,7 @@ sub parse_time ($$;$) {
 		$gmt = 1;
 	}
 	my ($YYYY, $MM, $DD);
-	if (bytes::length($date) == 8) { # RFC 3977 allows YYYYMMDD
+	if (length($date) == 8) { # RFC 3977 allows YYYYMMDD
 		($YYYY, $MM, $DD) = unpack('A4A2A2', $date);
 	} else { # legacy clients send YYMMDD
 		my $YY;
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index 17d38302..94ea6148 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -5,8 +5,7 @@
 # See Documentation/design_www.txt for this.
 package PublicInbox::View;
 use strict;
-use warnings;
-use bytes (); # only for bytes::length
+use v5.10.1;
 use List::Util qw(max);
 use PublicInbox::MsgTime qw(msg_datestamp);
 use PublicInbox::Hval qw(ascii_html obfuscate_addrs prurl mid_href
@@ -531,7 +530,7 @@ sub attach_link ($$$$;$) {
 	return unless $part->{bdy};
 
 	my $nl = $idx eq '1' ? '' : "\n"; # like join("\n", ...)
-	my $size = bytes::length($part->body);
+	my $size = length($part->body);
 
 	# hide attributes normally, unless we want to aid users in
 	# spotting MUA problems:
diff --git a/lib/PublicInbox/ViewVCS.pm b/lib/PublicInbox/ViewVCS.pm
index 702a075d..6365f045 100644
--- a/lib/PublicInbox/ViewVCS.pm
+++ b/lib/PublicInbox/ViewVCS.pm
@@ -15,8 +15,7 @@
 
 package PublicInbox::ViewVCS;
 use strict;
-use warnings;
-use bytes (); # only for bytes::length
+use v5.10.1;
 use PublicInbox::SolverGit;
 use PublicInbox::WwwStream qw(html_oneshot);
 use PublicInbox::Linkify;
@@ -49,7 +48,7 @@ sub stream_blob_parse_hdr { # {parse_hdr} for Qspawn
 	} elsif (index($$bref, "\0") >= 0) {
 		[200, [qw(Content-Type application/octet-stream), @cl] ];
 	} else {
-		my $n = bytes::length($$bref);
+		my $n = length($$bref);
 		if ($n >= $BIN_DETECT || $n == $size) {
 			return [200, [ 'Content-Type',
 				'text/plain; charset=UTF-8', @cl ] ];
diff --git a/lib/PublicInbox/WWW.pm b/lib/PublicInbox/WWW.pm
index 1afdece0..570e690e 100644
--- a/lib/PublicInbox/WWW.pm
+++ b/lib/PublicInbox/WWW.pm
@@ -11,10 +11,8 @@
 # - Must not rely on static content
 # - UTF-8 is only for user-content, 7-bit US-ASCII for us
 package PublicInbox::WWW;
-use 5.010_001;
 use strict;
-use warnings;
-use bytes (); # only for bytes::length
+use v5.10.1;
 use PublicInbox::Config;
 use PublicInbox::Hval;
 use URI::Escape qw(uri_unescape);
@@ -646,8 +644,7 @@ sub get_css ($$$) {
 		$css = PublicInbox::UserContent::sample($ctx->{ibx}, $env);
 	}
 	defined $css or return r404();
-	my $h = [ 'Content-Length', bytes::length($css),
-		'Content-Type', 'text/css' ];
+	my $h = [ 'Content-Length', length($css), 'Content-Type', 'text/css' ];
 	PublicInbox::GitHTTPBackend::cache_one_year($h);
 	[ 200, $h, [ $css ] ];
 }
@@ -656,7 +653,8 @@ sub get_description {
 	my ($ctx, $inbox) = @_;
 	invalid_inbox($ctx, $inbox) || do {
 		my $d = $ctx->{ibx}->description . "\n";
-		[ 200, [ 'Content-Length', bytes::length($d),
+		utf8::encode($d);
+		[ 200, [ 'Content-Length', length($d),
 			'Content-Type', 'text/plain' ], [ $d ] ];
 	};
 }
diff --git a/lib/PublicInbox/WwwAttach.pm b/lib/PublicInbox/WwwAttach.pm
index a6c68a3f..c17394af 100644
--- a/lib/PublicInbox/WwwAttach.pm
+++ b/lib/PublicInbox/WwwAttach.pm
@@ -4,8 +4,8 @@
 # For retrieving attachments from messages in the WWW interface
 package PublicInbox::WwwAttach; # internal package
 use strict;
+use v5.10.1;
 use parent qw(PublicInbox::GzipFilter);
-use bytes (); # only for bytes::length
 use PublicInbox::Eml;
 
 sub referer_match ($) {
@@ -50,7 +50,7 @@ sub get_attach_i { # ->each_part callback
 			$part = "Deep-linking prevented\n";
 		}
 	}
-	push @{$res->[1]}, 'Content-Length', bytes::length($part);
+	push @{$res->[1]}, 'Content-Length', length($part);
 	$res->[2]->[0] = $part;
 }
 
diff --git a/lib/PublicInbox/WwwHighlight.pm b/lib/PublicInbox/WwwHighlight.pm
index 6fed2fed..3593c2d4 100644
--- a/lib/PublicInbox/WwwHighlight.pm
+++ b/lib/PublicInbox/WwwHighlight.pm
@@ -20,8 +20,7 @@
 
 package PublicInbox::WwwHighlight;
 use strict;
-use warnings;
-use bytes (); # only for bytes::length
+use v5.10.1;
 use parent qw(PublicInbox::HlMod);
 use PublicInbox::Linkify qw();
 use PublicInbox::Hval qw(ascii_html);
@@ -69,7 +68,7 @@ sub call {
 	$l->linkify_2($$bref);
 
 	my $h = [ 'Content-Type', 'text/html; charset=UTF-8' ];
-	push @$h, 'Content-Length', bytes::length($$bref);
+	push @$h, 'Content-Length', length($$bref);
 
 	[ 200, $h, [ $$bref ] ]
 }
diff --git a/lib/PublicInbox/WwwListing.pm b/lib/PublicInbox/WwwListing.pm
index a31aa4ca..8b54d724 100644
--- a/lib/PublicInbox/WwwListing.pm
+++ b/lib/PublicInbox/WwwListing.pm
@@ -5,12 +5,12 @@
 # Used by PublicInbox::WWW
 package PublicInbox::WwwListing;
 use strict;
+use v5.10.1;
 use PublicInbox::Hval qw(prurl fmt_ts ascii_html);
 use PublicInbox::Linkify;
 use PublicInbox::GzipFilter qw(gzf_maybe);
 use PublicInbox::ConfigIter;
 use PublicInbox::WwwStream;
-use bytes (); # bytes::length
 
 sub ibx_entry {
 	my ($ctx, $ibx, $ce) = @_;
@@ -213,7 +213,7 @@ sub psgi_triple {
 	my $out = $gzf->zflush('</pre><hr><pre>'.
 			PublicInbox::WwwStream::code_footer($ctx->{env}) .
 			'</pre></body></html>');
-	$h->[3] = bytes::length($out);
+	$h->[3] = length($out);
 	[ $code, $h, [ $out ] ];
 }
 
diff --git a/lib/PublicInbox/WwwStatic.pm b/lib/PublicInbox/WwwStatic.pm
index 29e4819d..b3476ab8 100644
--- a/lib/PublicInbox/WwwStatic.pm
+++ b/lib/PublicInbox/WwwStatic.pm
@@ -9,8 +9,8 @@
 # functionality of nginx.
 package PublicInbox::WwwStatic;
 use strict;
+use v5.10.1;
 use parent qw(Exporter);
-use bytes ();
 use Fcntl qw(SEEK_SET O_RDONLY O_NONBLOCK);
 use POSIX qw(strftime);
 use HTTP::Date qw(time2str);
@@ -318,7 +318,7 @@ sub dir_response ($$$) {
 		"</head><body><pre>Index of $path_info_html</pre><hr><pre>\n");
 	$gzf->zmore(join("\n", @entries));
 	my $out = $gzf->zflush("</pre><hr></body></html>\n");
-	$h->[3] = bytes::length($out);
+	$h->[3] = length($out);
 	[ 200, $h, [ $out ] ]
 }
 
diff --git a/lib/PublicInbox/WwwStream.pm b/lib/PublicInbox/WwwStream.pm
index 2f8212d4..adcb5fe2 100644
--- a/lib/PublicInbox/WwwStream.pm
+++ b/lib/PublicInbox/WwwStream.pm
@@ -7,9 +7,9 @@
 # See PublicInbox::GzipFilter parent class for more info.
 package PublicInbox::WwwStream;
 use strict;
+use v5.10.1;
 use parent qw(Exporter PublicInbox::GzipFilter);
 our @EXPORT_OK = qw(html_oneshot);
-use bytes (); # length
 use PublicInbox::Hval qw(ascii_html prurl ts2str);
 our $TOR_URL = 'https://www.torproject.org/';
 our $CODE_URL = [ qw(http://7fh6tueqddpjyxjmgtdiueylzoqt6pt7hec3pukyptlmohoowvhde4yd.onion/public-inbox.git
@@ -216,7 +216,7 @@ sub html_oneshot ($$;$) {
 	};
 	$ctx->zmore($$sref) if $sref;
 	my $bdy = $ctx->zflush(_html_end($ctx));
-	$res_hdr->[3] = bytes::length($bdy);
+	$res_hdr->[3] = length($bdy);
 	[ $code, $res_hdr, [ $bdy ] ]
 }
 
diff --git a/lib/PublicInbox/WwwText.pm b/lib/PublicInbox/WwwText.pm
index 76a95a6b..db5060ea 100644
--- a/lib/PublicInbox/WwwText.pm
+++ b/lib/PublicInbox/WwwText.pm
@@ -4,8 +4,7 @@
 # used for displaying help texts and other non-mail content
 package PublicInbox::WwwText;
 use strict;
-use warnings;
-use bytes (); # only for bytes::length
+use v5.10.1;
 use PublicInbox::Linkify;
 use PublicInbox::WwwStream;
 use PublicInbox::Hval qw(ascii_html);
@@ -43,7 +42,7 @@ sub get_text {
 			$txt = $gzf->translate($txt);
 			$txt .= $gzf->zflush;
 		}
-		$hdr->[3] = bytes::length($txt);
+		$hdr->[3] = length($txt);
 		return [ $code, $hdr, [ $txt ] ]
 	}
 
diff --git a/t/psgi_search.t b/t/psgi_search.t
index 5bdd66ed..3da93eda 100644
--- a/t/psgi_search.t
+++ b/t/psgi_search.t
@@ -8,7 +8,6 @@ use IO::Uncompress::Gunzip qw(gunzip);
 use PublicInbox::Eml;
 use PublicInbox::Config;
 use PublicInbox::Inbox;
-use bytes (); # only for bytes::length
 my @mods = qw(DBD::SQLite Search::Xapian HTTP::Request::Common Plack::Test
 		URI::Escape Plack::Builder);
 require_mods(@mods);
diff --git a/t/search-thr-index.t b/t/search-thr-index.t
index fc1b666a..62745dbc 100644
--- a/t/search-thr-index.t
+++ b/t/search-thr-index.t
@@ -1,8 +1,8 @@
+#!perl -w
 # Copyright (C) 2017-2021 all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 use strict;
-use warnings;
-use bytes (); # only for bytes::length
+use v5.10.1;
 use Test::More;
 use PublicInbox::TestCommon;
 use PublicInbox::MID qw(mids);
@@ -45,7 +45,7 @@ foreach (reverse split(/\n\n/, $data)) {
 	my $mime = PublicInbox::Eml->new(\$_);
 	$mime->header_set('From' => 'bw@g');
 	$mime->header_set('To' => 'git@vger.kernel.org');
-	my $bytes = bytes::length($mime->as_string);
+	my $bytes = length($mime->as_string);
 	my $mid = mids($mime->header_obj)->[0];
 	my $smsg = bless {
 		bytes => $bytes,
@@ -92,7 +92,7 @@ To: git@vger.kernel.org
 	my $tid0 = $dbh->selectrow_array(<<'', undef, $num);
 SELECT tid FROM over WHERE num = ? LIMIT 1
 
-	my $bytes = bytes::length($mime->as_string);
+	my $bytes = length($mime->as_string);
 	my $mid = mids($mime->header_obj)->[0];
 	my $smsg = bless {
 		bytes => $bytes,
diff --git a/t/www_listing.t b/t/www_listing.t
index 6b3b408f..7ea12eea 100644
--- a/t/www_listing.t
+++ b/t/www_listing.t
@@ -55,7 +55,7 @@ sub tiny_test {
 	ok(my $clone = $manifest->{'/alt'}, '/alt in manifest');
 	is($clone->{owner}, "lorelei \x{100}", 'owner set');
 	is($clone->{reference}, '/bare', 'reference detected');
-	is($clone->{description}, "we're all clones", 'description read');
+	is($clone->{description}, "we're \x{100}ll clones", 'description read');
 	ok(my $bare = $manifest->{'/bare'}, '/bare in manifest');
 	is($bare->{description}, 'Unnamed repository',
 		'missing $GIT_DIR/description fallback');
@@ -72,6 +72,10 @@ sub tiny_test {
 	ok(my $v2epoch1 = $manifest->{'/v2/git/1.git'}, 'v2 epoch 1 appeared');
 	like($v2epoch1->{description}, qr/ \[epoch 1\]\z/,
 		'epoch 1 in description');
+
+	$res = $http->get("http://$host:$port/alt/description");
+	is($res->{content}, "we're \xc4\x80ll clones\n", 'UTF-8 description')
+		or diag explain($res);
 }
 
 my $td;
@@ -91,9 +95,9 @@ SKIP: {
 		is(xsys(@clone, $alt, "$v2/git/$i.git"), 0, "clone epoch $i")
 	}
 	ok(open(my $fh, '>', "$v2/inbox.lock"), 'mock a v2 inbox');
-	open $fh, '>', "$alt/description" or die;
-	print $fh "we're all clones\n" or die;
-	close $fh or die;
+	open $fh, '>', "$alt/description" or xbail "open $alt/description $!";
+	print $fh "we're \xc4\x80ll clones\n" or xbail "print $!";
+	close $fh or xbail "write: $alt/description $!";
 	is(xsys('git', "--git-dir=$alt", qw(config gitweb.owner),
 		"lorelei \xc4\x80"), 0,
 		'set gitweb user');
@@ -178,6 +182,13 @@ manifest = \${site}/v2/manifest.js.gz
 	for (qw(v2/git/0.git v2/git/1.git v2/git/2.git)) {
 		ok(-d "$tmpdir/per-inbox/$_", "grok-pull created $_");
 	}
+	$td->kill;
+	$td->join;
+	is($?, 0, 'no error in exited process');
+	open $fh, '<', $err or BAIL_OUT("open $err failed: $!");
+	my $eout = do { local $/; <$fh> };
+	unlike($eout, qr/wide/i, 'no Wide character warnings');
+	unlike($eout, qr/uninitialized/i, 'no uninitialized warnings');
 }
 
 done_testing();
diff --git a/xt/cmp-msgstr.t b/xt/cmp-msgstr.t
index e0e8ed5a..900127c7 100644
--- a/xt/cmp-msgstr.t
+++ b/xt/cmp-msgstr.t
@@ -60,7 +60,7 @@ my $cmp = sub {
 				my $dig = $dig_cls->new;
 				$dig->add($part);
 				push @$cmp_arg, "M: ".$dig->hexdigest;
-				push @$cmp_arg, "B: ".bytes::length($part);
+				push @$cmp_arg, "B: ".length($part);
 			} else {
 				$part =~ s/\s+\z//s;
 				push @$cmp_arg, "X: ".$part;

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/8] ds: use bytes::substr and bytes::length module-wide for now
  2021-08-26 12:33 ` [PATCH 0/8] various WWW + extindex stuff Eric Wong
  2021-08-26 12:33   ` [PATCH 1/8] get rid of unnecessary bytes::length usage Eric Wong
@ 2021-08-26 12:33   ` Eric Wong
  2021-08-26 12:33   ` [PATCH 3/8] www_stream: sh-friendly .onion URLs wrapping Eric Wong
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Eric Wong @ 2021-08-26 12:33 UTC (permalink / raw)
  To: meta

The use of substr within IO::Handle->write may not be correct if
we have wide characters, so handle it ourselves.

bytes.pm usage is probably better fixed in PublicInbox::NNTP,
but the effort required is higher, so we'll just keep bytes in
DS for now.
---
 lib/PublicInbox/DS.pm | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/lib/PublicInbox/DS.pm b/lib/PublicInbox/DS.pm
index 7a4dfed0..d804792b 100644
--- a/lib/PublicInbox/DS.pm
+++ b/lib/PublicInbox/DS.pm
@@ -23,7 +23,7 @@ package PublicInbox::DS;
 use strict;
 use v5.10.1;
 use parent qw(Exporter);
-use bytes;
+use bytes qw(length substr); # FIXME(?): needed for PublicInbox::NNTP
 use POSIX qw(WNOHANG sigprocmask SIG_SETMASK);
 use Fcntl qw(SEEK_SET :DEFAULT O_APPEND);
 use Time::HiRes qw(clock_gettime CLOCK_MONOTONIC);
@@ -499,13 +499,14 @@ sub drop {
 # n.b.: use ->write/->read for this buffer to allow compatibility with
 # PerlIO::mmap or PerlIO::scalar if needed
 sub tmpio ($$$) {
-    my ($self, $bref, $off) = @_;
-    my $fh = tmpfile('wbuf', $self->{sock}, O_APPEND) or
-        return drop($self, "tmpfile $!");
-    $fh->autoflush(1);
-    my $len = bytes::length($$bref) - $off;
-    $fh->write($$bref, $len, $off) or return drop($self, "write ($len): $!");
-    [ $fh, 0 ] # [1] = offset, [2] = length, not set by us
+	my ($self, $bref, $off) = @_;
+	my $fh = tmpfile('wbuf', $self->{sock}, O_APPEND) or
+		return drop($self, "tmpfile $!");
+	$fh->autoflush(1);
+	my $len = length($$bref) - $off;
+	print $fh substr($$bref, $off, $len) or
+		return drop($self, "write ($len): $!");
+	[ $fh, 0 ] # [1] = offset, [2] = length, not set by us
 }
 
 =head2 C<< $obj->write( $data ) >>
@@ -547,7 +548,7 @@ sub write {
         $bref->($self);
         return 1;
     } else {
-        my $to_write = bytes::length($$bref);
+        my $to_write = length($$bref);
         my $written = syswrite($sock, $$bref, $to_write);
 
         if (defined $written) {
@@ -582,7 +583,7 @@ sub msg_more ($$) {
 		!$sock->can('stop_SSL')) {
         my $n = send($sock, $_[1], MSG_MORE);
         if (defined $n) {
-            my $nlen = bytes::length($_[1]) - $n;
+            my $nlen = length($_[1]) - $n;
             return 1 if $nlen == 0; # all done!
             # queue up the unwritten substring:
             my $tmpio = tmpio($self, \($_[1]), $n) or return 0;

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/8] www_stream: sh-friendly .onion URLs wrapping
  2021-08-26 12:33 ` [PATCH 0/8] various WWW + extindex stuff Eric Wong
  2021-08-26 12:33   ` [PATCH 1/8] get rid of unnecessary bytes::length usage Eric Wong
  2021-08-26 12:33   ` [PATCH 2/8] ds: use bytes::substr and bytes::length module-wide for now Eric Wong
@ 2021-08-26 12:33   ` Eric Wong
  2021-08-26 12:33   ` [PATCH 4/8] www: avoid incorrect instructions for extindex Eric Wong
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Eric Wong @ 2021-08-26 12:33 UTC (permalink / raw)
  To: meta; +Cc: Konstantin Ryabitsev

The long v3 .onion URL was causing havoc on small mobile
displays, so extract "hostname" into a variable which can
still used as a Bourne shell snippet.

While we're at it, include "torsocks" in the git command used
for .onion URLs since that's the (near)-universal wrapper for
Tor-ifying things (like git) which are dynamically linked to
libc.

Cc: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Link: https://public-inbox.org/meta/20210816163654.c6gfzuezhji4l6s7@nitro.local/
---
 lib/PublicInbox/WwwStream.pm | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/WwwStream.pm b/lib/PublicInbox/WwwStream.pm
index adcb5fe2..57c23690 100644
--- a/lib/PublicInbox/WwwStream.pm
+++ b/lib/PublicInbox/WwwStream.pm
@@ -12,8 +12,10 @@ use parent qw(Exporter PublicInbox::GzipFilter);
 our @EXPORT_OK = qw(html_oneshot);
 use PublicInbox::Hval qw(ascii_html prurl ts2str);
 our $TOR_URL = 'https://www.torproject.org/';
-our $CODE_URL = [ qw(http://7fh6tueqddpjyxjmgtdiueylzoqt6pt7hec3pukyptlmohoowvhde4yd.onion/public-inbox.git
-	https://public-inbox.org/public-inbox.git) ];
+
+our $CODE_URL = [ qw(
+http://7fh6tueqddpjyxjmgtdiueylzoqt6pt7hec3pukyptlmohoowvhde4yd.onion/public-inbox.git
+https://public-inbox.org/public-inbox.git) ];
 
 sub base_url ($) {
 	my $ctx = shift;
@@ -107,7 +109,13 @@ EOF
 sub code_footer ($) {
 	my ($env) = @_;
 	my $u = prurl($env, $CODE_URL);
-	qq(AGPL code for this site: git clone <a\nhref="$u">$u</a>)
+	my $arg = $u;
+	if ($arg =~ s!\A(https?://)([^/\.]+)\.onion/!$1\$hostname\.onion/!i) {
+		"AGPL code for this site:\n\thostname=$2\n\t" .
+		qq(torsocks git clone <a\nhref="$u">$arg</a>)
+	} else {
+		qq(AGPL code for this site: git clone <a\nhref="$u">$u</a>)
+	}
 }
 
 sub _html_end {
@@ -125,6 +133,9 @@ EOF
 	my $max = $ibx->max_git_epoch;
 	my $dir = (split(m!/!, $http))[-1];
 	my %seen = ($http => 1);
+	# TODO: some of these URLs may be too long and we may need to
+	# do something like code_footer() above, but these are local
+	# admin-defined
 	if (defined($max)) { # v2
 		for my $i (0..$max) {
 			# old epochs my be deleted:

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/8] www: avoid incorrect instructions for extindex
  2021-08-26 12:33 ` [PATCH 0/8] various WWW + extindex stuff Eric Wong
                     ` (2 preceding siblings ...)
  2021-08-26 12:33   ` [PATCH 3/8] www_stream: sh-friendly .onion URLs wrapping Eric Wong
@ 2021-08-26 12:33   ` Eric Wong
  2021-08-26 12:33   ` [PATCH 5/8] www_text: fix example config snippet " Eric Wong
                     ` (4 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Eric Wong @ 2021-08-26 12:33 UTC (permalink / raw)
  To: meta

There's no way to clone an extindex, since there's no git
storage associated with them.  So attempt to link to the
HTML listing of public-inboxes, instead.
---
 lib/PublicInbox/ExtSearch.pm |   4 --
 lib/PublicInbox/WwwStream.pm | 111 +++++++++++++++++++----------------
 2 files changed, 62 insertions(+), 53 deletions(-)

diff --git a/lib/PublicInbox/ExtSearch.pm b/lib/PublicInbox/ExtSearch.pm
index 0b480c7e..bd301158 100644
--- a/lib/PublicInbox/ExtSearch.pm
+++ b/lib/PublicInbox/ExtSearch.pm
@@ -106,10 +106,6 @@ sub description {
 		'$EXTINDEX_DIR/description missing';
 }
 
-sub cloneurl { [] } # TODO
-
-sub nntp_url { [] }
-
 no warnings 'once';
 *base_url = \&PublicInbox::Inbox::base_url;
 *smsg_eml = \&PublicInbox::Inbox::smsg_eml;
diff --git a/lib/PublicInbox/WwwStream.pm b/lib/PublicInbox/WwwStream.pm
index 57c23690..d8142824 100644
--- a/lib/PublicInbox/WwwStream.pm
+++ b/lib/PublicInbox/WwwStream.pm
@@ -120,52 +120,50 @@ sub code_footer ($) {
 
 sub _html_end {
 	my ($ctx) = @_;
-	my $urls = <<EOF;
-<a
-id=mirror>This inbox may be cloned and mirrored by anyone:</a>
-EOF
-
 	my $ibx = $ctx->{ibx};
 	my $desc = ascii_html($ibx->description);
-
-	my @urls;
+	my $s = "<a\nid=mirror>";
 	my $http = $ctx->{base_url};
-	my $max = $ibx->max_git_epoch;
 	my $dir = (split(m!/!, $http))[-1];
 	my %seen = ($http => 1);
-	# TODO: some of these URLs may be too long and we may need to
-	# do something like code_footer() above, but these are local
-	# admin-defined
-	if (defined($max)) { # v2
-		for my $i (0..$max) {
-			# old epochs my be deleted:
-			-d "$ibx->{inboxdir}/git/$i.git" or next;
-			my $url = "$http/$i";
-			$seen{$url} = 1;
-			push @urls, "$url $dir/git/$i.git";
+	if ($ibx->can('cloneurl')) { # PublicInbox::Inbox
+		$s .= "This inbox may be cloned and mirrored by anyone:</a>\n";
+		my @urls;
+		my $max = $ibx->max_git_epoch;
+		# TODO: some of these URLs may be too long and we may need to
+		# do something like code_footer() above, but these are local
+		# admin-defined
+		if (defined($max)) { # v2
+			for my $i (0..$max) {
+				# old epochs my be deleted:
+				-d "$ibx->{inboxdir}/git/$i.git" or next;
+				my $url = "$http/$i";
+				$seen{$url} = 1;
+				push @urls, "$url $dir/git/$i.git";
+			}
+			my $nr = scalar(@urls);
+			if ($nr > 1) {
+				$s .= "\n\t";
+				$s .= "# this inbox consists of $nr epochs:";
+				$urls[0] .= "\t# oldest";
+				$urls[-1] .= "\t# newest";
+			}
+		} else { # v1
+			push @urls, $http;
 		}
-		my $nr = scalar(@urls);
-		if ($nr > 1) {
-			$urls .= "\n\t# this inbox consists of $nr epochs:";
-			$urls[0] .= "\t# oldest";
-			$urls[-1] .= "\t# newest";
+		# FIXME: epoch splits can be different in other repositories,
+		# use the "cloneurl" file as-is for now:
+		for my $u (@{$ibx->cloneurl}) {
+			next if $seen{$u}++;
+			push @urls, ($u =~ /\Ahttps?:/ ?
+					qq(<a\nhref="$u">$u</a>) : $u);
 		}
-	} else { # v1
-		push @urls, $http;
-	}
-
-	# FIXME: epoch splits can be different in other repositories,
-	# use the "cloneurl" file as-is for now:
-	foreach my $u (@{$ibx->cloneurl}) {
-		next if $seen{$u}++;
-		push @urls, $u =~ /\Ahttps?:/ ? qq(<a\nhref="$u">$u</a>) : $u;
-	}
-
-	$urls .= "\n" . join('', map { "\tgit clone --mirror $_\n" } @urls);
-	if (my $addrs = $ibx->{address}) {
-		$addrs = join(' ', @$addrs) if ref($addrs) eq 'ARRAY';
-		my $v = defined $max ? '-V2' : '-V1';
-		$urls .= <<EOF;
+		$s .= "\n";
+		$s .= join('', map { "\tgit clone --mirror $_\n" } @urls);
+		if (my $addrs = $ibx->{address}) {
+			$addrs = join(' ', @$addrs) if ref($addrs) eq 'ARRAY';
+			my $v = defined $max ? '-V2' : '-V1';
+			$s .= <<EOF;
 
 	# If you have public-inbox 1.1+ installed, you may
 	# initialize and index your mirror using the following commands:
@@ -173,26 +171,41 @@ EOF
 		$addrs
 	public-inbox-index $dir
 EOF
+		}
+	} else { # PublicInbox::ExtSearch
+		$s .= <<EOM;
+This is an extindex which is an amalgamation of several public-inboxes</a>
+EOM
+		my $v = $ctx->{www}->{pi_cfg}->{lc('publicInbox.wwwListing')};
+		if (($v // '') =~ /\A(?:all|match=domain)\z/) {
+			my $upfx = ($ctx->{-upfx} // ''). '../';
+			$s .= <<EOM;
+A list of them is available in the <a\nhref="$upfx">listing</a>
+EOM
+		}
 	}
+
 	my $cfg_link = ($ctx->{-upfx} // '').'_/text/config/raw';
-	$urls .= <<EOF;
+	$s .= <<EOF;
 
 Example <a
 href="$cfg_link">config snippet</a> for mirrors.
 EOF
-	my @nntp = map { qq(<a\nhref="$_">$_</a>) } @{$ibx->nntp_url};
-	if (@nntp) {
-		$urls .= @nntp == 1 ? 'Newsgroup' : 'Newsgroups are';
-		$urls .= ' available over NNTP:';
-		$urls .= "\n\t" . join("\n\t", @nntp) . "\n";
+	if ($ibx->can('nntp_url')) {
+		my @nntp = map { qq(<a\nhref="$_">$_</a>) } @{$ibx->nntp_url};
+		if (@nntp) {
+			$s .= @nntp == 1 ? 'Newsgroup' : 'Newsgroups are';
+			$s .= ' available over NNTP:';
+			$s .= "\n\t" . join("\n\t", @nntp) . "\n";
+		}
 	}
-	if ($urls =~ m!\b[^:]+://\w+\.onion/!) {
-		$urls .= " note: .onion URLs require Tor: ";
-		$urls .= qq[<a\nhref="$TOR_URL">$TOR_URL</a>];
+	if ($s =~ m!\b[^:]+://\w+\.onion/!) {
+		$s .= " note: .onion URLs require Tor: ";
+		$s .= qq[<a\nhref="$TOR_URL">$TOR_URL</a>];
 	}
 	'<hr><pre>'.join("\n\n",
 		$desc,
-		$urls,
+		$s,
 		coderepos($ctx),
 		code_footer($ctx->{env})
 	).'</pre></body></html>';

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 5/8] www_text: fix example config snippet for extindex
  2021-08-26 12:33 ` [PATCH 0/8] various WWW + extindex stuff Eric Wong
                     ` (3 preceding siblings ...)
  2021-08-26 12:33   ` [PATCH 4/8] www: avoid incorrect instructions for extindex Eric Wong
@ 2021-08-26 12:33   ` Eric Wong
  2021-08-26 12:33   ` [PATCH 6/8] config: do not parse altid " Eric Wong
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Eric Wong @ 2021-08-26 12:33 UTC (permalink / raw)
  To: meta

extindex doesn't use the same config stuff as normal
"publicinbox" entries, so we'll need a separate function
for them.
---
 lib/PublicInbox/WwwText.pm | 29 ++++++++++++++++++++++++++++-
 t/extindex-psgi.t          | 12 ++++++++++++
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/WwwText.pm b/lib/PublicInbox/WwwText.pm
index db5060ea..eb5e3ac7 100644
--- a/lib/PublicInbox/WwwText.pm
+++ b/lib/PublicInbox/WwwText.pm
@@ -214,10 +214,37 @@ EOF
 	1;
 }
 
+# n.b. this is a perfect candidate for memoization
+sub extindex_config ($$$) {
+	my ($ctx, $hdr, $txt) = @_;
+	my $ibx = $ctx->{ibx};
+	push @$hdr, 'Content-Disposition', 'inline; filename=extindex.config';
+	my $name = dq_escape($ibx->{name});
+	my $base_url = $ibx->base_url($ctx->{env});
+	$$txt .= <<EOS;
+; Example public-inbox config snippet for the external index (extindex) at:
+; $base_url
+; See public-inbox-config(5)manpage for more details:
+; https://public-inbox.org/public-inbox-config.html
+[extindex "$name"]
+	topdir = /path/to/extindex-topdir
+	url = https://example.com/$name/
+	url = http://example.onion/$name/
+EOS
+	for my $k (qw(infourl)) {
+		defined(my $v = $ibx->{$k}) or next;
+		$$txt .= "\t$k = $v\n";
+	}
+	# TODO: coderepo support for extindex
+	1;
+}
+
 sub _default_text ($$$$) {
 	my ($ctx, $key, $hdr, $txt) = @_;
 	return _colors_help($ctx, $txt) if $key eq 'color';
-	return inbox_config($ctx, $hdr, $txt) if $key eq 'config';
+	$key eq 'config' and return $ctx->{ibx}->can('cloneurl') ?
+			inbox_config($ctx, $hdr, $txt) :
+			extindex_config($ctx, $hdr, $txt);
 	return if $key ne 'help'; # TODO more keys?
 
 	my $ibx = $ctx->{ibx};
diff --git a/t/extindex-psgi.t b/t/extindex-psgi.t
index 6f62b5a0..b9acc979 100644
--- a/t/extindex-psgi.t
+++ b/t/extindex-psgi.t
@@ -40,6 +40,18 @@ my $client = sub {
 		'Host: header respected in Atom feed');
 	unlike($res->content, qr!http://bogus\.example\.com/!s,
 		'default URL ignored with different host header');
+
+	$res = $cb->(GET('/all/_/text/config/'));
+	is($res->code, 200, '/text/config HTML');
+	$res = $cb->(GET('/all/_/text/config/raw'));
+	is($res->code, 200, '/text/config raw');
+	my $f = "$tmpdir/extindex.config";
+	open my $fh, '>', $f or xbail $!;
+	print $fh $res->content or xbail $!;
+	close $fh or xbail $!;
+	my $cfg = PublicInbox::Config->git_config_dump($f);
+	is($?, 0, 'no errors from git-config parsing');
+	ok($cfg->{'extindex.all.topdir'}, 'extindex.topdir defined');
 };
 test_psgi(sub { $www->call(@_) }, $client);
 %$env = (%$env, TMPDIR => $tmpdir, PI_CONFIG => $pi_config);

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 6/8] config: do not parse altid for extindex
  2021-08-26 12:33 ` [PATCH 0/8] various WWW + extindex stuff Eric Wong
                     ` (4 preceding siblings ...)
  2021-08-26 12:33   ` [PATCH 5/8] www_text: fix example config snippet " Eric Wong
@ 2021-08-26 12:33   ` Eric Wong
  2021-08-26 12:33   ` [PATCH 7/8] www_text: add coderepo config support " Eric Wong
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Eric Wong @ 2021-08-26 12:33 UTC (permalink / raw)
  To: meta

There's currently no support for altid with extindex, and
there's likely no legacy precedent for using altid like there is
with single public-inboxes.
---
 lib/PublicInbox/Config.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/Config.pm b/lib/PublicInbox/Config.pm
index 7aa1f6c8..b3e00ae0 100644
--- a/lib/PublicInbox/Config.pm
+++ b/lib/PublicInbox/Config.pm
@@ -525,7 +525,7 @@ sub _fill_ei ($$) {
 		my $v = get_1($self, $pfx, $k) // next;
 		$es->{$k} = $v;
 	}
-	for my $k (qw(altid coderepo hide url infourl)) {
+	for my $k (qw(coderepo hide url infourl)) {
 		my $v = $self->{"$pfx.$k"} // next;
 		$es->{$k} = _array($v);
 	}

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 7/8] www_text: add coderepo config support for extindex
  2021-08-26 12:33 ` [PATCH 0/8] various WWW + extindex stuff Eric Wong
                     ` (5 preceding siblings ...)
  2021-08-26 12:33   ` [PATCH 6/8] config: do not parse altid " Eric Wong
@ 2021-08-26 12:33   ` Eric Wong
  2021-08-26 12:33   ` [PATCH 8/8] move ->ids_after from mm to over Eric Wong
  2021-08-26 13:27   ` [PATCH 0/8] various WWW + extindex stuff Konstantin Ryabitsev
  8 siblings, 0 replies; 21+ messages in thread
From: Eric Wong @ 2021-08-26 12:33 UTC (permalink / raw)
  To: meta

At least manually configured coderepos "just work"
for extindex, though it probably could be automatic
and inherited from the publicinbox configs.
---
 lib/PublicInbox/WwwText.pm | 75 +++++++++++++++++++-------------------
 1 file changed, 38 insertions(+), 37 deletions(-)

diff --git a/lib/PublicInbox/WwwText.pm b/lib/PublicInbox/WwwText.pm
index eb5e3ac7..47310258 100644
--- a/lib/PublicInbox/WwwText.pm
+++ b/lib/PublicInbox/WwwText.pm
@@ -129,7 +129,42 @@ sub dq_escape ($) {
 	$name;
 }
 
-sub URI_PATH () { '^A-Za-z0-9\-\._~/' }
+sub _coderepo_config ($$) {
+	my ($ctx, $txt) = @_;
+	my $cr = $ctx->{ibx}->{coderepo} // return;
+	# note: this doesn't preserve cgitrc layout, since we parse cgitrc
+	# and drop the original structure
+	$$txt .= "\tcoderepo = $_\n" for @$cr;
+	$$txt .= <<'EOF';
+
+; `coderepo' entries allows blob reconstruction via patch emails if
+; the inbox is indexed with Xapian.  `@@ <from-range> <to-range> @@'
+; line number ranges in `[PATCH]' emails link to /$INBOX_NAME/$OID/s/,
+; an HTTP endpoint which reconstructs git blobs via git-apply(1).
+EOF
+	my $pi_cfg = $ctx->{www}->{pi_cfg};
+	for my $cr_name (@$cr) {
+		my $urls = $pi_cfg->get_all("coderepo.$cr_name.cgiturl");
+		my $path = "/path/to/$cr_name";
+		$cr_name = dq_escape($cr_name);
+
+		$$txt .= qq([coderepo "$cr_name"]\n);
+		if ($urls && scalar(@$urls)) {
+			$$txt .= "\t; ";
+			$$txt .= join(" ||\n\t;\t", map {;
+				my $dst = $path;
+				if ($path !~ m![a-z0-9_/\.\-]!i) {
+					$dst = '"'.dq_escape($dst).'"';
+				}
+				qq(git clone $_ $dst);
+			} @$urls);
+			$$txt .= "\n";
+		}
+		$$txt .= "\tdir = $path\n";
+		$$txt .= "\tcgiturl = https://example.com/";
+		$$txt .= uri_escape_utf8($cr_name, '^A-Za-z0-9\-\._~/')."\n";
+	}
+}
 
 # n.b. this is a perfect candidate for memoization
 sub inbox_config ($$$) {
@@ -176,41 +211,7 @@ EOF
 		$$txt .= "\t$k = $v\n";
 	}
 	$$txt .= "\tnntpmirror = $_\n" for (@{$ibx->nntp_url});
-
-	# note: this doesn't preserve cgitrc layout, since we parse cgitrc
-	# and drop the original structure
-	if (defined(my $cr = $ibx->{coderepo})) {
-		$$txt .= "\tcoderepo = $_\n" for @$cr;
-		$$txt .= <<'EOF';
-
-; `coderepo' entries allows blob reconstruction via patch emails if
-; the inbox is indexed with Xapian.  `@@ <from-range> <to-range> @@'
-; line number ranges in `[PATCH]' emails link to /$INBOX_NAME/$OID/s/,
-; an HTTP endpoint which reconstructs git blobs via git-apply(1).
-EOF
-		my $pi_cfg = $ctx->{www}->{pi_cfg};
-		for my $cr_name (@$cr) {
-			my $urls = $pi_cfg->get_all("coderepo.$cr_name.cgiturl");
-			my $path = "/path/to/$cr_name";
-			$cr_name = dq_escape($cr_name);
-
-			$$txt .= qq([coderepo "$cr_name"]\n);
-			if ($urls && scalar(@$urls)) {
-				$$txt .= "\t; ";
-				$$txt .= join(" ||\n\t;\t", map {;
-					my $dst = $path;
-					if ($path !~ m![a-z0-9_/\.\-]!i) {
-						$dst = '"'.dq_escape($dst).'"';
-					}
-					qq(git clone $_ $dst);
-				} @$urls);
-				$$txt .= "\n";
-			}
-			$$txt .= "\tdir = $path\n";
-			$$txt .= "\tcgiturl = https://example.com/";
-			$$txt .= uri_escape_utf8($cr_name, URI_PATH)."\n";
-		}
-	}
+	_coderepo_config($ctx, $txt);
 	1;
 }
 
@@ -235,7 +236,7 @@ EOS
 		defined(my $v = $ibx->{$k}) or next;
 		$$txt .= "\t$k = $v\n";
 	}
-	# TODO: coderepo support for extindex
+	_coderepo_config($ctx, $txt);
 	1;
 }
 

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 8/8] move ->ids_after from mm to over
  2021-08-26 12:33 ` [PATCH 0/8] various WWW + extindex stuff Eric Wong
                     ` (6 preceding siblings ...)
  2021-08-26 12:33   ` [PATCH 7/8] www_text: add coderepo config support " Eric Wong
@ 2021-08-26 12:33   ` Eric Wong
  2021-08-26 13:27   ` [PATCH 0/8] various WWW + extindex stuff Konstantin Ryabitsev
  8 siblings, 0 replies; 21+ messages in thread
From: Eric Wong @ 2021-08-26 12:33 UTC (permalink / raw)
  To: meta

Since we favor ->over in WWW and IMAP, move this method to
->over to reduce open files in common cases.

This fixes the /$EXTINDEX_NAME/all.mbox.gz endpoint for extindex
entries (which may get expensive...).
---
 lib/PublicInbox/Mbox.pm   | 10 ++++------
 lib/PublicInbox/Msgmap.pm | 11 -----------
 lib/PublicInbox/NNTP.pm   |  2 +-
 lib/PublicInbox/Over.pm   | 11 +++++++++++
 t/extindex-psgi.t         |  3 +++
 5 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/lib/PublicInbox/Mbox.pm b/lib/PublicInbox/Mbox.pm
index 844099aa..f72af26b 100644
--- a/lib/PublicInbox/Mbox.pm
+++ b/lib/PublicInbox/Mbox.pm
@@ -161,19 +161,17 @@ sub all_ids_cb {
 			my $smsg = $ctx->{over}->get_art($num) or next;
 			return $smsg;
 		}
-		$ctx->{ids} = $ids = $ctx->{mm}->ids_after(\($ctx->{prev}));
+		$ctx->{ids} = $ids = $ctx->{over}->ids_after(\($ctx->{prev}));
 	} while (@$ids);
 }
 
 sub mbox_all_ids {
 	my ($ctx) = @_;
-	my $ibx = $ctx->{ibx};
 	my $prev = 0;
-	my $mm = $ctx->{mm} = $ibx->mm;
-	my $ids = $mm->ids_after(\$prev) or return
-		[404, [qw(Content-Type text/plain)], ["No results found\n"]];
-	$ctx->{over} = $ibx->over or
+	$ctx->{over} = $ctx->{ibx}->over or
 		return PublicInbox::WWW::need($ctx, 'Overview');
+	my $ids = $ctx->{over}->ids_after(\$prev) or return
+		[404, [qw(Content-Type text/plain)], ["No results found\n"]];
 	$ctx->{ids} = $ids;
 	$ctx->{prev} = $prev;
 	require PublicInbox::MboxGz;
diff --git a/lib/PublicInbox/Msgmap.pm b/lib/PublicInbox/Msgmap.pm
index 16a9a476..3887a9e6 100644
--- a/lib/PublicInbox/Msgmap.pm
+++ b/lib/PublicInbox/Msgmap.pm
@@ -189,17 +189,6 @@ CREATE TABLE IF NOT EXISTS meta (
 
 }
 
-# used by NNTP.pm
-sub ids_after {
-	my ($self, $num) = @_;
-	my $ids = $self->{dbh}->selectcol_arrayref(<<'', undef, $$num);
-SELECT num FROM msgmap WHERE num > ?
-ORDER BY num ASC LIMIT 1000
-
-	$$num = $ids->[-1] if @$ids;
-	$ids;
-}
-
 sub msg_range {
 	my ($self, $beg, $end, $cols) = @_;
 	$cols //= 'num,mid';
diff --git a/lib/PublicInbox/NNTP.pm b/lib/PublicInbox/NNTP.pm
index aea04c05..ea9ce183 100644
--- a/lib/PublicInbox/NNTP.pm
+++ b/lib/PublicInbox/NNTP.pm
@@ -210,7 +210,7 @@ sub listgroup_range_i {
 
 sub listgroup_all_i {
 	my ($self, $num) = @_;
-	my $ary = $self->{ibx}->mm(1)->ids_after($num);
+	my $ary = $self->{ibx}->over(1)->ids_after($num);
 	scalar(@$ary) or return;
 	more($self, join("\r\n", @$ary));
 	1;
diff --git a/lib/PublicInbox/Over.pm b/lib/PublicInbox/Over.pm
index 58fdea0e..19da056a 100644
--- a/lib/PublicInbox/Over.pm
+++ b/lib/PublicInbox/Over.pm
@@ -371,4 +371,15 @@ SELECT COUNT(*) FROM xref3 WHERE oidbin = ?
 
 sub blob_exists { oidbin_exists($_[0], pack('H*', $_[1])) }
 
+# used by NNTP.pm
+sub ids_after {
+	my ($self, $num) = @_;
+	my $ids = dbh($self)->selectcol_arrayref(<<'', undef, $$num);
+SELECT num FROM over WHERE num > ?
+ORDER BY num ASC LIMIT 1000
+
+	$$num = $ids->[-1] if @$ids;
+	$ids;
+}
+
 1;
diff --git a/t/extindex-psgi.t b/t/extindex-psgi.t
index b9acc979..d4761641 100644
--- a/t/extindex-psgi.t
+++ b/t/extindex-psgi.t
@@ -52,6 +52,9 @@ my $client = sub {
 	my $cfg = PublicInbox::Config->git_config_dump($f);
 	is($?, 0, 'no errors from git-config parsing');
 	ok($cfg->{'extindex.all.topdir'}, 'extindex.topdir defined');
+
+	$res = $cb->(GET('/all/all.mbox.gz'));
+	is($res->code, 200, 'all.mbox.gz');
 };
 test_psgi(sub { $www->call(@_) }, $client);
 %$env = (%$env, TMPDIR => $tmpdir, PI_CONFIG => $pi_config);

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/8] various WWW + extindex stuff
  2021-08-26 12:33 ` [PATCH 0/8] various WWW + extindex stuff Eric Wong
                     ` (7 preceding siblings ...)
  2021-08-26 12:33   ` [PATCH 8/8] move ->ids_after from mm to over Eric Wong
@ 2021-08-26 13:27   ` Konstantin Ryabitsev
  2021-08-28 11:50     ` [PATCH 0/2] www: split out mirror to /text/ Eric Wong
  8 siblings, 1 reply; 21+ messages in thread
From: Konstantin Ryabitsev @ 2021-08-26 13:27 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On Thu, Aug 26, 2021 at 12:33:30PM +0000, Eric Wong wrote:
> This hopefully makes the long .onion URL more usable on small
> displays;

It's still not quite fixing the problem (actually, for some reason it's now
looking worse in the narrow mobile view for me). May I suggest the following:

- The / mirror / link at the top should link to _/text/mirror
- Everything below "This inbox may be clone and mirrored by anyone:" should
  move to that page, including cloning instructions for public-inbox itself
- "AGPL code to this site" will be just a link to _/text/mirror/#clone

I think this solves multiple problems:

1. Removes the long link view from the thread listing, allowing for better
   display on narrow screens; mobile device users are unlikely to be accessing
   the mirroring instructions page, because that requires an actual
   workstation to do.
2. On lists with a lot of epochs, moving mirroring instructions from every
   view shrinks the page basement considerably (e.g. see /lkml/ with 11
   epochs).
3. Allows adding more content to _/text/mirror page without any fear of
   impacting the thread listing.

What do you think?

-K

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 0/2] www: split out mirror to /text/
  2021-08-26 13:27   ` [PATCH 0/8] various WWW + extindex stuff Konstantin Ryabitsev
@ 2021-08-28 11:50     ` Eric Wong
  2021-08-28 11:50       ` [PATCH 1/2] www: move mirror instructions " Eric Wong
                         ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Eric Wong @ 2021-08-28 11:50 UTC (permalink / raw)
  To: meta; +Cc: Konstantin Ryabitsev

Maybe this works better?  *shrug*

I was somewhat under the impression every page needed to
link/advertise code under the AGPL, but maybe just one HTML page
is fine...  We can't advertise our AGPL code to NNTP/IMAP users,
nor to people using Atom feed readers.

So maybe just burying the code link in the "mirror" link is
fine.  I could never be comfortable with promoting anything I
do, anyways.

Eric Wong (2):
  www: move mirror instructions to /text/
  www_stream: description header links to top $INBOX_URL

 lib/PublicInbox/WwwListing.pm |   4 +-
 lib/PublicInbox/WwwStream.pm  | 118 +++---------------------------
 lib/PublicInbox/WwwText.pm    | 134 +++++++++++++++++++++++++++++++++-
 t/psgi_mount.t                |  11 +--
 4 files changed, 145 insertions(+), 122 deletions(-)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/2] www: move mirror instructions to /text/
  2021-08-28 11:50     ` [PATCH 0/2] www: split out mirror to /text/ Eric Wong
@ 2021-08-28 11:50       ` Eric Wong
  2021-08-28 11:50       ` [PATCH 2/2] www_stream: description header links to top $INBOX_URL Eric Wong
  2021-08-28 17:58       ` [PATCH 0/2] www: split out mirror to /text/ Konstantin Ryabitsev
  2 siblings, 0 replies; 21+ messages in thread
From: Eric Wong @ 2021-08-28 11:50 UTC (permalink / raw)
  To: meta; +Cc: Konstantin Ryabitsev

This makes the mirroring and code retrieval instructions less
obstructive.  Relying on WwwText means we only use our Linkify
module to make hrefs of full URLs; making relative and shortened
hrefs off-limits; hopefully this isn't too much of a problem.

coderepo information remains duplicated on every page since
(IMHO) coderepos are an important feature; but nobody besides me
has ever bothered to configure coderepos, so I suppose it's
fine...

Suggested-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Link: https://public-inbox.org/meta/20210826132747.6gxuwnhftyf7c6hp@nitro.local/
---
 lib/PublicInbox/WwwListing.pm |   4 +-
 lib/PublicInbox/WwwStream.pm  | 116 +++--------------------------
 lib/PublicInbox/WwwText.pm    | 134 +++++++++++++++++++++++++++++++++-
 t/psgi_mount.t                |  11 +--
 4 files changed, 143 insertions(+), 122 deletions(-)

diff --git a/lib/PublicInbox/WwwListing.pm b/lib/PublicInbox/WwwListing.pm
index ef9048b5..c3779619 100644
--- a/lib/PublicInbox/WwwListing.pm
+++ b/lib/PublicInbox/WwwListing.pm
@@ -226,9 +226,7 @@ sub psgi_triple {
 	} else {
 		$gzf->zmore('<pre>no inboxes, yet');
 	}
-	my $out = $gzf->zflush('</pre><hr><pre>'.
-			PublicInbox::WwwStream::code_footer($ctx->{env}) .
-			'</pre></body></html>');
+	my $out = $gzf->zflush('</pre></body></html>');
 	$h->[3] = length($out);
 	[ $code, $h, [ $out ] ];
 }
diff --git a/lib/PublicInbox/WwwStream.pm b/lib/PublicInbox/WwwStream.pm
index d8142824..c960edc5 100644
--- a/lib/PublicInbox/WwwStream.pm
+++ b/lib/PublicInbox/WwwStream.pm
@@ -11,7 +11,6 @@ use v5.10.1;
 use parent qw(Exporter PublicInbox::GzipFilter);
 our @EXPORT_OK = qw(html_oneshot);
 use PublicInbox::Hval qw(ascii_html prurl ts2str);
-our $TOR_URL = 'https://www.torproject.org/';
 
 our $CODE_URL = [ qw(
 http://7fh6tueqddpjyxjmgtdiueylzoqt6pt7hec3pukyptlmohoowvhde4yd.onion/public-inbox.git
@@ -42,8 +41,6 @@ sub html_top ($) {
 	my $desc = ascii_html($ibx->description);
 	my $title = delete($ctx->{-title_html}) // $desc;
 	my $upfx = $ctx->{-upfx} || '';
-	my $help = $upfx.'_/text/help/';
-	my $color = $upfx.'_/text/color/';
 	my $atom = $ctx->{-atom} || $upfx.'new.atom';
 	my $top = "<b>$desc</b>";
 	if (my $t_max = $ctx->{-t_max}) {
@@ -54,9 +51,11 @@ sub html_top ($) {
 		$top = qq(<a\nhref="./">$top</a>);
 	}
 	my $code = $ibx->{coderepo} ? qq( / <a\nhref=#code>code</a>) : '';
-	my $links = qq(<a\nhref="$help">help</a> / ).
-			qq(<a\nhref="$color">color</a> / ).
-			qq(<a\nhref=#mirror>mirror</a>$code / ).
+	# id=mirror must exist for legacy bookmarks
+	my $links = qq(<a\nhref="${upfx}_/text/help/">help</a> / ).
+			qq(<a\nhref="${upfx}_/text/color/">color</a> / ).
+			qq(<a\nid=mirror) .
+			qq(\nhref="${upfx}_/text/mirror/">mirror</a>$code / ).
 			qq(<a\nhref="$atom">Atom feed</a>);
 	if ($ibx->isrch) {
 		my $q_val = delete($ctx->{-q_value_html}) // '';
@@ -106,109 +105,12 @@ EOF
 	@ret; # may be empty, this sub is called as an arg for join()
 }
 
-sub code_footer ($) {
-	my ($env) = @_;
-	my $u = prurl($env, $CODE_URL);
-	my $arg = $u;
-	if ($arg =~ s!\A(https?://)([^/\.]+)\.onion/!$1\$hostname\.onion/!i) {
-		"AGPL code for this site:\n\thostname=$2\n\t" .
-		qq(torsocks git clone <a\nhref="$u">$arg</a>)
-	} else {
-		qq(AGPL code for this site: git clone <a\nhref="$u">$u</a>)
-	}
-}
-
 sub _html_end {
 	my ($ctx) = @_;
-	my $ibx = $ctx->{ibx};
-	my $desc = ascii_html($ibx->description);
-	my $s = "<a\nid=mirror>";
-	my $http = $ctx->{base_url};
-	my $dir = (split(m!/!, $http))[-1];
-	my %seen = ($http => 1);
-	if ($ibx->can('cloneurl')) { # PublicInbox::Inbox
-		$s .= "This inbox may be cloned and mirrored by anyone:</a>\n";
-		my @urls;
-		my $max = $ibx->max_git_epoch;
-		# TODO: some of these URLs may be too long and we may need to
-		# do something like code_footer() above, but these are local
-		# admin-defined
-		if (defined($max)) { # v2
-			for my $i (0..$max) {
-				# old epochs my be deleted:
-				-d "$ibx->{inboxdir}/git/$i.git" or next;
-				my $url = "$http/$i";
-				$seen{$url} = 1;
-				push @urls, "$url $dir/git/$i.git";
-			}
-			my $nr = scalar(@urls);
-			if ($nr > 1) {
-				$s .= "\n\t";
-				$s .= "# this inbox consists of $nr epochs:";
-				$urls[0] .= "\t# oldest";
-				$urls[-1] .= "\t# newest";
-			}
-		} else { # v1
-			push @urls, $http;
-		}
-		# FIXME: epoch splits can be different in other repositories,
-		# use the "cloneurl" file as-is for now:
-		for my $u (@{$ibx->cloneurl}) {
-			next if $seen{$u}++;
-			push @urls, ($u =~ /\Ahttps?:/ ?
-					qq(<a\nhref="$u">$u</a>) : $u);
-		}
-		$s .= "\n";
-		$s .= join('', map { "\tgit clone --mirror $_\n" } @urls);
-		if (my $addrs = $ibx->{address}) {
-			$addrs = join(' ', @$addrs) if ref($addrs) eq 'ARRAY';
-			my $v = defined $max ? '-V2' : '-V1';
-			$s .= <<EOF;
-
-	# If you have public-inbox 1.1+ installed, you may
-	# initialize and index your mirror using the following commands:
-	public-inbox-init $v $ibx->{name} $dir/ $http \\
-		$addrs
-	public-inbox-index $dir
-EOF
-		}
-	} else { # PublicInbox::ExtSearch
-		$s .= <<EOM;
-This is an extindex which is an amalgamation of several public-inboxes</a>
-EOM
-		my $v = $ctx->{www}->{pi_cfg}->{lc('publicInbox.wwwListing')};
-		if (($v // '') =~ /\A(?:all|match=domain)\z/) {
-			my $upfx = ($ctx->{-upfx} // ''). '../';
-			$s .= <<EOM;
-A list of them is available in the <a\nhref="$upfx">listing</a>
-EOM
-		}
-	}
-
-	my $cfg_link = ($ctx->{-upfx} // '').'_/text/config/raw';
-	$s .= <<EOF;
-
-Example <a
-href="$cfg_link">config snippet</a> for mirrors.
-EOF
-	if ($ibx->can('nntp_url')) {
-		my @nntp = map { qq(<a\nhref="$_">$_</a>) } @{$ibx->nntp_url};
-		if (@nntp) {
-			$s .= @nntp == 1 ? 'Newsgroup' : 'Newsgroups are';
-			$s .= ' available over NNTP:';
-			$s .= "\n\t" . join("\n\t", @nntp) . "\n";
-		}
-	}
-	if ($s =~ m!\b[^:]+://\w+\.onion/!) {
-		$s .= " note: .onion URLs require Tor: ";
-		$s .= qq[<a\nhref="$TOR_URL">$TOR_URL</a>];
-	}
-	'<hr><pre>'.join("\n\n",
-		$desc,
-		$s,
-		coderepos($ctx),
-		code_footer($ctx->{env})
-	).'</pre></body></html>';
+	my @cr = coderepos($ctx);
+	scalar(@cr) ?
+		'<hr><pre>'.join("\n\n", @cr).'</pre></body></html>' :
+		'</body></html>';
 }
 
 # callback for HTTP.pm (and any other PSGI servers)
diff --git a/lib/PublicInbox/WwwText.pm b/lib/PublicInbox/WwwText.pm
index 47310258..858fc2f7 100644
--- a/lib/PublicInbox/WwwText.pm
+++ b/lib/PublicInbox/WwwText.pm
@@ -7,7 +7,7 @@ use strict;
 use v5.10.1;
 use PublicInbox::Linkify;
 use PublicInbox::WwwStream;
-use PublicInbox::Hval qw(ascii_html);
+use PublicInbox::Hval qw(ascii_html prurl);
 use URI::Escape qw(uri_escape_utf8);
 use PublicInbox::GzipFilter qw(gzf_maybe);
 our $QP_URL = 'https://xapian.org/docs/queryparser.html';
@@ -23,7 +23,7 @@ sub get_text {
 	my ($ctx, $key) = @_;
 	my $code = 200;
 
-	$key = 'help' if !defined $key; # this 302s to _/text/help/
+	$key //= 'help'; # this 302s to _/text/help/
 
 	# get the raw text the same way we get mboxrds
 	my $raw = ($key =~ s!/raw\z!!);
@@ -240,12 +240,138 @@ EOS
 	1;
 }
 
+sub coderepos_raw ($$) {
+	my ($ctx, $top_url) = @_;
+	my $cr = $ctx->{ibx}->{coderepo} // return ();
+	my $cfg = $ctx->{www}->{pi_cfg};
+	my @ret;
+	for my $cr_name (@$cr) {
+		$ret[0] //= <<EOF;
+code repositories for project(s) associated with this inbox:
+EOF
+		my $urls = $cfg->get_all("coderepo.$cr_name.cgiturl");
+		if ($urls) {
+			for (@$urls) {
+				# relative or absolute URL?, prefix relative
+				# "foo.git" with appropriate number of "../"
+				my $u = m!\A(?:[a-z\+]+:)?//!i ? $_ :
+					$top_url.$_;
+				$ret[0] .= "\n\t" . prurl($ctx->{env}, $u);
+			}
+		} else {
+			$ret[0] .= qq[\n\t$cr_name.git (no URL configured)];
+		}
+	}
+	@ret; # may be empty, this sub is called as an arg for join()
+}
+
+sub _mirror_help ($$) {
+	my ($ctx, $txt) = @_;
+	my $ibx = $ctx->{ibx};
+	my $base_url = $ibx->base_url($ctx->{env});
+	chop $base_url; # no trailing slash for "git clone"
+	my $dir = (split(m!/!, $base_url))[-1];
+	my %seen = ($base_url => 1);
+	my $top_url = $base_url;
+	$top_url =~ s!/[^/]+\z!/!;
+	$$txt .= "public-inbox mirroring instructions\n\n";
+	if ($ibx->can('cloneurl')) { # PublicInbox::Inbox
+		$$txt .= "This inbox may be cloned and mirrored by anyone:\n";
+		my @urls;
+		my $max = $ibx->max_git_epoch;
+		# TODO: some of these URLs may be too long and we may need to
+		# do something like code_footer() above, but these are local
+		# admin-defined
+		if (defined($max)) { # v2
+			for my $i (0..$max) {
+				# old epochs my be deleted:
+				-d "$ibx->{inboxdir}/git/$i.git" or next;
+				my $url = "$base_url/$i";
+				$seen{$url} = 1;
+				push @urls, "$url $dir/git/$i.git";
+			}
+			my $nr = scalar(@urls);
+			if ($nr > 1) {
+				$$txt .= "\n\t";
+				$$txt .= "# this inbox consists of $nr epochs:";
+				$urls[0] .= " # oldest";
+				$urls[-1] .= " # newest";
+			}
+		} else { # v1
+			push @urls, $base_url;
+		}
+		# FIXME: epoch splits can be different in other repositories,
+		# use the "cloneurl" file as-is for now:
+		for my $u (@{$ibx->cloneurl}) {
+			next if $seen{$u}++;
+			push @urls, $u;
+		}
+		$$txt .= "\n";
+		$$txt .= join('', map { "\tgit clone --mirror $_\n" } @urls);
+		if (my $addrs = $ibx->{address}) {
+			$addrs = join(' ', @$addrs) if ref($addrs) eq 'ARRAY';
+			my $v = defined $max ? '-V2' : '-V1';
+			$$txt .= <<EOF;
+
+	# If you have public-inbox 1.1+ installed, you may
+	# initialize and index your mirror using the following commands:
+	public-inbox-init $v $ibx->{name} $dir/ $base_url \\
+		$addrs
+	public-inbox-index $dir
+EOF
+		}
+	} else { # PublicInbox::ExtSearch
+		$$txt .= <<EOM;
+This is an extindex which is an amalgamation of several public-inboxes.
+Each public-inbox needs to be mirrored individually.
+EOM
+		my $v = $ctx->{www}->{pi_cfg}->{lc('publicInbox.wwwListing')};
+		if (($v // '') =~ /\A(?:all|match=domain)\z/) {
+			$$txt .= <<EOM;
+A list of them is available at $top_url
+EOM
+		}
+	}
+	my $cfg_link = "$base_url/_/text/config/raw";
+	$$txt .= <<EOF;
+
+Example config snippet for mirrors: $cfg_link
+EOF
+	if ($ibx->can('nntp_url')) {
+		my $nntp = $ibx->nntp_url;
+		if (scalar @$nntp) {
+			$$txt .= "\n";
+			$$txt .= @$nntp == 1 ? 'Newsgroup' : 'Newsgroups are';
+			$$txt .= ' available over NNTP:';
+			$$txt .= "\n\t" . join("\n\t", @$nntp) . "\n";
+		}
+	}
+	if ($$txt =~ m!\b[^:]+://\w+\.onion/!) {
+		$$txt .= <<EOM
+
+note: .onion URLs require Tor: https://www.torproject.org/
+
+EOM
+	}
+	my $code_url = prurl($ctx->{env}, $PublicInbox::WwwStream::CODE_URL);
+	$$txt .= join("\n\n",
+		coderepos_raw($ctx, $top_url), # may be empty
+		"AGPL code for this site:\n\tgit clone $code_url");
+	1;
+}
+
 sub _default_text ($$$$) {
 	my ($ctx, $key, $hdr, $txt) = @_;
-	return _colors_help($ctx, $txt) if $key eq 'color';
-	$key eq 'config' and return $ctx->{ibx}->can('cloneurl') ?
+	if ($key eq 'mirror') {
+		return _mirror_help($ctx, $txt);
+	} elsif ($key eq 'color') {
+		return _colors_help($ctx, $txt);
+	} elsif ($key eq 'config') {
+		return $ctx->{ibx}->can('cloneurl') ?
 			inbox_config($ctx, $hdr, $txt) :
 			extindex_config($ctx, $hdr, $txt);
+	}
+
 	return if $key ne 'help'; # TODO more keys?
 
 	my $ibx = $ctx->{ibx};
diff --git a/t/psgi_mount.t b/t/psgi_mount.t
index e9547c15..7c5487f3 100644
--- a/t/psgi_mount.t
+++ b/t/psgi_mount.t
@@ -48,14 +48,9 @@ test_psgi($app, sub {
 	unlike($res->content, qr!\b\Qhttp://[^/]+/test/\E!,
 		'No URLs which are not mount-aware');
 
-	$res = $cb->(GET('/a/test/new.html'));
-	like($res->content, qr!git clone --mirror http://[^/]+/a/test\b!,
-		'clone URL in new.html is mount-aware');
-
-	$res = $cb->(GET('/a/test/blah%40example.com/'));
-	is($res->code, 200, 'OK with URLMap mount');
-	like($res->content, qr!git clone --mirror http://[^/]+/a/test\b!,
-		'clone URL in /$INBOX/$MESSAGE_ID/ is mount-aware');
+	$res = $cb->(GET('/a/test/_/text/mirror/'));
+	like($res->content, qr!git clone --mirror\s+.*?http://[^/]+/a/test\b!s,
+		'clone URL in /text/mirror is mount-aware');
 
 	$res = $cb->(GET('/a/test/blah%40example.com/raw'));
 	is($res->code, 200, 'OK with URLMap mount');

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/2] www_stream: description header links to top $INBOX_URL
  2021-08-28 11:50     ` [PATCH 0/2] www: split out mirror to /text/ Eric Wong
  2021-08-28 11:50       ` [PATCH 1/2] www: move mirror instructions " Eric Wong
@ 2021-08-28 11:50       ` Eric Wong
  2021-08-28 17:58       ` [PATCH 0/2] www: split out mirror to /text/ Konstantin Ryabitsev
  2 siblings, 0 replies; 21+ messages in thread
From: Eric Wong @ 2021-08-28 11:50 UTC (permalink / raw)
  To: meta

Making the inbox description link back to the most recent
per-inbox topics from text/ and $OID/s/ URLs seems useful,
rather than keeping the description up there.

Followup-to: 6c853f5256f3a324 ("www: improve navigation around contemporary threads")
---
 lib/PublicInbox/WwwStream.pm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/PublicInbox/WwwStream.pm b/lib/PublicInbox/WwwStream.pm
index c960edc5..472316c2 100644
--- a/lib/PublicInbox/WwwStream.pm
+++ b/lib/PublicInbox/WwwStream.pm
@@ -49,6 +49,8 @@ sub html_top ($) {
 	# we had some kind of query, link to /$INBOX/?t=YYYYMMDDhhmmss
 	} elsif ($ctx->{qp}->{t}) {
 		$top = qq(<a\nhref="./">$top</a>);
+	} elsif (length($upfx)) {
+		$top = qq(<a\nhref="$upfx">$top</a>);
 	}
 	my $code = $ibx->{coderepo} ? qq( / <a\nhref=#code>code</a>) : '';
 	# id=mirror must exist for legacy bookmarks

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/2] www: split out mirror to /text/
  2021-08-28 11:50     ` [PATCH 0/2] www: split out mirror to /text/ Eric Wong
  2021-08-28 11:50       ` [PATCH 1/2] www: move mirror instructions " Eric Wong
  2021-08-28 11:50       ` [PATCH 2/2] www_stream: description header links to top $INBOX_URL Eric Wong
@ 2021-08-28 17:58       ` Konstantin Ryabitsev
  2021-08-30 23:44         ` [PATCH 0/3] www: more footer and mirroring instructions tweaks Eric Wong
  2 siblings, 1 reply; 21+ messages in thread
From: Konstantin Ryabitsev @ 2021-08-28 17:58 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On Sat, Aug 28, 2021 at 11:50:05AM +0000, Eric Wong wrote:
> Maybe this works better?  *shrug*
> 
> I was somewhat under the impression every page needed to
> link/advertise code under the AGPL, but maybe just one HTML page
> is fine...  We can't advertise our AGPL code to NNTP/IMAP users,
> nor to people using Atom feed readers.

I think this looks great, Eric. You went a bit further than I suggested,
though -- I do think you should keep the following at each page bottom:

<hr>
<a href="../../_/text/mirror/">Get full contents and AGPL source for this site.</a>

Or some similar wording. I do think something like that belongs on each view
just to stress that everything about the site is fully available for
replication.

-K

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 0/3] www: more footer and mirroring instructions tweaks
  2021-08-28 17:58       ` [PATCH 0/2] www: split out mirror to /text/ Konstantin Ryabitsev
@ 2021-08-30 23:44         ` Eric Wong
  2021-08-30 23:44           ` [PATCH 1/3] www_stream: extra link to mirroring information in the footer Eric Wong
                             ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Eric Wong @ 2021-08-30 23:44 UTC (permalink / raw)
  To: meta; +Cc: Konstantin Ryabitsev

Some of the wording may still need tweaking, I'm preferring to
favor the data aspect over the code aspect of mirroring since
AGPL probably scares some people.

Not really sure about 3/3 or if including instructions to
grokmirror is out-of-scope for this project.

Eric Wong (3):
  www_stream: extra link to mirroring information in the footer
  www_text/mirror: spell out "external index" and "public inbox"
  www_listing: add note about mirroring information

 lib/PublicInbox/WwwListing.pm |  5 ++++-
 lib/PublicInbox/WwwStream.pm  | 24 +++++++++++++++++++-----
 lib/PublicInbox/WwwText.pm    | 15 ++++++++++-----
 3 files changed, 33 insertions(+), 11 deletions(-)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/3] www_stream: extra link to mirroring information in the footer
  2021-08-30 23:44         ` [PATCH 0/3] www: more footer and mirroring instructions tweaks Eric Wong
@ 2021-08-30 23:44           ` Eric Wong
  2021-08-30 23:44           ` [PATCH 2/3] www_text/mirror: spell out "external index" and "public inbox" Eric Wong
  2021-08-30 23:44           ` [PATCH 3/3] www_listing: add note about mirroring information Eric Wong
  2 siblings, 0 replies; 21+ messages in thread
From: Eric Wong @ 2021-08-30 23:44 UTC (permalink / raw)
  To: meta; +Cc: Konstantin Ryabitsev

This may be redundant with the "mirror" link at the top right,
but maybe people will miss one.  Properly capitalize the
"Code repositories" text while we're at it.

Link: https://public-inbox.org/20210828175827.rgzwqbn7brl56oej@nitro.local/
Cc: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
---
 lib/PublicInbox/WwwStream.pm | 24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/WwwStream.pm b/lib/PublicInbox/WwwStream.pm
index 472316c2..a88ff972 100644
--- a/lib/PublicInbox/WwwStream.pm
+++ b/lib/PublicInbox/WwwStream.pm
@@ -89,7 +89,7 @@ sub coderepos ($) {
 	my @ret;
 	for my $cr_name (@$cr) {
 		$ret[0] //= <<EOF;
-<a id=code>code repositories for project(s) associated with this inbox:
+<a id=code>Code repositories for project(s) associated with this inbox:
 EOF
 		my $urls = $cfg->get_all("coderepo.$cr_name.cgiturl");
 		if ($urls) {
@@ -109,10 +109,24 @@ EOF
 
 sub _html_end {
 	my ($ctx) = @_;
-	my @cr = coderepos($ctx);
-	scalar(@cr) ?
-		'<hr><pre>'.join("\n\n", @cr).'</pre></body></html>' :
-		'</body></html>';
+	my $upfx = $ctx->{-upfx} || '';
+	my $m = "${upfx}_/text/mirror/";
+	my $x;
+	if ($ctx->{ibx}->can('cloneurl')) {
+		$x = <<EOF;
+This is a public inbox, see <a
+href="$m">mirroring instructions</a>
+on how to clone and mirror all data and code used for this inbox
+EOF
+	} else {
+		$x = <<EOF;
+This is an external index of several public inboxes,
+see <a href="$m">mirroring instructions</a> on how to clone and mirror
+all data and code used by this external index.
+EOF
+	}
+	chomp $x;
+	'<hr><pre>'.join("\n\n", coderepos($ctx), $x).'</pre></body></html>'
 }
 
 # callback for HTTP.pm (and any other PSGI servers)

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/3] www_text/mirror: spell out "external index" and "public inbox"
  2021-08-30 23:44         ` [PATCH 0/3] www: more footer and mirroring instructions tweaks Eric Wong
  2021-08-30 23:44           ` [PATCH 1/3] www_stream: extra link to mirroring information in the footer Eric Wong
@ 2021-08-30 23:44           ` Eric Wong
  2021-08-30 23:44           ` [PATCH 3/3] www_listing: add note about mirroring information Eric Wong
  2 siblings, 0 replies; 21+ messages in thread
From: Eric Wong @ 2021-08-30 23:44 UTC (permalink / raw)
  To: meta

"extindex" and "public-inbox" are project-specific terms which
are probably unsuitable for folks who are seeing this for the
first time.

Use "public inbox" when referring to actual public inboxes,
since "public-inbox" is merely the name for this particular
implementation and others have adopted the same concept (IMHO
the concept is more important than any particular
implementation).
---
 lib/PublicInbox/WwwText.pm | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/WwwText.pm b/lib/PublicInbox/WwwText.pm
index 858fc2f7..fabe39f6 100644
--- a/lib/PublicInbox/WwwText.pm
+++ b/lib/PublicInbox/WwwText.pm
@@ -246,9 +246,13 @@ sub coderepos_raw ($$) {
 	my $cfg = $ctx->{www}->{pi_cfg};
 	my @ret;
 	for my $cr_name (@$cr) {
-		$ret[0] //= <<EOF;
-code repositories for project(s) associated with this inbox:
+		$ret[0] //= do {
+			my $thing = $ctx->{ibx}->can('cloneurl') ?
+				'public inbox' : 'external index';
+			<<EOF;
+Code repositories for project(s) associated with this $thing
 EOF
+		};
 		my $urls = $cfg->get_all("coderepo.$cr_name.cgiturl");
 		if ($urls) {
 			for (@$urls) {
@@ -276,7 +280,8 @@ sub _mirror_help ($$) {
 	$top_url =~ s!/[^/]+\z!/!;
 	$$txt .= "public-inbox mirroring instructions\n\n";
 	if ($ibx->can('cloneurl')) { # PublicInbox::Inbox
-		$$txt .= "This inbox may be cloned and mirrored by anyone:\n";
+		$$txt .=
+		  "This public inbox may be cloned and mirrored by anyone:\n";
 		my @urls;
 		my $max = $ibx->max_git_epoch;
 		# TODO: some of these URLs may be too long and we may need to
@@ -322,8 +327,8 @@ EOF
 		}
 	} else { # PublicInbox::ExtSearch
 		$$txt .= <<EOM;
-This is an extindex which is an amalgamation of several public-inboxes.
-Each public-inbox needs to be mirrored individually.
+This is an external index which is an amalgamation of several public inboxes.
+Each public inbox needs to be mirrored individually.
 EOM
 		my $v = $ctx->{www}->{pi_cfg}->{lc('publicInbox.wwwListing')};
 		if (($v // '') =~ /\A(?:all|match=domain)\z/) {

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/3] www_listing: add note about mirroring information
  2021-08-30 23:44         ` [PATCH 0/3] www: more footer and mirroring instructions tweaks Eric Wong
  2021-08-30 23:44           ` [PATCH 1/3] www_stream: extra link to mirroring information in the footer Eric Wong
  2021-08-30 23:44           ` [PATCH 2/3] www_text/mirror: spell out "external index" and "public inbox" Eric Wong
@ 2021-08-30 23:44           ` Eric Wong
  2 siblings, 0 replies; 21+ messages in thread
From: Eric Wong @ 2021-08-30 23:44 UTC (permalink / raw)
  To: meta

Perhaps this can be expanded to include grokmirror information
in the future.  For now, just give a hint about the "mirror"
link for each inbox.
---
 lib/PublicInbox/WwwListing.pm | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/WwwListing.pm b/lib/PublicInbox/WwwListing.pm
index c3779619..a9290802 100644
--- a/lib/PublicInbox/WwwListing.pm
+++ b/lib/PublicInbox/WwwListing.pm
@@ -226,7 +226,10 @@ sub psgi_triple {
 	} else {
 		$gzf->zmore('<pre>no inboxes, yet');
 	}
-	my $out = $gzf->zflush('</pre></body></html>');
+	my $out = $gzf->zflush('</pre><hr><pre>'.
+qq(This is a listing of public inboxes, see the `mirror' link of each inbox
+for instructions on how to mirror all the data and code on this site.) .
+			'</pre></body></html>');
 	$h->[3] = length($out);
 	[ $code, $h, [ $out ] ];
 }

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2021-08-30 23:44 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-08-16 16:36 RFE: Long .onion URL breaks mobile view Konstantin Ryabitsev
2021-08-16 22:38 ` Eric Wong
2021-08-16 22:53   ` Eric Wong
2021-08-26 12:33 ` [PATCH 0/8] various WWW + extindex stuff Eric Wong
2021-08-26 12:33   ` [PATCH 1/8] get rid of unnecessary bytes::length usage Eric Wong
2021-08-26 12:33   ` [PATCH 2/8] ds: use bytes::substr and bytes::length module-wide for now Eric Wong
2021-08-26 12:33   ` [PATCH 3/8] www_stream: sh-friendly .onion URLs wrapping Eric Wong
2021-08-26 12:33   ` [PATCH 4/8] www: avoid incorrect instructions for extindex Eric Wong
2021-08-26 12:33   ` [PATCH 5/8] www_text: fix example config snippet " Eric Wong
2021-08-26 12:33   ` [PATCH 6/8] config: do not parse altid " Eric Wong
2021-08-26 12:33   ` [PATCH 7/8] www_text: add coderepo config support " Eric Wong
2021-08-26 12:33   ` [PATCH 8/8] move ->ids_after from mm to over Eric Wong
2021-08-26 13:27   ` [PATCH 0/8] various WWW + extindex stuff Konstantin Ryabitsev
2021-08-28 11:50     ` [PATCH 0/2] www: split out mirror to /text/ Eric Wong
2021-08-28 11:50       ` [PATCH 1/2] www: move mirror instructions " Eric Wong
2021-08-28 11:50       ` [PATCH 2/2] www_stream: description header links to top $INBOX_URL Eric Wong
2021-08-28 17:58       ` [PATCH 0/2] www: split out mirror to /text/ Konstantin Ryabitsev
2021-08-30 23:44         ` [PATCH 0/3] www: more footer and mirroring instructions tweaks Eric Wong
2021-08-30 23:44           ` [PATCH 1/3] www_stream: extra link to mirroring information in the footer Eric Wong
2021-08-30 23:44           ` [PATCH 2/3] www_text/mirror: spell out "external index" and "public inbox" Eric Wong
2021-08-30 23:44           ` [PATCH 3/3] www_listing: add note about mirroring information Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).