* [PATCH 0/7] improved thread views and 404 reductions
@ 2015-09-02 6:59 Eric Wong
2015-09-02 6:59 ` [PATCH 1/7] view: close possible race condition in thread view Eric Wong
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: Eric Wong @ 2015-09-02 6:59 UTC (permalink / raw)
To: meta
The thread HTML view may now be flat (chronological, newest
first) to make active threads easier-to-follow. We also make
unknown Message-IDs more usable by avoiding running SHA-1
on them.
The Message-ID finder is also handy for cross posts
and can probably link to multiple, external sources such as
mid.gmane.org and other places.
Eric Wong (7):
view: close possible race condition in thread view
view: optional flat view for recent messages
view: account for missing In-Reply-To header
view: simplify parent anchoring code
view: pre-anchor entries for flat view
view: avoid links to unknown compressed Message-IDs
implement external Message-ID finder
lib/PublicInbox/ExtMsg.pm | 92 ++++++++++++++++++++++++
lib/PublicInbox/Hval.pm | 4 +-
lib/PublicInbox/View.pm | 180 ++++++++++++++++++++++++++++++----------------
lib/PublicInbox/WWW.pm | 18 +++--
public-inbox.cgi | 1 +
5 files changed, 226 insertions(+), 69 deletions(-)
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/7] view: close possible race condition in thread view
2015-09-02 6:59 [PATCH 0/7] improved thread views and 404 reductions Eric Wong
@ 2015-09-02 6:59 ` Eric Wong
2015-09-02 6:59 ` [PATCH 2/7] view: optional flat view for recent messages Eric Wong
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong @ 2015-09-02 6:59 UTC (permalink / raw)
To: meta
It's possible that the Xapian index and git HEAD can be out-of-sync
and a message which existed when we did the search is no longer
accessible by the time we get to rendering it.
---
lib/PublicInbox/View.pm | 29 ++++++++++++++++++-----------
1 file changed, 18 insertions(+), 11 deletions(-)
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index 6aa199e..1eb12a9 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -144,7 +144,7 @@ sub emit_thread_html {
my $msgs = load_results($res);
my $nr = scalar @$msgs;
return missing_thread($cb) if $nr == 0;
- my $fh = $cb->([200,['Content-Type'=>'text/html; charset=UTF-8']]);
+ my $orig_cb = $cb;
my $th = thread_results($msgs);
my $state = {
ctx => $ctx,
@@ -155,18 +155,23 @@ sub emit_thread_html {
{
require PublicInbox::GitCatFile;
my $git = PublicInbox::GitCatFile->new($ctx->{git_dir});
- thread_entry($fh, $git, $state, $_, 0) for $th->rootset;
+ thread_entry(\$cb, $git, $state, $_, 0) for $th->rootset;
}
Email::Address->purge_cache;
+
+ # there could be a race due to a message being deleted in git
+ # but still being in the Xapian index:
+ return missing_thread($cb) if ($orig_cb eq $cb);
+
my $final_anchor = $state->{anchor_idx};
my $next = "<a\nid=\"s$final_anchor\">";
$next .= $final_anchor == 1 ? 'only message in' : 'end of';
$next .= " thread</a>, back to <a\nhref=\"../../\">index</a>\n";
$next .= "download thread: <a\nhref=\"../t.mbox.gz\">mbox.gz</a>";
$next .= " / follow: <a\nhref=\"../t.atom\">Atom feed</a>\n\n";
- $fh->write("<hr />" . PRE_WRAP . $next . $foot .
+ $cb->write("<hr />" . PRE_WRAP . $next . $foot .
"</pre></body></html>");
- $fh->close;
+ $cb->close;
}
sub index_walk {
@@ -536,14 +541,16 @@ sub anchor_for {
}
sub thread_html_head {
- my ($mime) = @_;
+ my ($cb, $mime) = @_;
+ $$cb = $$cb->([200, ['Content-Type'=> 'text/html; charset=UTF-8']]);
+
my $s = PublicInbox::Hval->new_oneline($mime->header('Subject'));
$s = $s->as_html;
- "<html><head><title>$s</title></head><body>";
+ $$cb->write("<html><head><title>$s</title></head><body>");
}
sub thread_entry {
- my ($fh, $git, $state, $node, $level) = @_;
+ my ($cb, $git, $state, $node, $level) = @_;
return unless $node;
if (my $mime = $node->message) {
@@ -552,13 +559,13 @@ sub thread_entry {
$mime = eval { Email::MIME->new($git->cat_file("HEAD:$path")) };
if ($mime) {
if ($state->{anchor_idx} == 0) {
- $fh->write(thread_html_head($mime));
+ thread_html_head($cb, $mime);
}
- index_entry($fh, $mime, $level, $state);
+ index_entry($$cb, $mime, $level, $state);
}
}
- thread_entry($fh, $git, $state, $node->child, $level + 1);
- thread_entry($fh, $git, $state, $node->next, $level);
+ thread_entry($cb, $git, $state, $node->child, $level + 1);
+ thread_entry($cb, $git, $state, $node->next, $level);
}
sub load_results {
--
EW
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/7] view: optional flat view for recent messages
2015-09-02 6:59 [PATCH 0/7] improved thread views and 404 reductions Eric Wong
2015-09-02 6:59 ` [PATCH 1/7] view: close possible race condition in thread view Eric Wong
@ 2015-09-02 6:59 ` Eric Wong
2015-09-02 6:59 ` [PATCH 3/7] view: account for missing In-Reply-To header Eric Wong
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong @ 2015-09-02 6:59 UTC (permalink / raw)
To: meta
For still-active threads, it will likely be easier to follow
them chronologically, especially if we have links to parent
messages.
---
lib/PublicInbox/View.pm | 67 ++++++++++++++++++++++++++++++++++---------------
lib/PublicInbox/WWW.pm | 3 +++
2 files changed, 50 insertions(+), 20 deletions(-)
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index 1eb12a9..a3df319 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -81,7 +81,8 @@ sub index_entry {
$anchor = $seen->{$anchor_idx};
}
if ($srch) {
- $subj = "<a\nhref=\"${path}$href/t/#u\">$subj</a>";
+ my $t = $ctx->{flat} ? 'T' : 't';
+ $subj = "<a\nhref=\"${path}$href/$t/#u\">$subj</a>";
}
if ($root_anchor && $root_anchor eq $id) {
$subj = "<u\nid=\"u\">$subj</u>";
@@ -103,7 +104,9 @@ sub index_entry {
my ($fhref, $more_ref);
my $mhref = "${path}$href/";
- if ($level > 0) {
+
+ # show full messages at level == 0 in threaded view
+ if ($level > 0 || ($ctx->{flat} && $root_anchor ne $id)) {
$fhref = "${path}$href/f/";
$more_ref = \$more;
}
@@ -126,6 +129,15 @@ sub index_entry {
}
$rv .= " <a\nhref=\"$anchor\">parent</a>";
}
+ if ($srch) {
+ if ($ctx->{flat}) {
+ $rv .= " [<a\nhref=\"${path}$href/t/#u\">threaded</a>" .
+ "|<b>flat</b>]";
+ } else {
+ $rv .= " [<b>threaded</b>|" .
+ "<a\nhref=\"${path}$href/T/#u\">flat</a>]";
+ }
+ }
$fh->write($rv .= '</pre></td></tr></table>');
}
@@ -144,19 +156,24 @@ sub emit_thread_html {
my $msgs = load_results($res);
my $nr = scalar @$msgs;
return missing_thread($cb) if $nr == 0;
+ my $flat = $ctx->{flat};
my $orig_cb = $cb;
- my $th = thread_results($msgs);
my $state = {
ctx => $ctx,
seen => {},
root_anchor => anchor_for($mid),
anchor_idx => 0,
};
- {
- require PublicInbox::GitCatFile;
- my $git = PublicInbox::GitCatFile->new($ctx->{git_dir});
+
+ require PublicInbox::GitCatFile;
+ my $git = PublicInbox::GitCatFile->new($ctx->{git_dir});
+ if ($flat) {
+ __thread_entry(\$cb, $git, $state, $_, 0) for (@$msgs);
+ } else {
+ my $th = thread_results($msgs);
thread_entry(\$cb, $git, $state, $_, 0) for $th->rootset;
}
+ $git = undef;
Email::Address->purge_cache;
# there could be a race due to a message being deleted in git
@@ -166,10 +183,15 @@ sub emit_thread_html {
my $final_anchor = $state->{anchor_idx};
my $next = "<a\nid=\"s$final_anchor\">";
$next .= $final_anchor == 1 ? 'only message in' : 'end of';
- $next .= " thread</a>, back to <a\nhref=\"../../\">index</a>\n";
- $next .= "download thread: <a\nhref=\"../t.mbox.gz\">mbox.gz</a>";
- $next .= " / follow: <a\nhref=\"../t.atom\">Atom feed</a>\n\n";
- $cb->write("<hr />" . PRE_WRAP . $next . $foot .
+ $next .= " thread</a>, back to <a\nhref=\"../../\">index</a>";
+ if ($flat) {
+ $next .= " [<a\nhref=\"../t/#u\">threaded</a>|<b>flat</b>]";
+ } else {
+ $next .= " [<b>threaded</b>|<a\nhref=\"../T/#u\">flat</a>]";
+ }
+ $next .= "\ndownload thread: <a\nhref=\"../t.mbox.gz\">mbox.gz</a>";
+ $next .= " / follow: <a\nhref=\"../t.atom\">Atom feed</a>";
+ $cb->write("<hr />" . PRE_WRAP . $next . "\n\n". $foot .
"</pre></body></html>");
$cb->close;
}
@@ -549,20 +571,25 @@ sub thread_html_head {
$$cb->write("<html><head><title>$s</title></head><body>");
}
+sub __thread_entry {
+ my ($cb, $git, $state, $mime, $level) = @_;
+
+ # lazy load the full message from mini_mime:
+ my $path = mid2path(mid_clean($mime->header('Message-ID')));
+ $mime = eval { Email::MIME->new($git->cat_file("HEAD:$path")) };
+ if ($mime) {
+ if ($state->{anchor_idx} == 0) {
+ thread_html_head($cb, $mime);
+ }
+ index_entry($$cb, $mime, $level, $state);
+ }
+}
+
sub thread_entry {
my ($cb, $git, $state, $node, $level) = @_;
return unless $node;
if (my $mime = $node->message) {
-
- # lazy load the full message from mini_mime:
- my $path = mid2path(mid_clean($mime->header('Message-ID')));
- $mime = eval { Email::MIME->new($git->cat_file("HEAD:$path")) };
- if ($mime) {
- if ($state->{anchor_idx} == 0) {
- thread_html_head($cb, $mime);
- }
- index_entry($$cb, $mime, $level, $state);
- }
+ __thread_entry($cb, $git, $state, $mime, $level);
}
thread_entry($cb, $git, $state, $node->child, $level + 1);
thread_entry($cb, $git, $state, $node->next, $level);
diff --git a/lib/PublicInbox/WWW.pm b/lib/PublicInbox/WWW.pm
index d666a1b..9ae7f7b 100644
--- a/lib/PublicInbox/WWW.pm
+++ b/lib/PublicInbox/WWW.pm
@@ -46,6 +46,9 @@ sub run {
invalid_list_mid(\%ctx, $1, $2) || get_thread_mbox(\%ctx, $sfx);
} elsif ($path_info =~ m!$LISTNAME_RE/$MID_RE/t\.atom\z!o) {
invalid_list_mid(\%ctx, $1, $2) || get_thread_atom(\%ctx);
+ } elsif ($path_info =~ m!$LISTNAME_RE/$MID_RE/T/\z!o) {
+ $ctx{flat} = 1;
+ invalid_list_mid(\%ctx, $1, $2) || get_thread(\%ctx);
# single-message pages
} elsif ($path_info =~ m!$LISTNAME_RE/$MID_RE/\z!o) {
--
EW
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/7] view: account for missing In-Reply-To header
2015-09-02 6:59 [PATCH 0/7] improved thread views and 404 reductions Eric Wong
2015-09-02 6:59 ` [PATCH 1/7] view: close possible race condition in thread view Eric Wong
2015-09-02 6:59 ` [PATCH 2/7] view: optional flat view for recent messages Eric Wong
@ 2015-09-02 6:59 ` Eric Wong
2015-09-02 6:59 ` [PATCH 4/7] view: simplify parent anchoring code Eric Wong
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong @ 2015-09-02 6:59 UTC (permalink / raw)
To: meta
Some mail clients do not generate In-Reply-To headers,
but do generate a proper References header.
This matches the behavior of Mail::Thread as well
as our SearchIdx code to link threads in the Xapian DB.
---
lib/PublicInbox/View.pm | 22 ++++++++++++++++++----
1 file changed, 18 insertions(+), 4 deletions(-)
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index a3df319..d213124 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -46,6 +46,19 @@ sub feed_entry {
PRE_WRAP . multipart_text_as_html($mime, $full_pfx) . '</pre>';
}
+sub in_reply_to {
+ my ($header_obj) = @_;
+ my $irt = $header_obj->header('In-Reply-To');
+
+ return mid_clean($irt) if (defined $irt);
+
+ my $refs = $header_obj->header('References');
+ if ($refs && $refs =~ /<([^>]+)>\s*\z/s) {
+ return $1;
+ }
+ undef;
+}
+
# this is already inside a <pre>
sub index_entry {
my ($fh, $mime, $level, $state) = @_;
@@ -74,7 +87,8 @@ sub index_entry {
my $root_anchor = $state->{root_anchor};
my $path = $root_anchor ? '../../' : '';
my $href = $mid->as_href;
- my $irt = $header_obj->header('In-Reply-To');
+ my $irt = in_reply_to($header_obj);
+
my ($anchor_idx, $anchor);
if (defined $irt) {
$anchor_idx = anchor_for($irt);
@@ -463,7 +477,7 @@ sub _parent_headers_nosrch {
my ($header_obj) = @_;
my $rv = '';
- my $irt = $header_obj->header('In-Reply-To');
+ my $irt = in_reply_to($header_obj);
if (defined $irt) {
my $v = PublicInbox::Hval->new_msgid($irt);
my $html = $v->as_html;
@@ -476,7 +490,7 @@ sub _parent_headers_nosrch {
if ($refs) {
# avoid redundant URLs wasting bandwidth
my %seen;
- $seen{mid_clean($irt)} = 1 if defined $irt;
+ $seen{$irt} = 1 if defined $irt;
my @refs;
my @raw_refs = ($refs =~ /<([^>]+)>/g);
foreach my $ref (@raw_refs) {
@@ -526,7 +540,7 @@ sub html_footer {
my $idx = $standalone ? " <a\nhref=\"$upfx\">index</a>" : '';
if ($idx && $srch) {
my $next = thread_inline(\$idx, $ctx, $mime, $full_pfx);
- $irt = $mime->header('In-Reply-To');
+ $irt = in_reply_to($mime->header_obj);
if (defined $irt) {
$irt = PublicInbox::Hval->new_msgid($irt);
$irt = $irt->as_href;
--
EW
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 4/7] view: simplify parent anchoring code
2015-09-02 6:59 [PATCH 0/7] improved thread views and 404 reductions Eric Wong
` (2 preceding siblings ...)
2015-09-02 6:59 ` [PATCH 3/7] view: account for missing In-Reply-To header Eric Wong
@ 2015-09-02 6:59 ` Eric Wong
2015-09-02 6:59 ` [PATCH 5/7] view: pre-anchor entries for flat view Eric Wong
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong @ 2015-09-02 6:59 UTC (permalink / raw)
To: meta
This will make things easier for the next commit to pre-populate
the `$seen' hash for linking within the flat view of a thread.
---
lib/PublicInbox/View.pm | 15 +++++----------
1 file changed, 5 insertions(+), 10 deletions(-)
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index d213124..0331b62 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -74,7 +74,7 @@ sub index_entry {
my $mid_raw = $header_obj->header('Message-ID');
my $id = anchor_for($mid_raw);
my $seen = $state->{seen};
- $seen->{$id} = "#$id"; # save the anchor for later
+ $seen->{$id} = "#$id"; # save the anchor for children, later
my $mid = PublicInbox::Hval->new_msgid($mid_raw);
my $from = PublicInbox::Hval->new_oneline($mime->header('From'))->raw;
@@ -88,12 +88,8 @@ sub index_entry {
my $path = $root_anchor ? '../../' : '';
my $href = $mid->as_href;
my $irt = in_reply_to($header_obj);
+ my $parent_anchor = $seen->{anchor_for($irt)} if defined $irt;
- my ($anchor_idx, $anchor);
- if (defined $irt) {
- $anchor_idx = anchor_for($irt);
- $anchor = $seen->{$anchor_idx};
- }
if ($srch) {
my $t = $ctx->{flat} ? 'T' : 't';
$subj = "<a\nhref=\"${path}$href/$t/#u\">$subj</a>";
@@ -135,13 +131,12 @@ sub index_entry {
$rv .= html_footer($mime, 0, undef, $ctx);
if (defined $irt) {
- unless (defined $anchor) {
+ unless (defined $parent_anchor) {
my $v = PublicInbox::Hval->new_msgid($irt);
$v = $v->as_href;
- $anchor = "${path}$v/";
- $seen->{$anchor_idx} = $anchor;
+ $parent_anchor = "${path}$v/";
}
- $rv .= " <a\nhref=\"$anchor\">parent</a>";
+ $rv .= " <a\nhref=\"$parent_anchor\">parent</a>";
}
if ($srch) {
if ($ctx->{flat}) {
--
EW
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 5/7] view: pre-anchor entries for flat view
2015-09-02 6:59 [PATCH 0/7] improved thread views and 404 reductions Eric Wong
` (3 preceding siblings ...)
2015-09-02 6:59 ` [PATCH 4/7] view: simplify parent anchoring code Eric Wong
@ 2015-09-02 6:59 ` Eric Wong
2015-09-02 6:59 ` [PATCH 6/7] view: avoid links to unknown compressed Message-IDs Eric Wong
2015-09-02 6:59 ` [PATCH 7/7] implement external Message-ID finder Eric Wong
6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong @ 2015-09-02 6:59 UTC (permalink / raw)
To: meta
This will allow users to navigate the flat view without making extra
HTTP requests.
---
lib/PublicInbox/View.pm | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index 0331b62..98fc133 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -167,9 +167,10 @@ sub emit_thread_html {
return missing_thread($cb) if $nr == 0;
my $flat = $ctx->{flat};
my $orig_cb = $cb;
+ my $seen = {};
my $state = {
ctx => $ctx,
- seen => {},
+ seen => $seen,
root_anchor => anchor_for($mid),
anchor_idx => 0,
};
@@ -177,6 +178,7 @@ sub emit_thread_html {
require PublicInbox::GitCatFile;
my $git = PublicInbox::GitCatFile->new($ctx->{git_dir});
if ($flat) {
+ pre_anchor_entry($seen, $_) for (@$msgs);
__thread_entry(\$cb, $git, $state, $_, 0) for (@$msgs);
} else {
my $th = thread_results($msgs);
@@ -580,6 +582,12 @@ sub thread_html_head {
$$cb->write("<html><head><title>$s</title></head><body>");
}
+sub pre_anchor_entry {
+ my ($seen, $mime) = @_;
+ my $id = anchor_for($mime->header('Message-ID'));
+ $seen->{$id} = "#$id"; # save the anchor for children, later
+}
+
sub __thread_entry {
my ($cb, $git, $state, $mime, $level) = @_;
--
EW
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 6/7] view: avoid links to unknown compressed Message-IDs
2015-09-02 6:59 [PATCH 0/7] improved thread views and 404 reductions Eric Wong
` (4 preceding siblings ...)
2015-09-02 6:59 ` [PATCH 5/7] view: pre-anchor entries for flat view Eric Wong
@ 2015-09-02 6:59 ` Eric Wong
2015-09-02 6:59 ` [PATCH 7/7] implement external Message-ID finder Eric Wong
6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong @ 2015-09-02 6:59 UTC (permalink / raw)
To: meta
Compressed Message-IDs are irreversible and may not be used
at other sites. So avoid compressing Message-IDs we do not
know about so users have a chance of finding the message in
other archives by doing a Message-ID lookup.
---
lib/PublicInbox/Hval.pm | 4 ++--
lib/PublicInbox/View.pm | 33 +++++++++++++++++++--------------
2 files changed, 21 insertions(+), 16 deletions(-)
diff --git a/lib/PublicInbox/Hval.pm b/lib/PublicInbox/Hval.pm
index 21efe40..0445e57 100644
--- a/lib/PublicInbox/Hval.pm
+++ b/lib/PublicInbox/Hval.pm
@@ -25,9 +25,9 @@ sub new {
}
sub new_msgid {
- my ($class, $msgid) = @_;
+ my ($class, $msgid, $no_compress) = @_;
$msgid = mid_clean($msgid);
- $class->new($msgid, mid_compress($msgid));
+ $class->new($msgid, $no_compress ? $msgid : mid_compress($msgid));
}
sub new_oneline {
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index 98fc133..1528a87 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -132,7 +132,7 @@ sub index_entry {
if (defined $irt) {
unless (defined $parent_anchor) {
- my $v = PublicInbox::Hval->new_msgid($irt);
+ my $v = PublicInbox::Hval->new_msgid($irt, 1);
$v = $v->as_href;
$parent_anchor = "${path}$v/";
}
@@ -452,22 +452,25 @@ sub thread_inline {
if ($nr <= 1) {
$$dst .= "\n[no followups, yet]\n";
- return;
+ return (undef, in_reply_to($cur));
}
my $upfx = $full_pfx ? '' : '../';
$$dst .= "\n\n~$nr messages in thread: ".
"(<a\nhref=\"${upfx}t/#u\">expand</a>)\n";
my $subj = $srch->subject_path($cur->header('Subject'));
+ my $parent = in_reply_to($cur);
my $state = {
seen => { $subj => 1 },
srch => $srch,
cur => $mid,
+ parent_cmp => $parent ? mid_compress($parent) : '',
+ parent => $parent,
};
for (thread_results(load_results($res))->rootset) {
inline_dump($dst, $state, $upfx, $_, 0);
}
- $state->{next_msg};
+ ($state->{next_msg}, $state->{parent});
}
sub _parent_headers_nosrch {
@@ -476,7 +479,7 @@ sub _parent_headers_nosrch {
my $irt = in_reply_to($header_obj);
if (defined $irt) {
- my $v = PublicInbox::Hval->new_msgid($irt);
+ my $v = PublicInbox::Hval->new_msgid($irt, 1);
my $html = $v->as_html;
my $href = $v->as_href;
$rv .= "In-Reply-To: <";
@@ -493,7 +496,7 @@ sub _parent_headers_nosrch {
foreach my $ref (@raw_refs) {
next if $seen{$ref};
$seen{$ref} = 1;
- push @refs, linkify_ref($ref);
+ push @refs, linkify_ref_nosrch($ref);
}
if (@refs) {
@@ -536,12 +539,11 @@ sub html_footer {
my $upfx = $full_pfx ? '../' : '../../';
my $idx = $standalone ? " <a\nhref=\"$upfx\">index</a>" : '';
if ($idx && $srch) {
- my $next = thread_inline(\$idx, $ctx, $mime, $full_pfx);
- $irt = in_reply_to($mime->header_obj);
- if (defined $irt) {
- $irt = PublicInbox::Hval->new_msgid($irt);
- $irt = $irt->as_href;
- $irt = "<a\nhref=\"$upfx$irt/\">parent</a> ";
+ my ($next, $p) = thread_inline(\$idx, $ctx, $mime, $full_pfx);
+ if (defined $p) {
+ $p = PublicInbox::Hval->new_oneline($p);
+ $p = $p->as_href;
+ $irt = "<a\nhref=\"$upfx$p/\">parent</a> ";
} else {
$irt = ' ' x length('parent ');
}
@@ -557,8 +559,8 @@ sub html_footer {
"$irt<a\nhref=\"" . ascii_html($href) . '">reply</a>' . $idx;
}
-sub linkify_ref {
- my $v = PublicInbox::Hval->new_msgid($_[0]);
+sub linkify_ref_nosrch {
+ my $v = PublicInbox::Hval->new_msgid($_[0], 1);
my $html = $v->as_html;
my $href = $v->as_href;
"<<a\nhref=\"../$href/\">$html</a>>";
@@ -699,8 +701,11 @@ sub _inline_header {
sub inline_dump {
my ($dst, $state, $upfx, $node, $level) = @_;
return unless $node;
- return if $state->{stopped};
if (my $mime = $node->message) {
+ my $mid = mid_clean($mime->header('Message-ID'));
+ if ($mid eq $state->{parent_cmp}) {
+ $state->{parent} = $mid;
+ }
_inline_header($dst, $state, $upfx, $mime, $level);
}
inline_dump($dst, $state, $upfx, $node->child, $level+1);
--
EW
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 7/7] implement external Message-ID finder
2015-09-02 6:59 [PATCH 0/7] improved thread views and 404 reductions Eric Wong
` (5 preceding siblings ...)
2015-09-02 6:59 ` [PATCH 6/7] view: avoid links to unknown compressed Message-IDs Eric Wong
@ 2015-09-02 6:59 ` Eric Wong
6 siblings, 0 replies; 8+ messages in thread
From: Eric Wong @ 2015-09-02 6:59 UTC (permalink / raw)
To: meta
Currently, this looks at other public-inbox configurations
served in the same process. In the future, it will generate
links to other Message-ID lookup endpoints.
---
lib/PublicInbox/ExtMsg.pm | 92 +++++++++++++++++++++++++++++++++++++++++++++++
lib/PublicInbox/View.pm | 14 ++++----
lib/PublicInbox/WWW.pm | 15 +++++---
public-inbox.cgi | 1 +
4 files changed, 110 insertions(+), 12 deletions(-)
create mode 100644 lib/PublicInbox/ExtMsg.pm
diff --git a/lib/PublicInbox/ExtMsg.pm b/lib/PublicInbox/ExtMsg.pm
new file mode 100644
index 0000000..1c0887c
--- /dev/null
+++ b/lib/PublicInbox/ExtMsg.pm
@@ -0,0 +1,92 @@
+# Copyright (C) 2015 all contributors <meta@public-inbox.org>
+# License: AGPLv3 or later (https://www.gnu.org/licenses/agpl-3.0.txt)
+package PublicInbox::ExtMsg;
+use strict;
+use warnings;
+use URI::Escape qw(uri_escape_utf8);
+use PublicInbox::Hval;
+use PublicInbox::MID qw/mid_compress mid2path/;
+
+sub ext_msg {
+ my ($ctx) = @_;
+ my $pi_config = $ctx->{pi_config};
+ my $listname = $ctx->{listname};
+ my $mid = $ctx->{mid};
+ my $cmid = mid_compress($mid);
+
+ eval { require PublicInbox::Search };
+ my $have_xap = $@ ? 0 : 1;
+ my @nox;
+
+ foreach my $k (keys %$pi_config) {
+ $k =~ /\Apublicinbox\.([A-Z0-9a-z-]+)\.url\z/ or next;
+ my $list = $1;
+ next if $list eq $listname;
+
+ my $git_dir = $pi_config->{"publicinbox.$list.mainrepo"};
+ defined $git_dir or next;
+
+ my $url = $pi_config->{"publicinbox.$list.url"};
+ defined $url or next;
+
+ $url =~ s!/+\z!!;
+
+ # try to find the URL with Xapian to avoid forking
+ if ($have_xap) {
+ my $doc_id = eval {
+ my $s = PublicInbox::Search->new($git_dir);
+ $s->find_unique_doc_id('mid', $cmid);
+ };
+ if ($@) {
+ # xapian not configured for this repo
+ } else {
+ # maybe we found it!
+ return r302($url, $cmid) if (defined $doc_id);
+
+ # no point in trying the fork fallback if we
+ # know Xapian is up-to-date but missing the
+ # message in the current repo
+ next;
+ }
+ }
+
+ # queue up for forking after we've tried Xapian on all of them
+ push @nox, { git_dir => $git_dir, url => $url };
+ }
+
+ # Xapian not installed or configured for some repos
+ my $path = "HEAD:" . mid2path($cmid);
+
+ foreach my $n (@nox) {
+ my @cmd = ('git', "--git-dir=$n->{git_dir}", 'cat-file',
+ '-t', $path);
+ my $pid = open my $fh, '-|';
+ defined $pid or die "fork failed: $!\n";
+
+ if ($pid == 0) {
+ open STDERR, '>', '/dev/null'; # ignore errors
+ exec @cmd or die "exec failed: $!\n";
+ } else {
+ my $type = eval { local $/; <$fh> };
+ close $fh;
+ if ($? == 0 && $type eq "blob\n") {
+ return r302($n->{url}, $cmid);
+ }
+ }
+ }
+
+ # Fall back to external repos
+
+ [404, ['Content-Type'=>'text/plain'], ['Not found']];
+}
+
+# Redirect to another public-inbox which is mapped by $pi_config
+sub r302 {
+ my ($url, $mid) = @_;
+ $url .= '/' . uri_escape_utf8($mid) . '/';
+ [ 302,
+ [ 'Location' => $url, 'Content-Type' => 'text/plain' ],
+ [ "Redirecting to\n$url\n" ] ]
+}
+
+1;
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index 1528a87..e18895f 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -164,7 +164,7 @@ sub emit_thread_html {
my $res = $srch->get_thread($mid);
my $msgs = load_results($res);
my $nr = scalar @$msgs;
- return missing_thread($cb) if $nr == 0;
+ return missing_thread($cb, $ctx) if $nr == 0;
my $flat = $ctx->{flat};
my $orig_cb = $cb;
my $seen = {};
@@ -189,7 +189,7 @@ sub emit_thread_html {
# there could be a race due to a message being deleted in git
# but still being in the Xapian index:
- return missing_thread($cb) if ($orig_cb eq $cb);
+ return missing_thread($cb, $ctx) if ($orig_cb eq $cb);
my $final_anchor = $state->{anchor_idx};
my $next = "<a\nid=\"s$final_anchor\">";
@@ -637,12 +637,10 @@ sub thread_results {
}
sub missing_thread {
- my ($cb) = @_;
- my $title = 'Thread does not exist';
- $cb->([404, ['Content-Type' => 'text/html']])->write(<<EOF);
-<html><head><title>$title</title></head><body><pre>$title
-<a href="../../">Return to index</a></pre></body></html>
-EOF
+ my ($cb, $ctx) = @_;
+ require PublicInbox::ExtMsg;
+
+ $cb->(PublicInbox::ExtMsg::ext_msg($ctx))
}
sub _msg_date {
diff --git a/lib/PublicInbox/WWW.pm b/lib/PublicInbox/WWW.pm
index 9ae7f7b..16fd16a 100644
--- a/lib/PublicInbox/WWW.pm
+++ b/lib/PublicInbox/WWW.pm
@@ -88,7 +88,14 @@ sub preload {
# private functions below
-sub r404 { r(404, 'Not Found') }
+sub r404 {
+ my ($ctx) = @_;
+ if ($ctx && $ctx->{mid}) {
+ require PublicInbox::ExtMsg;
+ return PublicInbox::ExtMsg::ext_msg($ctx);
+ }
+ r(404, 'Not Found');
+}
# simple response for errors
sub r { [ $_[0], ['Content-Type' => 'text/plain'], [ join(' ', @_, "\n") ] ] }
@@ -151,7 +158,7 @@ sub mid2blob {
# /$LISTNAME/$MESSAGE_ID/raw -> raw mbox
sub get_mid_txt {
my ($ctx) = @_;
- my $x = mid2blob($ctx) or return r404();
+ my $x = mid2blob($ctx) or return r404($ctx);
require PublicInbox::Mbox;
PublicInbox::Mbox::emit1($x);
}
@@ -159,7 +166,7 @@ sub get_mid_txt {
# /$LISTNAME/$MESSAGE_ID/ -> HTML content (short quotes)
sub get_mid_html {
my ($ctx) = @_;
- my $x = mid2blob($ctx) or return r404();
+ my $x = mid2blob($ctx) or return r404($ctx);
require PublicInbox::View;
my $foot = footer($ctx);
@@ -173,7 +180,7 @@ sub get_mid_html {
# /$LISTNAME/$MESSAGE_ID/f/ -> HTML content (fullquotes)
sub get_full_html {
my ($ctx) = @_;
- my $x = mid2blob($ctx) or return r404();
+ my $x = mid2blob($ctx) or return r404($ctx);
require PublicInbox::View;
my $foot = footer($ctx);
diff --git a/public-inbox.cgi b/public-inbox.cgi
index 75d510c..1fcc04f 100755
--- a/public-inbox.cgi
+++ b/public-inbox.cgi
@@ -18,6 +18,7 @@ BEGIN {
%HTTP_CODES = (
200 => 'OK',
301 => 'Moved Permanently',
+ 302 => 'Found',
404 => 'Not Found',
405 => 'Method Not Allowed',
501 => 'Not Implemented',
--
EW
^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-09-02 6:59 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-02 6:59 [PATCH 0/7] improved thread views and 404 reductions Eric Wong
2015-09-02 6:59 ` [PATCH 1/7] view: close possible race condition in thread view Eric Wong
2015-09-02 6:59 ` [PATCH 2/7] view: optional flat view for recent messages Eric Wong
2015-09-02 6:59 ` [PATCH 3/7] view: account for missing In-Reply-To header Eric Wong
2015-09-02 6:59 ` [PATCH 4/7] view: simplify parent anchoring code Eric Wong
2015-09-02 6:59 ` [PATCH 5/7] view: pre-anchor entries for flat view Eric Wong
2015-09-02 6:59 ` [PATCH 6/7] view: avoid links to unknown compressed Message-IDs Eric Wong
2015-09-02 6:59 ` [PATCH 7/7] implement external Message-ID finder Eric Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).