unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
From: Eric Wong <e@yhbt.net>
To: meta@public-inbox.org
Subject: [PATCH 1/2] viewdiff: assume diffstat and diff order are identical
Date: Wed,  6 May 2020 10:40:53 +0000	[thread overview]
Message-ID: <20200506104054.3074-2-e@yhbt.net> (raw)
In-Reply-To: <20200506104054.3074-1-e@yhbt.net>

For non-malicious messages, we can assume the diffstat and actual
diff appear in the same order.  Thus we can store {-long_paths} as
an arrayref and only compare the first element when we encounter
a truncated path.

This should make HTML rendering stable when there's basename
conflicts in message such as
https://lore.kernel.org/backports/1393202754-12919-13-git-send-email-hauke@hauke-m.de/

This diffstat anchor linkification can still be defeated by
users who make actual path names beginning with "...", but we
won't waste CPU cycles on it, either.
---
 lib/PublicInbox/ViewDiff.pm | 23 +++++++++--------------
 1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/lib/PublicInbox/ViewDiff.pm b/lib/PublicInbox/ViewDiff.pm
index 3d6058a9..34df8ad4 100644
--- a/lib/PublicInbox/ViewDiff.pm
+++ b/lib/PublicInbox/ViewDiff.pm
@@ -82,10 +82,8 @@ sub anchor0 ($$$$) {
 	$fn =~ s/{(?:.+) => (.+)}/$1/ or $fn =~ s/.* => (.+)/$1/;
 	$fn = git_unquote($fn);
 
-	# long filenames will require us to walk backwards in anchor1
-	if ($fn =~ s!\A\.\.\./?!!) {
-		$ctx->{-long_path}->{$fn} = qr/\Q$fn\E\z/s;
-	}
+	# long filenames will require us to check in anchor1()
+	push(@{$ctx->{-long_path}}, $fn) if $fn =~ s!\A\.\.\./?!!;
 
 	if (my $attr = to_attr($ctx->{-apfx}.$fn)) {
 		$ctx->{-anchors}->{$attr} = 1;
@@ -105,17 +103,14 @@ sub anchor1 ($$) {
 
 	my $ok = delete $ctx->{-anchors}->{$attr};
 
-	# unlikely, check the end of all long path names we captured:
+	# unlikely, check the end of long path names we captured,
+	# assume diffstat and diff output follow the same order,
+	# and ignore different ordering (could be malicious input)
 	unless ($ok) {
-		my $lp = $ctx->{-long_path} or return;
-		foreach my $fn (keys %$lp) {
-			$pb =~ $lp->{$fn} or next;
-
-			delete $lp->{$fn};
-			$attr = to_attr($ctx->{-apfx}.$fn) or return;
-			$ok = delete $ctx->{-anchors}->{$attr} or return;
-			last;
-		}
+		my $fn = shift(@{$ctx->{-long_path}}) or return;
+		$pb =~ /\Q$fn\E\z/s or return;
+		$attr = to_attr($ctx->{-apfx}.$fn) or return;
+		$ok = delete $ctx->{-anchors}->{$attr} or return;
 	}
 	$ok ? "<a\nhref=#i$attr\nid=$attr>diff</a> --git" : undef
 }

  reply	other threads:[~2020-05-06 10:40 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-06 10:40 [PATCH 0/2] viewdiff: linkification fixes Eric Wong
2020-05-06 10:40 ` Eric Wong [this message]
2020-05-06 10:40 ` [PATCH 2/2] viewdiff: stricter highlighting and linkification check Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200506104054.3074-2-e@yhbt.net \
    --to=e@yhbt.net \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).