unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
* [PATCH 0/4] imap: reduce impact of bot scanners
@ 2022-08-08 23:16 Eric Wong
  2022-08-08 23:16 ` [PATCH 1/4] imap: limit ibx_async_prefetch to idle git processes Eric Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Eric Wong @ 2022-08-08 23:16 UTC (permalink / raw)
  To: meta

There seems to be a fair amount of bot traffic scanning the
IMAP(S) port on public-inbox.org using username+password logins
(which we currently accept combination of).

AUTH=ANONYMOUS traffic is probably more likely to be legit,
and supported by mutt and lei, at least.

To avoid breaking things for legitimate users using
username+passwords, I've decided to deprioritize, but
still allow traffic of clients using username+password
logins.

The initial prefix change is good regardless, since
even legitimate AUTH=ANONYMOUS clients could've caused
fairness problems with the aggressive pipelining to
git-cat-file||Gcf2.

Eric Wong (4):
  imap: limit ibx_async_prefetch to idle git processes
  imap: only give AUTH=ANONYMOUS clients prefetch
  imap: prioritize AUTH=ANONYMOUS clients
  README: recommend AUTH=ANONYMOUS on IMAP URLs

 README                         |  6 +++---
 lib/PublicInbox/DS.pm          |  2 +-
 lib/PublicInbox/GitAsyncCat.pm |  9 ++++-----
 lib/PublicInbox/IMAP.pm        | 16 +++++++++++++---
 lib/PublicInbox/IMAPD.pm       |  7 +++++++
 5 files changed, 28 insertions(+), 12 deletions(-)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/4] imap: limit ibx_async_prefetch to idle git processes
  2022-08-08 23:16 [PATCH 0/4] imap: reduce impact of bot scanners Eric Wong
@ 2022-08-08 23:16 ` Eric Wong
  2022-08-08 23:16 ` [PATCH 2/4] imap: only give AUTH=ANONYMOUS clients prefetch Eric Wong
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2022-08-08 23:16 UTC (permalink / raw)
  To: meta

This improves fairness while having no measurable performance
impact for a single uncached IMAP client (mutt) opening a folder
for the first time.

I noticed this problem with the public-inbox.org IMAP server where
a few IMAP clients were unfairly monopolizing the -netd process.
---
 lib/PublicInbox/GitAsyncCat.pm | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/GitAsyncCat.pm b/lib/PublicInbox/GitAsyncCat.pm
index cea3f539..6b7425f6 100644
--- a/lib/PublicInbox/GitAsyncCat.pm
+++ b/lib/PublicInbox/GitAsyncCat.pm
@@ -69,19 +69,18 @@ sub ibx_async_cat ($$$$) {
 }
 
 # this is safe to call inside $cb, but not guaranteed to enqueue
-# returns true if successful, undef if not.
+# returns true if successful, undef if not.  For fairness, we only
+# prefetch if there's no in-flight requests.
 sub ibx_async_prefetch {
 	my ($ibx, $oid, $cb, $arg) = @_;
 	my $git = $ibx->git;
 	if (!defined($ibx->{topdir}) && $GCF2C) {
-		if (!$GCF2C->{wbuf}) {
+		if (!@{$GCF2C->{inflight} // []}) {
 			$oid .= " $git->{git_dir}\n";
 			return $GCF2C->gcf2_async(\$oid, $cb, $arg); # true
 		}
 	} elsif ($git->{async_cat} && (my $inflight = $git->{inflight})) {
-		# we could use MAX_INFLIGHT here w/o the halving,
-		# but lets not allow one client to monopolize a git process
-		if (@$inflight < int(PublicInbox::Git::MAX_INFLIGHT/2)) {
+		if (!@$inflight) {
 			print { $git->{out} } $oid, "\n" or
 						$git->fail("write error: $!");
 			return push(@$inflight, $oid, $cb, $arg);

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/4] imap: only give AUTH=ANONYMOUS clients prefetch
  2022-08-08 23:16 [PATCH 0/4] imap: reduce impact of bot scanners Eric Wong
  2022-08-08 23:16 ` [PATCH 1/4] imap: limit ibx_async_prefetch to idle git processes Eric Wong
@ 2022-08-08 23:16 ` Eric Wong
  2022-08-08 23:16 ` [PATCH 3/4] imap: prioritize AUTH=ANONYMOUS clients Eric Wong
  2022-08-08 23:16 ` [PATCH 4/4] README: recommend AUTH=ANONYMOUS on IMAP URLs Eric Wong
  3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2022-08-08 23:16 UTC (permalink / raw)
  To: meta

Looking at IMAP traffic on public-inbox.org, it seems there is a
fair amount of traffic coming from malicious clients assuming
the IMAP server is compromised and searching for private
information.  Since AUTH=ANONYMOUS clients are more likely to
be legitimate clients looking for publicly-archived mail,
give them priority.
---
 lib/PublicInbox/IMAP.pm | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/IMAP.pm b/lib/PublicInbox/IMAP.pm
index bed633e5..4ef5252b 100644
--- a/lib/PublicInbox/IMAP.pm
+++ b/lib/PublicInbox/IMAP.pm
@@ -138,6 +138,7 @@ sub login_success ($$) {
 sub auth_challenge_ok ($) {
 	my ($self) = @_;
 	my $tag = delete($self->{-login_tag}) or return;
+	$self->{anon} = 1;
 	login_success($self, $tag);
 }
 
@@ -588,10 +589,9 @@ sub fetch_blob_cb { # called by git->cat_async via ibx_async_cat
 		$smsg->{blob} eq $oid or die "BUG: $smsg->{blob} != $oid";
 	}
 	my $pre;
-	if (!$self->{wbuf} && (my $nxt = $msgs->[0])) {
-		$pre = ibx_async_prefetch($ibx, $nxt->{blob},
+	($self->{anon} && !$self->{wbuf} && $msgs->[0]) and
+		$pre = ibx_async_prefetch($ibx, $msgs->[0]->{blob},
 					\&fetch_blob_cb, $fetch_arg);
-	}
 	fetch_run_ops($self, $smsg, $bref, $ops, $partial);
 	$pre ? $self->dflush : $self->requeue_once;
 }

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/4] imap: prioritize AUTH=ANONYMOUS clients
  2022-08-08 23:16 [PATCH 0/4] imap: reduce impact of bot scanners Eric Wong
  2022-08-08 23:16 ` [PATCH 1/4] imap: limit ibx_async_prefetch to idle git processes Eric Wong
  2022-08-08 23:16 ` [PATCH 2/4] imap: only give AUTH=ANONYMOUS clients prefetch Eric Wong
@ 2022-08-08 23:16 ` Eric Wong
  2022-08-08 23:16 ` [PATCH 4/4] README: recommend AUTH=ANONYMOUS on IMAP URLs Eric Wong
  3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2022-08-08 23:16 UTC (permalink / raw)
  To: meta

...by deprioritizing clients using a username + password.

As IMAP provides AUTH=ANONYMOUS for designating anonymous
access, we'll rely on it as a heuristic for favoring "good"
clients.  Clients using a username + password seem to (more
often than not) be malicious and looking for info which doesn't
belong in public inboxes.

This copies the technique used by WWW + -httpd to deprioritize
expensive mbox.gz downloads.
---
 lib/PublicInbox/DS.pm    |  2 +-
 lib/PublicInbox/IMAP.pm  | 10 ++++++++++
 lib/PublicInbox/IMAPD.pm |  7 +++++++
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/DS.pm b/lib/PublicInbox/DS.pm
index 77e2e5e9..5e8a6a66 100644
--- a/lib/PublicInbox/DS.pm
+++ b/lib/PublicInbox/DS.pm
@@ -688,7 +688,7 @@ sub requeue_once {
 	# but only after all pending writes are done.
 	# autovivify wbuf.  wbuf may be populated by $cb,
 	# no need to rearm if so: (push returns new size of array)
-	requeue($self) if push(@{$self->{wbuf}}, \&long_step) == 1;
+	$self->requeue if push(@{$self->{wbuf}}, \&long_step) == 1;
 }
 
 sub long_response ($$;@) {
diff --git a/lib/PublicInbox/IMAP.pm b/lib/PublicInbox/IMAP.pm
index 4ef5252b..605c5e51 100644
--- a/lib/PublicInbox/IMAP.pm
+++ b/lib/PublicInbox/IMAP.pm
@@ -575,6 +575,16 @@ sub fetch_run_ops {
 	$self->msg_more(")\r\n");
 }
 
+sub requeue { # overrides PublicInbox::DS::requeue
+	my ($self) = @_;
+	if ($self->{anon}) { # AUTH=ANONYMOUS gets high priority
+		$self->SUPER::requeue;
+	} else { # low priority
+		push(@{$self->{imapd}->{-authed_q}}, $self) == 1 and
+			PublicInbox::DS::requeue($self->{imapd});
+	}
+}
+
 sub fetch_blob_cb { # called by git->cat_async via ibx_async_cat
 	my ($bref, $oid, $type, $size, $fetch_arg) = @_;
 	my ($self, undef, $msgs, $range_info, $ops, $partial) = @$fetch_arg;
diff --git a/lib/PublicInbox/IMAPD.pm b/lib/PublicInbox/IMAPD.pm
index 5368ff04..dd0d2c53 100644
--- a/lib/PublicInbox/IMAPD.pm
+++ b/lib/PublicInbox/IMAPD.pm
@@ -87,4 +87,11 @@ sub idler_start {
 	$_[0]->{idler} //= PublicInbox::InboxIdle->new($_[0]->{pi_cfg});
 }
 
+sub event_step { # called vai requeue for low-priority IMAP clients
+	my ($self) = @_;
+	my $imap = shift(@{$self->{-authed_q}}) // return;
+	PublicInbox::DS::requeue($self) if scalar(@{$self->{-authed_q}});
+	$imap->event_step; # PublicInbox::IMAP::event_step
+}
+
 1;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 4/4] README: recommend AUTH=ANONYMOUS on IMAP URLs
  2022-08-08 23:16 [PATCH 0/4] imap: reduce impact of bot scanners Eric Wong
                   ` (2 preceding siblings ...)
  2022-08-08 23:16 ` [PATCH 3/4] imap: prioritize AUTH=ANONYMOUS clients Eric Wong
@ 2022-08-08 23:16 ` Eric Wong
  3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2022-08-08 23:16 UTC (permalink / raw)
  To: meta

public-inbox-imapd prioritizes AUTH=ANONYMOUS clients, nowadays,
since it's a good heuristic for legitimate client traffic.
---
 README | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/README b/README
index 364ef7e0..01089314 100644
--- a/README
+++ b/README
@@ -117,16 +117,16 @@ on git@vger.kernel.org).
 The archives are readable via IMAP, NNTP or HTTP:
 
 	nntps://news.public-inbox.org/inbox.comp.mail.public-inbox.meta
-	imaps://news.public-inbox.org/inbox.comp.mail.public-inbox.meta.0
+	imaps://;AUTH=ANONYMOUS@public-inbox.org/inbox.comp.mail.public-inbox.meta.0
 	https://public-inbox.org/meta/
 
-AUTH=ANONYMOUS is supported for IMAP, but any username + password works
+AUTH=ANONYMOUS is recommended for IMAP, but any username + password works
 
 And as Tor hidden services:
 
 	http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta/
 	nntp://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/inbox.comp.mail.public-inbox.meta
-	imap://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/inbox.comp.mail.public-inbox.meta.0
+	imap://;AUTH=ANONYMOUS@4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/inbox.comp.mail.public-inbox.meta.0
 
 You may also clone all messages via git:
 

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-08-08 23:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-08 23:16 [PATCH 0/4] imap: reduce impact of bot scanners Eric Wong
2022-08-08 23:16 ` [PATCH 1/4] imap: limit ibx_async_prefetch to idle git processes Eric Wong
2022-08-08 23:16 ` [PATCH 2/4] imap: only give AUTH=ANONYMOUS clients prefetch Eric Wong
2022-08-08 23:16 ` [PATCH 3/4] imap: prioritize AUTH=ANONYMOUS clients Eric Wong
2022-08-08 23:16 ` [PATCH 4/4] README: recommend AUTH=ANONYMOUS on IMAP URLs Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).