unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
* MIME types for image attachments
@ 2020-11-07 19:10 Leah Neukirchen
  2020-11-07 20:39 ` Eric Wong
  0 siblings, 1 reply; 5+ messages in thread
From: Leah Neukirchen @ 2020-11-07 19:10 UTC (permalink / raw)
  To: meta

Hi,

I just noticed this on a plain public-inbox 1.6.0 installation:

https://inbox.vuxu.org/9fans/8F5F1B4BCF0E2F1DA17BDFBF06430DC7@abbatoir.fios-router.home/T/#u
> [-- Attachment #2: Type: image/png, Size: 56860 bytes --]

However, when I click on it:

% curl -I https://inbox.vuxu.org/9fans/8F5F1B4BCF0E2F1DA17BDFBF06430DC7@abbatoir.fios-router.home/2-a.bin
HTTP/1.1 200 OK
Server: nginx/1.18.0
Date: Sat, 07 Nov 2020 19:08:48 GMT
Content-Type: application/octet-stream
Content-Length: 56860
Connection: keep-alive

Any reason this is not served as image/png?  I don't think serving
image/* types is particularily dangerous, and it easily allows looking
at attached images from the browser.

Thanks,
-- 
Leah Neukirchen  <leah@vuxu.org>  https://leahneukirchen.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: MIME types for image attachments
  2020-11-07 19:10 MIME types for image attachments Leah Neukirchen
@ 2020-11-07 20:39 ` Eric Wong
  2020-11-08  0:05   ` Leah Neukirchen
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Wong @ 2020-11-07 20:39 UTC (permalink / raw)
  To: Leah Neukirchen; +Cc: meta

Leah Neukirchen <leah@vuxu.org> wrote:
> Hi,
> 
> I just noticed this on a plain public-inbox 1.6.0 installation:
> 
> https://inbox.vuxu.org/9fans/8F5F1B4BCF0E2F1DA17BDFBF06430DC7@abbatoir.fios-router.home/T/#u
> > [-- Attachment #2: Type: image/png, Size: 56860 bytes --]
> 
> However, when I click on it:
> 
> % curl -I https://inbox.vuxu.org/9fans/8F5F1B4BCF0E2F1DA17BDFBF06430DC7@abbatoir.fios-router.home/2-a.bin
> HTTP/1.1 200 OK
> Server: nginx/1.18.0
> Date: Sat, 07 Nov 2020 19:08:48 GMT
> Content-Type: application/octet-stream
> Content-Length: 56860
> Connection: keep-alive
> 
> Any reason this is not served as image/png?  I don't think serving
> image/* types is particularily dangerous, and it easily allows looking
> at attached images from the browser.

Several reasons off the top of my head (there may be more):

1) Image rendering libraries and complex graphics stacks increase
   attack surface.  IIRC libpng/libjpeg have both had problems with
   malicious data in the past, and could be in the future.

   From what I can tell, text-only stacks seem barely capable of
   displaying text without arbitrary code execution.  I'm not
   optimistic about something as complex as image rendering from
   untrusted sources.

2) Risk of illegal or objectionable content being viewed by
   readers and bystanders, especially when in public (libraries,
   coffee shops, planes, etc).  The risk of accidental clicks
   seems higher in public due to spills/bumps, too, especially
   in unfamiliar environments.

   The current practice of linkifying URLs poses that problem,
   too, but the public-inbox admin isn't responsible for hosting
   the content in those URLs, bringing us to...

3) (Probably) risk to admins hosting public-inbox instances if
   there's illegal content.  Right now, the data is still there,
   but having it less obviously visible probably helps reduce
   exposure when combined with 2).

I am not a lawyer, and laws vary wildly by jurisdiction;
so I think it's prudent to err on the side of paranoia when
dealing in untrusted data sources.

That said, a patch + options to allow passing through certain
content types for the server to pass through could be accepted.
It needs to also require a secondary option visible to the client
(via opt-in cookie or POST), to avoid surprising differences
between differently-configured server instances.

Risks will need to be documented for the admin, and the current
behavior needs to remain the default.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: MIME types for image attachments
  2020-11-07 20:39 ` Eric Wong
@ 2020-11-08  0:05   ` Leah Neukirchen
  2020-11-08  7:49     ` [PATCH] wwwattach: set "Content-Disposition: attachment" Eric Wong
  0 siblings, 1 reply; 5+ messages in thread
From: Leah Neukirchen @ 2020-11-08  0:05 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong <e@80x24.org> writes:

> Leah Neukirchen <leah@vuxu.org> wrote:
>> Hi,
>> 
>> I just noticed this on a plain public-inbox 1.6.0 installation:
>> 
>> https://inbox.vuxu.org/9fans/8F5F1B4BCF0E2F1DA17BDFBF06430DC7@abbatoir.fios-router.home/T/#u
>> > [-- Attachment #2: Type: image/png, Size: 56860 bytes --]
>> 
>> However, when I click on it:
>> 
>> % curl -I
>> https://inbox.vuxu.org/9fans/8F5F1B4BCF0E2F1DA17BDFBF06430DC7@abbatoir.fios-router.home/2-a.bin
>> HTTP/1.1 200 OK
>> Server: nginx/1.18.0
>> Date: Sat, 07 Nov 2020 19:08:48 GMT
>> Content-Type: application/octet-stream
>> Content-Length: 56860
>> Connection: keep-alive
>> 
>> Any reason this is not served as image/png?  I don't think serving
>> image/* types is particularily dangerous, and it easily allows looking
>> at attached images from the browser.
>
> Several reasons off the top of my head (there may be more):
>
> 1) Image rendering libraries and complex graphics stacks increase
>    attack surface.  IIRC libpng/libjpeg have both had problems with
>    malicious data in the past, and could be in the future.
>
>    From what I can tell, text-only stacks seem barely capable of
>    displaying text without arbitrary code execution.  I'm not
>    optimistic about something as complex as image rendering from
>    untrusted sources.

Well, that's what probably literally every other website on the web does. ;)

> 2) Risk of illegal or objectionable content being viewed by
>    readers and bystanders, especially when in public (libraries,
>    coffee shops, planes, etc).  The risk of accidental clicks
>    seems higher in public due to spills/bumps, too, especially
>    in unfamiliar environments.
>
>    The current practice of linkifying URLs poses that problem,
>    too, but the public-inbox admin isn't responsible for hosting
>    the content in those URLs, bringing us to...

Yes, I don't want to inline the data but just display it on click.
It's the same as any other Mailman or Google Groups archive.

> 3) (Probably) risk to admins hosting public-inbox instances if
>    there's illegal content.  Right now, the data is still there,
>    but having it less obviously visible probably helps reduce
>    exposure when combined with 2).

Notice that deep-linking this attachment (e.g. with <img src= />)
will already display the image, as it triggers content autodetection.

> I am not a lawyer, and laws vary wildly by jurisdiction;
> so I think it's prudent to err on the side of paranoia when
> dealing in untrusted data sources.
>
> That said, a patch + options to allow passing through certain
> content types for the server to pass through could be accepted.
> It needs to also require a secondary option visible to the client
> (via opt-in cookie or POST), to avoid surprising differences
> between differently-configured server instances.

I think it would be more correct to send the real MIME type
and "Content-Disposition: attachment" (or "inline" then, when asked for).

(However this does not prevent hotlinking either...)

> Risks will need to be documented for the admin, and the current
> behavior needs to remain the default.

-- 
Leah Neukirchen  <leah@vuxu.org>  https://leahneukirchen.org

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] wwwattach: set "Content-Disposition: attachment"
  2020-11-08  0:05   ` Leah Neukirchen
@ 2020-11-08  7:49     ` Eric Wong
  2020-11-23 14:15       ` [PATCH v2] wwwattach: prevent deep-linking via Referer match Eric Wong
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Wong @ 2020-11-08  7:49 UTC (permalink / raw)
  To: Leah Neukirchen; +Cc: meta

Leah Neukirchen <leah@vuxu.org> wrote:
> Eric Wong <e@80x24.org> writes:
> > 1) Image rendering libraries and complex graphics stacks increase
> >    attack surface.  IIRC libpng/libjpeg have both had problems with
> >    malicious data in the past, and could be in the future.
> >
> >    From what I can tell, text-only stacks seem barely capable of
> >    displaying text without arbitrary code execution.  I'm not
> >    optimistic about something as complex as image rendering from
> >    untrusted sources.
> 
> Well, that's what probably literally every other website on the web does. ;)

Yes, but this project has never been about doing what others do :>

> >    The current practice of linkifying URLs poses that problem,
> >    too, but the public-inbox admin isn't responsible for hosting
> >    the content in those URLs, bringing us to...
> 
> Yes, I don't want to inline the data but just display it on click.
> It's the same as any other Mailman or Google Groups archive.

OK, it seems possible to support optionally.  I don't think it's
a good default for Mailman given the risk.  Of course Google has
vast legal resources which we can't expect our users to have.

> > 3) (Probably) risk to admins hosting public-inbox instances if
> >    there's illegal content.  Right now, the data is still there,
> >    but having it less obviously visible probably helps reduce
> >    exposure when combined with 2).
> 
> Notice that deep-linking this attachment (e.g. with <img src= />)
> will already display the image, as it triggers content autodetection.

Yikes, I did not know that!(*)  It doesn't happen with dillo,
though Firefox seems affected.

(*) Because I never use <img> out of concern for folks with
    slow and/or metered Internet access.

> > I am not a lawyer, and laws vary wildly by jurisdiction;
> > so I think it's prudent to err on the side of paranoia when
> > dealing in untrusted data sources.
> >
> > That said, a patch + options to allow passing through certain
> > content types for the server to pass through could be accepted.
> > It needs to also require a secondary option visible to the client
> > (via opt-in cookie or POST), to avoid surprising differences
> > between differently-configured server instances.
> 
> I think it would be more correct to send the real MIME type
> and "Content-Disposition: attachment" (or "inline" then, when asked for).

I think passing MIME type is too risky by default even with
"Content-Disposition: attachment".  Browsers can still
auto-launch external programs on attachments, right?

But C-D:a definitely seems needed, and below is a patch for it.

> (However this does not prevent hotlinking either...)

Right.  I'd be much more comfortable if the default behavior
was consistently "Save As..." and then have the user process
it as a separate step (perhaps on an air-gapped machine).

Care to test the following patch?

It's also right where you or someone else would need to
make changes to optionally enable MIME type pass through.

Thanks.

--------8<-------
Subject: [PATCH] wwwattach: set "Content-Disposition: attachment"

This prevents `<img src=' tags from being used to deep link
image attachments.  Some browsers favor content detection and
will display images irrespective of the Content-Type header
being "application/octet-stream".

Reported-by: Leah Neukirchen <leah@vuxu.org>
---
 lib/PublicInbox/WwwAttach.pm | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/PublicInbox/WwwAttach.pm b/lib/PublicInbox/WwwAttach.pm
index 0b2cda90..a6ed6caa 100644
--- a/lib/PublicInbox/WwwAttach.pm
+++ b/lib/PublicInbox/WwwAttach.pm
@@ -30,6 +30,9 @@ sub get_attach_i { # ->each_part callback
 	} else { # TODO: allow user to configure safe types
 		$res->[1]->[1] = 'application/octet-stream';
 		$part = $part->body;
+
+		# prevent deep-linking via <img>:
+		push @{$res->[1]}, 'Content-Disposition', 'attachment';
 	}
 	push @{$res->[1]}, 'Content-Length', bytes::length($part);
 	$res->[2]->[0] = $part;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2] wwwattach: prevent deep-linking via Referer match
  2020-11-08  7:49     ` [PATCH] wwwattach: set "Content-Disposition: attachment" Eric Wong
@ 2020-11-23 14:15       ` Eric Wong
  0 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2020-11-23 14:15 UTC (permalink / raw)
  To: meta; +Cc: Leah Neukirchen

This prevents `<img src=' tags from being used to deep-link
image attachments from HTML outside of the current host and
reduces potential for abuse.

Some browsers (e.g. Firefox) favor content detection and will
display images irrespective of the Content-Type header being
"application/octet-stream", and "Content-Disposition: attachment"
doesn't stop them, either.

Tested with dillo and Firefox.

Reported-by: Leah Neukirchen <leah@vuxu.org>
---
  Deployed on public-inbox.org (http://ou63pmih66umazou.onion/)
  but not others.

 lib/PublicInbox/WwwAttach.pm | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/WwwAttach.pm b/lib/PublicInbox/WwwAttach.pm
index 0b2cda90..09c66d02 100644
--- a/lib/PublicInbox/WwwAttach.pm
+++ b/lib/PublicInbox/WwwAttach.pm
@@ -9,6 +9,22 @@ use bytes (); # only for bytes::length
 use PublicInbox::EmlContentFoo qw(parse_content_type);
 use PublicInbox::Eml;
 
+sub referer_match ($) {
+	my ($ctx) = @_;
+	my $env = $ctx->{env};
+	my $referer = $env->{HTTP_REFERER} // '';
+	return 1 if $referer eq ''; # no referer is always OK for wget/curl
+
+	# prevent deep-linking from other domains on some browsers (Firefox)
+	# n.b.: $ctx->{-inbox}->base_url($env) with INBOX_URL won't work
+	# with dillo, we can only match "$url_scheme://$HTTP_HOST/" without
+	# path components
+	my $base_url = $env->{'psgi.url_scheme'} . '://' .
+			($env->{HTTP_HOST} //
+			 "$env->{SERVER_NAME}:$env->{SERVER_PORT}") . '/';
+	index($referer, $base_url) == 0;
+}
+
 sub get_attach_i { # ->each_part callback
 	my ($part, $depth, $idx) = @{$_[0]};
 	my $ctx = $_[1];
@@ -28,8 +44,14 @@ sub get_attach_i { # ->each_part callback
 								$ctx->{env});
 		$part = $ctx->zflush($part->body);
 	} else { # TODO: allow user to configure safe types
-		$res->[1]->[1] = 'application/octet-stream';
-		$part = $part->body;
+		if (referer_match($ctx)) {
+			$res->[1]->[1] = 'application/octet-stream';
+			$part = $part->body;
+		} else {
+			$res->[0] = 403;
+			$res->[1]->[1] = 'text/plain';
+			$part = "Deep-linking prevented\n";
+		}
 	}
 	push @{$res->[1]}, 'Content-Length', bytes::length($part);
 	$res->[2]->[0] = $part;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-11-23 14:15 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-07 19:10 MIME types for image attachments Leah Neukirchen
2020-11-07 20:39 ` Eric Wong
2020-11-08  0:05   ` Leah Neukirchen
2020-11-08  7:49     ` [PATCH] wwwattach: set "Content-Disposition: attachment" Eric Wong
2020-11-23 14:15       ` [PATCH v2] wwwattach: prevent deep-linking via Referer match Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).