unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Backdoor in upstream xz-utils
@ 2024-03-29 17:51 ` Ryan Prior
  2024-03-29 20:39   ` Felix Lechner via Development of GNU Guix and the GNU System distribution.
  2024-04-04 10:34   ` backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) Giovanni Biscuolo
  0 siblings, 2 replies; 33+ messages in thread
From: Ryan Prior @ 2024-03-29 17:51 UTC (permalink / raw)
  To: Guix Devel

[-- Attachment #1: Type: text/plain, Size: 407 bytes --]

I'm reading today that a backdoor is present in xz's upstream tarball (but not in git), starting at version 5.6.0. Source: https://www.openwall.com/lists/oss-security/2024/03/29/4

Guix currently packages xz-utils 5.2.8 as "xz" using the upstream tarball. Is there a way we can blacklist known bad versions? Should we switch from using upstream tarballs to some fork with more responsible maintainers?

Ryan

[-- Attachment #2: Type: text/html, Size: 1280 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Backdoor in upstream xz-utils
  2024-03-29 17:51 ` Ryan Prior
@ 2024-03-29 20:39   ` Felix Lechner via Development of GNU Guix and the GNU System distribution.
  2024-03-29 20:55     ` Tomas Volf
  2024-04-04 10:34   ` backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) Giovanni Biscuolo
  1 sibling, 1 reply; 33+ messages in thread
From: Felix Lechner via Development of GNU Guix and the GNU System distribution. @ 2024-03-29 20:39 UTC (permalink / raw)
  To: Ryan Prior, Guix Devel; +Cc: guix-security

Hi Ryan,

On Fri, Mar 29 2024, Ryan Prior wrote:

> I'm reading today that a backdoor is present in xz's upstream tarball
> (but not in git), starting at version 5.6.0. Source:
> https://www.openwall.com/lists/oss-security/2024/03/29/4

Thanks for sending this!  This is an extremely serious vulnerability
with criminal intent.  I cc'd guix-security@gnu.org just in case you
haven't.

> Guix currently packages xz-utils 5.2.8 as "xz" using the upstream
> tarball. [...] Should we switch from using upstream tarballs to some
> fork with more responsible maintainers?

Guix's habit of building from tarballs is a poor idea because tarballs
often differ.  For example, maintainers may choose to ship a ./configure
script that is otherwise not present in Git (although a configure.ac
might be).  Guix should build from Git.

> Is there a way we can blacklist known bad versions?

Having said all that, I am not sure Guix is affected.

On my systems, the 'detect.sh' script shows no referece to liblzma in
sshd.  Everyone, please send additional reports.

Kind regards
Felix


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Backdoor in upstream xz-utils
  2024-03-29 20:39   ` Felix Lechner via Development of GNU Guix and the GNU System distribution.
@ 2024-03-29 20:55     ` Tomas Volf
  2024-03-30 21:02       ` Ricardo Wurmus
  0 siblings, 1 reply; 33+ messages in thread
From: Tomas Volf @ 2024-03-29 20:55 UTC (permalink / raw)
  To: Felix Lechner; +Cc: Ryan Prior, Guix Devel, guix-security

[-- Attachment #1: Type: text/plain, Size: 777 bytes --]

Hello,

On 2024-03-29 13:39:59 -0700, Felix Lechner via Development of GNU Guix and the GNU System distribution. wrote:
> > Is there a way we can blacklist known bad versions?
>
> Having said all that, I am not sure Guix is affected.
>
> On my systems, the 'detect.sh' script shows no referece to liblzma in
> sshd.  Everyone, please send additional reports.

If nothing else, our xz is at 5.2.8.  I think the question was if there is a way
to blacklist specific known tarball to ensure no-one updates to it by accident.

(I do not believe Guix would be vulnerable even when built from the malicious
tarball, but that is a separate issue.)

Have a nice day,
Tomas

--
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Backdoor in upstream xz-utils
@ 2024-03-29 20:57 John Kehayias
  2024-03-29 17:51 ` Ryan Prior
  2024-03-31 15:04 ` Backdoor in upstream xz-utils Rostislav Svoboda
  0 siblings, 2 replies; 33+ messages in thread
From: John Kehayias @ 2024-03-29 20:57 UTC (permalink / raw)
  To: Felix Lechner; +Cc: Ryan Prior, Guix Devel, guix-security

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512


-----BEGIN PGP SIGNATURE-----

iQJRBAEBCgA7FiEEpCB7VsJVEJ8ssxV+SZCXrl6oFdkFAmYHK0sdHGpvaG4ua2Vo
YXlpYXNAcHJvdG9ubWFpbC5jb20ACgkQSZCXrl6oFdkFRA//WaJMegtHd88wlq0V
QovAYD7+d6zj5DxgVTiGKXckyKWx7AceVJ0WVp9MB+WxU8dEXepEnd9AHOA4v/Fb
HLy4prms+noIpXqHW5y6EDgbMiBUX2rk6UVq7qnLCPujfv3hrJl4S7B5fJxjLSM/
M++F40YKc6PNSjQHi9BH5+Vl70jGCIzXNcomvEanu4SAsXLSlEwvOlnAPD57mb4k
n4Tg4d7ExXjdi7/qdq/OnF2RGQjiLQ4qX7AeSu8kIaEaK3WdMy1JO1fy9vaZNuSg
oCuUGJYCFj60BEYDQdUM8NiNe76zVzXvP/wKrR1XpqsnK9keKKEZpuZCQmJApgCJ
dvVbrU8OfKPJ/B7CwNJu32FyrdgQt53ytYjNxs/cNNjB2ciDeIGszCzxwytRZz4k
JEbE8VZrUACNvQXCdRbr1Jse1+FuM2hjTwILdia/A8GcWn9tfmfGdqlqOuw6c8qG
hYX7l3+3t0c7VzLhgs2iE/BEKtUAYCrwRf+10J9dOm4TzmbEbg7+1j7FJcYhmIgJ
qeEXistWXx7FY2Yl0UjrNtxi3UGR5rnx2hAb3zEcMoqcHHKuKz/X8aeMfIHryn23
rQms/cVwAPeR908xwbJgqkzQhY5A9DrU+0VGssILyXKvMYp6xTXJ6cf2gGLyhAFF
VerlLVFCEHunNyWr94ZTeXr3p00=
=dUKI
-----END PGP SIGNATURE-----
Hi Ryan, Felix, and guix-devel,

On Fri, Mar 29, 2024 at 01:39 PM, Felix Lechner via Reports of security issues in Guix itself and in packages provided by Guix wrote:

> Hi Ryan,
>
> On Fri, Mar 29 2024, Ryan Prior wrote:
>
>> I'm reading today that a backdoor is present in xz's upstream tarball
>> (but not in git), starting at version 5.6.0. Source:
>> <https://www.openwall.com/lists/oss-security/2024/03/29/4>
>
> Thanks for sending this!  This is an extremely serious vulnerability
> with criminal intent.  I cc'd guix-security@gnu.org just in case you
> haven't.
>

At least me (as part of guix-security) is aware and have been reading
the analysis and further investigation.

Both clever and interesting, but also worrisome. I think we were
rather lucky this was found relatively quickly, though it seems to
point to a bad actor and throws into question other projects (like
libarchive) which have contributions from the same identity. Likely
other accounts are involved too, so maybe on a positive side this
unravels other issues.

The discussion on Hacker News has also been informative (though rather
long now): <https://news.ycombinator.com/item?id=39865810>

>> Guix currently packages xz-utils 5.2.8 as "xz" using the upstream
>> tarball. [...] Should we switch from using upstream tarballs to some
>> fork with more responsible maintainers?
>
> Guix's habit of building from tarballs is a poor idea because tarballs
> often differ.  For example, maintainers may choose to ship a ./configure
> script that is otherwise not present in Git (although a configure.ac
> might be).  Guix should build from Git.
>

We discussed a bit on #guix today about this. A movement to sourcing
more directly from Git in general has been discussed before, though
has some hurdles. I will let someone more knowledgeable about the
details chime in, but yes, something we should do.

Unfortunately in this case, while it seems the older versions don't
have *this* exploit, given the perpetrator either is or has control over
a maintainer account, it throws into question a lot more than the most
recent version. We will have to keep a careful eye on this. I'm not
currently aware of anything untoward for our current version, so far.

>> Is there a way we can blacklist known bad versions?
>

I'm not sure what you mean, but I don't think so. The main danger is
in guix time-machine to the past, as you are (purposefully) going to
older versions of software. This is warned in the manual
<https://guix.gnu.org/en/manual/devel/en/html_node/Invoking-guix-time_002dmachine.html>
though we should perhaps do this at runtime as well.

Even better would be if we can warn about known bad versions. Such a
tool was started (guix health) here:
<https://issues.guix.gnu.org/31444> Anyone up for reviving it, now
that we have some changes that should make this more doable (based on
a quick glance of more recent messages)?

> Having said all that, I am not sure Guix is affected.
>
> On my systems, the 'detect.sh' script shows no referece to liblzma in
> sshd.  Everyone, please send additional reports.
>

Pretty sure we are not affected, at least with what is known: the
exploit targets particular systems and things like argv[0] being
/usr/sbin/sshd. A combination perhaps of who or what was being
targeted as well as trying to make this harder to discover.

Still, we should have an abundance of caution and pay close attention,
as there is much we don't know and a history of commits to go through.
As well as being suspicious in general of things like binary files
added to a release tarball (as a project we always try to make sure
there are no binary files anyway), this is a clear example of a
clever/malicious way of causing harm.

Please do feel free to report privately any concerns or potential
affected packages to guix-security@gnu.org as well. And if you are
interested in helping with these things, I'm sure we could rotate in
some people for that team.

Thanks all! An action-packed Friday.

John



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Backdoor in upstream xz-utils
  2024-03-29 20:55     ` Tomas Volf
@ 2024-03-30 21:02       ` Ricardo Wurmus
  0 siblings, 0 replies; 33+ messages in thread
From: Ricardo Wurmus @ 2024-03-30 21:02 UTC (permalink / raw)
  To: Tomas Volf; +Cc: Felix Lechner, Ryan Prior, guix-security, guix-devel


Tomas Volf <~@wolfsden.cz> writes:

> On 2024-03-29 13:39:59 -0700, Felix Lechner via Development of GNU Guix and the GNU System distribution. wrote:
>> > Is there a way we can blacklist known bad versions?
>>
>> Having said all that, I am not sure Guix is affected.
>>
>> On my systems, the 'detect.sh' script shows no referece to liblzma in
>> sshd.  Everyone, please send additional reports.
>
> If nothing else, our xz is at 5.2.8.  I think the question was if there is a way
> to blacklist specific known tarball to ensure no-one updates to it by accident.

The properties field on a package definition can be used to record
arbitrary information, which could be read by `guix lint`.

-- 
Ricardo


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Backdoor in upstream xz-utils
  2024-03-29 20:57 Backdoor in upstream xz-utils John Kehayias
  2024-03-29 17:51 ` Ryan Prior
@ 2024-03-31 15:04 ` Rostislav Svoboda
  1 sibling, 0 replies; 33+ messages in thread
From: Rostislav Svoboda @ 2024-03-31 15:04 UTC (permalink / raw)
  To: John Kehayias; +Cc: Felix Lechner, Ryan Prior, Guix Devel, guix-security

> >> Is there a way we can blacklist known bad versions?
>
> I'm not sure what you mean, but I don't think so.

For beginning, what about adding a short comment:

diff --git a/gnu/packages/compression.scm b/gnu/packages/compression.scm
index 5de17b6b51..fd5ab7ba00 100644
--- a/gnu/packages/compression.scm
+++ b/gnu/packages/compression.scm
@@ -493,6 +493,8 @@ (define-public pbzip2
 (define-public xz
   (package
    (name "xz")
+;;; Be reminded of the xz/liblzma backdoor in the versions 5.6.0 and 5.6.1!
+;;; See https://www.openwall.com/lists/oss-security/2024/03/29/4
    (version "5.2.8")
    (source (origin
             (method url-fetch)

as a single commit, with an appropriate commit message. That's a bang
for pretty much no money.

> The main danger is in guix time-machine to the past

Good point. So then a little note here, too:

diff --git a/doc/guix.texi b/doc/guix.texi
index 69a904473c..60909adf5f 100644
--- a/doc/guix.texi
+++ b/doc/guix.texi
@@ -5012,10 +5012,13 @@ Invoking guix time-machine
 @quotation Note
 The history of Guix is immutable and @command{guix time-machine}
 provides the exact same software as they are in a specific Guix
-revision.  Naturally, no security fixes are provided for old versions
-of Guix or its channels.  A careless use of @command{guix time-machine}
-opens the door to security vulnerabilities.  @xref{Invoking guix pull,
-@option{--allow-downgrades}}.
+revision.  Naturally, no security fixes are provided for old versions of
+Guix or its channels.  A careless use of @command{guix time-machine}
+opens the door to security vulnerabilities, or potentially even
+backdoors. (Do you remember the
+@uref{https://www.openwall.com/lists/oss-security/2024/03/29/4, backdoor
+in upstream xz/liblzma leading to ssh server compromise}?)
+@xref{Invoking guix pull, @option{--allow-downgrades}}.
 @end quotation

Cheers Bost


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-03-29 17:51 ` Ryan Prior
  2024-03-29 20:39   ` Felix Lechner via Development of GNU Guix and the GNU System distribution.
@ 2024-04-04 10:34   ` Giovanni Biscuolo
  2024-04-04 15:12     ` Attila Lendvai
                       ` (3 more replies)
  1 sibling, 4 replies; 33+ messages in thread
From: Giovanni Biscuolo @ 2024-04-04 10:34 UTC (permalink / raw)
  To: Guix Devel, guix-security; +Cc: Felix Lechner, Ryan Prior

[-- Attachment #1: Type: text/plain, Size: 11907 bytes --]

Hello everybody,

I know for sure that Guix maintainers and developers are working on
this, I'm just asking to find some time to inform and possibly discuss
with users (also in guix-devel) on what measures GNU Guix - the software
distribution - can/should deploy to try to avoid this kind of attacks.

Please consider that this (sub)thread is _not_ specific to xz-utils but
to the specific attack vector (matrix?) used to inject a backdoor in a
binary during a build phase, in a _very_ stealthy way.

Also, since Guix _is_ downstream, I'd like this (sub)thread to
concentrate on what *Guix* can/should do to strenghten the build process
/independently/ of what upstreams (or other distributions) can/should
do.

First of all, I understand the xz backdoor attack was complex (both
socially and technically) and all the details are still under scrutiny,
but AFAIU the way the backdoor has been injected by "infecting" the
**build phase** of the software (and obfuscating the payload in
binaries) is very alarming and is something all distributions aiming at
reproducible builds must (and they actually _are_) examine(ing) very
well.

John Kehayias <john.kehayias@protonmail.com> writes:

[...]

> On Fri, Mar 29, 2024 at 01:39 PM, Felix Lechner via Reports of security issues in Guix itself and in packages provided by Guix wrote:
>
>> Hi Ryan,
>>
>> On Fri, Mar 29 2024, Ryan Prior wrote:

[...]

>>> Guix currently packages xz-utils 5.2.8 as "xz" using the upstream
>>> tarball. [...] Should we switch from using upstream tarballs to some
>>> fork with more responsible maintainers?
>>
>> Guix's habit of building from tarballs is a poor idea because tarballs
>> often differ.

First of all: is to be considered reproducible a software that produces
different binaries if compiled from the source code repository (git or
something else managed) or from the official released source tarball?

My first thought is no.

>> For example, maintainers may choose to ship a ./configure script that
>> is otherwise not present in Git (although a configure.ac might be).
>> Guix should build from Git.

Two useful pointers explaining how the backdoor has been injected are
[1] (general workflow) and [2] (payload obfuscation)

The first and *indispensable* condition for the attack to be succesful
is this:

--8<---------------cut here---------------start------------->8---

* The release tarballs upstream publishes don't have the same code that
 GitHub has. This is common in C projects so that downstream consumers
 don't need to remember how to run autotools and autoconf. The version
 of build-to-host.m4 in the release tarballs differs wildly from the
 upstream on GitHub.

[...]

* Explain dist tarballs, why we use them, what they do, link to
  autotools docs, etc

 * "Explaining the history of it would be very helpful I think. It also
 explains how a single person was able to insert code in an open source
 project that no one was able to peer review. It is pragmatically
 impossible, even if technically possible once you know the problem is
 there, to peer review a tarball prepared in this manner."

--8<---------------cut here---------------end--------------->8---
(from [1])

Let me highlight this: «It is pragmatically impossible [...] to peer
review a tarball prepared in this manner.»

There is no doubt that the release tarball is a very weak "trusted
source" (trusted by peer review, not by authority) than the upstream
DVCS repository.

It's *very* noteworthy that the backdoor was discovered thanks to a
performance issue and _not_ during a peer review of the source
code... the _build_ code *is* source code, no?

It's not the first time a source release tarball of free software is
compromised [3], but the way the compromise worked in this case is
something new (or at least never spetted before, right?).

> We discussed a bit on #guix today about this. A movement to sourcing
> more directly from Git in general has been discussed before, though
> has some hurdles.

Please could someone knowledgeable about the details describe what are
the hurdles about sourcing from DVCS (eventually other than git)?

> I will let someone more knowledgeable about the details chime in, but
> yes, something we should do.

I'm definitely _not_ the knowledgeable one, but I'd like to share the
result of my researches.

Is it possible to enhance our build-system(s) (e.g. gnu-build-system) so
thay can /ignore/ pre-built .m4 or similar script and rebuild them
during the build process?

Richard W.M. Jones on fedora-devel ML proposed [4]:

--8<---------------cut here---------------start------------->8---

(1) We should routinely delete autoconf-generated cruft from upstream
projects and regenerate it in %prep. It is easier to study the real
source rather than dig through the convoluted, generated shell script in
an upstream './configure' looking for back doors. For most projects,
just running "autoreconf - fiv" is enough.

--8<---------------cut here---------------end--------------->8---

There is an interesting bug report [5] about autoreconf:

--8<---------------cut here---------------start------------->8---

While analyzing the recent xz backdoor hook into the build system [A],
I noticed that one of the aspects why the hook worked was because it
seems like «autoreconf -f -i» (that is run in Debian as part of
dh-autoreconf via dh) still seems to take the serial into account,
which was bumped in the tampered .m4 file. If either the gettext.m4
had gotten downgraded (to the version currently in Debian, which would
not have pulled the tampered build-to-host.m4), or once Debian upgrades
gettext, the build-to-host.m4 would get downgraded to the upstream
clean version, then the hook would have been disabled and the backdoor
would be inert. (Of course at that point the malicious actor would
have found another way to hook into the build system, but the less
avenues there are the better.)

I've tried to search the list and checked for old bug reports on the
debbugs.gnu.org site, but didn't notice anything. To me this looks like
a very unexpected behavior, but it's not clear whether this is intentional
or a bug. In any case regardless of either position, it would be good to
improve this (either by fixing --force to force things even if
downgrading, or otherwise perhaps to add a new option to really force
everything).

--8<---------------cut here---------------end--------------->8---

So AFAIU using a fixed "autoreconf -fi" should mitigate the risks of
tampered .m4 macros (and other possibly tampered build configuration
script)?

IMHO "ignoring" (deleting) pre-built build scripts in Guix
build-system(s) should be considered... or is /already/ so?

Also, I found this thread [6] interesting, especially this message [7]
from Jacob Bachmeyer:

--8<---------------cut here---------------start------------->8---

The *user* could catch issues like this backdoor, since the backdoor
appears (based on what I have read so far) to materialize certain object
files while configure is running, while `find . -iname '*.o'` /should/
return nothing before make is run. This also suggests that running "make
clean" after configure would kill at least this backdoor.

--8<---------------cut here---------------end--------------->8---

Something to apply in Guix gnu-build-system?

He also writes:

--8<---------------cut here---------------start------------->8---

A *very* observant (unreasonably so) user might notice that "make" did
not build the objects that the backdoor provided.

--8<---------------cut here---------------end--------------->8---

Is there a way to enhance gnu-build-system in order to make it notice
that some object was not build by make?

He then goes on explaining:

--8<---------------cut here---------------start------------->8---

Of course, an attacker could sneak around this as well by moving the
process for unpacking the backdoor object to a Makefile rule, but that
is more likely to "stick out" to an observant user, as well as being an
easy target for automated analysis ("Which files have 'special' rules?")
since you cannot obfuscate those from make(1) and expect them to still
work.

--8<---------------cut here---------------end--------------->8---

Given the above observation that «it is pragmatically impossible [...]
to peer review a tarball prepared in this manner», I strongly doubt that
a possible Makefile tampering _in_the_release_tarball_ is easy to peer
review; I'd ask: is it feaseable such an "automated analysis" (see
above) in a dedicated build-system phase?

Anyway I'm asking myself: a *possibly different from the official code
in a DVCS* release tarball with a *valid* GPG signature (please see [3])
would have been really peer reviewed or is it «pragmatically
impossible»?

In other words: what if the backdoor was injected directly in the source
code of the *official* release tarball signed with a valid GPG signature
(and obviously with a valid sha256 hash)?

Do upstream developer communities peer review release tarballs or they
"just" peer review the code in the official DVCS?

Also, in (info "(guix) origin Reference") I see that Guix packages can have a
list of uri(s) for the origin of source code, see xz as an example [7]:
are they intended to be multiple independent sources to be compared in
order to prevent possible tampering or are they "just" alternatives to
be used if the first listed uri is unavailable?

If the case is the first, a solution would be to specify multiple
independent release tarballs for each package, so that it would be
harder to copromise two release sources, but that is not something under
Guix control.

All in all: should we really avoid the "pragmatically impossible to be
peer reviewed" release tarballs?

WDYT?

Happy hacking! Gio'

[...]

[1] https://gist.github.com/thesamesam/223949d5a074ebc3dce9ee78baad9e27
«FAQ on the xz-utils backdoor (CVE-2024-3094)» (costantly updated)

[2] https://gynvael.coldwind.pl/?lang=en&id=782
«xz/liblzma: Bash-stage Obfuscation Explained»

[3]
e.g. https://web.archive.org/web/20110708023004/http://www.h-online.com/open/news/item/Vsftpd-backdoor-discovered-in-source-code-update-1272310.html
«Vsftpd backdoor discovered in source code - update»
"a bad tarball had been downloaded from the vsftpd master site with an
invalid GPG signature"

[4]
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/YWMNOEJ34Q7QLBWQAB5TM6A2SVJFU4RV/
«Three steps we could take to make supply chain attacks a bit harder»

[5] https://lists.gnu.org/archive/html/bug-autoconf/2024-03/msg00000.html

[6] https://lists.gnu.org/archive/html/automake/2024-03/msg00007.html
«GNU Coding Standards, automake, and the recent xz-utils backdoor»

[7]
https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages/compression.scm#n494
--8<---------------cut here---------------start------------->8---
(define-public xz
  (package
   (name "xz")
   (version "5.2.8")
   (source (origin
            (method url-fetch)
            (uri (list (string-append "http://tukaani.org/xz/xz-" version
                                      ".tar.gz")
                       (string-append "http://multiprecision.org/guix/xz-"
                                      version ".tar.gz")))
--8<---------------cut here---------------end--------------->8---



P.S.: in a way, I see this kind of attack is exploiting a form of
statefulness of the build system, in this case "build-to-host.m4" was
/status/; I think that (also) build systems should be stateless and Guix
is doing a great job to reach this goal.

-- 
Giovanni Biscuolo

Xelera IT Infrastructures

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 849 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-04 10:34   ` backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) Giovanni Biscuolo
@ 2024-04-04 15:12     ` Attila Lendvai
  2024-04-04 16:47       ` Giovanni Biscuolo
  2024-04-04 15:47     ` Giovanni Biscuolo
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 33+ messages in thread
From: Attila Lendvai @ 2024-04-04 15:12 UTC (permalink / raw)
  To: Giovanni Biscuolo; +Cc: Guix Devel, guix-security, Felix Lechner, Ryan Prior

> Also, in (info "(guix) origin Reference") I see that Guix packages can have a
> list of uri(s) for the origin of source code, see xz as an example [7]:
> are they intended to be multiple independent sources to be compared in
> order to prevent possible tampering or are they "just" alternatives to
> be used if the first listed uri is unavailable?


a source origin is identified by its cryptographic hash (stored in its sha256 field); i.e. it doesn't matter *where* the source archive was acquired from. if the hash matches the one in the package definition, then it's the same archive that the guix packager has seen while packaging.

-- 
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“We’ll know our disinformation program is complete when everything the American public believes is false.”
	— William Casey (1913–1987), the director of CIA 1981-1987



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-04 10:34   ` backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) Giovanni Biscuolo
  2024-04-04 15:12     ` Attila Lendvai
@ 2024-04-04 15:47     ` Giovanni Biscuolo
  2024-04-04 19:48       ` Attila Lendvai
  2024-04-04 23:03     ` Ricardo Wurmus
  2024-04-05 16:52     ` Jan Wielkiewicz
  3 siblings, 1 reply; 33+ messages in thread
From: Giovanni Biscuolo @ 2024-04-04 15:47 UTC (permalink / raw)
  To: Guix Devel

[-- Attachment #1: Type: text/plain, Size: 9052 bytes --]

Hello,

a couple of additional (IMO) useful resources...

Giovanni Biscuolo <g@xelera.eu> writes:

[...]

> Let me highlight this: «It is pragmatically impossible [...] to peer
> review a tarball prepared in this manner.»
>
> There is no doubt that the release tarball is a very weak "trusted
> source" (trusted by peer review, not by authority) than the upstream
> DVCS repository.

This kind of attack was described by Daniel Stenberg in his «HOWTO
backdoor curl» article in 2021.03.30 as "skip-git-altogether" method:

https://daniel.haxx.se/blog/2021/03/30/howto-backdoor-curl/
--8<---------------cut here---------------start------------->8---

The skip-git-altogether methods

As I’ve described above, it is really hard even for a skilled developer
to write a backdoor and have that landed in the curl git repository and
stick there for longer than just a very brief period.

If the attacker instead can just sneak the code directly into a release
archive then it won’t appear in git, it won’t get tested and it won’t
get easily noticed by team members!

curl release tarballs are made by me, locally on my machine. After I’ve
built the tarballs I sign them with my GPG key and upload them to the
curl.se origin server for the world to download. (Web users don’t
actually hit my server when downloading curl. The user visible web site
and downloads are hosted by Fastly servers.)

An attacker that would infect my release scripts (which btw are also in
the git repository) or do something to my machine could get something
into the tarball and then have me sign it and then create the “perfect
backdoor” that isn’t detectable in git and requires someone to diff the
release with git in order to detect – which usually isn’t done by anyone
that I know of.

[...] I of course do my best to maintain proper login sanitation,
updated operating systems and use of safe passwords and encrypted
communications everywhere. But I’m also a human so I’m bound to do
occasional mistakes.

Another way could be for the attacker to breach the origin download
server and replace one of the tarballs there with an infected version,
and hope that people skip verifying the signature when they download it
or otherwise notice that the tarball has been modified. I do my best at
maintaining server security to keep that risk to a minimum. Most people
download the latest release, and then it’s enough if a subset checks the
signature for the attack to get revealed sooner rather than later.

--8<---------------cut here---------------end--------------->8---

Unfortunately Stenberg in that section misses one attack vector he
mentioned in a previous article section named "The tricking a user
method":

--8<---------------cut here---------------start------------->8---

We can even include more forced “convincing” such as direct threats
against persons or their families: “push this code or else…”. This way
of course cannot be protected against using 2fa, better passwords or
things like that.

--8<---------------cut here---------------end--------------->8---

...and an attack vector involving more subltle ways (let's call it
distributed social engineering) to convince the upstream developer and
other contributors and/or third parties they need a project
co-maintainer authorized to publish _official_ release tarballs.

Following Stenberg's attacks classification, since the supply-chain
attack was intended to install a backdoor in the _sshd_ service, and
_not_ in xz-utils or liblzma, we can classify this attack as:

  skip-git-altogether to install a backdoor further-down-the-chain,
  precisely in a _dependency_ of the attacked one, durind a period of
  "weakness" of the upstream maintainers

Stenberg closes his article with this update and one related reply to a
comment:

--8<---------------cut here---------------start------------->8---

Dependencies

Added after the initial post. Lots of people have mentioned that curl
can get built with many dependencies and maybe one of those would be an
easier or better target. Maybe they are, but they are products of their
own individual projects and an attack on those projects/products would
not be an attack on curl or backdoor in curl by my way of looking at it.

In the curl project we ship the source code for curl and libcurl and the
users, the ones that builds the binaries from that source code will get
the dependencies too.

[...]

 Jean Hominal says: 
 April 1, 2021 at 14:04 

 I think the big difference why you “missed” dependencies as an attack
 vector is because today, most application developers ship their
 dependencies in their application binaries (by linking statically or
 shipping a container) – in such a case, I would definitely count an
 attack on such a dependency, that is then shipped as part of the
 project’s artifacts, as a successful attack on the project.

 However, as you only ship a source artifact – of course, dependencies
 *are* out of scope in your case.

 Daniel Stenberg says: 
 April 1, 2021 at 15:05 

 Jean: Right. I don’t want to dismiss the risk or the danger of an
 attack to a curl dependency.  However, it is not possible for me or the
 curl project to keep them safe!

--8<---------------cut here---------------end--------------->8---

That lets a number of open questions about some developers attitude
towards _distributing_ their software, but it's off-topic here IMO.

Anyway, let me highlight, again, the "pragmatically impossible peer
review of release tarballs" argument; Stenberg says: «the “perfect
backdoor” that isn’t detectable in git and requires someone to diff the
release with git in order to detect – which usually isn’t done by anyone
that I know of.»

[...]

> Is it possible to enhance our build-system(s) (e.g. gnu-build-system) so
> thay can /ignore/ pre-built .m4 or similar script and rebuild them
> during the build process?

There is a related security issue for PHP [1], with an interesting
thread on the php.internals mailing list (via externals.io [2]):

--8<---------------cut here---------------start------------->8---

Consider removing autogenerated files from tarballs

[...] I believe that it would be a good idea to remove the huge attack
surface offered by the pre-generated autoconf build scripts and lexers,
offered in the release tarballs.

[...] this injection mode makes sense, as extra files in the tarball not
present in the git repo would raise suspicions, but machine-generated
configure scripts containing hundreds of thousands of lines of code not
present in the upstream VCS are the norm, and are usually not checked
before execution.

[...] Specifically in the case of PHP, along from the configure script,
the tarball also bundles generated lexer files which contain actual C
code, which is an additional attack vector

[...] To prevent attacks from malevolent/compromised RMs, I propose
completely removing all autogenerated files from the release tarballs,
and ensuring their content exactly matches the content of the associated
git tag

[...] Of course this means that users will have to generate the build
scripts when compiling PHP, as when installing PHP from the VCS repo.

[...] Distros like arch linux already re-generate the configure scripts
from scratch, but I believe that no distinction should be made, everyone
should get a tarball containing only the bare source code, without
leaving to the user the choice to re-generate the build files, or use a
potentially compromised build script.

[...] The current standard way of distributing generated configure files
in tarballs is precisely what allowed the xz supply chain attack to go
unnoticed for so long.

I strongly believe all projects using autotools, including PHP, should
switch away from this "standard" way of doing things.

[...] when a user downloads a source tarball, there's a false sense of
security rooted in the mistaken belief that the source code in the
tarball matches the one distributed in the VCS, but in reality, the
tarball also contains potentially malicious semi-compiled blobs, not
present in the VCS.

--8<---------------cut here---------------end--------------->8---

Are really "configure scripts containing hundreds of thousands of lines
of code not present in the upstream VCS" the norm?

If so, can we consider hundreds of thousand of lines of configure
scripts and other (auto)generated files bundled in release tarballs
"pragmatically impossible" to be peer reviewed?

Can we consider that artifacts as sort-of-binary and "force" our
build-systems to _regenerate_ *all* them?

...or is it better to completely avoid release tarballs as our sources
uris?

[...]

Thanks, Gio'


[1] https://github.com/php/php-src/issues/13838

[2] https://externals.io/message/122811

-- 
Giovanni Biscuolo

Xelera IT Infrastructures

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 849 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-04 15:12     ` Attila Lendvai
@ 2024-04-04 16:47       ` Giovanni Biscuolo
  0 siblings, 0 replies; 33+ messages in thread
From: Giovanni Biscuolo @ 2024-04-04 16:47 UTC (permalink / raw)
  To: Attila Lendvai; +Cc: Guix Devel, guix-security

[-- Attachment #1: Type: text/plain, Size: 987 bytes --]

Hi Attila,

Attila Lendvai <attila@lendvai.name> writes:

>> Also, in (info "(guix) origin Reference") I see that Guix packages
>> can have a list of uri(s) for the origin of source code, see xz as an
>> example [7]: are they intended to be multiple independent sources to
>> be compared in order to prevent possible tampering or are they "just"
>> alternatives to be used if the first listed uri is unavailable?
>
> a source origin is identified by its cryptographic hash (stored in its
> sha256 field); i.e. it doesn't matter *where* the source archive was
> acquired from. if the hash matches the one in the package definition,
> then it's the same archive that the guix packager has seen while
> packaging.

Ehrm, you are right, mine was a stupid question :-)

We *are* already verifying that tarballs had not been tampered
with... by other people but the release manager :-(

[...]

Happy hacking! Gio'

-- 
Giovanni Biscuolo

Xelera IT Infrastructures

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 849 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-04 15:47     ` Giovanni Biscuolo
@ 2024-04-04 19:48       ` Attila Lendvai
  2024-04-04 20:32         ` Ekaitz Zarraga
  2024-04-05 10:13         ` Giovanni Biscuolo
  0 siblings, 2 replies; 33+ messages in thread
From: Attila Lendvai @ 2024-04-04 19:48 UTC (permalink / raw)
  To: Giovanni Biscuolo; +Cc: Guix Devel

> Are really "configure scripts containing hundreds of thousands of lines
> of code not present in the upstream VCS" the norm?


pretty much for all C and C++ projects that use autoconf... which is numerous, especially among the core GNU components.


> If so, can we consider hundreds of thousand of lines of configure
> scripts and other (auto)generated files bundled in release tarballs
> "pragmatically impossible" to be peer reviewed?


yes.


> Can we consider that artifacts as sort-of-binary and "force" our
> build-systems to regenerate all them?


that would be a good practice.


> ...or is it better to completely avoid release tarballs as our sources
> uris?


yes, and this^ would guarantee the previous point, but it's not always trivial.

as an example see this: https://issues.guix.gnu.org/61750

in short: when building shepherd from git the man files need to be generated using the program help2man. this invokes the binary with --help and formats the output as a man page. the usefulness of this is questionable, but the point is that it breaks crosscompilation, because the host cannot execute the target binary.

but these generated man files are part of the release tarball, so cross compilation works fine using the tarball.

all in all, just by following my gut insctincts, i was advodating for building everything from git even before the exposure of this backdoor. in fact, i found it surprising as a guix newbie that not everything is built from git (or their VCS of choice).

-- 
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“For if you [the rulers] suffer your people to be ill-educated, and their manners to be corrupted from their infancy, and then punish them for those crimes to which their first education disposed them, what else is to be concluded from this, but that you first make thieves [and outlaws] and then punish them.”
	— Sir Thomas More (1478–1535), 'Utopia', Book 1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-04 19:48       ` Attila Lendvai
@ 2024-04-04 20:32         ` Ekaitz Zarraga
  2024-04-10 13:57           ` Ludovic Courtès
  2024-04-05 10:13         ` Giovanni Biscuolo
  1 sibling, 1 reply; 33+ messages in thread
From: Ekaitz Zarraga @ 2024-04-04 20:32 UTC (permalink / raw)
  To: Attila Lendvai, Giovanni Biscuolo; +Cc: Guix Devel

Hi,

I just want to add some perspective from the bootstrapping.

On 2024-04-04 21:48, Attila Lendvai wrote:
> 
> all in all, just by following my gut insctincts, i was advodating for building everything from git even before the exposure of this backdoor. in fact, i found it surprising as a guix newbie that not everything is built from git (or their VCS of choice).

That has happened to me too.
Why not use Git directly always?

In the bootstrapping it's also a problem, as all those tools (autotools) 
must be bootstrapped, and they require other programs (compilers) that 
actually use them. And we'll be forced to use git, too, or at least 
clone the bootstrapping repos, git-archive them ourselves and host them 
properly signed. At least, we could challenge them using git (similar to 
what we do with the substitutes), which we cannot do right now with the 
release tarballs against the actual code of the repository.

In live-bootstrap they just write the build scripts by hand, and ignore 
whatever the ./configure script says. That's also a reasonable way to 
tackle the bootstrapping, but it's a hard one. Thankfully, we are 
working together in this Bootstrapping effort so we can learn from them 
and adapt their recipes to our Guix commencement.scm module. This would 
be some effort, but it's actually doable.

Hope this adds something useful to the discussion,

Ekaitz



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-04 10:34   ` backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) Giovanni Biscuolo
  2024-04-04 15:12     ` Attila Lendvai
  2024-04-04 15:47     ` Giovanni Biscuolo
@ 2024-04-04 23:03     ` Ricardo Wurmus
  2024-04-05  7:06       ` Giovanni Biscuolo
  2024-04-05 16:52     ` Jan Wielkiewicz
  3 siblings, 1 reply; 33+ messages in thread
From: Ricardo Wurmus @ 2024-04-04 23:03 UTC (permalink / raw)
  To: Giovanni Biscuolo; +Cc: Guix Devel, guix-security, Felix Lechner, Ryan Prior

[mu4e must have changed the key bindings for replies, so here is my mail
again, this time as a wide reply.]

Giovanni Biscuolo <g@xelera.eu> writes:

> So AFAIU using a fixed "autoreconf -fi" should mitigate the risks of
> tampered .m4 macros (and other possibly tampered build configuration
> script)?
>
> IMHO "ignoring" (deleting) pre-built build scripts in Guix
> build-system(s) should be considered... or is /already/ so?

The gnu-build-system has a bootstrap phase, but it only does something
when a configure script does not already exist.  We sometimes force it
to bootstrap the build system when we patch configure.ac.

In previous discussions there were no big objections to always
bootstrapping the build system files from autoconf/automake sources.

This particular backdoor relied on a number of obfuscations:

- binary test data.  Nobody ever looks at binaries.

- incomprehensibility of autotools output.  This one is fundamentally a
  social problem and easily extends to other complex build systems.  In
  the xz case, the instructions for assembling the shell snippets to
  inject the backdoor could hide in plain sight, just because configure
  scripts are expected to be near incomprehensible.  They contain no
  comments, are filled to the brim with portable (lowest common
  denominator) shell magic, and contain bizarrely named variables.

Not using generated output is a good idea anyway and removes the
requirement to trust that the release tarballs are faithful derivations
from the autotools sources, but given the bland complexity of build system
code (whether that's recursive Makefiles, CMake cruft, or the infamous
gorilla spit[1] of autotools) I don't see a good way out.

[1] https://www.gnu.org/software/autoconf/manual/autoconf-2.65/autoconf.html#History

> Given the above observation that <<it is pragmatically impossible [...]
> to peer review a tarball prepared in this manner>>, I strongly doubt that
> a possible Makefile tampering _in_the_release_tarball_ is easy to peer
> review; I'd ask: is it feaseable such an "automated analysis" (see
> above) in a dedicated build-system phase?

I don't think it's feasible.  Since Guix isn't a regular user (the
target audience of configure scripts) it has no business depending on
generated configure scripts.  It should build these from source.

> In other words: what if the backdoor was injected directly in the source
> code of the *official* release tarball signed with a valid GPG signature
> (and obviously with a valid sha256 hash)?

A malicious maintainer can sign bad release tarballs.  A malicious
contributor can push signed commits that contain backdoors in code.

> Do upstream developer communities peer review release tarballs or they
> "just" peer review the code in the official DVCS?

Most do neither.  I'd guess that virtually *nobody* reviews tarballs
beyond automated tests (like what the GNU maintainers' GNUmakefile /
maint.mk does when preparing a release).

> Also, in (info "(guix) origin Reference") I see that Guix packages can have a
> list of uri(s) for the origin of source code, see xz as an example [7]:
> are they intended to be multiple independent sources to be compared in
> order to prevent possible tampering or are they "just" alternatives to
> be used if the first listed uri is unavailable?

They are alternative URLs, much like what the mirror:// URLs do.

> If the case is the first, a solution would be to specify multiple
> independent release tarballs for each package, so that it would be
> harder to copromise two release sources, but that is not something under
> Guix control.

We have hashes for this purpose.  A tarball that was modified since the
package definition has been published would have a different hash.  This
is not a statement about tampering, but only says that our expectations
(from the time of packaging) have not been met.

> All in all: should we really avoid the "pragmatically impossible to be
> peer reviewed" release tarballs?

Yes.

-- 
Ricardo


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-04 23:03     ` Ricardo Wurmus
@ 2024-04-05  7:06       ` Giovanni Biscuolo
  2024-04-05  7:39         ` Ricardo Wurmus
  0 siblings, 1 reply; 33+ messages in thread
From: Giovanni Biscuolo @ 2024-04-05  7:06 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: Guix Devel, guix-security

[-- Attachment #1: Type: text/plain, Size: 5583 bytes --]

Hello Ricardo,

Ricardo Wurmus <rekado@elephly.net> writes:

> Giovanni Biscuolo <g@xelera.eu> writes:
>
>> So AFAIU using a fixed "autoreconf -fi" should mitigate the risks of
>> tampered .m4 macros (and other possibly tampered build configuration
>> script)?
>>
>> IMHO "ignoring" (deleting) pre-built build scripts in Guix
>> build-system(s) should be considered... or is /already/ so?
>
> The gnu-build-system has a bootstrap phase, but it only does something
> when a configure script does not already exist.  We sometimes force it
> to bootstrap the build system when we patch configure.ac.
>
> In previous discussions there were no big objections to always
> bootstrapping the build system files from autoconf/automake sources.

But AFAIU the boostrap is not always done, right?

If so, given that there are no big objections to always bootstrap the
build system files, what is the technical reason it's not done?

> This particular backdoor relied on a number of obfuscations:
>
> - binary test data.  Nobody ever looks at binaries.

Yes, and the presence of binary data (e.g. for testing or other included
media) is not something under downstream (Guix) control, so we have to
live with it.  No?

> - incomprehensibility of autotools output.  This one is fundamentally a
>   social problem and easily extends to other complex build systems.  In
>   the xz case, the instructions for assembling the shell snippets to
>   inject the backdoor could hide in plain sight, just because configure
>   scripts are expected to be near incomprehensible.  They contain no
>   comments, are filled to the brim with portable (lowest common
>   denominator) shell magic, and contain bizarrely named variables.

Yes I understand this well, for this reason I call configure scripts
near-binary-artifacts, kinda *.o files

From a reproducibility and security POV this is a nightmare and no one
should never ever trust such configure scripts

> Not using generated output is a good idea anyway and removes the
> requirement to trust that the release tarballs are faithful derivations
> from the autotools sources, but given the bland complexity of build system
> code (whether that's recursive Makefiles, CMake cruft, or the infamous
> gorilla spit[1] of autotools) I don't see a good way out.

I guess I miss the technical details about why it's not possible to
_always_ bootstrap the build system files from autoconf/automake
sources: do you have any reference documentation or technical article as
a reference, please?

> [1]
> https://www.gnu.org/software/autoconf/manual/autoconf-2.65/autoconf.html#History

I'll study the autoconf history :-)

>> Given the above observation that «it is pragmatically impossible [...]
>> to peer review a tarball prepared in this manner», I strongly doubt that
>> a possible Makefile tampering _in_the_release_tarball_ is easy to peer
>> review; I'd ask: is it feaseable such an "automated analysis" (see
>> above) in a dedicated build-system phase?
>
> I don't think it's feasible.  Since Guix isn't a regular user (the
> target audience of configure scripts) it has no business depending on
> generated configure scripts.  It should build these from source.

Maybe I misunderstand your argument or, more probably, I was too
cryptic.  I mean, Someone™ is telling that moving the unpacking of the
backdoor object to a Makefile rule is an easy target for _automated_
analisys: is that someone wrong or is there a way to analyze a Makefile
to answer "Which files have 'special' rules?"

>> In other words: what if the backdoor was injected directly in the source
>> code of the *official* release tarball signed with a valid GPG signature
>> (and obviously with a valid sha256 hash)?
>
> A malicious maintainer can sign bad release tarballs.  A malicious
> contributor can push signed commits that contain backdoors in code.

Oh yes, but it's way more harder to hide backdoors in code published as
signed (signed?!?) commits in a DVCS.

Obviously no security system is perfect, but Some™ is (very) less
perfect than others. :-)

>> Do upstream developer communities peer review release tarballs or they
>> "just" peer review the code in the official DVCS?
>
> Most do neither.  I'd guess that virtually *nobody* reviews tarballs
> beyond automated tests (like what the GNU maintainers' GNUmakefile /
> maint.mk does when preparing a release).

I guess that in "nobody" are included Guix package contributors and
committers...  Then I'd say that virtually *nobody* should trust
tarball!  :-O

To be clear: I'm not suggesting that "tarball reviews" - that is, verify
the /almost/ exact correspondence of the tarball with the corresponding
DVCS commit - should be added as a requirement for contributors or
maintainers... it would be too burdensome.

>> Also, in (info "(guix) origin Reference") I see that Guix packages can have a
>> list of uri(s) for the origin of source code, see xz as an example [7]:
>> are they intended to be multiple independent sources to be compared in
>> order to prevent possible tampering or are they "just" alternatives to
>> be used if the first listed uri is unavailable?
>
> They are alternative URLs, much like what the mirror:// URLs do.

OK understood, thanks!

[...]

>> All in all: should we really avoid the "pragmatically impossible to be
>> peer reviewed" release tarballs?
>
> Yes.

I tend to agree! :-(

Thank you! Giovanni

-- 
Giovanni Biscuolo

Xelera IT Infrastructures

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 849 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-05  7:06       ` Giovanni Biscuolo
@ 2024-04-05  7:39         ` Ricardo Wurmus
  0 siblings, 0 replies; 33+ messages in thread
From: Ricardo Wurmus @ 2024-04-05  7:39 UTC (permalink / raw)
  To: Giovanni Biscuolo; +Cc: Guix Devel, guix-security

Giovanni Biscuolo <g@xelera.eu> writes:

> Hello Ricardo,
>
> Ricardo Wurmus <rekado@elephly.net> writes:
>
>> Giovanni Biscuolo <g@xelera.eu> writes:
>>
>>> So AFAIU using a fixed "autoreconf -fi" should mitigate the risks of
>>> tampered .m4 macros (and other possibly tampered build configuration
>>> script)?
>>>
>>> IMHO "ignoring" (deleting) pre-built build scripts in Guix
>>> build-system(s) should be considered... or is /already/ so?
>>
>> The gnu-build-system has a bootstrap phase, but it only does something
>> when a configure script does not already exist.  We sometimes force it
>> to bootstrap the build system when we patch configure.ac.
>>
>> In previous discussions there were no big objections to always
>> bootstrapping the build system files from autoconf/automake sources.
>
> But AFAIU the boostrap is not always done, right?

It is not.  See guix/build/gnu-build-system.scm:

    (if (not (script-exists? "configure")) ...)

> If so, given that there are no big objections to always bootstrap the
> build system files, what is the technical reason it's not done?

I don't think there is a technical reason.  It's just one of those
things that need someone doing them.

>> Not using generated output is a good idea anyway and removes the
>> requirement to trust that the release tarballs are faithful derivations
>> from the autotools sources, but given the bland complexity of build system
>> code (whether that's recursive Makefiles, CMake cruft, or the infamous
>> gorilla spit[1] of autotools) I don't see a good way out.
>
> I guess I miss the technical details about why it's not possible to
> _always_ bootstrap the build system files from autoconf/automake
> sources: do you have any reference documentation or technical article as
> a reference, please?

I didn't say it's not possible.  Someone's gotta start a branch and
build it all out.  There may be some annoyance closer to the bootstrap
origins (because we may not easily be able to run an approximation of
autotools or even VCS tools closer to the bootstrap seeds), but I think
we're already using custom Makefiles in some of these cases to simplify
bootstrapping.

It's just work.  Someone's gotta do it.  It's probably not super
complicated, but given the large number of packages we have it won't be
fast.

-- 
Ricardo


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-04 19:48       ` Attila Lendvai
  2024-04-04 20:32         ` Ekaitz Zarraga
@ 2024-04-05 10:13         ` Giovanni Biscuolo
  2024-04-05 14:51           ` Attila Lendvai
  1 sibling, 1 reply; 33+ messages in thread
From: Giovanni Biscuolo @ 2024-04-05 10:13 UTC (permalink / raw)
  To: Attila Lendvai, guix-security; +Cc: Guix Devel

[-- Attachment #1: Type: text/plain, Size: 5431 bytes --]

Hi Attila and guix-security team,

Attila Lendvai <attila@lendvai.name> writes:

>> Are really "configure scripts containing hundreds of thousands of lines
>> of code not present in the upstream VCS" the norm?
>
> pretty much for all C and C++ projects that use autoconf... which is
> numerous, especially among the core GNU components.

OK, thank you for the confirmation.

[...]

>> ...or is it better to completely avoid release tarballs as our sources
>> uris?
>
> yes, and this^ would guarantee the previous point, but it's not always trivial.
>
> as an example see this: https://issues.guix.gnu.org/61750

[...]

> it breaks crosscompilation, because the host cannot execute the target
> binary.

OK thanks, I missed that.

In general, there is really no other solution for projects than to
distribute some artifacts "out of band" or renounce to crosscompile?!?

Are there other issues (different from the "host cannot execute target
binary") that makes relesase tarballs indispensable for some upstream
projects?

AFAIU the only thing that /could/ "save" source tarballs it's their
/scientific/ reproducibility.  In this direction there is a very
interesting patchset from Janneke Nieuwenhuizen to try to get a
reproducible _Guix_ release tarball:

https://issues.guix.gnu.org/70169
«Reproducible `make dist' tarball in defiance of Autotools and Gettext»

Obviously having a reproducible tarball makes _practical_ the
"pragmatically impossible" task to reproduce a release tarball to check
if it corresponds to the same **build** (make dist) performed in the
official DVCS repo; only this could "save" all the "build software using
release tarball" workflow.

...but /in general/ here we are _downstream_, we have absolutely no
control over upstream, and it's _very_ unlikely that we'll see a *good*
solution to the tarball reproduciblity problem applied "in the wild
upstream" soon.

I said "a **good* solution" because some proposals I'm reading about are
/bad/ _complications_ that absolutely are NOT really solving the source
tarball reproduciblity problem [1]; for example:

1. build the tarball on the RM host using a docker container
(unreproducible built) and call it "a reproducible release tarball":
https://medium.com/@lanoxx/creating-reproducible-release-tarballs-fa2e2ce745a7

2. have a CI system based on github actions [2] and call it "fully
verifiable": https://externals.io/message/122811#122814 (from
php.internals mailing list)

So, while "almost all the world" is applying _wrong_ solutions to the
source tarball reproducibility problem, what can Guix do?

Even if We™ (ehrm) find a solution to the source tarball reproducibility
problem (potentially allowing us to patch all the upstream makefiles
with specific phases in our packages definitions) are we really going to
start our own (or one managed by the reproducible build community)
"reproducible source tarballs" repository?  Is this feaseable?

I think there is no solution that can "pragmatically save" the source
tarballs of all the software packaged in Guix (and all other
distributions part of the reproducible builds effort).

> but these generated man files are part of the release tarball, so
> cross compilation works fine using the tarball.

AFAIU *in this case* there is an easy alternative: distribute the
(generated) man files as *code* tracked in the DVCS (e.g. git) repo
itself.  IMHO it's likely that this workflow can fix most if not all the
crosscompilation issues, no?

In general, AFAIU it's against reproducibility to distribute
pre-generated (compiled? transpiled?) artifacts in a tarball that are
not present in the official DVCS repo, especially when tarballs are
_not_ reproducible (and they are not in likely 99.9% of cases).

> all in all, just by following my gut insctincts, i was advodating for
> building everything from git even before the exposure of this
> backdoor. in fact, i found it surprising as a guix newbie that not
> everything is built from git (or their VCS of choice).

Given the current situation so clearly exposed by the "xz backdoor"
case, this is something Guix should seriously consider.

I mean: Guix should seriously consider to drop source tarballs and
_also_ all pre-compiled artifacts distributed only via that tarballs.

I don't like this proposal, but I see no other "pragmatically possible"
solution.

AFAIU no need to rush, but I'm afraid that the class of attacks we can
call "supply-chain backdoor injection due to source tarball
pragmatically impossible verifiability" are hard to deploy but
unfortunately not _too_ hard.

[...]

Thanks! Gio'


[1] this boils down to the unfortunate fact that "reproducibility" is a
very misunderstood concept [1.1], even by some very skilled (experienced?)
programmers

[1.1] because it's strictly related to good _redistribution_ of
_trusted_ software, not to good programming

[2]
https://docs.github.com/en/actions/learn-github-actions/understanding-github-actions#runners
«each workflow run executes in a fresh, newly-provisioned virtual machine.»
see also
https://www.paloaltonetworks.com/blog/prisma-cloud/unpinnable-actions-github-security/
for security concerns about GitHub actions relying on Docker containers
used for "reproducibility" purposes.

-- 
Giovanni Biscuolo

Xelera IT Infrastructures

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 849 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-05 10:13         ` Giovanni Biscuolo
@ 2024-04-05 14:51           ` Attila Lendvai
  2024-04-13  7:42             ` Giovanni Biscuolo
  0 siblings, 1 reply; 33+ messages in thread
From: Attila Lendvai @ 2024-04-05 14:51 UTC (permalink / raw)
  To: Giovanni Biscuolo; +Cc: guix-security, Guix Devel

> Are there other issues (different from the "host cannot execute target
> binary") that makes relesase tarballs indispensable for some upstream
> projects?


i didn't mean to say that tarballs are indispensible. i just wanted to point out that it's not as simple as going through each package definition and robotically changing the source origin from tarball to git repo. it costs some effort, but i don't mean to suggest that it's not worth doing.


> So, while "almost all the world" is applying wrong solutions to the
> source tarball reproducibility problem, what can Guix do?


AFAIU the plan is straightforward: change all package definitions to point to the (git) repos of the upstream, and ignore any generated ./configure scripts if it happens to be checked into the repo.

it involves quite some work, both in quantity, and also some thinking around surprises.

i think a good first step would be to reword the packaging guidelines in the doc to strongly prefer VCS sources instead of tarballs.


> Even if We™ (ehrm) find a solution to the source tarball reproducibility
> problem (potentially allowing us to patch all the upstream makefiles
> with specific phases in our packages definitions) are we really going to
> start our own (or one managed by the reproducible build community)
> "reproducible source tarballs" repository? Is this feaseable?


but why would that be any better than simply building from git? which, i think, would even take less effort.


> > but these generated man files are part of the release tarball, so
> > cross compilation works fine using the tarball.
> 
> 
> AFAIU in this case there is an easy alternative: distribute the
> (generated) man files as code tracked in the DVCS (e.g. git) repo
> itself.


yes, that would work in this case (although, that man page is guaranteed to go stale). my proposal was to simply drop the generated man file. it adds very little value (although it's not zero; web search, etc).

-- 
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“It is easy to be conspicuously 'compassionate' if others are being forced to pay the cost.”
	— Murray N. Rothbard (1926–1995)



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-04 10:34   ` backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) Giovanni Biscuolo
                       ` (2 preceding siblings ...)
  2024-04-04 23:03     ` Ricardo Wurmus
@ 2024-04-05 16:52     ` Jan Wielkiewicz
  3 siblings, 0 replies; 33+ messages in thread
From: Jan Wielkiewicz @ 2024-04-05 16:52 UTC (permalink / raw)
  To: Giovanni Biscuolo; +Cc: Guix Devel, guix-security, Felix Lechner, Ryan Prior

On Thu, 04 Apr 2024 12:34:42 +0200
Giovanni Biscuolo <g@xelera.eu> wrote:

> Hello everybody,
> 
> I know for sure that Guix maintainers and developers are working on
> this, I'm just asking to find some time to inform and possibly discuss
> with users (also in guix-devel) on what measures GNU Guix - the
> software distribution - can/should deploy to try to avoid this kind
> of attacks.

What about integrating ClamAV into the build farms (if this isn't a
thing already)? ClamAV could scan source files and freshly-built
packages and perhaps detect obvious malware. AFAIK it can also detect
CVEs. Guix already has ClamAV packaged so this shouldn't be that hard.

--

Jan Wielkiewicz


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-04 20:32         ` Ekaitz Zarraga
@ 2024-04-10 13:57           ` Ludovic Courtès
  2024-04-11 12:43             ` Andreas Enge
  2024-04-13  6:13             ` Giovanni Biscuolo
  0 siblings, 2 replies; 33+ messages in thread
From: Ludovic Courtès @ 2024-04-10 13:57 UTC (permalink / raw)
  To: Ekaitz Zarraga; +Cc: Attila Lendvai, Giovanni Biscuolo, Guix Devel

Hi,

Ekaitz Zarraga <ekaitz@elenq.tech> skribis:

> On 2024-04-04 21:48, Attila Lendvai wrote:
>> all in all, just by following my gut insctincts, i was advodating
>> for building everything from git even before the exposure of this
>> backdoor. in fact, i found it surprising as a guix newbie that not
>> everything is built from git (or their VCS of choice).
>
> That has happened to me too.
> Why not use Git directly always?

Because it create{s,d} a bootstrapping issue.  The
“builtin:git-download” method was added only recently to guix-daemon and
cannot be assumed to be available yet:

  https://issues.guix.gnu.org/65866

> In the bootstrapping it's also a problem, as all those tools
> (autotools) must be bootstrapped, and they require other programs
> (compilers) that actually use them. And we'll be forced to use git,
> too, or at least clone the bootstrapping repos, git-archive them
> ourselves and host them properly signed. At least, we could challenge
> them using git (similar to what we do with the substitutes), which we
> cannot do right now with the release tarballs against the actual code
> of the repository.

I think we should gradually move to building everything from
source—i.e., fetching code from VCS and adding Autoconf & co. as inputs.

This has been suggested several times before.  The difficulty, as you
point out, will lie in addressing bootstrapping issues with core
packages: glibc, GCC, Binutils, Coreutils, etc.  I’m not sure how to do
that but…

> In live-bootstrap they just write the build scripts by hand, and
> ignore whatever the ./configure script says. That's also a reasonable
> way to tackle the bootstrapping, but it's a hard one. Thankfully, we
> are working together in this Bootstrapping effort so we can learn from
> them and adapt their recipes to our Guix commencement.scm module. This
> would be some effort, but it's actually doable.

… live-bootstrap can probably be a good source of inspiration to find a
way to build those core packages (or some of them) straight from a VCS
checkout.  And here the trick will be to find a way to do that in a
concise and maintainable way (generating config.h and Makefiles by hand
may prove unmaintainable in practice.)

Ludo’.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-10 13:57           ` Ludovic Courtès
@ 2024-04-11 12:43             ` Andreas Enge
  2024-04-11 12:56               ` Ekaitz Zarraga
                                 ` (2 more replies)
  2024-04-13  6:13             ` Giovanni Biscuolo
  1 sibling, 3 replies; 33+ messages in thread
From: Andreas Enge @ 2024-04-11 12:43 UTC (permalink / raw)
  To: Ludovic Courtès
  Cc: Ekaitz Zarraga, Attila Lendvai, Giovanni Biscuolo, Guix Devel

Hello,

Am Wed, Apr 10, 2024 at 03:57:20PM +0200 schrieb Ludovic Courtès:
> I think we should gradually move to building everything from
> source—i.e., fetching code from VCS and adding Autoconf & co. as inputs.

the big drawback of this approach is that we would lose maintainers'
signatures, right?

Would the suggestion to use signed tarballs, but to autoreconf the
generated files, not be a better compromise between trusting and
distrusting upstream maintainers?

Andreas



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-11 12:43             ` Andreas Enge
@ 2024-04-11 12:56               ` Ekaitz Zarraga
  2024-04-11 13:49                 ` Andreas Enge
  2024-04-12 13:09               ` Attila Lendvai
  2024-04-12 20:42               ` Ludovic Courtès
  2 siblings, 1 reply; 33+ messages in thread
From: Ekaitz Zarraga @ 2024-04-11 12:56 UTC (permalink / raw)
  To: Andreas Enge, Ludovic Courtès
  Cc: Attila Lendvai, Giovanni Biscuolo, Guix Devel

Hi,

On 2024-04-11 14:43, Andreas Enge wrote:
> Hello,
> 
> Am Wed, Apr 10, 2024 at 03:57:20PM +0200 schrieb Ludovic Courtès:
>> I think we should gradually move to building everything from
>> source—i.e., fetching code from VCS and adding Autoconf & co. as inputs.
> 
> the big drawback of this approach is that we would lose maintainers'
> signatures, right?
> 
> Would the suggestion to use signed tarballs, but to autoreconf the
> generated files, not be a better compromise between trusting and
> distrusting upstream maintainers?
> 
> Andreas
> 

Probably not, because the release tarballs might code that is not 
present in the Git history and there are not that many eyes checking 
them. This time it was autoconf, but it might be anything else.

The maintainers' machines can be hijacked too... I think it's just 
better to obtain the exact same code that is easy to find and everybody 
is reading.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-11 12:56               ` Ekaitz Zarraga
@ 2024-04-11 13:49                 ` Andreas Enge
  2024-04-11 14:05                   ` Ekaitz Zarraga
                                     ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Andreas Enge @ 2024-04-11 13:49 UTC (permalink / raw)
  To: Ekaitz Zarraga
  Cc: Ludovic Courtès, Attila Lendvai, Giovanni Biscuolo,
	Guix Devel

Am Thu, Apr 11, 2024 at 02:56:24PM +0200 schrieb Ekaitz Zarraga:
> I think it's just better to
> obtain the exact same code that is easy to find

The exact same code as what? Actually I often wonder when looking for
a project and end up with a Github repository how I could distinguish
the "original" from its clones in a VCS. With the signature by the
known (this may also be a wrong assumption, admittedly) maintainer
there is at least some form of assurance of origin.

> and everybody is reading.

This is a steep claim! I agree that nobody reads generated files in
a release tarball, but I am not sure how many other files are actually
read.

Andreas



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-11 13:49                 ` Andreas Enge
@ 2024-04-11 14:05                   ` Ekaitz Zarraga
  2024-04-13  0:14                   ` Skyler Ferris
  2024-04-13  6:50                   ` Giovanni Biscuolo
  2 siblings, 0 replies; 33+ messages in thread
From: Ekaitz Zarraga @ 2024-04-11 14:05 UTC (permalink / raw)
  To: Andreas Enge
  Cc: Ludovic Courtès, Attila Lendvai, Giovanni Biscuolo,
	Guix Devel

Hi,

>> and everybody is reading.
> 
> This is a steep claim! I agree that nobody reads generated files in
> a release tarball, but I am not sure how many other files are actually
> read.

Yea, it is. I'd also love to know how effective is the reading in a 
release tarball vs a VCS repo. Quality of the reading is also very 
important. I simply don't even try to read a tarball, not having the 
history makes the understanding very difficult. If I find a piece of 
code that seems odd, I would like to `git blame` it and see what was the 
reason for the inclusion, who included it and so on.

It's not much, but it's better than nothing. Although, I'd understand if 
you told me the history might be misleading, too.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-11 12:43             ` Andreas Enge
  2024-04-11 12:56               ` Ekaitz Zarraga
@ 2024-04-12 13:09               ` Attila Lendvai
  2024-04-12 20:42               ` Ludovic Courtès
  2 siblings, 0 replies; 33+ messages in thread
From: Attila Lendvai @ 2024-04-12 13:09 UTC (permalink / raw)
  To: Andreas Enge
  Cc: Ludovic Courtès, Ekaitz Zarraga, Giovanni Biscuolo,
	Guix Devel

> > I think we should gradually move to building everything from
> > source—i.e., fetching code from VCS and adding Autoconf & co. as inputs.
> 
> 
> the big drawback of this approach is that we would lose maintainers'
> signatures, right?


it's possible to sign git commits and (annotated) tags, too.

it's good practice to enable signing by default.

admittedly though, few people sign all their commits, and even fewer sign their tags.

-- 
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“Never appeal to a man's "better nature". He may not have one. Invoking his self-interest gives you more leverage.”
	— Robert Heinlein (1907–1988), 'Time Enough For Love' (1973)



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-11 12:43             ` Andreas Enge
  2024-04-11 12:56               ` Ekaitz Zarraga
  2024-04-12 13:09               ` Attila Lendvai
@ 2024-04-12 20:42               ` Ludovic Courtès
  2 siblings, 0 replies; 33+ messages in thread
From: Ludovic Courtès @ 2024-04-12 20:42 UTC (permalink / raw)
  To: Andreas Enge
  Cc: Ekaitz Zarraga, Attila Lendvai, Giovanni Biscuolo, Guix Devel

Hi!

Andreas Enge <andreas@enge.fr> skribis:

> Am Wed, Apr 10, 2024 at 03:57:20PM +0200 schrieb Ludovic Courtès:
>> I think we should gradually move to building everything from
>> source—i.e., fetching code from VCS and adding Autoconf & co. as inputs.
>
> the big drawback of this approach is that we would lose maintainers'
> signatures, right?

Yes.  But as Attila wrote, one can hope that they provide a way to
authenticate at least part of their VCS history, for example with signed
tags.  (Ideally everyone would use ‘guix git authenticate’ of course.)

> Would the suggestion to use signed tarballs, but to autoreconf the
> generated files, not be a better compromise between trusting and
> distrusting upstream maintainers?

IMO starting from an authenticated VCS checkout is clearer.

Ludo’.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-11 13:49                 ` Andreas Enge
  2024-04-11 14:05                   ` Ekaitz Zarraga
@ 2024-04-13  0:14                   ` Skyler Ferris
  2024-04-19 14:31                     ` Ludovic Courtès
  2024-04-13  6:50                   ` Giovanni Biscuolo
  2 siblings, 1 reply; 33+ messages in thread
From: Skyler Ferris @ 2024-04-13  0:14 UTC (permalink / raw)
  To: Andreas Enge, Ekaitz Zarraga
  Cc: Ludovic Courtès, Attila Lendvai, Giovanni Biscuolo,
	Guix Devel

Hi all,

On 4/11/24 06:49, Andreas Enge wrote:
> Am Thu, Apr 11, 2024 at 02:56:24PM +0200 schrieb Ekaitz Zarraga:
>> I think it's just better to
>> obtain the exact same code that is easy to find
> The exact same code as what? Actually I often wonder when looking for
> a project and end up with a Github repository how I could distinguish
> the "original" from its clones in a VCS. With the signature by the
> known (this may also be a wrong assumption, admittedly) maintainer
> there is at least some form of assurance of origin.

I think this assumption deserves a lot more scrutiny than it typically 
gets (this is a general statement not particular to your message; even 
the tails project gets this part of security wrong and they are 
generally diligent in their efforts). I find it difficult to download 
PGP keys with any degree of confidence. Often, I see a file with a 
signature and a key served by the same web page, all coming from the 
same server. PGP keys are only useful if the attacker compromised the 
information that the user is receiving from the web page (for example, 
by gaining control of the web server or compromising the HTTPS session). 
In the typical scenario I have encountered, the attacker would also 
replace the key and signature with ones that they generated themself.

In short, I'm not sure that we actually get any value from checking the 
PGP signature for most projects. Either HTTPS is good enough or the 
attacker won. 99% of the time HTTPS is good enough (though it is notable 
that the remaining 1% has a disproportionate impact on the affected 
population).

Some caveats:

It's difficult for me to use web of trust effectively because I haven't 
met anyone who uses PGP keys IRL. I'm ultimately trusting my internet 
connection and servers which are either semi-centralized (there are not 
that many open keyservers, it's an oligopoly for lack of a better term) 
or have the problem described above. So maybe everyone else is using web 
of trust effectively and I don't know what I'm talking about. =)

The key download could be compared to the "trust on first use" model 
that SSH uses. It's not clear to me how effective a simple text box 
saying "we rotated our keys so you need to re-download it!" would be, 
but I suspect that most people would download without a second thought. 
It might be interesting to add public keys and signature locations to 
package definitions and have Guix re-verify the signature when it 
downloads the source. This would provide more scrutiny when keys are 
rotated (because of the review process) and would prevent harm from the 
situation where the package author is re-downloading the key each time 
the software is updated.

The review process also adds a significant layer of protection because 
an attacker would need to compromise the HTTPS session of the reviewer 
in addition to the original package author (assuming that the signature 
is re-checked by the reviewer; I'm not sure how often this happens in 
practice). In principle it should be difficult for an attacker to 
predict who will be reviewing which issue. However, if the pool of 
reviewers is small it would be easier for the attacker to predict this 
or just compromise all of the reviewers. Also, if there was some way for 
the attacker to launch a general attack on people working out of the 
Guix repository then the value of this protection becomes negligible.

The above two paragraphs are somewhat at odds: if Guix has the public 
key baked in and knows where to download the signature, some reviewers 
might not double-check the key that they get from the website because 
Guix is doing it for them. On one hand, I generally think that 
automating security makes it worse because once it's automated there's a 
system of rules for attackers to manipulate. On the other hand, if we 
assume people aren't doing the things they need to then no amount of 
technical support will give us a secure system. How much is reasonable 
to expect of people? From my extremely biased perspective, it's 
difficult to say.

>> and everybody is reading.
> This is a steep claim! I agree that nobody reads generated files in
> a release tarball, but I am not sure how many other files are actually
> read.
>
> Andreas

I would guess that the level of the protection is strongly correlated 
with the popularity of the project among developers who need to add 
features or fix bugs. I don't think anybody reads a source repository 
"cover to cover", but we rummage around in the code on an as-needed 
basis. It would probably be difficult to sneak something into core 
projects like glibc or gcc, but pretty easy to sneak something into 
"emojis-but-cooler.js". It would be better to have comprehensive audits 
of all the projects, but that's not something Guix can manage by itself. 
It could make it easier to free up resources for that task, but I digress.

While it is hyperbolic to say that "with enough eyes, all bugs are 
shallow" there is a kernel of truth to it. There's a reason they hid the 
noticeably malicious macros in the release tarball.

In solidarity,
Skyler





^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-10 13:57           ` Ludovic Courtès
  2024-04-11 12:43             ` Andreas Enge
@ 2024-04-13  6:13             ` Giovanni Biscuolo
  1 sibling, 0 replies; 33+ messages in thread
From: Giovanni Biscuolo @ 2024-04-13  6:13 UTC (permalink / raw)
  To: Ludovic Courtès, Ekaitz Zarraga; +Cc: Attila Lendvai, Guix Devel

[-- Attachment #1: Type: text/plain, Size: 1998 bytes --]

Hello,

Ludovic Courtès <ludo@gnu.org> writes:

> Ekaitz Zarraga <ekaitz@elenq.tech> skribis:
>
>> On 2024-04-04 21:48, Attila Lendvai wrote:
>>> all in all, just by following my gut insctincts, i was advodating
>>> for building everything from git even before the exposure of this
>>> backdoor. in fact, i found it surprising as a guix newbie that not
>>> everything is built from git (or their VCS of choice).
>>
>> That has happened to me too.
>> Why not use Git directly always?
>
> Because it create{s,d} a bootstrapping issue.  The
> “builtin:git-download” method was added only recently to guix-daemon and
> cannot be assumed to be available yet:
>
>   https://issues.guix.gnu.org/65866

This fortunately will help a lot with the "everything built from git"
part of the "whishlist", but what about the not zero occurrences of
"other upstream VCSs"?

[...]

> I think we should gradually move to building everything from
> source—i.e., fetching code from VCS and adding Autoconf & co. as inputs.
>
> This has been suggested several times before.  The difficulty, as you
> point out, will lie in addressing bootstrapping issues with core
> packages: glibc, GCC, Binutils, Coreutils, etc.  I’m not sure how to do
> that but…

does it have to be an "all of nothing" choiche?  I mean "continue using
release tarballs" vs "use git" for "all"?

If using git is unfeaseable for bootstrapping reasons [1], why not
cointinue using release tarballs with some _extra_ verifications steps
and possibly add some automation steps to "lint" to help contributors
and committers check that there are not "quasi-binary" seeds [2] hidden
in release tarballs?

WDYT?

[...]

Grazie! Gio'



[1] or other reasons specific to a package that should be documented
when needed, at least with a comment in the package definition

[2] the autogenerated files that are not pragmatically verifiable

-- 
Giovanni Biscuolo

Xelera IT Infrastructures

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 849 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-11 13:49                 ` Andreas Enge
  2024-04-11 14:05                   ` Ekaitz Zarraga
  2024-04-13  0:14                   ` Skyler Ferris
@ 2024-04-13  6:50                   ` Giovanni Biscuolo
  2024-04-13 10:26                     ` Skyler Ferris
  2 siblings, 1 reply; 33+ messages in thread
From: Giovanni Biscuolo @ 2024-04-13  6:50 UTC (permalink / raw)
  To: Andreas Enge, Ekaitz Zarraga
  Cc: Ludovic Courtès, Attila Lendvai, Guix Devel

[-- Attachment #1: Type: text/plain, Size: 3216 bytes --]

Hello,

general reminder: please remember the specific scope of this (sub)thread

--8<---------------cut here---------------start------------->8---

 Please consider that this (sub)thread is _not_ specific to xz-utils but
 to the specific attack vector (matrix?) used to inject a backdoor in a
 binary during a build phase, in a _very_ stealthy way.

 Also, since Guix _is_ downstream, I'd like this (sub)thread to
 concentrate on what *Guix* can/should do to strenghten the build process
 /independently/ of what upstreams (or other distributions) can/should
 do.

--8<---------------cut here---------------end--------------->8---
(https://yhetil.org/guix/8734s1mn5p.fsf@xelera.eu/)

...and if needed read that message again to understand the context,
please.

Andreas Enge <andreas@enge.fr> writes:

> Am Thu, Apr 11, 2024 at 02:56:24PM +0200 schrieb Ekaitz Zarraga:
>> I think it's just better to
>> obtain the exact same code that is easy to find
>
> The exact same code as what?

Of what is contained in the official tool used by upstream to track
their code, that is the one and _only_ that is /pragmatically/ open to
scrutiny by other upstream and _downstream_ contributors.

> Actually I often wonder when looking for a project and end up with a
> Github repository how I could distinguish the "original" from its
> clones in a VCS.

Actually it's a little bit of "intelligence work" but it's something
that usually downstream should really do: have a reasonable level of
trust that the origin is really the upstream one.

But here we are /brainstormig/ about the very issue that led to the
backdoor injection, and that issue is how to avoid "backdoor injections
via build subversion exploiting semi-binary seeds in release tarballs".
(see the scope above)

> With the signature by the known (this may also be a wrong assumption,
> admittedly) maintainer there is at least some form of assurance of
> origin.

We should definitely drop the idea of "trust by autority" as a
sufficient requisite for verifiability, that is one assumption for
reproducible builds.

The XZ backdoor injection absolutely demonstrates that one and just one
_co-maintainer_ was able to hide a trojan in the _signed_ release
tarball and the payload in the git archive (as very obfuscated bynary),
so it was _the origin_ that was "infected".

It's NOT important _who_ injected the backdoor (and in _was_ upstream),
but _how_.

In other words, we need a _pragmatic_ way (possibly with helping tools)
to "challenge the upstream authority" :-)

>> and everybody is reading.
>
> This is a steep claim! I agree that nobody reads generated files in
> a release tarball, but I am not sure how many other files are actually
> read.

Let's say that at least /someone/ should be _able_ to read the files,
but in the attack we are considering /no one/ is _pragmatically_ able to
read the (auto)generated semi-binary seeds in the release tarballs.

Security is a complex system, especially when considering the entire
supply chain: let's focus on this _specific_ weakness of the supply
chain. :-)


Ciao! Gio'

-- 
Giovanni Biscuolo

Xelera IT Infrastructures

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 849 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-05 14:51           ` Attila Lendvai
@ 2024-04-13  7:42             ` Giovanni Biscuolo
  0 siblings, 0 replies; 33+ messages in thread
From: Giovanni Biscuolo @ 2024-04-13  7:42 UTC (permalink / raw)
  To: Attila Lendvai; +Cc: guix-security, Guix Devel

[-- Attachment #1: Type: text/plain, Size: 1552 bytes --]

Hi Attila,

sorry for the delay in my reply,

I'm asking myself if this (sub)thread should be "condensed" in a
dedicated RFC (are RFCs official workflows in Guix, now?); if so, I
volunteer to file such an RFC in the next weeks.

Attila Lendvai <attila@lendvai.name> writes:

>> Are there other issues (different from the "host cannot execute target
>> binary") that makes relesase tarballs indispensable for some upstream
>> projects?
>
>
> i didn't mean to say that tarballs are indispensible. i just wanted to
> point out that it's not as simple as going through each package
> definition and robotically changing the source origin from tarball to
> git repo. it costs some effort, but i don't mean to suggest that it's
> not worth doing.

OK understood thanks!

[...]

> i think a good first step would be to reword the packaging guidelines
> in the doc to strongly prefer VCS sources instead of tarballs.

I agree.

>> Even if We™ (ehrm) find a solution to the source tarball reproducibility
>> problem (potentially allowing us to patch all the upstream makefiles
>> with specific phases in our packages definitions) are we really going to
>> start our own (or one managed by the reproducible build community)
>> "reproducible source tarballs" repository? Is this feaseable?
>
> but why would that be any better than simply building from git? which,
> i think, would even take less effort.

I agree, I was just brainstorming.

[...]

Thanks, Gio'

-- 
Giovanni Biscuolo

Xelera IT Infrastructures

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 849 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-13  6:50                   ` Giovanni Biscuolo
@ 2024-04-13 10:26                     ` Skyler Ferris
  2024-04-13 12:47                       ` Giovanni Biscuolo
  0 siblings, 1 reply; 33+ messages in thread
From: Skyler Ferris @ 2024-04-13 10:26 UTC (permalink / raw)
  To: Giovanni Biscuolo, Andreas Enge, Ekaitz Zarraga
  Cc: Ludovic Courtès, Attila Lendvai, Guix Devel

Hi again,

On 4/12/24 23:50, Giovanni Biscuolo wrote:
> Hello,
>
> general reminder: please remember the specific scope of this (sub)thread
>
> --8<---------------cut here---------------start------------->8---
>
>   Please consider that this (sub)thread is _not_ specific to xz-utils but
>   to the specific attack vector (matrix?) used to inject a backdoor in a
>   binary during a build phase, in a _very_ stealthy way.
>
>   Also, since Guix _is_ downstream, I'd like this (sub)thread to
>   concentrate on what *Guix* can/should do to strenghten the build process
>   /independently/ of what upstreams (or other distributions) can/should
>   do.
>
> --8<---------------cut here---------------end--------------->8---
> (https://yhetil.org/guix/8734s1mn5p.fsf@xelera.eu/)
>
> ...and if needed read that message again to understand the context,
> please.
>
>
I assume that this was an indirect response to the email I sent 
previously where I discussed the problems with PGP signatures on release 
files. I believe that this was in scope because of the discussion about 
whether to use VCS checkouts which lack signatures or release tarballs 
which have signatures. If the signatures on the release tarballs are not 
providing us with additional confidence then we are not losing anything 
by switching to the VCS checkout. Analysis of the effectiveness of what 
upstream projects are doing is relevant when trying to determine what we 
are capable of doing. I also pointed out that a change to Guix such as 
adding signature metadata to packages could help make up for problems 
with upstream workflows and how the review process provides additional 
confidence, demonstrating how this analysis is relevant to what to 
currently/could possibly do. Please let me know if you think that this 
is incorrect.

Additionally, I need to correct something that I previously said. I 
stated this:

On 4/12/24 17:14, Skyler Ferris wrote:
> even the tails project gets this part of security wrong and they are generally diligent in their efforts

Without first double-checking the current state of the project. While 
this was true at one point, they have since updated their website and 
clearly explain the problem and what their new verification method is 
able to protect against at 
https://tails.net/contribute/design/download_verification/. I apologize 
for disseminating outdated information.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-13 10:26                     ` Skyler Ferris
@ 2024-04-13 12:47                       ` Giovanni Biscuolo
  2024-04-14 16:22                         ` Skyler Ferris
  0 siblings, 1 reply; 33+ messages in thread
From: Giovanni Biscuolo @ 2024-04-13 12:47 UTC (permalink / raw)
  To: Skyler Ferris; +Cc: Guix Devel

[-- Attachment #1: Type: text/plain, Size: 1360 bytes --]

Hello Skyler,

Skyler Ferris <skyvine@protonmail.com> writes:

> On 4/12/24 23:50, Giovanni Biscuolo wrote:

>> general reminder: please remember the specific scope of this (sub)thread

[...]

>> (https://yhetil.org/guix/8734s1mn5p.fsf@xelera.eu/)
>>
>> ...and if needed read that message again to understand the context,
>> please.
>>
> I assume that this was an indirect response to the email I sent 
> previously where I discussed the problems with PGP signatures on release 
> files.

No, believe me! I'm sorry I gave you this impression. :-)

> I believe that this was in scope

To be clear: not only I did not mean to say - even indirectly - that you
where out of scope _or_ that you did not understand the context.

Also, I really did not mean to /appear/ as the "coordinator" of this
(sub)thread and even less to /appear/ as the one who decides what's in
scope and what's OT; obviously everyone is absolutely free to decide
what is in scope and that she or he understood the context .

> because of the discussion about whether to use VCS checkouts which
> lack signatures or release tarballs which have signatures.

I still have not commented what you discussed just because I lack time,
not interest;  if I can I'll do it ASAP™ :-(

[...]

Thanks! Gio'

-- 
Giovanni Biscuolo

Xelera IT Infrastructures

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 849 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-13 12:47                       ` Giovanni Biscuolo
@ 2024-04-14 16:22                         ` Skyler Ferris
  0 siblings, 0 replies; 33+ messages in thread
From: Skyler Ferris @ 2024-04-14 16:22 UTC (permalink / raw)
  To: Giovanni Biscuolo; +Cc: Guix Devel

On 4/13/24 05:47, Giovanni Biscuolo wrote:
> Hello Skyler,
>
> Skyler Ferris <skyvine@protonmail.com> writes:
>
>> On 4/12/24 23:50, Giovanni Biscuolo wrote:
>>> general reminder: please remember the specific scope of this (sub)thread
> [...]
>
>>> (https://yhetil.org/guix/8734s1mn5p.fsf@xelera.eu/)
>>>
>>> ...and if needed read that message again to understand the context,
>>> please.
>>>
>> I assume that this was an indirect response to the email I sent
>> previously where I discussed the problems with PGP signatures on release
>> files.
> No, believe me! I'm sorry I gave you this impression. :-)
>
>> I believe that this was in scope
> To be clear: not only I did not mean to say - even indirectly - that you
> where out of scope _or_ that you did not understand the context.
>
> Also, I really did not mean to /appear/ as the "coordinator" of this
> (sub)thread and even less to /appear/ as the one who decides what's in
> scope and what's OT; obviously everyone is absolutely free to decide
> what is in scope and that she or he understood the context .
>
>> because of the discussion about whether to use VCS checkouts which
>> lack signatures or release tarballs which have signatures.
> I still have not commented what you discussed just because I lack time,
> not interest;  if I can I'll do it ASAP™ :-(
>
> [...]
>
> Thanks! Gio'
>
Thanks for clarifying! Misunderstandings happen sometimes. I look 
forward to hearing  your thoughts if you're able to find time to share 
them! =)



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils)
  2024-04-13  0:14                   ` Skyler Ferris
@ 2024-04-19 14:31                     ` Ludovic Courtès
  0 siblings, 0 replies; 33+ messages in thread
From: Ludovic Courtès @ 2024-04-19 14:31 UTC (permalink / raw)
  To: Skyler Ferris
  Cc: Andreas Enge, Ekaitz Zarraga, Attila Lendvai, Giovanni Biscuolo,
	Guix Devel

Hi,

Skyler Ferris <skyvine@protonmail.com> skribis:

> In short, I'm not sure that we actually get any value from checking the 
> PGP signature for most projects. Either HTTPS is good enough or the 
> attacker won. 99% of the time HTTPS is good enough (though it is notable 
> that the remaining 1% has a disproportionate impact on the affected 
> population).

When checking PGP signatures, you end up with a trust-on-first-use
model: the first time, you download a PGP key that you know nothing
about and you authenticate code against that, which gives no
information.

On subsequent releases though, you can ensure (ideally) that releases
still originates from the same party.

HTTPS has nothing to do with that: it just proves that the web server
holds a valid certificate for its domain name.

But really, the gold standard, if I dare forego any form of modesty, is
the ‘.guix-authorizations’ model as it takes care of key distribution as
well as authorization delegation and revocation.

  https://doi.org/10.22152/programming-journal.org/2023/7/1

Ludo’.


^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2024-04-19 14:32 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-29 20:57 Backdoor in upstream xz-utils John Kehayias
2024-03-29 17:51 ` Ryan Prior
2024-03-29 20:39   ` Felix Lechner via Development of GNU Guix and the GNU System distribution.
2024-03-29 20:55     ` Tomas Volf
2024-03-30 21:02       ` Ricardo Wurmus
2024-04-04 10:34   ` backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) Giovanni Biscuolo
2024-04-04 15:12     ` Attila Lendvai
2024-04-04 16:47       ` Giovanni Biscuolo
2024-04-04 15:47     ` Giovanni Biscuolo
2024-04-04 19:48       ` Attila Lendvai
2024-04-04 20:32         ` Ekaitz Zarraga
2024-04-10 13:57           ` Ludovic Courtès
2024-04-11 12:43             ` Andreas Enge
2024-04-11 12:56               ` Ekaitz Zarraga
2024-04-11 13:49                 ` Andreas Enge
2024-04-11 14:05                   ` Ekaitz Zarraga
2024-04-13  0:14                   ` Skyler Ferris
2024-04-19 14:31                     ` Ludovic Courtès
2024-04-13  6:50                   ` Giovanni Biscuolo
2024-04-13 10:26                     ` Skyler Ferris
2024-04-13 12:47                       ` Giovanni Biscuolo
2024-04-14 16:22                         ` Skyler Ferris
2024-04-12 13:09               ` Attila Lendvai
2024-04-12 20:42               ` Ludovic Courtès
2024-04-13  6:13             ` Giovanni Biscuolo
2024-04-05 10:13         ` Giovanni Biscuolo
2024-04-05 14:51           ` Attila Lendvai
2024-04-13  7:42             ` Giovanni Biscuolo
2024-04-04 23:03     ` Ricardo Wurmus
2024-04-05  7:06       ` Giovanni Biscuolo
2024-04-05  7:39         ` Ricardo Wurmus
2024-04-05 16:52     ` Jan Wielkiewicz
2024-03-31 15:04 ` Backdoor in upstream xz-utils Rostislav Svoboda

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).