* Re: Backdoor in upstream xz-utils @ 2024-03-29 20:57 John Kehayias 2024-03-29 17:51 ` Ryan Prior 2024-03-31 15:04 ` Backdoor in upstream xz-utils Rostislav Svoboda 0 siblings, 2 replies; 34+ messages in thread From: John Kehayias @ 2024-03-29 20:57 UTC (permalink / raw) To: Felix Lechner; +Cc: Ryan Prior, Guix Devel, guix-security -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 -----BEGIN PGP SIGNATURE----- iQJRBAEBCgA7FiEEpCB7VsJVEJ8ssxV+SZCXrl6oFdkFAmYHK0sdHGpvaG4ua2Vo YXlpYXNAcHJvdG9ubWFpbC5jb20ACgkQSZCXrl6oFdkFRA//WaJMegtHd88wlq0V QovAYD7+d6zj5DxgVTiGKXckyKWx7AceVJ0WVp9MB+WxU8dEXepEnd9AHOA4v/Fb HLy4prms+noIpXqHW5y6EDgbMiBUX2rk6UVq7qnLCPujfv3hrJl4S7B5fJxjLSM/ M++F40YKc6PNSjQHi9BH5+Vl70jGCIzXNcomvEanu4SAsXLSlEwvOlnAPD57mb4k n4Tg4d7ExXjdi7/qdq/OnF2RGQjiLQ4qX7AeSu8kIaEaK3WdMy1JO1fy9vaZNuSg oCuUGJYCFj60BEYDQdUM8NiNe76zVzXvP/wKrR1XpqsnK9keKKEZpuZCQmJApgCJ dvVbrU8OfKPJ/B7CwNJu32FyrdgQt53ytYjNxs/cNNjB2ciDeIGszCzxwytRZz4k JEbE8VZrUACNvQXCdRbr1Jse1+FuM2hjTwILdia/A8GcWn9tfmfGdqlqOuw6c8qG hYX7l3+3t0c7VzLhgs2iE/BEKtUAYCrwRf+10J9dOm4TzmbEbg7+1j7FJcYhmIgJ qeEXistWXx7FY2Yl0UjrNtxi3UGR5rnx2hAb3zEcMoqcHHKuKz/X8aeMfIHryn23 rQms/cVwAPeR908xwbJgqkzQhY5A9DrU+0VGssILyXKvMYp6xTXJ6cf2gGLyhAFF VerlLVFCEHunNyWr94ZTeXr3p00= =dUKI -----END PGP SIGNATURE----- Hi Ryan, Felix, and guix-devel, On Fri, Mar 29, 2024 at 01:39 PM, Felix Lechner via Reports of security issues in Guix itself and in packages provided by Guix wrote: > Hi Ryan, > > On Fri, Mar 29 2024, Ryan Prior wrote: > >> I'm reading today that a backdoor is present in xz's upstream tarball >> (but not in git), starting at version 5.6.0. Source: >> <https://www.openwall.com/lists/oss-security/2024/03/29/4> > > Thanks for sending this! This is an extremely serious vulnerability > with criminal intent. I cc'd guix-security@gnu.org just in case you > haven't. > At least me (as part of guix-security) is aware and have been reading the analysis and further investigation. Both clever and interesting, but also worrisome. I think we were rather lucky this was found relatively quickly, though it seems to point to a bad actor and throws into question other projects (like libarchive) which have contributions from the same identity. Likely other accounts are involved too, so maybe on a positive side this unravels other issues. The discussion on Hacker News has also been informative (though rather long now): <https://news.ycombinator.com/item?id=39865810> >> Guix currently packages xz-utils 5.2.8 as "xz" using the upstream >> tarball. [...] Should we switch from using upstream tarballs to some >> fork with more responsible maintainers? > > Guix's habit of building from tarballs is a poor idea because tarballs > often differ. For example, maintainers may choose to ship a ./configure > script that is otherwise not present in Git (although a configure.ac > might be). Guix should build from Git. > We discussed a bit on #guix today about this. A movement to sourcing more directly from Git in general has been discussed before, though has some hurdles. I will let someone more knowledgeable about the details chime in, but yes, something we should do. Unfortunately in this case, while it seems the older versions don't have *this* exploit, given the perpetrator either is or has control over a maintainer account, it throws into question a lot more than the most recent version. We will have to keep a careful eye on this. I'm not currently aware of anything untoward for our current version, so far. >> Is there a way we can blacklist known bad versions? > I'm not sure what you mean, but I don't think so. The main danger is in guix time-machine to the past, as you are (purposefully) going to older versions of software. This is warned in the manual <https://guix.gnu.org/en/manual/devel/en/html_node/Invoking-guix-time_002dmachine.html> though we should perhaps do this at runtime as well. Even better would be if we can warn about known bad versions. Such a tool was started (guix health) here: <https://issues.guix.gnu.org/31444> Anyone up for reviving it, now that we have some changes that should make this more doable (based on a quick glance of more recent messages)? > Having said all that, I am not sure Guix is affected. > > On my systems, the 'detect.sh' script shows no referece to liblzma in > sshd. Everyone, please send additional reports. > Pretty sure we are not affected, at least with what is known: the exploit targets particular systems and things like argv[0] being /usr/sbin/sshd. A combination perhaps of who or what was being targeted as well as trying to make this harder to discover. Still, we should have an abundance of caution and pay close attention, as there is much we don't know and a history of commits to go through. As well as being suspicious in general of things like binary files added to a release tarball (as a project we always try to make sure there are no binary files anyway), this is a clear example of a clever/malicious way of causing harm. Please do feel free to report privately any concerns or potential affected packages to guix-security@gnu.org as well. And if you are interested in helping with these things, I'm sure we could rotate in some people for that team. Thanks all! An action-packed Friday. John ^ permalink raw reply [flat|nested] 34+ messages in thread
* Backdoor in upstream xz-utils @ 2024-03-29 17:51 ` Ryan Prior 2024-03-29 20:39 ` Felix Lechner via Development of GNU Guix and the GNU System distribution. 2024-04-04 10:34 ` backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) Giovanni Biscuolo 0 siblings, 2 replies; 34+ messages in thread From: Ryan Prior @ 2024-03-29 17:51 UTC (permalink / raw) To: Guix Devel [-- Attachment #1: Type: text/plain, Size: 407 bytes --] I'm reading today that a backdoor is present in xz's upstream tarball (but not in git), starting at version 5.6.0. Source: https://www.openwall.com/lists/oss-security/2024/03/29/4 Guix currently packages xz-utils 5.2.8 as "xz" using the upstream tarball. Is there a way we can blacklist known bad versions? Should we switch from using upstream tarballs to some fork with more responsible maintainers? Ryan [-- Attachment #2: Type: text/html, Size: 1280 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Backdoor in upstream xz-utils 2024-03-29 17:51 ` Ryan Prior @ 2024-03-29 20:39 ` Felix Lechner via Development of GNU Guix and the GNU System distribution. 2024-03-29 20:55 ` Tomas Volf 2024-04-04 10:34 ` backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) Giovanni Biscuolo 1 sibling, 1 reply; 34+ messages in thread From: Felix Lechner via Development of GNU Guix and the GNU System distribution. @ 2024-03-29 20:39 UTC (permalink / raw) To: Ryan Prior, Guix Devel; +Cc: guix-security Hi Ryan, On Fri, Mar 29 2024, Ryan Prior wrote: > I'm reading today that a backdoor is present in xz's upstream tarball > (but not in git), starting at version 5.6.0. Source: > https://www.openwall.com/lists/oss-security/2024/03/29/4 Thanks for sending this! This is an extremely serious vulnerability with criminal intent. I cc'd guix-security@gnu.org just in case you haven't. > Guix currently packages xz-utils 5.2.8 as "xz" using the upstream > tarball. [...] Should we switch from using upstream tarballs to some > fork with more responsible maintainers? Guix's habit of building from tarballs is a poor idea because tarballs often differ. For example, maintainers may choose to ship a ./configure script that is otherwise not present in Git (although a configure.ac might be). Guix should build from Git. > Is there a way we can blacklist known bad versions? Having said all that, I am not sure Guix is affected. On my systems, the 'detect.sh' script shows no referece to liblzma in sshd. Everyone, please send additional reports. Kind regards Felix ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Backdoor in upstream xz-utils 2024-03-29 20:39 ` Felix Lechner via Development of GNU Guix and the GNU System distribution. @ 2024-03-29 20:55 ` Tomas Volf 2024-03-30 21:02 ` Ricardo Wurmus 0 siblings, 1 reply; 34+ messages in thread From: Tomas Volf @ 2024-03-29 20:55 UTC (permalink / raw) To: Felix Lechner; +Cc: Ryan Prior, Guix Devel, guix-security [-- Attachment #1: Type: text/plain, Size: 777 bytes --] Hello, On 2024-03-29 13:39:59 -0700, Felix Lechner via Development of GNU Guix and the GNU System distribution. wrote: > > Is there a way we can blacklist known bad versions? > > Having said all that, I am not sure Guix is affected. > > On my systems, the 'detect.sh' script shows no referece to liblzma in > sshd. Everyone, please send additional reports. If nothing else, our xz is at 5.2.8. I think the question was if there is a way to blacklist specific known tarball to ensure no-one updates to it by accident. (I do not believe Guix would be vulnerable even when built from the malicious tarball, but that is a separate issue.) Have a nice day, Tomas -- There are only two hard things in Computer Science: cache invalidation, naming things and off-by-one errors. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Backdoor in upstream xz-utils 2024-03-29 20:55 ` Tomas Volf @ 2024-03-30 21:02 ` Ricardo Wurmus 0 siblings, 0 replies; 34+ messages in thread From: Ricardo Wurmus @ 2024-03-30 21:02 UTC (permalink / raw) To: Tomas Volf; +Cc: Felix Lechner, Ryan Prior, guix-security, guix-devel Tomas Volf <~@wolfsden.cz> writes: > On 2024-03-29 13:39:59 -0700, Felix Lechner via Development of GNU Guix and the GNU System distribution. wrote: >> > Is there a way we can blacklist known bad versions? >> >> Having said all that, I am not sure Guix is affected. >> >> On my systems, the 'detect.sh' script shows no referece to liblzma in >> sshd. Everyone, please send additional reports. > > If nothing else, our xz is at 5.2.8. I think the question was if there is a way > to blacklist specific known tarball to ensure no-one updates to it by accident. The properties field on a package definition can be used to record arbitrary information, which could be read by `guix lint`. -- Ricardo ^ permalink raw reply [flat|nested] 34+ messages in thread
* backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-03-29 17:51 ` Ryan Prior 2024-03-29 20:39 ` Felix Lechner via Development of GNU Guix and the GNU System distribution. @ 2024-04-04 10:34 ` Giovanni Biscuolo 2024-04-04 15:12 ` Attila Lendvai ` (3 more replies) 1 sibling, 4 replies; 34+ messages in thread From: Giovanni Biscuolo @ 2024-04-04 10:34 UTC (permalink / raw) To: Guix Devel, guix-security; +Cc: Felix Lechner, Ryan Prior [-- Attachment #1: Type: text/plain, Size: 11907 bytes --] Hello everybody, I know for sure that Guix maintainers and developers are working on this, I'm just asking to find some time to inform and possibly discuss with users (also in guix-devel) on what measures GNU Guix - the software distribution - can/should deploy to try to avoid this kind of attacks. Please consider that this (sub)thread is _not_ specific to xz-utils but to the specific attack vector (matrix?) used to inject a backdoor in a binary during a build phase, in a _very_ stealthy way. Also, since Guix _is_ downstream, I'd like this (sub)thread to concentrate on what *Guix* can/should do to strenghten the build process /independently/ of what upstreams (or other distributions) can/should do. First of all, I understand the xz backdoor attack was complex (both socially and technically) and all the details are still under scrutiny, but AFAIU the way the backdoor has been injected by "infecting" the **build phase** of the software (and obfuscating the payload in binaries) is very alarming and is something all distributions aiming at reproducible builds must (and they actually _are_) examine(ing) very well. John Kehayias <john.kehayias@protonmail.com> writes: [...] > On Fri, Mar 29, 2024 at 01:39 PM, Felix Lechner via Reports of security issues in Guix itself and in packages provided by Guix wrote: > >> Hi Ryan, >> >> On Fri, Mar 29 2024, Ryan Prior wrote: [...] >>> Guix currently packages xz-utils 5.2.8 as "xz" using the upstream >>> tarball. [...] Should we switch from using upstream tarballs to some >>> fork with more responsible maintainers? >> >> Guix's habit of building from tarballs is a poor idea because tarballs >> often differ. First of all: is to be considered reproducible a software that produces different binaries if compiled from the source code repository (git or something else managed) or from the official released source tarball? My first thought is no. >> For example, maintainers may choose to ship a ./configure script that >> is otherwise not present in Git (although a configure.ac might be). >> Guix should build from Git. Two useful pointers explaining how the backdoor has been injected are [1] (general workflow) and [2] (payload obfuscation) The first and *indispensable* condition for the attack to be succesful is this: --8<---------------cut here---------------start------------->8--- * The release tarballs upstream publishes don't have the same code that GitHub has. This is common in C projects so that downstream consumers don't need to remember how to run autotools and autoconf. The version of build-to-host.m4 in the release tarballs differs wildly from the upstream on GitHub. [...] * Explain dist tarballs, why we use them, what they do, link to autotools docs, etc * "Explaining the history of it would be very helpful I think. It also explains how a single person was able to insert code in an open source project that no one was able to peer review. It is pragmatically impossible, even if technically possible once you know the problem is there, to peer review a tarball prepared in this manner." --8<---------------cut here---------------end--------------->8--- (from [1]) Let me highlight this: «It is pragmatically impossible [...] to peer review a tarball prepared in this manner.» There is no doubt that the release tarball is a very weak "trusted source" (trusted by peer review, not by authority) than the upstream DVCS repository. It's *very* noteworthy that the backdoor was discovered thanks to a performance issue and _not_ during a peer review of the source code... the _build_ code *is* source code, no? It's not the first time a source release tarball of free software is compromised [3], but the way the compromise worked in this case is something new (or at least never spetted before, right?). > We discussed a bit on #guix today about this. A movement to sourcing > more directly from Git in general has been discussed before, though > has some hurdles. Please could someone knowledgeable about the details describe what are the hurdles about sourcing from DVCS (eventually other than git)? > I will let someone more knowledgeable about the details chime in, but > yes, something we should do. I'm definitely _not_ the knowledgeable one, but I'd like to share the result of my researches. Is it possible to enhance our build-system(s) (e.g. gnu-build-system) so thay can /ignore/ pre-built .m4 or similar script and rebuild them during the build process? Richard W.M. Jones on fedora-devel ML proposed [4]: --8<---------------cut here---------------start------------->8--- (1) We should routinely delete autoconf-generated cruft from upstream projects and regenerate it in %prep. It is easier to study the real source rather than dig through the convoluted, generated shell script in an upstream './configure' looking for back doors. For most projects, just running "autoreconf - fiv" is enough. --8<---------------cut here---------------end--------------->8--- There is an interesting bug report [5] about autoreconf: --8<---------------cut here---------------start------------->8--- While analyzing the recent xz backdoor hook into the build system [A], I noticed that one of the aspects why the hook worked was because it seems like «autoreconf -f -i» (that is run in Debian as part of dh-autoreconf via dh) still seems to take the serial into account, which was bumped in the tampered .m4 file. If either the gettext.m4 had gotten downgraded (to the version currently in Debian, which would not have pulled the tampered build-to-host.m4), or once Debian upgrades gettext, the build-to-host.m4 would get downgraded to the upstream clean version, then the hook would have been disabled and the backdoor would be inert. (Of course at that point the malicious actor would have found another way to hook into the build system, but the less avenues there are the better.) I've tried to search the list and checked for old bug reports on the debbugs.gnu.org site, but didn't notice anything. To me this looks like a very unexpected behavior, but it's not clear whether this is intentional or a bug. In any case regardless of either position, it would be good to improve this (either by fixing --force to force things even if downgrading, or otherwise perhaps to add a new option to really force everything). --8<---------------cut here---------------end--------------->8--- So AFAIU using a fixed "autoreconf -fi" should mitigate the risks of tampered .m4 macros (and other possibly tampered build configuration script)? IMHO "ignoring" (deleting) pre-built build scripts in Guix build-system(s) should be considered... or is /already/ so? Also, I found this thread [6] interesting, especially this message [7] from Jacob Bachmeyer: --8<---------------cut here---------------start------------->8--- The *user* could catch issues like this backdoor, since the backdoor appears (based on what I have read so far) to materialize certain object files while configure is running, while `find . -iname '*.o'` /should/ return nothing before make is run. This also suggests that running "make clean" after configure would kill at least this backdoor. --8<---------------cut here---------------end--------------->8--- Something to apply in Guix gnu-build-system? He also writes: --8<---------------cut here---------------start------------->8--- A *very* observant (unreasonably so) user might notice that "make" did not build the objects that the backdoor provided. --8<---------------cut here---------------end--------------->8--- Is there a way to enhance gnu-build-system in order to make it notice that some object was not build by make? He then goes on explaining: --8<---------------cut here---------------start------------->8--- Of course, an attacker could sneak around this as well by moving the process for unpacking the backdoor object to a Makefile rule, but that is more likely to "stick out" to an observant user, as well as being an easy target for automated analysis ("Which files have 'special' rules?") since you cannot obfuscate those from make(1) and expect them to still work. --8<---------------cut here---------------end--------------->8--- Given the above observation that «it is pragmatically impossible [...] to peer review a tarball prepared in this manner», I strongly doubt that a possible Makefile tampering _in_the_release_tarball_ is easy to peer review; I'd ask: is it feaseable such an "automated analysis" (see above) in a dedicated build-system phase? Anyway I'm asking myself: a *possibly different from the official code in a DVCS* release tarball with a *valid* GPG signature (please see [3]) would have been really peer reviewed or is it «pragmatically impossible»? In other words: what if the backdoor was injected directly in the source code of the *official* release tarball signed with a valid GPG signature (and obviously with a valid sha256 hash)? Do upstream developer communities peer review release tarballs or they "just" peer review the code in the official DVCS? Also, in (info "(guix) origin Reference") I see that Guix packages can have a list of uri(s) for the origin of source code, see xz as an example [7]: are they intended to be multiple independent sources to be compared in order to prevent possible tampering or are they "just" alternatives to be used if the first listed uri is unavailable? If the case is the first, a solution would be to specify multiple independent release tarballs for each package, so that it would be harder to copromise two release sources, but that is not something under Guix control. All in all: should we really avoid the "pragmatically impossible to be peer reviewed" release tarballs? WDYT? Happy hacking! Gio' [...] [1] https://gist.github.com/thesamesam/223949d5a074ebc3dce9ee78baad9e27 «FAQ on the xz-utils backdoor (CVE-2024-3094)» (costantly updated) [2] https://gynvael.coldwind.pl/?lang=en&id=782 «xz/liblzma: Bash-stage Obfuscation Explained» [3] e.g. https://web.archive.org/web/20110708023004/http://www.h-online.com/open/news/item/Vsftpd-backdoor-discovered-in-source-code-update-1272310.html «Vsftpd backdoor discovered in source code - update» "a bad tarball had been downloaded from the vsftpd master site with an invalid GPG signature" [4] https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/YWMNOEJ34Q7QLBWQAB5TM6A2SVJFU4RV/ «Three steps we could take to make supply chain attacks a bit harder» [5] https://lists.gnu.org/archive/html/bug-autoconf/2024-03/msg00000.html [6] https://lists.gnu.org/archive/html/automake/2024-03/msg00007.html «GNU Coding Standards, automake, and the recent xz-utils backdoor» [7] https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages/compression.scm#n494 --8<---------------cut here---------------start------------->8--- (define-public xz (package (name "xz") (version "5.2.8") (source (origin (method url-fetch) (uri (list (string-append "http://tukaani.org/xz/xz-" version ".tar.gz") (string-append "http://multiprecision.org/guix/xz-" version ".tar.gz"))) --8<---------------cut here---------------end--------------->8--- P.S.: in a way, I see this kind of attack is exploiting a form of statefulness of the build system, in this case "build-to-host.m4" was /status/; I think that (also) build systems should be stateless and Guix is doing a great job to reach this goal. -- Giovanni Biscuolo Xelera IT Infrastructures [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 849 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-04 10:34 ` backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) Giovanni Biscuolo @ 2024-04-04 15:12 ` Attila Lendvai 2024-04-04 16:47 ` Giovanni Biscuolo 2024-04-04 15:47 ` Giovanni Biscuolo ` (2 subsequent siblings) 3 siblings, 1 reply; 34+ messages in thread From: Attila Lendvai @ 2024-04-04 15:12 UTC (permalink / raw) To: Giovanni Biscuolo; +Cc: Guix Devel, guix-security, Felix Lechner, Ryan Prior > Also, in (info "(guix) origin Reference") I see that Guix packages can have a > list of uri(s) for the origin of source code, see xz as an example [7]: > are they intended to be multiple independent sources to be compared in > order to prevent possible tampering or are they "just" alternatives to > be used if the first listed uri is unavailable? a source origin is identified by its cryptographic hash (stored in its sha256 field); i.e. it doesn't matter *where* the source archive was acquired from. if the hash matches the one in the package definition, then it's the same archive that the guix packager has seen while packaging. -- • attila lendvai • PGP: 963F 5D5F 45C7 DFCD 0A39 -- “We’ll know our disinformation program is complete when everything the American public believes is false.” — William Casey (1913–1987), the director of CIA 1981-1987 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-04 15:12 ` Attila Lendvai @ 2024-04-04 16:47 ` Giovanni Biscuolo 0 siblings, 0 replies; 34+ messages in thread From: Giovanni Biscuolo @ 2024-04-04 16:47 UTC (permalink / raw) To: Attila Lendvai; +Cc: Guix Devel, guix-security [-- Attachment #1: Type: text/plain, Size: 987 bytes --] Hi Attila, Attila Lendvai <attila@lendvai.name> writes: >> Also, in (info "(guix) origin Reference") I see that Guix packages >> can have a list of uri(s) for the origin of source code, see xz as an >> example [7]: are they intended to be multiple independent sources to >> be compared in order to prevent possible tampering or are they "just" >> alternatives to be used if the first listed uri is unavailable? > > a source origin is identified by its cryptographic hash (stored in its > sha256 field); i.e. it doesn't matter *where* the source archive was > acquired from. if the hash matches the one in the package definition, > then it's the same archive that the guix packager has seen while > packaging. Ehrm, you are right, mine was a stupid question :-) We *are* already verifying that tarballs had not been tampered with... by other people but the release manager :-( [...] Happy hacking! Gio' -- Giovanni Biscuolo Xelera IT Infrastructures [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 849 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-04 10:34 ` backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) Giovanni Biscuolo 2024-04-04 15:12 ` Attila Lendvai @ 2024-04-04 15:47 ` Giovanni Biscuolo 2024-04-04 19:48 ` Attila Lendvai 2024-04-04 23:03 ` Ricardo Wurmus 2024-04-05 16:52 ` Jan Wielkiewicz 3 siblings, 1 reply; 34+ messages in thread From: Giovanni Biscuolo @ 2024-04-04 15:47 UTC (permalink / raw) To: Guix Devel [-- Attachment #1: Type: text/plain, Size: 9052 bytes --] Hello, a couple of additional (IMO) useful resources... Giovanni Biscuolo <g@xelera.eu> writes: [...] > Let me highlight this: «It is pragmatically impossible [...] to peer > review a tarball prepared in this manner.» > > There is no doubt that the release tarball is a very weak "trusted > source" (trusted by peer review, not by authority) than the upstream > DVCS repository. This kind of attack was described by Daniel Stenberg in his «HOWTO backdoor curl» article in 2021.03.30 as "skip-git-altogether" method: https://daniel.haxx.se/blog/2021/03/30/howto-backdoor-curl/ --8<---------------cut here---------------start------------->8--- The skip-git-altogether methods As I’ve described above, it is really hard even for a skilled developer to write a backdoor and have that landed in the curl git repository and stick there for longer than just a very brief period. If the attacker instead can just sneak the code directly into a release archive then it won’t appear in git, it won’t get tested and it won’t get easily noticed by team members! curl release tarballs are made by me, locally on my machine. After I’ve built the tarballs I sign them with my GPG key and upload them to the curl.se origin server for the world to download. (Web users don’t actually hit my server when downloading curl. The user visible web site and downloads are hosted by Fastly servers.) An attacker that would infect my release scripts (which btw are also in the git repository) or do something to my machine could get something into the tarball and then have me sign it and then create the “perfect backdoor” that isn’t detectable in git and requires someone to diff the release with git in order to detect – which usually isn’t done by anyone that I know of. [...] I of course do my best to maintain proper login sanitation, updated operating systems and use of safe passwords and encrypted communications everywhere. But I’m also a human so I’m bound to do occasional mistakes. Another way could be for the attacker to breach the origin download server and replace one of the tarballs there with an infected version, and hope that people skip verifying the signature when they download it or otherwise notice that the tarball has been modified. I do my best at maintaining server security to keep that risk to a minimum. Most people download the latest release, and then it’s enough if a subset checks the signature for the attack to get revealed sooner rather than later. --8<---------------cut here---------------end--------------->8--- Unfortunately Stenberg in that section misses one attack vector he mentioned in a previous article section named "The tricking a user method": --8<---------------cut here---------------start------------->8--- We can even include more forced “convincing” such as direct threats against persons or their families: “push this code or else…”. This way of course cannot be protected against using 2fa, better passwords or things like that. --8<---------------cut here---------------end--------------->8--- ...and an attack vector involving more subltle ways (let's call it distributed social engineering) to convince the upstream developer and other contributors and/or third parties they need a project co-maintainer authorized to publish _official_ release tarballs. Following Stenberg's attacks classification, since the supply-chain attack was intended to install a backdoor in the _sshd_ service, and _not_ in xz-utils or liblzma, we can classify this attack as: skip-git-altogether to install a backdoor further-down-the-chain, precisely in a _dependency_ of the attacked one, durind a period of "weakness" of the upstream maintainers Stenberg closes his article with this update and one related reply to a comment: --8<---------------cut here---------------start------------->8--- Dependencies Added after the initial post. Lots of people have mentioned that curl can get built with many dependencies and maybe one of those would be an easier or better target. Maybe they are, but they are products of their own individual projects and an attack on those projects/products would not be an attack on curl or backdoor in curl by my way of looking at it. In the curl project we ship the source code for curl and libcurl and the users, the ones that builds the binaries from that source code will get the dependencies too. [...] Jean Hominal says: April 1, 2021 at 14:04 I think the big difference why you “missed” dependencies as an attack vector is because today, most application developers ship their dependencies in their application binaries (by linking statically or shipping a container) – in such a case, I would definitely count an attack on such a dependency, that is then shipped as part of the project’s artifacts, as a successful attack on the project. However, as you only ship a source artifact – of course, dependencies *are* out of scope in your case. Daniel Stenberg says: April 1, 2021 at 15:05 Jean: Right. I don’t want to dismiss the risk or the danger of an attack to a curl dependency. However, it is not possible for me or the curl project to keep them safe! --8<---------------cut here---------------end--------------->8--- That lets a number of open questions about some developers attitude towards _distributing_ their software, but it's off-topic here IMO. Anyway, let me highlight, again, the "pragmatically impossible peer review of release tarballs" argument; Stenberg says: «the “perfect backdoor” that isn’t detectable in git and requires someone to diff the release with git in order to detect – which usually isn’t done by anyone that I know of.» [...] > Is it possible to enhance our build-system(s) (e.g. gnu-build-system) so > thay can /ignore/ pre-built .m4 or similar script and rebuild them > during the build process? There is a related security issue for PHP [1], with an interesting thread on the php.internals mailing list (via externals.io [2]): --8<---------------cut here---------------start------------->8--- Consider removing autogenerated files from tarballs [...] I believe that it would be a good idea to remove the huge attack surface offered by the pre-generated autoconf build scripts and lexers, offered in the release tarballs. [...] this injection mode makes sense, as extra files in the tarball not present in the git repo would raise suspicions, but machine-generated configure scripts containing hundreds of thousands of lines of code not present in the upstream VCS are the norm, and are usually not checked before execution. [...] Specifically in the case of PHP, along from the configure script, the tarball also bundles generated lexer files which contain actual C code, which is an additional attack vector [...] To prevent attacks from malevolent/compromised RMs, I propose completely removing all autogenerated files from the release tarballs, and ensuring their content exactly matches the content of the associated git tag [...] Of course this means that users will have to generate the build scripts when compiling PHP, as when installing PHP from the VCS repo. [...] Distros like arch linux already re-generate the configure scripts from scratch, but I believe that no distinction should be made, everyone should get a tarball containing only the bare source code, without leaving to the user the choice to re-generate the build files, or use a potentially compromised build script. [...] The current standard way of distributing generated configure files in tarballs is precisely what allowed the xz supply chain attack to go unnoticed for so long. I strongly believe all projects using autotools, including PHP, should switch away from this "standard" way of doing things. [...] when a user downloads a source tarball, there's a false sense of security rooted in the mistaken belief that the source code in the tarball matches the one distributed in the VCS, but in reality, the tarball also contains potentially malicious semi-compiled blobs, not present in the VCS. --8<---------------cut here---------------end--------------->8--- Are really "configure scripts containing hundreds of thousands of lines of code not present in the upstream VCS" the norm? If so, can we consider hundreds of thousand of lines of configure scripts and other (auto)generated files bundled in release tarballs "pragmatically impossible" to be peer reviewed? Can we consider that artifacts as sort-of-binary and "force" our build-systems to _regenerate_ *all* them? ...or is it better to completely avoid release tarballs as our sources uris? [...] Thanks, Gio' [1] https://github.com/php/php-src/issues/13838 [2] https://externals.io/message/122811 -- Giovanni Biscuolo Xelera IT Infrastructures [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 849 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-04 15:47 ` Giovanni Biscuolo @ 2024-04-04 19:48 ` Attila Lendvai 2024-04-04 20:32 ` Ekaitz Zarraga 2024-04-05 10:13 ` backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) Giovanni Biscuolo 0 siblings, 2 replies; 34+ messages in thread From: Attila Lendvai @ 2024-04-04 19:48 UTC (permalink / raw) To: Giovanni Biscuolo; +Cc: Guix Devel > Are really "configure scripts containing hundreds of thousands of lines > of code not present in the upstream VCS" the norm? pretty much for all C and C++ projects that use autoconf... which is numerous, especially among the core GNU components. > If so, can we consider hundreds of thousand of lines of configure > scripts and other (auto)generated files bundled in release tarballs > "pragmatically impossible" to be peer reviewed? yes. > Can we consider that artifacts as sort-of-binary and "force" our > build-systems to regenerate all them? that would be a good practice. > ...or is it better to completely avoid release tarballs as our sources > uris? yes, and this^ would guarantee the previous point, but it's not always trivial. as an example see this: https://issues.guix.gnu.org/61750 in short: when building shepherd from git the man files need to be generated using the program help2man. this invokes the binary with --help and formats the output as a man page. the usefulness of this is questionable, but the point is that it breaks crosscompilation, because the host cannot execute the target binary. but these generated man files are part of the release tarball, so cross compilation works fine using the tarball. all in all, just by following my gut insctincts, i was advodating for building everything from git even before the exposure of this backdoor. in fact, i found it surprising as a guix newbie that not everything is built from git (or their VCS of choice). -- • attila lendvai • PGP: 963F 5D5F 45C7 DFCD 0A39 -- “For if you [the rulers] suffer your people to be ill-educated, and their manners to be corrupted from their infancy, and then punish them for those crimes to which their first education disposed them, what else is to be concluded from this, but that you first make thieves [and outlaws] and then punish them.” — Sir Thomas More (1478–1535), 'Utopia', Book 1 ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-04 19:48 ` Attila Lendvai @ 2024-04-04 20:32 ` Ekaitz Zarraga 2024-04-10 13:57 ` Ludovic Courtès 2024-04-05 10:13 ` backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) Giovanni Biscuolo 1 sibling, 1 reply; 34+ messages in thread From: Ekaitz Zarraga @ 2024-04-04 20:32 UTC (permalink / raw) To: Attila Lendvai, Giovanni Biscuolo; +Cc: Guix Devel Hi, I just want to add some perspective from the bootstrapping. On 2024-04-04 21:48, Attila Lendvai wrote: > > all in all, just by following my gut insctincts, i was advodating for building everything from git even before the exposure of this backdoor. in fact, i found it surprising as a guix newbie that not everything is built from git (or their VCS of choice). That has happened to me too. Why not use Git directly always? In the bootstrapping it's also a problem, as all those tools (autotools) must be bootstrapped, and they require other programs (compilers) that actually use them. And we'll be forced to use git, too, or at least clone the bootstrapping repos, git-archive them ourselves and host them properly signed. At least, we could challenge them using git (similar to what we do with the substitutes), which we cannot do right now with the release tarballs against the actual code of the repository. In live-bootstrap they just write the build scripts by hand, and ignore whatever the ./configure script says. That's also a reasonable way to tackle the bootstrapping, but it's a hard one. Thankfully, we are working together in this Bootstrapping effort so we can learn from them and adapt their recipes to our Guix commencement.scm module. This would be some effort, but it's actually doable. Hope this adds something useful to the discussion, Ekaitz ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-04 20:32 ` Ekaitz Zarraga @ 2024-04-10 13:57 ` Ludovic Courtès 2024-04-11 12:43 ` Andreas Enge ` (2 more replies) 0 siblings, 3 replies; 34+ messages in thread From: Ludovic Courtès @ 2024-04-10 13:57 UTC (permalink / raw) To: Ekaitz Zarraga; +Cc: Attila Lendvai, Giovanni Biscuolo, Guix Devel Hi, Ekaitz Zarraga <ekaitz@elenq.tech> skribis: > On 2024-04-04 21:48, Attila Lendvai wrote: >> all in all, just by following my gut insctincts, i was advodating >> for building everything from git even before the exposure of this >> backdoor. in fact, i found it surprising as a guix newbie that not >> everything is built from git (or their VCS of choice). > > That has happened to me too. > Why not use Git directly always? Because it create{s,d} a bootstrapping issue. The “builtin:git-download” method was added only recently to guix-daemon and cannot be assumed to be available yet: https://issues.guix.gnu.org/65866 > In the bootstrapping it's also a problem, as all those tools > (autotools) must be bootstrapped, and they require other programs > (compilers) that actually use them. And we'll be forced to use git, > too, or at least clone the bootstrapping repos, git-archive them > ourselves and host them properly signed. At least, we could challenge > them using git (similar to what we do with the substitutes), which we > cannot do right now with the release tarballs against the actual code > of the repository. I think we should gradually move to building everything from source—i.e., fetching code from VCS and adding Autoconf & co. as inputs. This has been suggested several times before. The difficulty, as you point out, will lie in addressing bootstrapping issues with core packages: glibc, GCC, Binutils, Coreutils, etc. I’m not sure how to do that but… > In live-bootstrap they just write the build scripts by hand, and > ignore whatever the ./configure script says. That's also a reasonable > way to tackle the bootstrapping, but it's a hard one. Thankfully, we > are working together in this Bootstrapping effort so we can learn from > them and adapt their recipes to our Guix commencement.scm module. This > would be some effort, but it's actually doable. … live-bootstrap can probably be a good source of inspiration to find a way to build those core packages (or some of them) straight from a VCS checkout. And here the trick will be to find a way to do that in a concise and maintainable way (generating config.h and Makefiles by hand may prove unmaintainable in practice.) Ludo’. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-10 13:57 ` Ludovic Courtès @ 2024-04-11 12:43 ` Andreas Enge 2024-04-11 12:56 ` Ekaitz Zarraga ` (2 more replies) 2024-04-13 6:13 ` Giovanni Biscuolo 2024-05-07 18:22 ` 3 kinds of bootstrap (was Re: backdoor injection via release tarballs combined with binary artifacts) Simon Tournier 2 siblings, 3 replies; 34+ messages in thread From: Andreas Enge @ 2024-04-11 12:43 UTC (permalink / raw) To: Ludovic Courtès Cc: Ekaitz Zarraga, Attila Lendvai, Giovanni Biscuolo, Guix Devel Hello, Am Wed, Apr 10, 2024 at 03:57:20PM +0200 schrieb Ludovic Courtès: > I think we should gradually move to building everything from > source—i.e., fetching code from VCS and adding Autoconf & co. as inputs. the big drawback of this approach is that we would lose maintainers' signatures, right? Would the suggestion to use signed tarballs, but to autoreconf the generated files, not be a better compromise between trusting and distrusting upstream maintainers? Andreas ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-11 12:43 ` Andreas Enge @ 2024-04-11 12:56 ` Ekaitz Zarraga 2024-04-11 13:49 ` Andreas Enge 2024-04-12 13:09 ` Attila Lendvai 2024-04-12 20:42 ` Ludovic Courtès 2 siblings, 1 reply; 34+ messages in thread From: Ekaitz Zarraga @ 2024-04-11 12:56 UTC (permalink / raw) To: Andreas Enge, Ludovic Courtès Cc: Attila Lendvai, Giovanni Biscuolo, Guix Devel Hi, On 2024-04-11 14:43, Andreas Enge wrote: > Hello, > > Am Wed, Apr 10, 2024 at 03:57:20PM +0200 schrieb Ludovic Courtès: >> I think we should gradually move to building everything from >> source—i.e., fetching code from VCS and adding Autoconf & co. as inputs. > > the big drawback of this approach is that we would lose maintainers' > signatures, right? > > Would the suggestion to use signed tarballs, but to autoreconf the > generated files, not be a better compromise between trusting and > distrusting upstream maintainers? > > Andreas > Probably not, because the release tarballs might code that is not present in the Git history and there are not that many eyes checking them. This time it was autoconf, but it might be anything else. The maintainers' machines can be hijacked too... I think it's just better to obtain the exact same code that is easy to find and everybody is reading. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-11 12:56 ` Ekaitz Zarraga @ 2024-04-11 13:49 ` Andreas Enge 2024-04-11 14:05 ` Ekaitz Zarraga ` (2 more replies) 0 siblings, 3 replies; 34+ messages in thread From: Andreas Enge @ 2024-04-11 13:49 UTC (permalink / raw) To: Ekaitz Zarraga Cc: Ludovic Courtès, Attila Lendvai, Giovanni Biscuolo, Guix Devel Am Thu, Apr 11, 2024 at 02:56:24PM +0200 schrieb Ekaitz Zarraga: > I think it's just better to > obtain the exact same code that is easy to find The exact same code as what? Actually I often wonder when looking for a project and end up with a Github repository how I could distinguish the "original" from its clones in a VCS. With the signature by the known (this may also be a wrong assumption, admittedly) maintainer there is at least some form of assurance of origin. > and everybody is reading. This is a steep claim! I agree that nobody reads generated files in a release tarball, but I am not sure how many other files are actually read. Andreas ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-11 13:49 ` Andreas Enge @ 2024-04-11 14:05 ` Ekaitz Zarraga 2024-04-13 0:14 ` Skyler Ferris 2024-04-13 6:50 ` Giovanni Biscuolo 2 siblings, 0 replies; 34+ messages in thread From: Ekaitz Zarraga @ 2024-04-11 14:05 UTC (permalink / raw) To: Andreas Enge Cc: Ludovic Courtès, Attila Lendvai, Giovanni Biscuolo, Guix Devel Hi, >> and everybody is reading. > > This is a steep claim! I agree that nobody reads generated files in > a release tarball, but I am not sure how many other files are actually > read. Yea, it is. I'd also love to know how effective is the reading in a release tarball vs a VCS repo. Quality of the reading is also very important. I simply don't even try to read a tarball, not having the history makes the understanding very difficult. If I find a piece of code that seems odd, I would like to `git blame` it and see what was the reason for the inclusion, who included it and so on. It's not much, but it's better than nothing. Although, I'd understand if you told me the history might be misleading, too. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-11 13:49 ` Andreas Enge 2024-04-11 14:05 ` Ekaitz Zarraga @ 2024-04-13 0:14 ` Skyler Ferris 2024-04-19 14:31 ` Ludovic Courtès 2024-04-13 6:50 ` Giovanni Biscuolo 2 siblings, 1 reply; 34+ messages in thread From: Skyler Ferris @ 2024-04-13 0:14 UTC (permalink / raw) To: Andreas Enge, Ekaitz Zarraga Cc: Ludovic Courtès, Attila Lendvai, Giovanni Biscuolo, Guix Devel Hi all, On 4/11/24 06:49, Andreas Enge wrote: > Am Thu, Apr 11, 2024 at 02:56:24PM +0200 schrieb Ekaitz Zarraga: >> I think it's just better to >> obtain the exact same code that is easy to find > The exact same code as what? Actually I often wonder when looking for > a project and end up with a Github repository how I could distinguish > the "original" from its clones in a VCS. With the signature by the > known (this may also be a wrong assumption, admittedly) maintainer > there is at least some form of assurance of origin. I think this assumption deserves a lot more scrutiny than it typically gets (this is a general statement not particular to your message; even the tails project gets this part of security wrong and they are generally diligent in their efforts). I find it difficult to download PGP keys with any degree of confidence. Often, I see a file with a signature and a key served by the same web page, all coming from the same server. PGP keys are only useful if the attacker compromised the information that the user is receiving from the web page (for example, by gaining control of the web server or compromising the HTTPS session). In the typical scenario I have encountered, the attacker would also replace the key and signature with ones that they generated themself. In short, I'm not sure that we actually get any value from checking the PGP signature for most projects. Either HTTPS is good enough or the attacker won. 99% of the time HTTPS is good enough (though it is notable that the remaining 1% has a disproportionate impact on the affected population). Some caveats: It's difficult for me to use web of trust effectively because I haven't met anyone who uses PGP keys IRL. I'm ultimately trusting my internet connection and servers which are either semi-centralized (there are not that many open keyservers, it's an oligopoly for lack of a better term) or have the problem described above. So maybe everyone else is using web of trust effectively and I don't know what I'm talking about. =) The key download could be compared to the "trust on first use" model that SSH uses. It's not clear to me how effective a simple text box saying "we rotated our keys so you need to re-download it!" would be, but I suspect that most people would download without a second thought. It might be interesting to add public keys and signature locations to package definitions and have Guix re-verify the signature when it downloads the source. This would provide more scrutiny when keys are rotated (because of the review process) and would prevent harm from the situation where the package author is re-downloading the key each time the software is updated. The review process also adds a significant layer of protection because an attacker would need to compromise the HTTPS session of the reviewer in addition to the original package author (assuming that the signature is re-checked by the reviewer; I'm not sure how often this happens in practice). In principle it should be difficult for an attacker to predict who will be reviewing which issue. However, if the pool of reviewers is small it would be easier for the attacker to predict this or just compromise all of the reviewers. Also, if there was some way for the attacker to launch a general attack on people working out of the Guix repository then the value of this protection becomes negligible. The above two paragraphs are somewhat at odds: if Guix has the public key baked in and knows where to download the signature, some reviewers might not double-check the key that they get from the website because Guix is doing it for them. On one hand, I generally think that automating security makes it worse because once it's automated there's a system of rules for attackers to manipulate. On the other hand, if we assume people aren't doing the things they need to then no amount of technical support will give us a secure system. How much is reasonable to expect of people? From my extremely biased perspective, it's difficult to say. >> and everybody is reading. > This is a steep claim! I agree that nobody reads generated files in > a release tarball, but I am not sure how many other files are actually > read. > > Andreas I would guess that the level of the protection is strongly correlated with the popularity of the project among developers who need to add features or fix bugs. I don't think anybody reads a source repository "cover to cover", but we rummage around in the code on an as-needed basis. It would probably be difficult to sneak something into core projects like glibc or gcc, but pretty easy to sneak something into "emojis-but-cooler.js". It would be better to have comprehensive audits of all the projects, but that's not something Guix can manage by itself. It could make it easier to free up resources for that task, but I digress. While it is hyperbolic to say that "with enough eyes, all bugs are shallow" there is a kernel of truth to it. There's a reason they hid the noticeably malicious macros in the release tarball. In solidarity, Skyler ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-13 0:14 ` Skyler Ferris @ 2024-04-19 14:31 ` Ludovic Courtès 0 siblings, 0 replies; 34+ messages in thread From: Ludovic Courtès @ 2024-04-19 14:31 UTC (permalink / raw) To: Skyler Ferris Cc: Andreas Enge, Ekaitz Zarraga, Attila Lendvai, Giovanni Biscuolo, Guix Devel Hi, Skyler Ferris <skyvine@protonmail.com> skribis: > In short, I'm not sure that we actually get any value from checking the > PGP signature for most projects. Either HTTPS is good enough or the > attacker won. 99% of the time HTTPS is good enough (though it is notable > that the remaining 1% has a disproportionate impact on the affected > population). When checking PGP signatures, you end up with a trust-on-first-use model: the first time, you download a PGP key that you know nothing about and you authenticate code against that, which gives no information. On subsequent releases though, you can ensure (ideally) that releases still originates from the same party. HTTPS has nothing to do with that: it just proves that the web server holds a valid certificate for its domain name. But really, the gold standard, if I dare forego any form of modesty, is the ‘.guix-authorizations’ model as it takes care of key distribution as well as authorization delegation and revocation. https://doi.org/10.22152/programming-journal.org/2023/7/1 Ludo’. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-11 13:49 ` Andreas Enge 2024-04-11 14:05 ` Ekaitz Zarraga 2024-04-13 0:14 ` Skyler Ferris @ 2024-04-13 6:50 ` Giovanni Biscuolo 2024-04-13 10:26 ` Skyler Ferris 2 siblings, 1 reply; 34+ messages in thread From: Giovanni Biscuolo @ 2024-04-13 6:50 UTC (permalink / raw) To: Andreas Enge, Ekaitz Zarraga Cc: Ludovic Courtès, Attila Lendvai, Guix Devel [-- Attachment #1: Type: text/plain, Size: 3216 bytes --] Hello, general reminder: please remember the specific scope of this (sub)thread --8<---------------cut here---------------start------------->8--- Please consider that this (sub)thread is _not_ specific to xz-utils but to the specific attack vector (matrix?) used to inject a backdoor in a binary during a build phase, in a _very_ stealthy way. Also, since Guix _is_ downstream, I'd like this (sub)thread to concentrate on what *Guix* can/should do to strenghten the build process /independently/ of what upstreams (or other distributions) can/should do. --8<---------------cut here---------------end--------------->8--- (https://yhetil.org/guix/8734s1mn5p.fsf@xelera.eu/) ...and if needed read that message again to understand the context, please. Andreas Enge <andreas@enge.fr> writes: > Am Thu, Apr 11, 2024 at 02:56:24PM +0200 schrieb Ekaitz Zarraga: >> I think it's just better to >> obtain the exact same code that is easy to find > > The exact same code as what? Of what is contained in the official tool used by upstream to track their code, that is the one and _only_ that is /pragmatically/ open to scrutiny by other upstream and _downstream_ contributors. > Actually I often wonder when looking for a project and end up with a > Github repository how I could distinguish the "original" from its > clones in a VCS. Actually it's a little bit of "intelligence work" but it's something that usually downstream should really do: have a reasonable level of trust that the origin is really the upstream one. But here we are /brainstormig/ about the very issue that led to the backdoor injection, and that issue is how to avoid "backdoor injections via build subversion exploiting semi-binary seeds in release tarballs". (see the scope above) > With the signature by the known (this may also be a wrong assumption, > admittedly) maintainer there is at least some form of assurance of > origin. We should definitely drop the idea of "trust by autority" as a sufficient requisite for verifiability, that is one assumption for reproducible builds. The XZ backdoor injection absolutely demonstrates that one and just one _co-maintainer_ was able to hide a trojan in the _signed_ release tarball and the payload in the git archive (as very obfuscated bynary), so it was _the origin_ that was "infected". It's NOT important _who_ injected the backdoor (and in _was_ upstream), but _how_. In other words, we need a _pragmatic_ way (possibly with helping tools) to "challenge the upstream authority" :-) >> and everybody is reading. > > This is a steep claim! I agree that nobody reads generated files in > a release tarball, but I am not sure how many other files are actually > read. Let's say that at least /someone/ should be _able_ to read the files, but in the attack we are considering /no one/ is _pragmatically_ able to read the (auto)generated semi-binary seeds in the release tarballs. Security is a complex system, especially when considering the entire supply chain: let's focus on this _specific_ weakness of the supply chain. :-) Ciao! Gio' -- Giovanni Biscuolo Xelera IT Infrastructures [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 849 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-13 6:50 ` Giovanni Biscuolo @ 2024-04-13 10:26 ` Skyler Ferris 2024-04-13 12:47 ` Giovanni Biscuolo 0 siblings, 1 reply; 34+ messages in thread From: Skyler Ferris @ 2024-04-13 10:26 UTC (permalink / raw) To: Giovanni Biscuolo, Andreas Enge, Ekaitz Zarraga Cc: Ludovic Courtès, Attila Lendvai, Guix Devel Hi again, On 4/12/24 23:50, Giovanni Biscuolo wrote: > Hello, > > general reminder: please remember the specific scope of this (sub)thread > > --8<---------------cut here---------------start------------->8--- > > Please consider that this (sub)thread is _not_ specific to xz-utils but > to the specific attack vector (matrix?) used to inject a backdoor in a > binary during a build phase, in a _very_ stealthy way. > > Also, since Guix _is_ downstream, I'd like this (sub)thread to > concentrate on what *Guix* can/should do to strenghten the build process > /independently/ of what upstreams (or other distributions) can/should > do. > > --8<---------------cut here---------------end--------------->8--- > (https://yhetil.org/guix/8734s1mn5p.fsf@xelera.eu/) > > ...and if needed read that message again to understand the context, > please. > > I assume that this was an indirect response to the email I sent previously where I discussed the problems with PGP signatures on release files. I believe that this was in scope because of the discussion about whether to use VCS checkouts which lack signatures or release tarballs which have signatures. If the signatures on the release tarballs are not providing us with additional confidence then we are not losing anything by switching to the VCS checkout. Analysis of the effectiveness of what upstream projects are doing is relevant when trying to determine what we are capable of doing. I also pointed out that a change to Guix such as adding signature metadata to packages could help make up for problems with upstream workflows and how the review process provides additional confidence, demonstrating how this analysis is relevant to what to currently/could possibly do. Please let me know if you think that this is incorrect. Additionally, I need to correct something that I previously said. I stated this: On 4/12/24 17:14, Skyler Ferris wrote: > even the tails project gets this part of security wrong and they are generally diligent in their efforts Without first double-checking the current state of the project. While this was true at one point, they have since updated their website and clearly explain the problem and what their new verification method is able to protect against at https://tails.net/contribute/design/download_verification/. I apologize for disseminating outdated information. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-13 10:26 ` Skyler Ferris @ 2024-04-13 12:47 ` Giovanni Biscuolo 2024-04-14 16:22 ` Skyler Ferris 0 siblings, 1 reply; 34+ messages in thread From: Giovanni Biscuolo @ 2024-04-13 12:47 UTC (permalink / raw) To: Skyler Ferris; +Cc: Guix Devel [-- Attachment #1: Type: text/plain, Size: 1360 bytes --] Hello Skyler, Skyler Ferris <skyvine@protonmail.com> writes: > On 4/12/24 23:50, Giovanni Biscuolo wrote: >> general reminder: please remember the specific scope of this (sub)thread [...] >> (https://yhetil.org/guix/8734s1mn5p.fsf@xelera.eu/) >> >> ...and if needed read that message again to understand the context, >> please. >> > I assume that this was an indirect response to the email I sent > previously where I discussed the problems with PGP signatures on release > files. No, believe me! I'm sorry I gave you this impression. :-) > I believe that this was in scope To be clear: not only I did not mean to say - even indirectly - that you where out of scope _or_ that you did not understand the context. Also, I really did not mean to /appear/ as the "coordinator" of this (sub)thread and even less to /appear/ as the one who decides what's in scope and what's OT; obviously everyone is absolutely free to decide what is in scope and that she or he understood the context . > because of the discussion about whether to use VCS checkouts which > lack signatures or release tarballs which have signatures. I still have not commented what you discussed just because I lack time, not interest; if I can I'll do it ASAP™ :-( [...] Thanks! Gio' -- Giovanni Biscuolo Xelera IT Infrastructures [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 849 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-13 12:47 ` Giovanni Biscuolo @ 2024-04-14 16:22 ` Skyler Ferris 0 siblings, 0 replies; 34+ messages in thread From: Skyler Ferris @ 2024-04-14 16:22 UTC (permalink / raw) To: Giovanni Biscuolo; +Cc: Guix Devel On 4/13/24 05:47, Giovanni Biscuolo wrote: > Hello Skyler, > > Skyler Ferris <skyvine@protonmail.com> writes: > >> On 4/12/24 23:50, Giovanni Biscuolo wrote: >>> general reminder: please remember the specific scope of this (sub)thread > [...] > >>> (https://yhetil.org/guix/8734s1mn5p.fsf@xelera.eu/) >>> >>> ...and if needed read that message again to understand the context, >>> please. >>> >> I assume that this was an indirect response to the email I sent >> previously where I discussed the problems with PGP signatures on release >> files. > No, believe me! I'm sorry I gave you this impression. :-) > >> I believe that this was in scope > To be clear: not only I did not mean to say - even indirectly - that you > where out of scope _or_ that you did not understand the context. > > Also, I really did not mean to /appear/ as the "coordinator" of this > (sub)thread and even less to /appear/ as the one who decides what's in > scope and what's OT; obviously everyone is absolutely free to decide > what is in scope and that she or he understood the context . > >> because of the discussion about whether to use VCS checkouts which >> lack signatures or release tarballs which have signatures. > I still have not commented what you discussed just because I lack time, > not interest; if I can I'll do it ASAP™ :-( > > [...] > > Thanks! Gio' > Thanks for clarifying! Misunderstandings happen sometimes. I look forward to hearing your thoughts if you're able to find time to share them! =) ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-11 12:43 ` Andreas Enge 2024-04-11 12:56 ` Ekaitz Zarraga @ 2024-04-12 13:09 ` Attila Lendvai 2024-04-12 20:42 ` Ludovic Courtès 2 siblings, 0 replies; 34+ messages in thread From: Attila Lendvai @ 2024-04-12 13:09 UTC (permalink / raw) To: Andreas Enge Cc: Ludovic Courtès, Ekaitz Zarraga, Giovanni Biscuolo, Guix Devel > > I think we should gradually move to building everything from > > source—i.e., fetching code from VCS and adding Autoconf & co. as inputs. > > > the big drawback of this approach is that we would lose maintainers' > signatures, right? it's possible to sign git commits and (annotated) tags, too. it's good practice to enable signing by default. admittedly though, few people sign all their commits, and even fewer sign their tags. -- • attila lendvai • PGP: 963F 5D5F 45C7 DFCD 0A39 -- “Never appeal to a man's "better nature". He may not have one. Invoking his self-interest gives you more leverage.” — Robert Heinlein (1907–1988), 'Time Enough For Love' (1973) ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-11 12:43 ` Andreas Enge 2024-04-11 12:56 ` Ekaitz Zarraga 2024-04-12 13:09 ` Attila Lendvai @ 2024-04-12 20:42 ` Ludovic Courtès 2 siblings, 0 replies; 34+ messages in thread From: Ludovic Courtès @ 2024-04-12 20:42 UTC (permalink / raw) To: Andreas Enge Cc: Ekaitz Zarraga, Attila Lendvai, Giovanni Biscuolo, Guix Devel Hi! Andreas Enge <andreas@enge.fr> skribis: > Am Wed, Apr 10, 2024 at 03:57:20PM +0200 schrieb Ludovic Courtès: >> I think we should gradually move to building everything from >> source—i.e., fetching code from VCS and adding Autoconf & co. as inputs. > > the big drawback of this approach is that we would lose maintainers' > signatures, right? Yes. But as Attila wrote, one can hope that they provide a way to authenticate at least part of their VCS history, for example with signed tags. (Ideally everyone would use ‘guix git authenticate’ of course.) > Would the suggestion to use signed tarballs, but to autoreconf the > generated files, not be a better compromise between trusting and > distrusting upstream maintainers? IMO starting from an authenticated VCS checkout is clearer. Ludo’. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-10 13:57 ` Ludovic Courtès 2024-04-11 12:43 ` Andreas Enge @ 2024-04-13 6:13 ` Giovanni Biscuolo 2024-05-07 18:22 ` 3 kinds of bootstrap (was Re: backdoor injection via release tarballs combined with binary artifacts) Simon Tournier 2 siblings, 0 replies; 34+ messages in thread From: Giovanni Biscuolo @ 2024-04-13 6:13 UTC (permalink / raw) To: Ludovic Courtès, Ekaitz Zarraga; +Cc: Attila Lendvai, Guix Devel [-- Attachment #1: Type: text/plain, Size: 1998 bytes --] Hello, Ludovic Courtès <ludo@gnu.org> writes: > Ekaitz Zarraga <ekaitz@elenq.tech> skribis: > >> On 2024-04-04 21:48, Attila Lendvai wrote: >>> all in all, just by following my gut insctincts, i was advodating >>> for building everything from git even before the exposure of this >>> backdoor. in fact, i found it surprising as a guix newbie that not >>> everything is built from git (or their VCS of choice). >> >> That has happened to me too. >> Why not use Git directly always? > > Because it create{s,d} a bootstrapping issue. The > “builtin:git-download” method was added only recently to guix-daemon and > cannot be assumed to be available yet: > > https://issues.guix.gnu.org/65866 This fortunately will help a lot with the "everything built from git" part of the "whishlist", but what about the not zero occurrences of "other upstream VCSs"? [...] > I think we should gradually move to building everything from > source—i.e., fetching code from VCS and adding Autoconf & co. as inputs. > > This has been suggested several times before. The difficulty, as you > point out, will lie in addressing bootstrapping issues with core > packages: glibc, GCC, Binutils, Coreutils, etc. I’m not sure how to do > that but… does it have to be an "all of nothing" choiche? I mean "continue using release tarballs" vs "use git" for "all"? If using git is unfeaseable for bootstrapping reasons [1], why not cointinue using release tarballs with some _extra_ verifications steps and possibly add some automation steps to "lint" to help contributors and committers check that there are not "quasi-binary" seeds [2] hidden in release tarballs? WDYT? [...] Grazie! Gio' [1] or other reasons specific to a package that should be documented when needed, at least with a comment in the package definition [2] the autogenerated files that are not pragmatically verifiable -- Giovanni Biscuolo Xelera IT Infrastructures [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 849 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* 3 kinds of bootstrap (was Re: backdoor injection via release tarballs combined with binary artifacts) 2024-04-10 13:57 ` Ludovic Courtès 2024-04-11 12:43 ` Andreas Enge 2024-04-13 6:13 ` Giovanni Biscuolo @ 2024-05-07 18:22 ` Simon Tournier 2 siblings, 0 replies; 34+ messages in thread From: Simon Tournier @ 2024-05-07 18:22 UTC (permalink / raw) To: Ludovic Courtès, Ekaitz Zarraga Cc: Attila Lendvai, Giovanni Biscuolo, Guix Devel Hi, I am late to the party… On mer., 10 avril 2024 at 15:57, Ludovic Courtès <ludo@gnu.org> wrote: >> That has happened to me too. >> Why not use Git directly always? > > Because it create{s,d} a bootstrapping issue. The > “builtin:git-download” method was added only recently to guix-daemon and > cannot be assumed to be available yet: > > https://issues.guix.gnu.org/65866 [...] > I think we should gradually move to building everything from > source—i.e., fetching code from VCS and adding Autoconf & co. as inputs. > > This has been suggested several times before. The difficulty, as you > point out, will lie in addressing bootstrapping issues with core > packages: glibc, GCC, Binutils, Coreutils, etc. I’m not sure how to do > that but… [...] > … live-bootstrap can probably be a good source of inspiration to find a > way to build those core packages (or some of them) straight from a VCS > checkout. IMHO, we need to distinguish because there is different types of issues and thus different potential workarounds. :-) 1. Bootstrap how to download source code. 2. Bootstrap how to build core packages. 3. Bootstrap the driver (say guix-daemon and helpers). Well, having solutions for #1 and #3 would naturally provide a solution for #2. Although the devil is about details. ;-) About #1 ======== You cannot use the binary ’git’ in order to download the source code of Git to build the binary ’git’. Yeah, circular dependency. :-) Therefore, Git source code is pulled using another method, say from tarball, such method which also needs to be built from source, so it also needs yet another method. The usual chicken-or-the-egg problem. The current workaround is to “hide” the problem and introduce a “builtin:download” method: it’s an “opaque” binary that is hard to inspect. Roughly, the workaround had been introduced by [1] on Oct. 2016. Almost 8 years ago, so it works! :-) The argument for accepting this “opaque” method is because it is a fixed-output derivation. Other said, we know beforehand the SHA256 checksum. Thus the claim is: being “opaque” does not matter because the SH256 checksum can be computed independently and all the source code can be audited. For cutting another cycle, another “opaque” had be introduced: “builtin:git-download”. All applies similarly. Do not take me wrong with “opaque”. I mean that the method depends on the couple user-revision and daemon-revision. Other said, it is not straightforward to know when Alice and Bob are using the exact same method for downloading source code. Since it is not fully transparent, it is “opaque”. :-) Somehow we are applying to all what we need for cutting a specific circular dependency. We have some packages named ’foo-bootstrap’ that are aimed to solve some dependency problem about packages, then we do not use them for all; we just use them for cutting a circular dependency. I think a similar strategy should be applied for the fetch methods. We could have “git-fetch” relying on the initial Git method, i.e., a transparent derivation where it’s straightforward to audit all: the dependencies and the builder. And for some specific cases, we could have “git-fetch/bootstrap” relying on “builtin:git-download”. It eases to know which packages are very important to care. I think that “builtin:download” and “builtin:git-download” applied to all “url-fetch” and “git-fetch” both downgrade the complete transparency level for solving very specific bootstrapping problem. Last about #1, please note that the transparency does not come for free and has drawbacks: when running say “guix time-machine -C past.scm -- build -S”, all the dependencies for downloading would be the ones of past.scm. Other said, for downloading today the source code of a 5 years old package, say using ’hg-fetch’, we need Python and Mercurial as they were 5 years ago – when we do not expect any difference on the content with the Python and Mercurial of today. About #3 ======== That’s the very hard topic! The bootstrapping story is not fully done yet. Assuming trust for #1, the bootstrap of Guix starts with ’bootstrap-seeds’, roughly 232KiB. Take a moment, that’s impressive, :-) right? Obviously, I let aside Haskell, Ocaml@5 etc. Well, diving further. These 232K alone are not enough. It also requires helpers: tar (1.3MiB), bash (1.3MiB), mkdir (0.7MiB) and xz (0.844MiB). More, it requires two drivers: static Guile binary (14MiB) and guix-daemon. You get it: How to trust these helpers? Two approaches: (a) implement something directly in hex/assembler and/or (b) exploit the Guile binary (à la Scheme on bare metal). About guix-daemon, one solution is a daemon directly in Guile, and compatible with the very Guile binary. Or at least, a minimalist daemon with just enough features for building up to guix-daemon. Or another option is the “Extreme bootstrapping” [3] – my understanding of live-bootstrap. Somehow, remove guix-daemon from the picture and convert the derivation – the one read by guix-daemon – to a minimal Guile script that would be executed during startup. See the proof-of-concept in the branch wip-system-bootstrap [4]. Just my lengthy opinion… Or maybe some ideas for GSoC. ;-) 1: https://issues.guix.gnu.org/22774#3 2: https://guix.gnu.org/en/blog/2023/the-full-source-bootstrap-building-from-source-all-the-way-down 3: https://guix.gnu.org/en/blog/2019/reproducible-builds-summit-5th-edition 4: https://git.savannah.gnu.org/cgit/guix.git/log/?h=wip-system-bootstrap Cheers, simon ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-04 19:48 ` Attila Lendvai 2024-04-04 20:32 ` Ekaitz Zarraga @ 2024-04-05 10:13 ` Giovanni Biscuolo 2024-04-05 14:51 ` Attila Lendvai 1 sibling, 1 reply; 34+ messages in thread From: Giovanni Biscuolo @ 2024-04-05 10:13 UTC (permalink / raw) To: Attila Lendvai, guix-security; +Cc: Guix Devel [-- Attachment #1: Type: text/plain, Size: 5431 bytes --] Hi Attila and guix-security team, Attila Lendvai <attila@lendvai.name> writes: >> Are really "configure scripts containing hundreds of thousands of lines >> of code not present in the upstream VCS" the norm? > > pretty much for all C and C++ projects that use autoconf... which is > numerous, especially among the core GNU components. OK, thank you for the confirmation. [...] >> ...or is it better to completely avoid release tarballs as our sources >> uris? > > yes, and this^ would guarantee the previous point, but it's not always trivial. > > as an example see this: https://issues.guix.gnu.org/61750 [...] > it breaks crosscompilation, because the host cannot execute the target > binary. OK thanks, I missed that. In general, there is really no other solution for projects than to distribute some artifacts "out of band" or renounce to crosscompile?!? Are there other issues (different from the "host cannot execute target binary") that makes relesase tarballs indispensable for some upstream projects? AFAIU the only thing that /could/ "save" source tarballs it's their /scientific/ reproducibility. In this direction there is a very interesting patchset from Janneke Nieuwenhuizen to try to get a reproducible _Guix_ release tarball: https://issues.guix.gnu.org/70169 «Reproducible `make dist' tarball in defiance of Autotools and Gettext» Obviously having a reproducible tarball makes _practical_ the "pragmatically impossible" task to reproduce a release tarball to check if it corresponds to the same **build** (make dist) performed in the official DVCS repo; only this could "save" all the "build software using release tarball" workflow. ...but /in general/ here we are _downstream_, we have absolutely no control over upstream, and it's _very_ unlikely that we'll see a *good* solution to the tarball reproduciblity problem applied "in the wild upstream" soon. I said "a **good* solution" because some proposals I'm reading about are /bad/ _complications_ that absolutely are NOT really solving the source tarball reproduciblity problem [1]; for example: 1. build the tarball on the RM host using a docker container (unreproducible built) and call it "a reproducible release tarball": https://medium.com/@lanoxx/creating-reproducible-release-tarballs-fa2e2ce745a7 2. have a CI system based on github actions [2] and call it "fully verifiable": https://externals.io/message/122811#122814 (from php.internals mailing list) So, while "almost all the world" is applying _wrong_ solutions to the source tarball reproducibility problem, what can Guix do? Even if We™ (ehrm) find a solution to the source tarball reproducibility problem (potentially allowing us to patch all the upstream makefiles with specific phases in our packages definitions) are we really going to start our own (or one managed by the reproducible build community) "reproducible source tarballs" repository? Is this feaseable? I think there is no solution that can "pragmatically save" the source tarballs of all the software packaged in Guix (and all other distributions part of the reproducible builds effort). > but these generated man files are part of the release tarball, so > cross compilation works fine using the tarball. AFAIU *in this case* there is an easy alternative: distribute the (generated) man files as *code* tracked in the DVCS (e.g. git) repo itself. IMHO it's likely that this workflow can fix most if not all the crosscompilation issues, no? In general, AFAIU it's against reproducibility to distribute pre-generated (compiled? transpiled?) artifacts in a tarball that are not present in the official DVCS repo, especially when tarballs are _not_ reproducible (and they are not in likely 99.9% of cases). > all in all, just by following my gut insctincts, i was advodating for > building everything from git even before the exposure of this > backdoor. in fact, i found it surprising as a guix newbie that not > everything is built from git (or their VCS of choice). Given the current situation so clearly exposed by the "xz backdoor" case, this is something Guix should seriously consider. I mean: Guix should seriously consider to drop source tarballs and _also_ all pre-compiled artifacts distributed only via that tarballs. I don't like this proposal, but I see no other "pragmatically possible" solution. AFAIU no need to rush, but I'm afraid that the class of attacks we can call "supply-chain backdoor injection due to source tarball pragmatically impossible verifiability" are hard to deploy but unfortunately not _too_ hard. [...] Thanks! Gio' [1] this boils down to the unfortunate fact that "reproducibility" is a very misunderstood concept [1.1], even by some very skilled (experienced?) programmers [1.1] because it's strictly related to good _redistribution_ of _trusted_ software, not to good programming [2] https://docs.github.com/en/actions/learn-github-actions/understanding-github-actions#runners «each workflow run executes in a fresh, newly-provisioned virtual machine.» see also https://www.paloaltonetworks.com/blog/prisma-cloud/unpinnable-actions-github-security/ for security concerns about GitHub actions relying on Docker containers used for "reproducibility" purposes. -- Giovanni Biscuolo Xelera IT Infrastructures [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 849 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-05 10:13 ` backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) Giovanni Biscuolo @ 2024-04-05 14:51 ` Attila Lendvai 2024-04-13 7:42 ` Giovanni Biscuolo 0 siblings, 1 reply; 34+ messages in thread From: Attila Lendvai @ 2024-04-05 14:51 UTC (permalink / raw) To: Giovanni Biscuolo; +Cc: guix-security, Guix Devel > Are there other issues (different from the "host cannot execute target > binary") that makes relesase tarballs indispensable for some upstream > projects? i didn't mean to say that tarballs are indispensible. i just wanted to point out that it's not as simple as going through each package definition and robotically changing the source origin from tarball to git repo. it costs some effort, but i don't mean to suggest that it's not worth doing. > So, while "almost all the world" is applying wrong solutions to the > source tarball reproducibility problem, what can Guix do? AFAIU the plan is straightforward: change all package definitions to point to the (git) repos of the upstream, and ignore any generated ./configure scripts if it happens to be checked into the repo. it involves quite some work, both in quantity, and also some thinking around surprises. i think a good first step would be to reword the packaging guidelines in the doc to strongly prefer VCS sources instead of tarballs. > Even if We™ (ehrm) find a solution to the source tarball reproducibility > problem (potentially allowing us to patch all the upstream makefiles > with specific phases in our packages definitions) are we really going to > start our own (or one managed by the reproducible build community) > "reproducible source tarballs" repository? Is this feaseable? but why would that be any better than simply building from git? which, i think, would even take less effort. > > but these generated man files are part of the release tarball, so > > cross compilation works fine using the tarball. > > > AFAIU in this case there is an easy alternative: distribute the > (generated) man files as code tracked in the DVCS (e.g. git) repo > itself. yes, that would work in this case (although, that man page is guaranteed to go stale). my proposal was to simply drop the generated man file. it adds very little value (although it's not zero; web search, etc). -- • attila lendvai • PGP: 963F 5D5F 45C7 DFCD 0A39 -- “It is easy to be conspicuously 'compassionate' if others are being forced to pay the cost.” — Murray N. Rothbard (1926–1995) ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-05 14:51 ` Attila Lendvai @ 2024-04-13 7:42 ` Giovanni Biscuolo 0 siblings, 0 replies; 34+ messages in thread From: Giovanni Biscuolo @ 2024-04-13 7:42 UTC (permalink / raw) To: Attila Lendvai; +Cc: guix-security, Guix Devel [-- Attachment #1: Type: text/plain, Size: 1552 bytes --] Hi Attila, sorry for the delay in my reply, I'm asking myself if this (sub)thread should be "condensed" in a dedicated RFC (are RFCs official workflows in Guix, now?); if so, I volunteer to file such an RFC in the next weeks. Attila Lendvai <attila@lendvai.name> writes: >> Are there other issues (different from the "host cannot execute target >> binary") that makes relesase tarballs indispensable for some upstream >> projects? > > > i didn't mean to say that tarballs are indispensible. i just wanted to > point out that it's not as simple as going through each package > definition and robotically changing the source origin from tarball to > git repo. it costs some effort, but i don't mean to suggest that it's > not worth doing. OK understood thanks! [...] > i think a good first step would be to reword the packaging guidelines > in the doc to strongly prefer VCS sources instead of tarballs. I agree. >> Even if We™ (ehrm) find a solution to the source tarball reproducibility >> problem (potentially allowing us to patch all the upstream makefiles >> with specific phases in our packages definitions) are we really going to >> start our own (or one managed by the reproducible build community) >> "reproducible source tarballs" repository? Is this feaseable? > > but why would that be any better than simply building from git? which, > i think, would even take less effort. I agree, I was just brainstorming. [...] Thanks, Gio' -- Giovanni Biscuolo Xelera IT Infrastructures [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 849 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-04 10:34 ` backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) Giovanni Biscuolo 2024-04-04 15:12 ` Attila Lendvai 2024-04-04 15:47 ` Giovanni Biscuolo @ 2024-04-04 23:03 ` Ricardo Wurmus 2024-04-05 7:06 ` Giovanni Biscuolo 2024-04-05 16:52 ` Jan Wielkiewicz 3 siblings, 1 reply; 34+ messages in thread From: Ricardo Wurmus @ 2024-04-04 23:03 UTC (permalink / raw) To: Giovanni Biscuolo; +Cc: Guix Devel, guix-security, Felix Lechner, Ryan Prior [mu4e must have changed the key bindings for replies, so here is my mail again, this time as a wide reply.] Giovanni Biscuolo <g@xelera.eu> writes: > So AFAIU using a fixed "autoreconf -fi" should mitigate the risks of > tampered .m4 macros (and other possibly tampered build configuration > script)? > > IMHO "ignoring" (deleting) pre-built build scripts in Guix > build-system(s) should be considered... or is /already/ so? The gnu-build-system has a bootstrap phase, but it only does something when a configure script does not already exist. We sometimes force it to bootstrap the build system when we patch configure.ac. In previous discussions there were no big objections to always bootstrapping the build system files from autoconf/automake sources. This particular backdoor relied on a number of obfuscations: - binary test data. Nobody ever looks at binaries. - incomprehensibility of autotools output. This one is fundamentally a social problem and easily extends to other complex build systems. In the xz case, the instructions for assembling the shell snippets to inject the backdoor could hide in plain sight, just because configure scripts are expected to be near incomprehensible. They contain no comments, are filled to the brim with portable (lowest common denominator) shell magic, and contain bizarrely named variables. Not using generated output is a good idea anyway and removes the requirement to trust that the release tarballs are faithful derivations from the autotools sources, but given the bland complexity of build system code (whether that's recursive Makefiles, CMake cruft, or the infamous gorilla spit[1] of autotools) I don't see a good way out. [1] https://www.gnu.org/software/autoconf/manual/autoconf-2.65/autoconf.html#History > Given the above observation that <<it is pragmatically impossible [...] > to peer review a tarball prepared in this manner>>, I strongly doubt that > a possible Makefile tampering _in_the_release_tarball_ is easy to peer > review; I'd ask: is it feaseable such an "automated analysis" (see > above) in a dedicated build-system phase? I don't think it's feasible. Since Guix isn't a regular user (the target audience of configure scripts) it has no business depending on generated configure scripts. It should build these from source. > In other words: what if the backdoor was injected directly in the source > code of the *official* release tarball signed with a valid GPG signature > (and obviously with a valid sha256 hash)? A malicious maintainer can sign bad release tarballs. A malicious contributor can push signed commits that contain backdoors in code. > Do upstream developer communities peer review release tarballs or they > "just" peer review the code in the official DVCS? Most do neither. I'd guess that virtually *nobody* reviews tarballs beyond automated tests (like what the GNU maintainers' GNUmakefile / maint.mk does when preparing a release). > Also, in (info "(guix) origin Reference") I see that Guix packages can have a > list of uri(s) for the origin of source code, see xz as an example [7]: > are they intended to be multiple independent sources to be compared in > order to prevent possible tampering or are they "just" alternatives to > be used if the first listed uri is unavailable? They are alternative URLs, much like what the mirror:// URLs do. > If the case is the first, a solution would be to specify multiple > independent release tarballs for each package, so that it would be > harder to copromise two release sources, but that is not something under > Guix control. We have hashes for this purpose. A tarball that was modified since the package definition has been published would have a different hash. This is not a statement about tampering, but only says that our expectations (from the time of packaging) have not been met. > All in all: should we really avoid the "pragmatically impossible to be > peer reviewed" release tarballs? Yes. -- Ricardo ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-04 23:03 ` Ricardo Wurmus @ 2024-04-05 7:06 ` Giovanni Biscuolo 2024-04-05 7:39 ` Ricardo Wurmus 0 siblings, 1 reply; 34+ messages in thread From: Giovanni Biscuolo @ 2024-04-05 7:06 UTC (permalink / raw) To: Ricardo Wurmus; +Cc: Guix Devel, guix-security [-- Attachment #1: Type: text/plain, Size: 5583 bytes --] Hello Ricardo, Ricardo Wurmus <rekado@elephly.net> writes: > Giovanni Biscuolo <g@xelera.eu> writes: > >> So AFAIU using a fixed "autoreconf -fi" should mitigate the risks of >> tampered .m4 macros (and other possibly tampered build configuration >> script)? >> >> IMHO "ignoring" (deleting) pre-built build scripts in Guix >> build-system(s) should be considered... or is /already/ so? > > The gnu-build-system has a bootstrap phase, but it only does something > when a configure script does not already exist. We sometimes force it > to bootstrap the build system when we patch configure.ac. > > In previous discussions there were no big objections to always > bootstrapping the build system files from autoconf/automake sources. But AFAIU the boostrap is not always done, right? If so, given that there are no big objections to always bootstrap the build system files, what is the technical reason it's not done? > This particular backdoor relied on a number of obfuscations: > > - binary test data. Nobody ever looks at binaries. Yes, and the presence of binary data (e.g. for testing or other included media) is not something under downstream (Guix) control, so we have to live with it. No? > - incomprehensibility of autotools output. This one is fundamentally a > social problem and easily extends to other complex build systems. In > the xz case, the instructions for assembling the shell snippets to > inject the backdoor could hide in plain sight, just because configure > scripts are expected to be near incomprehensible. They contain no > comments, are filled to the brim with portable (lowest common > denominator) shell magic, and contain bizarrely named variables. Yes I understand this well, for this reason I call configure scripts near-binary-artifacts, kinda *.o files From a reproducibility and security POV this is a nightmare and no one should never ever trust such configure scripts > Not using generated output is a good idea anyway and removes the > requirement to trust that the release tarballs are faithful derivations > from the autotools sources, but given the bland complexity of build system > code (whether that's recursive Makefiles, CMake cruft, or the infamous > gorilla spit[1] of autotools) I don't see a good way out. I guess I miss the technical details about why it's not possible to _always_ bootstrap the build system files from autoconf/automake sources: do you have any reference documentation or technical article as a reference, please? > [1] > https://www.gnu.org/software/autoconf/manual/autoconf-2.65/autoconf.html#History I'll study the autoconf history :-) >> Given the above observation that «it is pragmatically impossible [...] >> to peer review a tarball prepared in this manner», I strongly doubt that >> a possible Makefile tampering _in_the_release_tarball_ is easy to peer >> review; I'd ask: is it feaseable such an "automated analysis" (see >> above) in a dedicated build-system phase? > > I don't think it's feasible. Since Guix isn't a regular user (the > target audience of configure scripts) it has no business depending on > generated configure scripts. It should build these from source. Maybe I misunderstand your argument or, more probably, I was too cryptic. I mean, Someone™ is telling that moving the unpacking of the backdoor object to a Makefile rule is an easy target for _automated_ analisys: is that someone wrong or is there a way to analyze a Makefile to answer "Which files have 'special' rules?" >> In other words: what if the backdoor was injected directly in the source >> code of the *official* release tarball signed with a valid GPG signature >> (and obviously with a valid sha256 hash)? > > A malicious maintainer can sign bad release tarballs. A malicious > contributor can push signed commits that contain backdoors in code. Oh yes, but it's way more harder to hide backdoors in code published as signed (signed?!?) commits in a DVCS. Obviously no security system is perfect, but Some™ is (very) less perfect than others. :-) >> Do upstream developer communities peer review release tarballs or they >> "just" peer review the code in the official DVCS? > > Most do neither. I'd guess that virtually *nobody* reviews tarballs > beyond automated tests (like what the GNU maintainers' GNUmakefile / > maint.mk does when preparing a release). I guess that in "nobody" are included Guix package contributors and committers... Then I'd say that virtually *nobody* should trust tarball! :-O To be clear: I'm not suggesting that "tarball reviews" - that is, verify the /almost/ exact correspondence of the tarball with the corresponding DVCS commit - should be added as a requirement for contributors or maintainers... it would be too burdensome. >> Also, in (info "(guix) origin Reference") I see that Guix packages can have a >> list of uri(s) for the origin of source code, see xz as an example [7]: >> are they intended to be multiple independent sources to be compared in >> order to prevent possible tampering or are they "just" alternatives to >> be used if the first listed uri is unavailable? > > They are alternative URLs, much like what the mirror:// URLs do. OK understood, thanks! [...] >> All in all: should we really avoid the "pragmatically impossible to be >> peer reviewed" release tarballs? > > Yes. I tend to agree! :-( Thank you! Giovanni -- Giovanni Biscuolo Xelera IT Infrastructures [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 849 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-05 7:06 ` Giovanni Biscuolo @ 2024-04-05 7:39 ` Ricardo Wurmus 0 siblings, 0 replies; 34+ messages in thread From: Ricardo Wurmus @ 2024-04-05 7:39 UTC (permalink / raw) To: Giovanni Biscuolo; +Cc: Guix Devel, guix-security Giovanni Biscuolo <g@xelera.eu> writes: > Hello Ricardo, > > Ricardo Wurmus <rekado@elephly.net> writes: > >> Giovanni Biscuolo <g@xelera.eu> writes: >> >>> So AFAIU using a fixed "autoreconf -fi" should mitigate the risks of >>> tampered .m4 macros (and other possibly tampered build configuration >>> script)? >>> >>> IMHO "ignoring" (deleting) pre-built build scripts in Guix >>> build-system(s) should be considered... or is /already/ so? >> >> The gnu-build-system has a bootstrap phase, but it only does something >> when a configure script does not already exist. We sometimes force it >> to bootstrap the build system when we patch configure.ac. >> >> In previous discussions there were no big objections to always >> bootstrapping the build system files from autoconf/automake sources. > > But AFAIU the boostrap is not always done, right? It is not. See guix/build/gnu-build-system.scm: (if (not (script-exists? "configure")) ...) > If so, given that there are no big objections to always bootstrap the > build system files, what is the technical reason it's not done? I don't think there is a technical reason. It's just one of those things that need someone doing them. >> Not using generated output is a good idea anyway and removes the >> requirement to trust that the release tarballs are faithful derivations >> from the autotools sources, but given the bland complexity of build system >> code (whether that's recursive Makefiles, CMake cruft, or the infamous >> gorilla spit[1] of autotools) I don't see a good way out. > > I guess I miss the technical details about why it's not possible to > _always_ bootstrap the build system files from autoconf/automake > sources: do you have any reference documentation or technical article as > a reference, please? I didn't say it's not possible. Someone's gotta start a branch and build it all out. There may be some annoyance closer to the bootstrap origins (because we may not easily be able to run an approximation of autotools or even VCS tools closer to the bootstrap seeds), but I think we're already using custom Makefiles in some of these cases to simplify bootstrapping. It's just work. Someone's gotta do it. It's probably not super complicated, but given the large number of packages we have it won't be fast. -- Ricardo ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) 2024-04-04 10:34 ` backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) Giovanni Biscuolo ` (2 preceding siblings ...) 2024-04-04 23:03 ` Ricardo Wurmus @ 2024-04-05 16:52 ` Jan Wielkiewicz 3 siblings, 0 replies; 34+ messages in thread From: Jan Wielkiewicz @ 2024-04-05 16:52 UTC (permalink / raw) To: Giovanni Biscuolo; +Cc: Guix Devel, guix-security, Felix Lechner, Ryan Prior On Thu, 04 Apr 2024 12:34:42 +0200 Giovanni Biscuolo <g@xelera.eu> wrote: > Hello everybody, > > I know for sure that Guix maintainers and developers are working on > this, I'm just asking to find some time to inform and possibly discuss > with users (also in guix-devel) on what measures GNU Guix - the > software distribution - can/should deploy to try to avoid this kind > of attacks. What about integrating ClamAV into the build farms (if this isn't a thing already)? ClamAV could scan source files and freshly-built packages and perhaps detect obvious malware. AFAIK it can also detect CVEs. Guix already has ClamAV packaged so this shouldn't be that hard. -- Jan Wielkiewicz ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: Backdoor in upstream xz-utils 2024-03-29 20:57 Backdoor in upstream xz-utils John Kehayias 2024-03-29 17:51 ` Ryan Prior @ 2024-03-31 15:04 ` Rostislav Svoboda 1 sibling, 0 replies; 34+ messages in thread From: Rostislav Svoboda @ 2024-03-31 15:04 UTC (permalink / raw) To: John Kehayias; +Cc: Felix Lechner, Ryan Prior, Guix Devel, guix-security > >> Is there a way we can blacklist known bad versions? > > I'm not sure what you mean, but I don't think so. For beginning, what about adding a short comment: diff --git a/gnu/packages/compression.scm b/gnu/packages/compression.scm index 5de17b6b51..fd5ab7ba00 100644 --- a/gnu/packages/compression.scm +++ b/gnu/packages/compression.scm @@ -493,6 +493,8 @@ (define-public pbzip2 (define-public xz (package (name "xz") +;;; Be reminded of the xz/liblzma backdoor in the versions 5.6.0 and 5.6.1! +;;; See https://www.openwall.com/lists/oss-security/2024/03/29/4 (version "5.2.8") (source (origin (method url-fetch) as a single commit, with an appropriate commit message. That's a bang for pretty much no money. > The main danger is in guix time-machine to the past Good point. So then a little note here, too: diff --git a/doc/guix.texi b/doc/guix.texi index 69a904473c..60909adf5f 100644 --- a/doc/guix.texi +++ b/doc/guix.texi @@ -5012,10 +5012,13 @@ Invoking guix time-machine @quotation Note The history of Guix is immutable and @command{guix time-machine} provides the exact same software as they are in a specific Guix -revision. Naturally, no security fixes are provided for old versions -of Guix or its channels. A careless use of @command{guix time-machine} -opens the door to security vulnerabilities. @xref{Invoking guix pull, -@option{--allow-downgrades}}. +revision. Naturally, no security fixes are provided for old versions of +Guix or its channels. A careless use of @command{guix time-machine} +opens the door to security vulnerabilities, or potentially even +backdoors. (Do you remember the +@uref{https://www.openwall.com/lists/oss-security/2024/03/29/4, backdoor +in upstream xz/liblzma leading to ssh server compromise}?) +@xref{Invoking guix pull, @option{--allow-downgrades}}. @end quotation Cheers Bost ^ permalink raw reply related [flat|nested] 34+ messages in thread
end of thread, other threads:[~2024-05-07 18:58 UTC | newest] Thread overview: 34+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-03-29 20:57 Backdoor in upstream xz-utils John Kehayias 2024-03-29 17:51 ` Ryan Prior 2024-03-29 20:39 ` Felix Lechner via Development of GNU Guix and the GNU System distribution. 2024-03-29 20:55 ` Tomas Volf 2024-03-30 21:02 ` Ricardo Wurmus 2024-04-04 10:34 ` backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) Giovanni Biscuolo 2024-04-04 15:12 ` Attila Lendvai 2024-04-04 16:47 ` Giovanni Biscuolo 2024-04-04 15:47 ` Giovanni Biscuolo 2024-04-04 19:48 ` Attila Lendvai 2024-04-04 20:32 ` Ekaitz Zarraga 2024-04-10 13:57 ` Ludovic Courtès 2024-04-11 12:43 ` Andreas Enge 2024-04-11 12:56 ` Ekaitz Zarraga 2024-04-11 13:49 ` Andreas Enge 2024-04-11 14:05 ` Ekaitz Zarraga 2024-04-13 0:14 ` Skyler Ferris 2024-04-19 14:31 ` Ludovic Courtès 2024-04-13 6:50 ` Giovanni Biscuolo 2024-04-13 10:26 ` Skyler Ferris 2024-04-13 12:47 ` Giovanni Biscuolo 2024-04-14 16:22 ` Skyler Ferris 2024-04-12 13:09 ` Attila Lendvai 2024-04-12 20:42 ` Ludovic Courtès 2024-04-13 6:13 ` Giovanni Biscuolo 2024-05-07 18:22 ` 3 kinds of bootstrap (was Re: backdoor injection via release tarballs combined with binary artifacts) Simon Tournier 2024-04-05 10:13 ` backdoor injection via release tarballs combined with binary artifacts (was Re: Backdoor in upstream xz-utils) Giovanni Biscuolo 2024-04-05 14:51 ` Attila Lendvai 2024-04-13 7:42 ` Giovanni Biscuolo 2024-04-04 23:03 ` Ricardo Wurmus 2024-04-05 7:06 ` Giovanni Biscuolo 2024-04-05 7:39 ` Ricardo Wurmus 2024-04-05 16:52 ` Jan Wielkiewicz 2024-03-31 15:04 ` Backdoor in upstream xz-utils Rostislav Svoboda
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/guix.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).