unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / Atom feed
* Sniping "guix deploy"
@ 2021-09-09 23:45 Julien Lepiller
  2021-09-10  4:06 ` Katherine Cox-Buday
  0 siblings, 1 reply; 4+ messages in thread
From: Julien Lepiller @ 2021-09-09 23:45 UTC (permalink / raw)
  To: guix-devel

Hi guix!

A few months ago, I published a paper on "Analyzing Infrastructure as
Code to Prevent Intra-update Sniping Vulnerabilities"
(https://lepiller.eu/pdf/hayha-extended.pdf) (and the tool:
https://github.com/roptat/hayha).

Although in the paper we focus on cloudformation and AWS, I believe the
same kind of issues can be found in any cloud or IaC toolings, such as
guix deploy.

To give you a concrete example, imagine the following situation. Not
very realistic, but I hope it gets the point across.

You manage your local network with guix deploy, and it contains a
router and a web server. Imagine you want to update the web server's
config to add an ssh service that listens for root and logs you in with
no password. You are aware this is a security risk, but you trust your
local network, so you also update the router's configuration to add a
firewall rule blocking any SSH attempt from the outside.

Unfortunately, although each system is updated atomically (although,
services are not reloaded atomically), the infrastructure is not. It
could be the case that the server is updated first, exposing root login
to the internet, for as long as the router is not updated, hence the
name "sniping".

I think this is a serious threat, despite the silly example, as the
attacker only needs to be there at the right time, with no specific
knowledge or technique. In the example, any bot would soon discover the
root login and maybe take automated actions to retain access.

However, it is also an inherent security issue to this type of tools
(and you could also very well mess up manually), so it's not clear to
me what to do.

Possible mitigations rely on user's awareness of the potential issue.
In the previous example, we would need to update the router first, and
only update the server once the router is updated. For a roll-back
(resetting the firewall and removing ssh access), the other order is
required. In other IaC tools, there is at least a way to describe
dependencies between systems/services. I think we should at least
implement such a feature in Guix too.

As a rule of thumb, when you update multiple systems and one system
provides security for another, you should update the security system
before the protected system if you restrict access, and the other way
around if you allow more access. Maybe we could add that to the manual,
in addition to letting users configure upgrade order?

In our paper, we were able to see that because Cloudformation has
explicit "references" between systems. It's also more of an issue in
Cloudformation, since you declare only small independent components and
not whole systems (security resources are always separate from the
resources they protect). There might be a way to improve guix language
to force using references between systems, which would allow us to
adopt a similar solution to what we propose in the paper. Or maybe it's
time to advocate for "immutable infrastructure" :)

Wdyt?


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Sniping "guix deploy"
  2021-09-09 23:45 Sniping "guix deploy" Julien Lepiller
@ 2021-09-10  4:06 ` Katherine Cox-Buday
  2021-09-10  5:07   ` Jack Hill
  0 siblings, 1 reply; 4+ messages in thread
From: Katherine Cox-Buday @ 2021-09-10  4:06 UTC (permalink / raw)
  To: Julien Lepiller; +Cc: guix-devel

Julien Lepiller <julien@lepiller.eu> writes:

> As a rule of thumb, when you update multiple systems and one system
> provides security for another, you should update the security system
> before the protected system if you restrict access, and the other way
> around if you allow more access. Maybe we could add that to the manual,
> in addition to letting users configure upgrade order?

I am not a Guix deploy expert (although I have just begun using it --
it's great!), but it seems to me that this is a higher-order mechanism
that wields deploy in a certain way.

I.e., as I understand it, deploy is going to walk the list of machines
you give it and run the deployment synchronously. If you wanted to build
references between machines, for security, live-upgrades, or otherwise,
I would think you'd declare some kind of data structure other than a
list (probably a graph) which would tell deploy what order to perform
the deployment.

Rolling back is a whole other matter, but I imagine that same data
structure could tell deploy how to walk it in reverse.

So, not that I have time to work on any of this, but what I would do is
shore up deploy so that it's nice and robust (I think it's still early
days), and then work on a declarative way to define machines and their
dependencies, and then a function to produce a sequence from these data
structures so deploy can stay an iterative type process.

Just late night thoughts from me :) Thanks for the input!

-- 
Katherine


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Sniping "guix deploy"
  2021-09-10  4:06 ` Katherine Cox-Buday
@ 2021-09-10  5:07   ` Jack Hill
  2021-09-10 15:36     ` Katherine Cox-Buday
  0 siblings, 1 reply; 4+ messages in thread
From: Jack Hill @ 2021-09-10  5:07 UTC (permalink / raw)
  To: Katherine Cox-Buday; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 3303 bytes --]

Thanks Julien and Katherine

On Thu, 9 Sep 2021, Katherine Cox-Buday wrote:

> Julien Lepiller <julien@lepiller.eu> writes:
>
>> As a rule of thumb, when you update multiple systems and one system
>> provides security for another, you should update the security system
>> before the protected system if you restrict access, and the other way
>> around if you allow more access. Maybe we could add that to the manual,
>> in addition to letting users configure upgrade order?
>
> I am not a Guix deploy expert (although I have just begun using it --
> it's great!), but it seems to me that this is a higher-order mechanism
> that wields deploy in a certain way.

I'm also not a deploy expert, nor I have I finished reading Julien's 
paper, but I really like this idea of building mechanisms around guix 
deploy, so wanted to reply. Thanks for suggesting it.

> I.e., as I understand it, deploy is going to walk the list of machines
> you give it and run the deployment synchronously. If you wanted to build
> references between machines, for security, live-upgrades, or otherwise,
> I would think you'd declare some kind of data structure other than a
> list (probably a graph) which would tell deploy what order to perform
> the deployment.
>
> Rolling back is a whole other matter, but I imagine that same data
> structure could tell deploy how to walk it in reverse.
>
> So, not that I have time to work on any of this, but what I would do is
> shore up deploy so that it's nice and robust (I think it's still early
> days), and then work on a declarative way to define machines and their
> dependencies, and then a function to produce a sequence from these data
> structures so deploy can stay an iterative type process.
>
> Just late night thoughts from me :) Thanks for the input!

I know that CoreOS at one point took cluster-wide knowledge into account 
when performing upgrades (so as not to take down all replicas at once for 
instance). It and similar projects are probably too far afield to take 
implementation ideas from, but some of the high level concepts might be 
worth taking inspiration from.

Another higher level mechanism that we might need for deploy is some way 
to do IO when building operating system definitions. Something along the 
lines of how "facts" are handled in the various configuration management 
tools. I see this as being useful for doing things like reconfiguring a 
system, but leaving the disk partitioning how it is now without having to 
to declare what the partitions are ahead of time.* In Scheme, there is 
nothing stopping us from doing IO at arbitrary points in an operating 
system definition, but as this takes away from the declarative nature of 
the configuration, having shared mechanisms for doing this in a clear, 
maintainable, and principled way could be very useful.

Late night brainstorming is fun.

Take care,
Jack

* I suppose file systems are the most obvious place where ugly state like 
this can sneak its way in since they are essential giant buckets to hold 
state, which is probably why this example has been running around in my 
head. While it might be nice to think about eliminating all state that 
doesn't seem workable, but maybe we come up with some way to represent the 
state declaratively à la Haskell's IO and State monads.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Sniping "guix deploy"
  2021-09-10  5:07   ` Jack Hill
@ 2021-09-10 15:36     ` Katherine Cox-Buday
  0 siblings, 0 replies; 4+ messages in thread
From: Katherine Cox-Buday @ 2021-09-10 15:36 UTC (permalink / raw)
  To: Jack Hill; +Cc: guix-devel

Jack Hill <jackhill@jackhill.us> writes:

> I know that CoreOS at one point took cluster-wide knowledge into
> account when performing upgrades (so as not to take down all replicas
> at once for instance). It and similar projects are probably too far
> afield to take implementation ideas from, but some of the high level
> concepts might be worth taking inspiration from.

Yeah, good point. There are probably a lot of other primitives here
worth implementing. I incorrectly scoped my statement to the
threat-model given.

> Another higher level mechanism that we might need for deploy is some
> way to do IO when building operating system definitions. Something
> along the lines of how "facts" are handled in the various
> configuration management tools. I see this as being useful for doing
> things like reconfiguring a system, but leaving the disk partitioning
> how it is now without having to to declare what the partitions are
> ahead of time.* In Scheme, there is nothing stopping us from doing IO
> at arbitrary points in an operating system definition, but as this
> takes away from the declarative nature of the configuration, having
> shared mechanisms for doing this in a clear, maintainable, and
> principled way could be very useful.

Maybe a blanket #t could mean "leave this alone" when specified for an
operating-system field?

> Late night brainstorming is fun.

That it is! And though this discussion is fun, I think it's very
premature. There are a handful of bugs[1] open for deploy. One is
critical[2], and I just filed one[3] because deploy can share
state/provinance between all machines deployed together. And I'm not
sure if it has someone actively working to make it better?

As always I find myself wishing I had more time to contribute.

> * I suppose file systems are the most obvious place where ugly state
>   like this can sneak its way in since they are essential giant
>  buckets to hold state, which is probably why this example has been
> running around in my head. While it might be nice to think about
> eliminating all state that doesn't seem workable, but maybe we come up
> with some way to represent the state declaratively à la Haskell's IO
> and State monads.

I am very fond of "trapdoors" in systems which are constrained/rigid by
design. It gives you all of the benefit of the constraints, but leaves a
way for users to work around bugs, or play with things until they get it
right and can bring it back into the fold.

[1] - https://issues.guix.gnu.org/search?query=%22guix+deploy%22+is%3Aopen
[2] - https://issues.guix.gnu.org/46756
[3] - https://issues.guix.gnu.org/50468

-- 
Katherine


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-09-10 15:36 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-09 23:45 Sniping "guix deploy" Julien Lepiller
2021-09-10  4:06 ` Katherine Cox-Buday
2021-09-10  5:07   ` Jack Hill
2021-09-10 15:36     ` Katherine Cox-Buday

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).