Capsicum vs Pledge Final Thoughts

Published: Monday, June 15, 2020
Tags: tech code dhcpcd sandbox

Following on from Capsicum vs Pledge Part 2 I thought I would post my final thougts on the topic as the development on these sandbox technologies draws to a close in dhcpcd.

But first, let us discuss …

The POSIX Resource Limited sandbox

POSIX documents setrlimit(2). Disabling the ability to open new files, sockets, etc, or create new processes is actually pretty powerful.

Thanks to the privsep dhcpcd now has to support both Capsicum and Pledge, this turned out to be pretty easy to implement. The only issue with this is the poll(2) interface which dhcpcd makes great use of. Implementations found on Linux, OpenBSD and Solaris return EINVAL when the nfds argument is greater than RLIMIT_NOFILE where-as the other OS’s dhcpcd supports don’t. This means that on Linux, OpenBSD and Solaris an attacker could close an exiting file descriptor and try to open a new one. On OpenBSD this is not that much of a big deal because pledge should stop that from happening. On Linux and Solaris, they can’t open a file thanks to the chrooted empty directory but they can create a new one. But thanks to RLIMIT_FSIZE they can’t actually write to it. At most they could create a network socket and send arbitary data over it. For Linux, we could look into using seccomp to stop this.

For implementations such as NetBSD and FreeBSD, setting RLIMIT_NOFILE to zero with poll(2) still working means they cannot create any few file descriptor. This means that if an attacker breaches a resource limited process it can only work with the resources it has because it cannot fork another process, nor open any files, sockets, etc. It’s also running as an unpriviledged user locked in an empty directory, so there is nothing to see or do there. All the resources it currently has are:

PF_INET, PF_LINK, etc sockets that can only query for data
network proxy process (receives only, does not send)
BPF processes (send and receive, the read and write filters are also locked)
privileged actioneer (filters ioctls, validates paths and outbound traffic)

So the only way to do anything outside of what dhcpcd normally does is to use a facility that does not need to create a file or socket, or fork a process. The only remaining avenue of attack is to break the privileged actioneer process- that cannot be protected by any sandboxing as it needs to do a lot of stuff. Both OpenBSD’s and FreeBSD’s dhclient have such a privileged process as well.

The priviledged actioneer process itself doesn’t do a great deal. For example to add a route on BSD you create a RTM_ADD message. The master process will do this and instead of writing to the PF_ROUTE socket itself it will pass the message to the privilged actioneer process which in turn writes to it’s PF_ROUTE socket. Every ioctl, path accessed or network bound packet is validated by the privileged actioneer. Even though it’s generic, it’s also locked down.

So what extra to Capsicum and Pledge bring to the table?

Both bring system call filtering. For example, sysctl(2) does not need to create a new resource. Information available to the ordinal user such as uname may not be desirable to leak to these sandboxed processes. But then the question to ask is what can they do with it?

Pledge overcomes the RLIMIT_NOFILE limitation for poll(2) on OpenBSD.

Capsicum goes a bit futher by limiting rights of each resource you have in terms of what you can do with them.

What they both bring to the table though is making sandboxing easier. Here, Pledge is the outright winner. As I pointed out in my initial blog post, Pledge is really easy. But make it too easy and it’s not as secure as it could be. Capsicum is harder and the resource limited sandbox is harder still.

The take-away point from this is while Capsicum and Pledge are nice, they don’t beat a good design.