Maybe I'm missing something about the use case, but I'm not sure I quite follow.
Sure, something needs to have permission to use the higher level of privilege. On your typical POSIX OS, your program is probably born with the ability to create arbitrary TCP/UDP sockets by default; on a capability OS, maybe you've explicitly provided it with access to your network stack. Regardless, at the entry point to your program you presumably have modules providing arbitrary network access in scope somehow.
If I'm understanding correctly, the case you described is that you have an HTTP client module that you'd like to have direct access to the network, but you'd like to restrict the consumers of the HTTP client to only querying certain hosts. From the start of your program, you'd instantiate an HTTP client (passing it a capability to use the network interface) then instantiate one of those HTTP client proxy objects that only allows communication with one host (passing it a capability to use the HTTP client). From there, you pass the capability to that proxy object to the unprivileged consumer of the module.
This seems to work without any kind of stack walking authentication logic, just normal variable scope, provided the language is capability-based. Am I missing something?
Exactly. What usually happens in capability systems is that the main() method gets all the capabilities (or whatever capabilities the user allowed it) and then does dependency injection to distribute those to other components. No need for complex stack-based authentication or policy rule evaluation.
Indeed, if you look at the history of Java sandbox escapes they are largely confused deputy attacks: some privileged code source can be tricked into doing something it shouldn’t do.
You can build a sandboxing language without any sort of stack walking. SEL4+C does this. It doesn't have especially good usability at scale, and it's not easy to modularise.
You're imagining a system where there's no specific authentication system for code. Instead in order to use a library, you need to explicitly and manually obtain all the capabilities it needs then pass them in, and in main() you get a kind of god object that can do everything that then needs to be progressively wrapped. If a library needs access to a remote service, you have to open the socket yourself and pass that in, and the library then needs to plumb it through the whole stack manually to the point where it's needed. If the library develops a need for a new permission then the API must change and again, the whole thing has to be manually plumbed through. This is unworkable when you don't control all the code in question and thus can't change your APIs, and as sandboxing is often used for plugins, well, that's a common problem.
There's no obvious way to modularise or abstract away that code. It can't come from the library itself because that's what you're trying to sandbox. So you have to wire up the library to the capabilities yourself. In some cases this would be extremely painful. What if the library in question is actually a networking library like Netty? There could be dozens or hundreds of entry points that eventually want to open a network connection of some sort.
What does this god object look like? It would need to hold basically the entire operating system interface via a single access point. That's not ideal. In particular, loading native code would need to also be a capability, which means any library that optimised by introducing a C version of something would need to change its entire API, potentially in many places. This sort of design pattern would also encourage/force every library to have a similar "demi-god" object approach, to reduce the pain of repeatedly passing in or creating capabilities. Sometimes that would work OK, other times it wouldn't.
The stack walking approach is a bit like SELinux. It allows for a conventional OO class library, without the need for some sort of master or god object, and all the permissions things need can be centralised in one place. Changes to permissions are just one or two extra lines in the security config file rather than a potentially large set of code diffs.
Now all that said, reasonable people can totally disagree about all of this. The JVM has been introducing more capability objects with time. For example the newer MethodHandle reflection object is a capability. FileChannel is a capability (I think!). You could build a pure capability language that runs on the JVM and maybe someone should. Perhaps the usability issues are not as big a deal as they seem. It would require libraries to be wrapped and their APIs changed, including the Java standard library, but the existing functionality could all be reused. The new libraries would just be a thin set of wrappers and forwarders over pre-existing functionality, but there'd be no way for anything except the god object to reach code that'd do a stack walk. Then the security manager can be disabled, and no checks will occur. It'd be a pure object capability approach.
> If a library needs access to a remote service, you have to open the socket yourself and pass that in, and the library then needs to plumb it through the whole stack manually to the point where it's needed.
You don't need to do this. There are a variety of ways to handle this, just as you would any other kind of dependency injection:
1. Design libraries to actually be modular so that dependencies (including capabilities) can be injected just where they are needed.
2. Pass in a factory object that lets the library construct sockets as and when it needs them. You can then enforce any arbitrary checks at the point of creating the socket. (This is much more flexible than a Java policy file).
3. Use a powerbox pattern [1] to allow the user to be directly asked each time the library attempts to open a socket. This is not always good UX, but sometimes it is the right solution.
> If the library develops a need for a new permission then the API must change and again, the whole thing has to be manually plumbed through.
Capturing permission requirements in the API is a good thing! With the stack walking/policy based approach I won't know the library needs a new permission until some library call suddenly fails at runtime.
The policy file isn't required, by the way. That's just a default implementation. My PDF viewer had a hard-coded policy and didn't use the file.
OK, so in a pure capability language how would you implement this: program A depends on dynamically loaded/installed plugin B written by some third party, that in turn depends on library C. One day library C gets a native implementation of some algorithm to speed it up. To load that native library requires a capability, as native code can break the sandbox. However:
1. You can't change the API of C because plugin B depends on it and would break.
2. You can't pass in a "load native library" capability to plugin B because you don't know in advance that B wants to use C, and if you did, B could just grab the capability before it gets passed to C and abuse it. So you need to pass the capability directly from A to C. But now A has to have a direct dependency on C and initialise it even if it's not otherwise being used by A or B.
Stack walking solves both these problems. You can increase the set of permissions required by library C without changing its callers, and you don't have the problem of needing to short-circuit everything and create a screwed up dependency graph.
With the stack walking/policy based approach I won't know the library needs a new permission until some library call suddenly fails at runtime
You often wouldn't need to. What permissions a module has is dependent on its implementation. It's legitimate for a library to be upgraded such that it needs newer permissions but that fact is encapsulated and abstracted away - just like if it needed a newer Java or a newer transitive dependency.
> OK, so in a pure capability language how would you implement this: program A depends on dynamically loaded/installed plugin B written by some third party, that in turn depends on library C. One day library C gets a native implementation of some algorithm to speed it up. To load that native library requires a capability, as native code can break the sandbox.
Now, I'm a little outside my area of expertise due to not having worked with capability systems very much yet. (There aren't that many of them and they're still often obscure, so even just trying to gain experience with them is difficult at this point.)
But that said... in an ideal capability system, isn't the idea that native code could just break the sandbox also wrong? I would imagine that in such a system, depending on another module that's running native code would be just fine, and the capability system's constraints would still apply. Maybe that could be supported by the OS itself on a capability OS; maybe the closest thing we'll get to native code for that on our existing POSIX systems is something like WASI[0].
> You often wouldn't need to. What permissions a module has is dependent on its implementation. It's legitimate for a library to be upgraded such that it needs newer permissions but that fact is encapsulated and abstracted away - just like if it needed a newer Java or a newer transitive dependency.
If our goal is to know that the dependencies we're using don't have more authority than they need, isn't it a problem if a module's permissions may increase without explicit input from the module's user (transitive or otherwise)?
One of the foundations of object-capability security is memory safety, so loading arbitrary native code does subvert that. You can get around this by, for example, requiring native code to be loaded in a separate process. As you say, a capability OS and/or CPU architecture [1] is able to confine native code.
> isn’t it a problem if a module’s permissions may increase without explicit input from the module’s user (transitive or otherwise)?
isn't it a problem if a module's permissions may increase without explicit input from the module's user (transitive or otherwise)?
The modules permissions can't increase without explicit input e.g. changes to the policy file. But the person who cares about the sandbox integrity is the user of the overall software or computing system. The plugin developer doesn't really care how the API is implemented or what permissions it needs. They just want it to work. The person who cares is the person who owns the resource or data an attacker may be trying to breach.
The beauty of object-capability security is that it completely aligns with normal object-oriented design. So you can always recast these discussions to not be about security: how would I inject any other new dependency I needed without changing the API of all intermediaries? And there is a whole literature of design patterns for doing this.
All you'd do there is make the injector a semantic equivalent of the AccessController. The injector must have some sort of security policy after all, to decide whether a component is allowed to request injection of a capability. Whether you structure it as a single subsystem is responsible for intercepting object construction and applying policy based on the home module of what's being constructed, or whether you determine that module via stack walks, the end result is very similar: some central engine decides what components can do and then applies that policy.
The Java approach is nice because it avoids any need for DI. DI is not a widely accepted pattern. There are no DI engines that would have any support for this kind of policy-driven injection. And whilst popular in some areas of software like Java web servers, it hardly features in most other languages and areas, there are no programming languages with built in support for it and that includes modern languages like Kotlin. DI engines meanwhile have changed how they work pretty radically over time - compare something like the original Spring XML DI to Guice to Dagger3. Plus, DI is awkward when the dependency you need isn't a singleton. How would I express for example "I need a capability injected that gives me access to the $HOME/.local/cache/app-name directory"? Annotation based DI struggles with this, but with the AccessController it's natural: the component just requests what it needs, and that's checked against a policy, which can be dynamically loaded from a file, or built by code.
Sure, something needs to have permission to use the higher level of privilege. On your typical POSIX OS, your program is probably born with the ability to create arbitrary TCP/UDP sockets by default; on a capability OS, maybe you've explicitly provided it with access to your network stack. Regardless, at the entry point to your program you presumably have modules providing arbitrary network access in scope somehow.
If I'm understanding correctly, the case you described is that you have an HTTP client module that you'd like to have direct access to the network, but you'd like to restrict the consumers of the HTTP client to only querying certain hosts. From the start of your program, you'd instantiate an HTTP client (passing it a capability to use the network interface) then instantiate one of those HTTP client proxy objects that only allows communication with one host (passing it a capability to use the HTTP client). From there, you pass the capability to that proxy object to the unprivileged consumer of the module.
This seems to work without any kind of stack walking authentication logic, just normal variable scope, provided the language is capability-based. Am I missing something?