Can't fault people for studying this I guess. But for the sort of future technology, super-sophisticated systems that are the main concern, is there anything that would prevent just circumventing the assumptions about possible decision functions, let's say by inserting a thorough game theoretical analysis somewhere along the cognitive pipeline, and use that to discard all courses of action that come out as vulnerable?
Seems like it would often not even be that expensive compared to the rest of the calculations that you might assume would be needed?
> use that to discard all courses of action that come out as vulnerable?
If we're aiming an arbitrarily intelligent agent, what if it concludes it's goals are best achieved via actions that would be be blocked by such a system? How do you know it couldn't come up with a way to perform those actions without triggering that censoring mechanism via some obfuscation or subterfuge? How can you be confident you won't be outsmarted by a system whose conceit is to be smarter than you?
The one thing I've been mulling over in regards to this is that it is theoretically possible for a less intelligent agent to ensnare a more intelligent agent.
For example, Ted Kaczynski is currently in jail, and it's not because his captors are smarter than him.
One type of superintelligence is collective superintelligence. Ted may be one sightly smarter person, but he was up against many people of varying intelligence.
I don't know if he was smart so much as driven, however.