It is critical that AI systems be developed so that they can be safely, robustly deployed in a way that broadly benefits society at large. The practical, near-term risks and harms as well as the potential hazards that we might collectively face from more capable and independent AI systems further down the road are areas deserving of rigorous examination.
We’re most interested in the detailed work that happens in the real world to make technological systems that are complex but safe. Consider bridges, planes, chemical processing plants, or nuclear reactors – these are all fairly complicated systems that, due to significant engineering effort, are extraordinarily safe. This is because these systems are designed safely from the ground up, rather than factoring in safety retroactively. This approach requires theoretical and practical understanding of the underlying principles of every component.
As a field, we currently lack a strong theoretical understanding of how deep learning works, let alone more abstract concepts like human understanding, values, and ethics. We believe it is critical to make progress on increasing our understanding of these phenomena. Such understanding will help us, like engineers in other fields, create systems that are safe and robust by design. We want to create agents that are interpretable by construction. We want researchers to be able to understand the goals and plans of complex agents, as well as the anticipated side-effects of those actions.
We also believe there is a very important role for international norms, regulations, and governments to play in ensuring that AI is deployed in a way that is broadly beneficial, and we will continue to collaborate on such work. For an early example of this type of work, see the NIST AI Risk Management Framework.
We will share more about our approach to safety, as well as related research and new initiatives, over the coming months. If you’re interested in developing AI systems that are designed with safety and robustness in mind, see our job postings for an idea of our current projects.