Home / Papers / Four Principles of Explainable Artificial Intelligence

Four Principles of Explainable Artificial Intelligence

129 Citations•2020•

P. Jonathon Phillips, Carina A. Hahn, Peter Fontana

journal unavailable

It is proposed that explainable AI systems deliver accompanying evidence or reasons for outcomes and processes; provide explanations that are understandable to individual users; provides explanations that correctly reﬂect the system’s process for generating the output; and that a system only operates under conditions for which it was designed and when it reaches suf ﬁcient conﬁdence in its output.

Abstract

We introduce four principles for explainable artificial intelligence (AI) that comprise the fundamental properties for explainable AI systems.They were developed to encompass the multidisciplinary nature of explainable AI, including the fields of computer science, engineering, and psychology.Because one size fits all explanations do not exist, different users will require different types of explanations.We present five categories of explanation and summarize theories of explainable AI.We give an overview of the algorithms in the field that cover the major classes of explainable algorithms.As a baseline comparison, we assess how well explanations provided by people follow our four principles.This assessment provides insights to the challenges of designing explainable AI systems.