Machine learning and AI are facing a new challenge: making models more explainable.
This means to develop new methodologies to describe the behaviour of widely adopted black-box models, i.e., high-performing models whose internal logic is challenging to describe, justify, and understand from a human perspective.
The final goal of an explainability method is to faithfully describe the behaviour of a (black-box) model to users who can get a better understanding of its logic, thus increasing the trust and acceptance of the system.
Unfortunately, state-of-the-art explainability approaches may not be enough to guarantee the full understandability of explanations from a human perspective. For this reason, human-in-the-loop methods have been widely employed to enhance and/or evaluate explanations of machine learning models. These approaches focus on collecting human knowledge that AI systems can then employ or involving humans to achieve their objectives (e.g., evaluating or improving the system).
Based on these assumptions and requirements, we published a review article that aims to present a literature overview on collecting and employing human knowledge to improve and evaluate the understandability of machine learning models through human-in-the-loop approaches. The paper features a discussion on the challenges, state-of-the-art, and future trends in explainability.
The paper starts from the definition of the notion of “explanation” as an “interface between humans and a decision-maker that is, at the same time, both an accurate proxy of the decision-maker and comprehensible to humans”. Such a description highlights two fundamental features an explanation should have. It must be accurate, i.e., it must faithfully represent the model’s behaviour, and comprehensible, i.e., any human should be able to understand the meaning it conveys.
The Role of Human Knowledge in Explainable AI
The figure above summarizes the four main ways to use human knowledge in explainability, namely: knowledge collection for explainability (red), explainability evaluation (green), understanding human’s perspective in explainability (blue), and improving model explainability (yellow). In the schema, the icons represent human actors.
You may cite the paper as: