Leveraging Large Language Models to Generate Natural Language Explanations of AI Systems - A Framework for Natural Language Explanations
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The acceleration of artificial intelligence (AI) deployment across various domains necessitates advancements in explainable AI (XAI) to enhance transparency and user interaction, including calibrating trust and reliance. This thesis introduces a framework leveraging large language models (LLMs) to generate free text natural language explanations (NLEs) of AI systems, without the need for human-annotated data. One of the aims of the framework is to make the explanations accessible and comprehensible to non-technical users. The framework integrates explainer models with LLMs to transform complex AI outputs into natural language. This thesis evaluates the framework’s effectiveness in generating faithful NLEs in a text classification task. Moreover, a user study examines how these explanations affect user satisfaction and reliance. The results demonstrate that while the framework can generate explanations faithful to the input from the explainer model, the satisfaction among users did not significantly differ from a traditional explanation method (LIME). However, the results indicate that NLEs can decrease over-reliance on AI systems. The thesis highlights critical considerations in selecting explainer models and tailoring explanations to the context and user expectations. It also opens avenues for future work, including enhancing interaction with explanations through conversational agents and the possibility to tailor explanations to the users.