ConLUX: Concept-Based Local Unified Explanations.

CoRR（2024）

Cited 0|Views2

Abstract

With the rapid advancements of various machine learning models, there is a significant demand for model-agnostic explanation techniques, which can explain these models across different architectures. Mainstream model-agnostic explanation techniques generate local explanations based on basic features (e.g., words for text models and (super-)pixels for image models). However, these explanations often do not align with the decision-making processes of the target models and end-users, resulting in explanations that are unfaithful and difficult for users to understand. On the other hand, concept-based techniques provide explanations based on high-level features (e.g., topics for text models and objects for image models), but most are model-specific or require additional pre-defined external concept knowledge. To address this limitation, we propose \toolname, a general framework to provide concept-based local explanations for any machine learning models. Our key insight is that we can automatically extract high-level concepts from large pre-trained models, and uniformly extend existing local model-agnostic techniques to provide unified concept-based explanations. We have instantiated \toolname on four different types of explanation techniques: LIME, Kernel SHAP, Anchor, and LORE, and applied these techniques to text and image models. Our evaluation results demonstrate that 1) compared to the vanilla versions, \toolname offers more faithful explanations and makes them more understandable to users, and 2) by offering multiple forms of explanations, \toolname outperforms state-of-the-art concept-based explanation techniques specifically designed for text and image models, respectively.

Translated text

Bibtex

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Data Disclaimer

The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn

Chat Paper

【要点】：论文提出ConLUX框架，利用自动提取的高级概念，为任意机器学习模型提供基于概念的本地统一解释，提高了解释的忠实度和用户理解度。

【方法】：通过从大型预训练模型中自动提取高级概念，并将现有本地模型无关技术统一扩展，实现了概念基础的本地解释。

【实验】：作者在四种不同的解释技术（LIME、Kernel SHAP、Anchor和LORE）上实现了ConLUX，并将其应用于文本和图像模型，结果显示ConLUX比原始版本提供了更忠实、更易懂的解释，并且优于专为文本和图像模型设计的最先进的概念基础解释技术。数据集名称在论文中未明确提及。

去 AI 文献库对话