Explainable AI (XAI) Methods for Convolutional Neural Networks

Introduction

This is the page of the undergraduate thesis of Antonio Fernando Silva e Cruz Filho e João Gabriel Andrade de Araujo Josephik, advised by Professor Nina Hirata.

The goal of the project is to study, explore and compare different explainability methods for Convolutional Neural Networks.

Proposal

Modern neural networks are very complex, formed by billions of parameters spread on multiple layers. Therefore, those systems are often seen as “black box” models: it’s possible to visualize the input and output, but not the internal mechanisms of the network.

In this context, multiple methods were developed over the years to shed a light in the “reasoning” behind the system decisions. We aim to study methods developed specifically for Convolutional Networks.

Throughout the course of this research project, several important topics were (or will be) be studied. These topics may include:

  • Feature Visualization models, such as DeepDream
  • Pixel attribution models, implementing GradCam
  • Using interpretable models to create explanations about Convolutional Neural Networks, using techniques like LIME.

Using the tools studied by this research, this project aims to contribute to existing knowledge about explainability and to possibly create new tools and techniques for explaining deep learning models.

Important Links

Thesis

Poster

Source Code

Presentation slides

References