adversarial-explainable-ai by hbaniecki

Resource list on adversarial explainable AI (AdvXAI)

Created 5 years ago

330 stars

Top 82.7% on SourcePulse

Project Summary

This repository serves as a curated list of academic papers focusing on adversarial attacks and defenses within Explainable Artificial Intelligence (XAI). It aims to consolidate research on the vulnerabilities of XAI methods, providing a resource for researchers and practitioners interested in the security and trustworthiness of AI explanations.

How It Works

The project compiles a comprehensive survey of literature that explores how adversarial manipulations can fool or compromise the integrity of XAI techniques. It categorizes various attack strategies and defense mechanisms, offering a unified notation and taxonomy to structure the field of Adversarial Explainable AI (AdvXAI). The goal is to highlight existing insecurities and propose future research directions for developing more robust interpretation methods.

Quick Start & Requirements

This repository is a collection of research papers and does not involve direct code execution or installation. All requirements are related to accessing and reading academic publications.

Highlighted Details

Comprehensive survey of adversarial attacks and defenses in XAI.
Unified notation and taxonomy for the AdvXAI field.
Discussion of vulnerabilities in popular XAI methods like LIME and SHAP.
Identification of research gaps and future directions for robust XAI.

Maintenance & Community

The primary contributor is Hubert Baniecki, with a published survey paper in Information Fusion. The repository is a static collection of research links.

Licensing & Compatibility

The repository itself does not have a license as it is a collection of links to academic papers. The licensing of the individual papers would be governed by their respective publishers.

Limitations & Caveats

This repository is a curated list of papers and does not provide any code or tools for implementing or testing adversarial XAI techniques. Its utility is limited to academic research and literature review.

adversarial-explainable-ai by hbaniecki

Explore Similar Projects

Adversarial_Examples_Papers by Trustworthy-AI-Group

interpretability-literature by amarasovic

awesome-trustworthy-deep-learning by MinghuiChen43

membership-inference-machine-learning-literature by HongshengHu

awesome-ai-security by ottosulin

advmlthreatmatrix by mitre

xai_resources by pbiecek

offensive-ai-compilation by jiep

backdoor-learning-resources by THUYimingLi

TAADpapers by thunlp

Awesome-AI-Security by DeepSpaceHarbor

cleverhans by cleverhans-lab