audio_adversarial_examples by carlini

Research code for targeted audio adversarial examples

Created 8 years ago

308 stars

Top 87.3% on SourcePulse

View on GitHub

1 Expert Loves This Project

Clarence Chio

Cofounder of Coverbase, Unit21

Project Summary

This repository provides code for generating targeted adversarial examples against speech-to-text (STT) systems, specifically targeting DeepSpeech. It enables researchers and security professionals to probe the robustness of STT models by creating audio inputs that are imperceptible to humans but cause misclassification.

How It Works

The project implements optimization-based attacks to find minimal audio perturbations that alter STT output. It leverages a differentiable STT model (DeepSpeech) to compute gradients of the loss function with respect to the input audio, guiding the search for adversarial perturbations. This gradient-based approach allows for targeted attacks, aiming to transform speech into a specific, incorrect transcription.

Quick Start & Requirements

Install/Run: Docker image is provided for GPU execution.
- Build: docker build -t aae_deepspeech_093_gpu .
- Run: docker run --gpus all -v /absolute/path/to/data:/data -v /absolute/path/to/tmp:/tmp -ti aae_deepspeech_093_gpu
Prerequisites: NVIDIA GPU, CUDA 10.1, CuDNN v7.6, Docker, NVIDIA Container Toolkit.
Setup: Requires building a Docker image and setting up NVIDIA Container Toolkit for GPU support.
Links:
- Paper: https://arxiv.org/abs/1801.01944
- Docker install: https://docs.docker.com/install/
- NVIDIA Container Toolkit: https://github.com/NVIDIA/nvidia-docker

Highlighted Details

Code is for targeted adversarial examples on speech-to-text systems.
Supports DeepSpeech v0.9.3 with TensorFlow 1.15.4.
Older versions (TF 1.14, DeepSpeech 0.4.1) are available via specific commits.
Reproducing the original paper requires checking out commit a8d5f675ac8659072732d3de2152411f07c7aa3a.

Maintenance & Community

The project is associated with Nicholas Carlini and David Wagner.
No explicit community channels (Discord/Slack) or roadmap are mentioned in the README.

Licensing & Compatibility

The README does not explicitly state a license. The code is provided for research purposes.

Limitations & Caveats

The README explicitly states "THIS IS NOT THE CODE USED IN THE PAPER," suggesting potential discrepancies in results or methodology. Reproducing the paper's exact setup is described as potentially difficult due to dependency management ("dependency hell"). GPU support is mandatory for the provided Docker image, and Windows/Mac GPU usage with Docker is noted as potentially unsupported.

Health Check

Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days