OpenAI Gym environment for malware manipulation research
Top 53.7% on sourcepulse
This repository provides a malware manipulation environment for OpenAI Gym, enabling reinforcement learning agents to learn functionality-preserving transformations on PE files to evade machine learning-based malware detection. It is targeted at researchers and developers in cybersecurity and AI, offering a framework to train agents that bypass static analysis malware detectors.
How It Works
The core approach uses OpenAI Gym to define a reinforcement learning environment where the "environment" is a PE file and the "agent" is an algorithm that applies binary manipulations. The agent receives observations about the malware sample and a reward signal based on its success in bypassing a classifier. The environment leverages the LIEF library for on-the-fly binary modification, supporting actions like appending data, repacking with UPX, changing section names, and modifying headers.
Quick Start & Requirements
pip install -r requirements.txt
gym_malware/gym_malware/envs/utils/samples/
. A VirusTotal API key is needed to download samples using download_samples.py
.Highlighted Details
Maintenance & Community
No specific community links (Discord/Slack) or roadmap are provided in the README. The project is associated with Hyrum S. Anderson and David Evans.
Licensing & Compatibility
The repository does not explicitly state a license. The associated research paper is available on arXiv. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project requires specific Python versions and manual setup of LIEF. Acquiring malware samples requires a VirusTotal API key. The default classifier is a specific model, and its effectiveness against diverse malware families is not detailed.
2 years ago
1 day