AutoAudit  by ddzipp

LLM for cybersecurity

created 2 years ago
342 stars

Top 81.9% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides LLMs fine-tuned for cybersecurity tasks, aiming to enhance threat detection, analysis, and response. It targets cybersecurity professionals and researchers by offering automated analysis of code, network protocols, and security knowledge, with a practical integration of ClamAV for a security scanning platform.

How It Works

The project leverages fine-tuned LLMs, including versions based on Alpaca-Lora (AutoAudit-7B) and Llama3-8B-instruct (AutoAudit-8B-Instruct), to process and analyze cybersecurity-related data. The approach focuses on generating detailed analysis, security ratings, risk assessments, and solutions, drawing from a dataset organized in the Alpaca format, combining human annotation and self-generated data from security sources.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt
  • Requires ClamAV installation and path configuration.
  • Set paths for Llama model and LoRA weights in sandbox/yahma/llama-7b-hf and sandbox/lilBuffaloEirc/autoaudit_20230703_attempt2.
  • Run with python manage.py runserver.
  • Python 3.8 is specified for the Conda environment.
  • Official documentation is available via the "文档/Wiki" link.

Highlighted Details

  • Offers AutoAudit-7B (demo, Alpaca-Lora based) and AutoAudit-8B-Instruct (Llama3-8B-instruct based) models.
  • Integrates with ClamAV for a security scanning platform.
  • Dataset generation utilizes a GPT in the GPT Store.
  • Future plans include synthesizing a high-quality cybersecurity corpus and integrating tools like Nmap and Metasploit.

Maintenance & Community

  • Active development acknowledged with thanks to Eric Ma and CUHKSZ He Lab.
  • Community interaction encouraged via "提问/Issues" and "讨论/Discussions" links.

Licensing & Compatibility

  • License details are not explicitly stated in the provided README text.

Limitations & Caveats

  • AutoAudit-7B lacks contextual understanding and requires larger parameter models.
  • The AutoAudit-Qwen model is in the exploration and planning stage due to limited Chinese cybersecurity corpus availability.
Health Check
Last commit

5 months ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
16 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.