SecGPT by Clouditera

Open-source LLM for cybersecurity tasks

Created 2 years ago

2,887 stars

Top 16.4% on SourcePulse

1 Expert Loves This Project

chiphuyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Project Summary

SecGPT is an open-source large language model specifically designed for cybersecurity tasks, aiming to enhance security defense efficiency and effectiveness through AI. It targets cybersecurity professionals, researchers, and engineers, offering an intelligent assistant for various security operations.

How It Works

SecGPT integrates natural language understanding, code generation, and security knowledge reasoning. It is built upon foundational models like Qwen2.5-Instruct and DeepSeek-R1, enhanced through extensive pre-training, instruction fine-tuning, and reinforcement learning on a proprietary, large-scale cybersecurity corpus (over 5TB). This approach aims to significantly improve the model's comprehension, reasoning, and response capabilities in specialized security contexts.

Quick Start & Requirements

Installation: Deploy using vLLM. Requires Python 3.10+ and PyTorch with CUDA support.

Commands:

conda create -n secgpt-vllm python=3.10 -y
conda activate secgpt-vllm
pip install --upgrade pip
pip install vllm
CUDA_VISIBLE_DEVICES= xxx(GPU index) vllm serve ./secgpt --tokenizer ./secgpt --tensor-parallel-size 4 --max-model-len 32768 --gpu-memory-utilization 0.9 --dtype bfloat16

Resources: Requires GPU(s) with CUDA. Specific GPU memory utilization is set to 0.9.
Links:
- Source & Docs: https://github.com/Clouditera/secgpt
- Models: https://huggingface.co/clouditera/secgpt, https://modelscope.cn/models/clouditera/SecGPT-14B
- Datasets: https://huggingface.co/datasets/clouditera/security-paper-datasets

Highlighted Details

Achieves significant performance gains across security-specific benchmarks (CISSP, CS-EVAL) and general capabilities (CEVAL, GSM8K, BBH) compared to base models.
Demonstrates advanced capabilities in vulnerability analysis, log/traffic analysis, threat hunting, code auditing, and reverse engineering.
Trained on a 5TB+ cybersecurity corpus, including structured data with 70+ fields and 14 categories, covering theoretical, adversarial, and applied security knowledge.
Offers a lightweight SecGPT-Mini version capable of running efficiently on CPUs.

Maintenance & Community

Actively developed by Clouditera.
Community engagement is encouraged via GitHub for suggestions, issue reporting, code contributions, and experience sharing.

Licensing & Compatibility

The specific license is not explicitly stated in the README, but the project is presented as open-source for research and exchange.
A disclaimer notes that public release or commercial deployment requires users to assume legal and compliance responsibilities.

Limitations & Caveats

The model's output is subject to the limitations of its training data coverage and requires user judgment for accuracy and applicability.
The developers disclaim responsibility for any direct or indirect damages arising from the model's use.

Health Check

Last Commit

6 months ago

Responsiveness

1 day

Pull Requests (30d)

0

Issues (30d)

0

Star History

68 stars in the last 30 days

Explore Similar Projects

CTFKnow by tszdanger

Framework for evaluating LLMs on cybersecurity CTF challenges

Created 6 months ago

Updated 6 months ago

maldev-links by CodeXTF2

Malware dev links collection

Created 3 years ago

Updated 7 months ago

Halberd by vectra-ai-research

Multi-cloud security testing platform for attack emulation

Created 1 year ago

Updated 2 days ago

AutoAudit by ddzipp

LLM for cybersecurity

Created 2 years ago

Updated 10 months ago

Awesome-LLM4Security by liu673

A curated repository of LLM applications in cybersecurity

Created 1 year ago

Updated 3 weeks ago

penetration-testing-roadmap by securitycipher

Comprehensive guide for cybersecurity penetration testing

Created 2 years ago

Updated 1 year ago

PotatoTool by HotBoy-java

Network security tool for red team, blue team, and security enthusiasts

Created 1 year ago

Updated 10 months ago

Starred by

Dan Guido

Dan Guido(Cofounder of Trail of Bits).

Awesome-LLM4Cybersecurity by tmylla

Literature review of LLMs in cybersecurity

Created 2 years ago

Updated 1 month ago

Starred by

Marc Klingen

Marc Klingen(Cofounder of Langfuse).

awesome-llm-security by corca-ai

Awesome LLM security resources

Created 2 years ago

Updated 4 months ago

SecurityProduct by birdhan

Open-source security product source code for IDS, IPS, WAF, honeypots, etc

Created 7 years ago

Updated 1 year ago

Awesome-OSINT-For-Everything by Astrosp

OSINT resource aggregator for infosec, cybersecurity, and investigations

Created 3 years ago

Updated 1 week ago

www-project-top-10-for-large-language-model-applications by OWASP

Security awareness document for LLM application security

Created 2 years ago

Updated 4 days ago

Feedback? Help us improve.