subwiz by hadriansecurity

Subdomain discovery with a lightweight GPT model

Created 1 year ago

344 stars

Top 80.8% on SourcePulse

Project Summary

This project provides subwiz, a lightweight transformer model for subdomain discovery, targeting security researchers and penetration testers. It leverages a nanoGPT-based architecture to predict new subdomains based on a provided seed list, offering an alternative to traditional brute-force or passive enumeration methods.

How It Works

subwiz utilizes an ultra-lightweight transformer model (17.3M parameters) trained on 26 million subdomain tokens. It employs a beam search algorithm to predict multiple likely subdomain sequences, offering more diverse results than single-sequence generation. The model is designed for efficiency and can run on various hardware, including CPU, CUDA, and MPS.

Quick Start & Requirements

Install via pip: pip install subwiz or pipx install subwiz.
Requires Python.
Model and tokenizer are downloaded automatically on first run.
Official Hugging Face model: HadrianSecurity/subwiz

Highlighted Details

Predicts N most likely subdomain sequences using beam search.
Ultra-lightweight transformer model (17.3M parameters).
Trained on 26 million subdomain tokens.
Supports CPU, CUDA, and MPS devices.

Maintenance & Community

Developed by Hadrian Security.
Model available on Hugging Face.

Licensing & Compatibility

MIT License.
Compatible with commercial use and closed-source linking.

Limitations & Caveats

The --no-resolve flag indicates that predicted subdomains are not automatically validated for existence, requiring a separate resolution step. The model's effectiveness is dependent on the quality and diversity of the training data.

Health Check

Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

4 stars in the last 30 days