subwiz  by hadriansecurity

Subdomain discovery with a lightweight GPT model

Created 1 year ago
292 stars

Top 90.4% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides subwiz, a lightweight transformer model for subdomain discovery, targeting security researchers and penetration testers. It leverages a nanoGPT-based architecture to predict new subdomains based on a provided seed list, offering an alternative to traditional brute-force or passive enumeration methods.

How It Works

subwiz utilizes an ultra-lightweight transformer model (17.3M parameters) trained on 26 million subdomain tokens. It employs a beam search algorithm to predict multiple likely subdomain sequences, offering more diverse results than single-sequence generation. The model is designed for efficiency and can run on various hardware, including CPU, CUDA, and MPS.

Quick Start & Requirements

  • Install via pip: pip install subwiz or pipx install subwiz.
  • Requires Python.
  • Model and tokenizer are downloaded automatically on first run.
  • Official Hugging Face model: HadrianSecurity/subwiz

Highlighted Details

  • Predicts N most likely subdomain sequences using beam search.
  • Ultra-lightweight transformer model (17.3M parameters).
  • Trained on 26 million subdomain tokens.
  • Supports CPU, CUDA, and MPS devices.

Maintenance & Community

  • Developed by Hadrian Security.
  • Model available on Hugging Face.

Licensing & Compatibility

  • MIT License.
  • Compatible with commercial use and closed-source linking.

Limitations & Caveats

The --no-resolve flag indicates that predicted subdomains are not automatically validated for existence, requiring a separate resolution step. The model's effectiveness is dependent on the quality and diversity of the training data.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
5
Star History
23 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.