supercharger by catid

CLI tool for LLM-powered code generation and unit testing

Created 2 years ago

349 stars

Top 79.7% on SourcePulse

View on GitHub

1 Expert Loves This Project

Georgios Konstantopoulos

CTO, General Partner at Paradigm

Project Summary

Supercharger aims to automate software development by leveraging locally-hosted Large Language Models (LLMs) to generate code and unit tests. It targets developers and researchers seeking to accelerate the coding process, offering a robust framework for distributed LLM inference and automated code validation.

How It Works

Supercharger employs a distributed architecture with a load balancer managing multiple worker nodes. Each worker node runs LLMs, specifically optimized for code generation and testing. The system uses prompt engineering tailored for code, generates multiple code/test pairs, and iteratively tests them until a passing pair is found. An AI evaluator scores the code and tests, and a virtual machine sandbox ensures the safety of executed candidate code.

Quick Start & Requirements

Install: Clone the repository, set up a Conda environment (conda create -n supercharger python=3.10), activate it (conda activate supercharger), and run ./update.sh.
Prerequisites: Docker, Python 3.10, Conda, passwordless SSH access between nodes.
Hardware: Designed for clusters of Linux servers, each with multiple GPUs (e.g., two 3090 or 4090 GPUs) for model parallelism. Tested with Baize-30B model using 8-bit quantization.
Resources: Requires significant GPU resources and a distributed setup.
Docs: https://docs.google.com/spreadsheets/d/1TYBNr_UPJ7wCzJThuk5ysje7K1x-_62JhBeXDbmrjA8/edit?usp=sharing

Highlighted Details

Generates multiple code and unit test combinations, executing them until a valid pair passes.
Utilizes an AI to score code and test quality.
Implements thorough code cleaning to remove LLM artifacts.
Executes candidate code within a virtual machine for safety.
Supports distributed inference across multiple nodes via a load balancer.

Maintenance & Community

The project is maintained by catid.io.
No specific community links (Discord/Slack) or roadmap are provided in the README.

Licensing & Compatibility

The README does not explicitly state a license. It is crucial to verify licensing for any usage, especially commercial.

Limitations & Caveats

The setup requires a distributed environment with multiple GPUs and specific hardware configurations.
The launch_cluster.sh script may leave zombie processes, requiring manual cleanup via ./kill_gpu_users.sh.
The project is described as having future work items, suggesting it may not be feature-complete.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days