Discover and explore top open-source AI tools and projects—updated daily.
squeeze-evolveMulti-model orchestration for verifier-free evolutionary LLM scaling
Top 97.2% on SourcePulse
Squeeze-Evolve is an open-source framework that drastically cuts LLM inference costs using a verifier-free evolutionary approach and multi-model orchestration. It intelligently routes inference tasks to the most cost-effective model based on difficulty, aiming for equivalent or better accuracy at a fraction of the expense. This targets researchers and power users optimizing LLM deployments.
How It Works
The system employs an evolutionary loop to refine candidate solutions. Its core innovation is "fitness-based routing": candidate groups are scored for difficulty (using confidence or diversity proxies) and dynamically routed to tiered models—expensive for hard problems, cheaper for easy ones, and a lightweight aggregator for consensus. This parallel, adaptive routing optimizes resource utilization and minimizes overall inference cost.
Quick Start & Requirements
Installation: Clone with submodules (git clone --recurse-submodules), then uv sync --dev or pip install -e ".[dev]". Optional cloud storage (AWS, GCS) requires squeeze-evolve[aws] or squeeze-evolve[gcs]. A forked vllm with a custom confidence engine is installable via VLLM_USE_PRECOMPILED=1 uv pip install --editable external/vllm. CLI tools (squeeze-evolve-client, squeeze-evolve-server) and benchmarks (AIME 2025, HMMT 2025, GPQA-Diamond) are included. vllm implies GPU/CUDA requirements.
Highlighted Details
vllm fork with GPU-accelerated confidence engine (4-10x lower scoring latency).Maintenance & Community
Described as "actively evolving research code" with ongoing productionization efforts. Contributions and feedback are welcomed via issues. No specific community channels or detailed contributor information are provided.
Licensing & Compatibility
Licensed under the Apache License 2.0, permissive for commercial use and closed-source integration.
Limitations & Caveats
As "actively evolving research code," users should expect potential instability or incomplete features. The forked vllm submodule may introduce build complexities. No explicit unsupported platforms or known bugs are detailed.
1 month ago
Inactive
ulab-uiuc
ModelTC
algorithmicsuperintelligence