Discover and explore top open-source AI tools and projects—updated daily.
itigges22Boosts frozen LLM performance for efficient, self-hosted AI
Top 39.0% on SourcePulse
Adaptive Test-time Learning and Autonomous Specialization (ATLAS) provides a self-hosted framework for running large language models locally, achieving competitive performance against frontier API models without fine-tuning or cloud reliance. It targets power users and researchers seeking cost-effective, private AI solutions on single consumer GPUs. The system wraps a frozen, quantized model within an intelligent infrastructure, enabling autonomous specialization and iterative refinement for complex tasks.
How It Works
ATLAS employs a multi-phase pipeline: Phase 1 generates candidate solutions using PlanSearch and BudgetForcing. Phase 2 scores and tests these candidates via a Geometric Lens (using self-embeddings for scoring) and sandbox execution. Tasks failing verification proceed to Phase 3, where the model generates its own test cases and iteratively refines solutions using PR-CoT (self-verified repair). This approach leverages a frozen, quantized model (e.g., Qwen3-14B-Q4_K_M) and avoids external API calls, data exfiltration, or usage metering, running entirely on local hardware.
Quick Start & Requirements
atlas.conf.example to atlas.conf (setting MODEL_PATH, DATA_DIR, GPU), run sudo ./scripts/install.sh, verify with ./scripts/verify-install.sh, and execute benchmarks with python3 benchmark/v3_runner.py.Highlighted Details
Maintenance & Community
No specific community links (Discord/Slack) or details on notable contributors/sponsorships are provided in the README.
Licensing & Compatibility
Licensed under the A.T.L.A.S Source Available License v1.0. This license may have restrictions on commercial use or redistribution; consult the LICENSE file for specifics.
Limitations & Caveats
The current V3.0 release is primarily optimized for LiveCodeBench, with other benchmarks (GPQA, SciCode) requiring further tuning for cross-domain generalization. The Geometric Lens candidate discrimination is limited by an undertrained scoring model, and the G(x) metric tensor is currently dormant or undergoing redesign. The task pipeline is single-threaded, and a stdio handling bug exists in the SandboxAdapter. V3.1 is planned to address these limitations, including model swaps, pipeline redesigns, and parallel task execution.
2 days ago
Inactive
allenai
WecoAI
jennyzzt