Discover and explore top open-source AI tools and projects—updated daily.
ThunderAgent-orgFast, program-aware agentic inference system
Top 92.9% on SourcePulse
Summary
ThunderAgent is a program-aware agentic inference system designed to enhance the throughput and stability of agentic workflows. It targets researchers and developers working with LLM agents, providing a unified interface for tool management and optimized inference scheduling, leading to significant performance gains and more reliable long-running deployments.
How It Works
The system functions as an agentic workflow scheduler, sitting between agent clients and infrastructure. Its core innovation is a program-aware scheduler that optimizes KV-cache hit rates and balances memory across nodes, boosting inference throughput by 1.5-3.6x. It also features robust tool-call lifecycle management with automatic resource reclamation and supports multiple inference backends like vLLM and SGLang.
Quick Start & Requirements
Installation involves cloning the repository and running pip install -e .. Users must install a compatible backend, such as vLLM (uv pip install vllm --torch-backend=auto). ThunderAgent is then launched via the thunderagent command, directing requests through its specified port (e.g., 9000). Embedding into existing workflows requires adding a program_id to the OpenAI API call's extra_body. Further details can be found in the project's paper: https://arxiv.org/abs/2602.13692.
Highlighted Details
Maintenance & Community
Contributions are welcomed via pull requests. Enterprise inquiries, including technical consulting and sponsorship, can be directed to hkang342@gatech.edu. No specific community channels (e.g., Discord, Slack) or roadmap details are provided in the README.
Licensing & Compatibility
ThunderAgent is released under the permissive MIT license. This license generally allows for commercial use and integration into closed-source projects without significant restrictions.
Limitations & Caveats
The provided README does not detail specific limitations, unsupported platforms, known bugs, or alpha/beta status. The system appears to be presented as a production-ready solution, though setup requires familiarity with LLM inference frameworks like vLLM.
2 days ago
Inactive
NVIDIA
letta-ai