Discover and explore top open-source AI tools and projects—updated daily.
NVlabsRL framework for training efficient agentic tool orchestrators
Top 59.8% on SourcePulse
ToolOrchestra provides an end-to-end RL training framework for orchestrating intelligent tools and specialized models, enabling efficient agentic workflows. It targets researchers and engineers building complex, multi-turn AI agents, offering a method to train small, highly capable orchestrator models that surpass larger, generalist LLMs in performance and efficiency. The framework allows agents to coordinate diverse tools and models, leading to state-of-the-art results on challenging benchmarks with significantly reduced computational cost.
How It Works
ToolOrchestra employs end-to-end reinforcement learning to train small orchestrator models (e.g., Orchestrator-8B) that dynamically coordinate tool usage and reasoning. The core approach involves an orchestrator agent alternating between planning and executing tool calls, interacting with a diverse set of resources including basic utilities (search, code interpreter), specialized LLMs (coding, math), and generalist LLMs. Optimization occurs via outcome, efficiency, and preference rewards, supported by a scalable pipeline for synthesizing training tasks. This method achieves superior performance and efficiency by leveraging specialized components rather than relying solely on monolithic models.
Quick Start & Requirements
toolorchestra directory, and set up Conda environments (toolorchestra, retriever, vllm1) with Python 3.12. Install dependencies using pip install -r requirements.txt and specific packages per environment.retriever), FAISS-GPU (for retriever), Tavily API key, Hugging Face datasets/checkpoints. Environment variables (INDEX_DIR, CHECKPOINT_PATH, HF_HOME, REPO_PATH, CKPT_DIR) must be set.resume_h100.py) suggest high-end GPU requirements (e.g., H100).https://gitlab-master.nvidia.com/dler/toolorchestra. Index Data: https://huggingface.co/datasets/multi-train/index. Checkpoints: https://huggingface.co/multi-train/ToolOrchestrator. Tavily API: https://app.tavily.com/home.Highlighted Details
Maintenance & Community
The project lists authors from NVIDIA and The University of Hong Kong. No specific community channels (e.g., Discord, Slack) or roadmap links are provided in the README.
Licensing & Compatibility
The code is licensed under the Apache 2.0 license. This license is permissive and generally compatible with commercial use and linking within closed-source projects.
Limitations & Caveats
The setup process is complex, requiring the creation and management of multiple distinct Conda environments with specific library versions (e.g., PyTorch, CUDA). Extensive use of environment variables for paths and API keys adds to the configuration overhead. Evaluation procedures, particularly for HLE, may require running components in separate processes, indicating potential complexities or dependencies that need careful management.
2 weeks ago
Inactive
0russwest0
NovaSky-AI