Discover and explore top open-source AI tools and projects—updated daily.
Vision-language-action framework for cross-environment policy learning
Top 47.2% on SourcePulse
UniVLA is a unified vision-language-action framework designed for learning generalist robotic policies across diverse environments and embodiments. It targets researchers and engineers in robotics and AI who aim to develop adaptable and efficient control systems, offering significant improvements over previous methods like OpenVLA.
How It Works
UniVLA introduces task-centric latent actions, derived unsupervisedly via a VQ-VAE, to create an embodiment-agnostic action space. This approach allows the model to leverage data from various sources without requiring explicit action labels. A generalist policy is then pretrained on this latent action space, followed by lightweight, embodiment-specific action decoders for deployment, enabling efficient fine-tuning and adaptation.
Quick Start & Requirements
pip install -e .
.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
4 weeks ago
1 day