Discover and explore top open-source AI tools and projects—updated daily.
FluxVLAEnd-to-end platform for embodied AI and VLA engineering
New!
Top 88.3% on SourcePulse
FluxVLA is a full-stack, end-to-end engineering platform for embodied intelligence and Vision-Language Agents (VLAs). It targets researchers and engineers, standardizing VLA development and deployment from data to real-robot applications, significantly reducing engineering complexity.
How It Works
Built on unified configuration, standardized interfaces, and module decoupling, FluxVLA enables a complete engineering loop. It supports diverse VLA models (Gr00t, Pi0.5, OpenVLA), LLM backbones (Llama, Gemma, Qwen), and vision backbones (DINOv2, SigLIP). Training strategies include FSDP, DDP, and LoRA, with support for Parquet datasets and safetensors weights. Features include multi-GPU evaluation, Real-Time Chunking (RTC) for trajectory continuity, and accelerated inference via Triton kernels and CUDA Graph capture.
Quick Start & Requirements
Installation requires a Python 3.10 conda environment, CUDA-enabled PyTorch (e.g., torch==2.6.0 --index-url https://download.pytorch.org/whl/cu124), flash-attn, av, and other dependencies via pip install -r requirements.txt. Key prerequisites include CUDA >= 12.4 and specific system libraries for EGL rendering. Experiment tracking via wandb/TensorBoard is supported. Links to PyTorch installation and EGL configuration are provided.
Highlighted Details
Maintenance & Community
The project acknowledges contributions from NVIDIA Isaac, OpenVLA, and Qwen. Support is available via GitHub issues. The roadmap includes expanding backbone support, integrating VLM/CoT data training, and adding support for tools like Isaac Sim.
Licensing & Compatibility
The license type is not explicitly stated in the provided README, requiring further investigation for commercial use.
Limitations & Caveats
GR00T evaluation on LIBERO is unstable and sensitive to environmental factors. RTX 5090 requires updated Triton (3.2.0+). Installation can be complex, with potential issues related to CMake, NumPy version conflicts, and Hugging Face connectivity, often needing specific environment variable settings.
5 days ago
Inactive
Physical-Intelligence