kairos-sensenova  by kairos-agi

World model for embodied AI grounded in physics

Created 4 months ago
379 stars

Top 75.0% on SourcePulse

GitHubView on GitHub
Project Summary

Kairos 3.0 offers a unified cross-embodiment world modeling framework designed to overcome limitations in Embodied AI, such as data heterogeneity, long-horizon reasoning, and edge compute constraints. Targeting engineers, researchers, and power users, it provides a 4B-parameter model capable of real-time edge deployment, enabling high-precision action prediction and HD generation for both physical and digital embodied AI applications.

How It Works

Kairos 3.0 is built upon fundamental physical and causal laws as its cognitive foundation, integrating real-robot interaction, structured human behavior, and Chain-of-Thought (CoT) data. Its unified multimodal architecture processes understanding, generation, and action prediction within a single loop. A key innovation is its custom Hybrid Linear Attention operator, which reduces temporal complexity from O(n^2) to O(n), drastically cutting VRAM and compute overhead while maintaining long-sequence capabilities. This approach enables physics-level deep cognition and efficient, low-latency inference crucial for embodied AI.

Quick Start & Requirements

Installation can be done via Docker (recommended, with specific images for A800/A100, RTX 5090, and METAX C500) or by building a Python environment with pip install -r requirements.txt. Prerequisites include Python >= 3.10, PyTorch >= 2.6, and CUDA >= 12.6 for the pip method. Additional large model weights (Qwen2.5-VL, Wan2.1-VAE) must be downloaded separately for inference.

  • GitHub Repository: https://github.com/kairos-agi/kairos-sensenova
  • Hugging Face Models: https://huggingface.co/kairos-agi/kairos-sensenova
  • ModelScope Models: https://modelscope.cn/models/kairos-team/

Highlighted Details

  • Achieves State-of-the-Art (SOTA) performance on benchmarks like PAI-Bench (80.03 for robot, 80.84 general) and WorldModelBench.
  • Demonstrates cross-embodiment generalization across single-arm, dual-arm, and dexterous-hand platforms, with native support for Agibot G1, Unitree G1, and Songling PIPER.
  • Offers industry-leading real-time on-robot inference speeds with ultra-low resource consumption, enabling deployment on edge devices.
  • The compact 4B parameter model matches or surpasses larger models (e.g., Cosmos 2.5 14B) in various benchmarks, balancing precision and efficiency.

Maintenance & Community

The project is developed and maintained by the Kairos Team, specializing in Embodied Intelligence and World Model research. Further community engagement details beyond the GitHub repository are not explicitly provided in the README.

Licensing & Compatibility

Kairos is open-sourced under the Apache License 2.0. This permissive license allows for free use, modification, and building of commercial products, with no significant restrictions on linking with closed-source software.

Limitations & Caveats

The METAX C500 platform is not supported when installing via requirements.txt; Docker is the only viable option for this hardware. Inference requires downloading substantial additional model weights (Qwen2.5-VL, Wan2.1-VAE) post-installation. The Docker image tag v0.0.1 suggests this may be an early release version.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
15
Issues (30d)
4
Star History
342 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.