Discover and explore top open-source AI tools and projects—updated daily.
F1y1113Memory-augmented world models for visual navigation
Top 98.8% on SourcePulse
Unified World Models (UniWM) introduces a unified, memory-augmented world model paradigm for visual navigation, integrating egocentric visual foresight and planning within a single autoregressive backbone. It targets researchers and engineers in AI and robotics, enabling stable, coherent reasoning over extended horizons by tightly aligning visual imagination with action decisions.
How It Works
The core approach employs a multimodal autoregressive backbone that unifies visual foresight and planning. Unlike modular systems, UniWM explicitly grounds action decisions in visually imagined outcomes. A hierarchical memory mechanism further integrates short-term perceptual details with long-term trajectory context, facilitating robust reasoning over extended horizons.
Quick Start & Requirements
Installation involves creating a Conda environment with Python 3.10, activating it, and installing PyTorch 2.4.0 and project dependencies via pip install -r requirements.txt --user. A partial dataset is available in data_samples/ for debugging and format demonstration. Training and evaluation leverage torchrun for multi-GPU distributed execution, requiring specification of data directories and model checkpoints.
Highlighted Details
Maintenance & Community
Contributions can be discussed by contacting yfeidong@uw.edu or fyiwu@uw.edu. No explicit community channels (e.g., Discord, Slack) are listed.
Licensing & Compatibility
The provided README does not specify a software license. Potential users should verify licensing terms before adoption, especially for commercial or closed-source integration.
Limitations & Caveats
A partial dataset is provided, primarily for debugging and demonstrating data formats, which may necessitate users supplying their own datasets for comprehensive training and evaluation.
1 month ago
Inactive
bytedance
microsoft