UniWM  by F1y1113

Memory-augmented world models for visual navigation

Created 2 months ago
255 stars

Top 98.8% on SourcePulse

GitHubView on GitHub
Project Summary

Unified World Models (UniWM) introduces a unified, memory-augmented world model paradigm for visual navigation, integrating egocentric visual foresight and planning within a single autoregressive backbone. It targets researchers and engineers in AI and robotics, enabling stable, coherent reasoning over extended horizons by tightly aligning visual imagination with action decisions.

How It Works

The core approach employs a multimodal autoregressive backbone that unifies visual foresight and planning. Unlike modular systems, UniWM explicitly grounds action decisions in visually imagined outcomes. A hierarchical memory mechanism further integrates short-term perceptual details with long-term trajectory context, facilitating robust reasoning over extended horizons.

Quick Start & Requirements

Installation involves creating a Conda environment with Python 3.10, activating it, and installing PyTorch 2.4.0 and project dependencies via pip install -r requirements.txt --user. A partial dataset is available in data_samples/ for debugging and format demonstration. Training and evaluation leverage torchrun for multi-GPU distributed execution, requiring specification of data directories and model checkpoints.

Highlighted Details

  • Integrates visual foresight and planning into a single, unified world model.
  • Features a hierarchical memory mechanism for short-term perception and long-term context.
  • Supports multi-GPU distributed training and various evaluation modes (single-step, task-level, rollout).

Maintenance & Community

Contributions can be discussed by contacting yfeidong@uw.edu or fyiwu@uw.edu. No explicit community channels (e.g., Discord, Slack) are listed.

Licensing & Compatibility

The provided README does not specify a software license. Potential users should verify licensing terms before adoption, especially for commercial or closed-source integration.

Limitations & Caveats

A partial dataset is provided, primarily for debugging and demonstrating data formats, which may necessitate users supplying their own datasets for comprehensive training and evaluation.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
15 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.