UniWM by F1y1113

Memory-augmented world models for visual navigation

Created 4 months ago

268 stars

Top 96.0% on SourcePulse

Project Summary

Unified World Models (UniWM) introduces a unified, memory-augmented world model paradigm for visual navigation, integrating egocentric visual foresight and planning within a single autoregressive backbone. It targets researchers and engineers in AI and robotics, enabling stable, coherent reasoning over extended horizons by tightly aligning visual imagination with action decisions.

How It Works

The core approach employs a multimodal autoregressive backbone that unifies visual foresight and planning. Unlike modular systems, UniWM explicitly grounds action decisions in visually imagined outcomes. A hierarchical memory mechanism further integrates short-term perceptual details with long-term trajectory context, facilitating robust reasoning over extended horizons.

Quick Start & Requirements

Installation involves creating a Conda environment with Python 3.10, activating it, and installing PyTorch 2.4.0 and project dependencies via pip install -r requirements.txt --user. A partial dataset is available in data_samples/ for debugging and format demonstration. Training and evaluation leverage torchrun for multi-GPU distributed execution, requiring specification of data directories and model checkpoints.

Highlighted Details

Integrates visual foresight and planning into a single, unified world model.
Features a hierarchical memory mechanism for short-term perception and long-term context.
Supports multi-GPU distributed training and various evaluation modes (single-step, task-level, rollout).

Maintenance & Community

Contributions can be discussed by contacting yfeidong@uw.edu or fyiwu@uw.edu. No explicit community channels (e.g., Discord, Slack) are listed.

Licensing & Compatibility

The provided README does not specify a software license. Potential users should verify licensing terms before adoption, especially for commercial or closed-source integration.

Limitations & Caveats

A partial dataset is provided, primarily for debugging and demonstrating data formats, which may necessitate users supplying their own datasets for comprehensive training and evaluation.

Health Check

Last Commit

3 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

6 stars in the last 30 days