Unified-IO 2 code for training, inference, and demo
Top 54.1% on sourcepulse
Unified-IO 2 is a multimodal foundation model designed for researchers and practitioners working with diverse data types including vision, language, audio, and action. It offers a unified framework for training and inference across these modalities, building upon the T5X codebase and providing pre-trained checkpoints for various model sizes.
How It Works
Unified-IO 2 employs an autoregressive approach to handle multiple modalities within a single model. It leverages a sophisticated preprocessing pipeline that includes task-specific steps (resizing images, converting audio to mel-spectrograms), modality-general preprocessing (tokenization, handling missing modalities), and a feature converter to ensure consistent, fixed-size tensor outputs suitable for JAX. This unified data processing and model architecture allows for efficient cross-modal learning and generation.
Quick Start & Requirements
pip install -e .
(for GPU/CPU) or pip install -e '.[tpu]'
(for TPU). Additional dependencies for demo: pip install -e '.[demo]'
.orbax.checkpoint
conflicts). CUDA for GPU. LLaMa tokenizer model file required.s3://ai2-prior-uio/public/uio2-checkpoints/large-3m
). Download command example: aws s3 --no-sign-request cp --recursive s3://ai2-prior-uio/public/uio2-checkpoints/large-3m large-3m --exclude "state*"
.jupyter notebook demo.ipynb
after installing demo dependencies.Highlighted Details
Maintenance & Community
The project is from Allen Institute for AI (AI2). Specific community channels (Discord/Slack) or roadmap details are not explicitly mentioned in the README.
Licensing & Compatibility
The README does not explicitly state a license. Given the affiliation with AI2 and the use of T5X, it is likely to be permissive, but users should verify. Compatibility with commercial or closed-source projects would require license confirmation.
Limitations & Caveats
GPU/CPU setups are noted as not well-tested, with a primary focus on TPUs. Python 3.9 may encounter compatibility issues with orbax.checkpoint
and JAX. Some datasets require manual preprocessing steps beyond the provided scripts.
1 year ago
1 week