UForm by unum-cloud

Multimodal AI library for content understanding and generation

Created 2 years ago

1,214 stars

Top 32.2% on SourcePulse

View on GitHub

7 Experts Love This Project

Simon Willison

Coauthor of Django

Omar Sanseviero

DevRel at Google DeepMind

Jiaming Song

Chief Scientist at Luma AI

John Resig

Author of jQuery; Chief Software Architect at Khan Academy

and 3 more!

Project Summary

UForm is a compact, multimodal AI library designed for efficient content understanding and generation across text, images, and video. It targets developers and researchers needing to deploy AI capabilities on diverse platforms, from servers to smartphones, offering significant speedups and reduced resource footprints compared to larger models.

How It Works

UForm leverages custom-trained, compact transformer models. Its embedding models utilize Matryoshka-style embeddings, allowing for flexible dimensionality (64-768) and efficient retrieval. Generative models are built on efficient architectures like Qwen and LLaMA, enabling tasks such as chat, image captioning, and Visual Question Answering (VQA). The library emphasizes portability via ONNX and native support for quantization (f32 to i8, b1) and embedding slicing, integrating tightly with SimSIMD for numerical operations and USearch for vector indexing.

Quick Start & Requirements

Install via pip: pip install uform
Requires Python 3.10+.
Generative models require transformers and torch.
ONNX export is supported for broader deployment.
Official Python, JavaScript, and Swift documentation is available.

Highlighted Details

Offers 64-dimensional Matryoshka embeddings for fast search.
Claims 2-4x faster inference than competitors due to small model size.
Supports ONNX for cross-platform deployment and quantization (f32 to i8, b1).
Integrates with SimSIMD for up to 2500x speedup in numerical operations and USearch for high-throughput vector search.

Maintenance & Community

Developed by unum-cloud.
Links to Python, JavaScript, and Swift documentation are provided.

Licensing & Compatibility

The README does not explicitly state the license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is marked with a ⚠️ emoji for the uform-gen model, suggesting potential instability or deprecation. Future support for video is indicated with 🔜 emojis. The license is not explicitly stated, which may pose a barrier for commercial adoption.

Health Check

Last Commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

16 stars in the last 30 days