pixeltable  by pixeltable

AI data infrastructure for multimodal apps using declarative, incremental approach

Created 2 years ago
950 stars

Top 38.6% on SourcePulse

GitHubView on GitHub
Project Summary

Pixeltable provides a declarative data infrastructure for multimodal AI applications, addressing the complexity of stitching together disparate tools for data ingestion, transformation, indexing, and orchestration. It targets AI engineers and researchers building production-ready multimodal applications, offering a unified framework to simplify data plumbing and accelerate development.

How It Works

Pixeltable operates as a database, storing metadata and computed results persistently. Users define data processing and AI workflows declaratively using computed columns on tables. The engine automatically handles data ingestion (referencing files in place), transformation via Python UDFs or built-in operations, AI model integration for inference, and vector index creation for semantic search. Its core advantage lies in incremental computation, ensuring only necessary recomputations occur when data or code changes, alongside automatic versioning and lineage tracking.

Quick Start & Requirements

  • Install via pip: pip install pixeltable
  • Requires Python 3.8+
  • Supports Linux, macOS, and Windows.
  • See Installation and Quick Start.

Highlighted Details

  • Unified multimodal interface for images, video, audio, and documents.
  • Declarative computed columns for automatic processing and AI model integration.
  • Built-in vector search and similarity indexing.
  • Supports Python UDFs and agentic workflows with LLM tool calling.
  • Persistent storage with automatic versioning and lineage tracking.

Maintenance & Community

  • Active development with a public roadmap for cloud infrastructure and deployment.
  • Community support available via Discord.
  • Contributions are welcomed via their Contributing Guide.

Licensing & Compatibility

  • Licensed under the Apache 2.0 License.
  • Permissive license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

The project is actively under development, with a roadmap indicating future cloud features. While it supports various AI integrations, specific model compatibility or performance tuning for niche use cases may require custom UDFs.

Health Check
Last Commit

1 day ago

Responsiveness

1 week

Pull Requests (30d)
37
Issues (30d)
4
Star History
223 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
2 more.

towhee by towhee-io

0.0%
3k
Framework for neural data processing pipelines
Created 4 years ago
Updated 11 months ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Simon Willison Simon Willison(Coauthor of Django), and
10 more.

LAVIS by salesforce

0.2%
11k
Library for language-vision AI research
Created 3 years ago
Updated 10 months ago
Feedback? Help us improve.