diffgram  by diffgram

AI datastore for schemas, BLOBs, and predictions

created 7 years ago
1,875 stars

Top 23.6% on sourcepulse

GitHubView on GitHub
Project Summary

Diffgram is an AI datastore designed for managing schemas, BLOBs, and predictions, targeting AI developers and researchers. It offers integrated human supervision, data workflow management, and a UI catalog to enhance AI data utilization and compliance, particularly for PII data.

How It Works

Diffgram functions as a centralized datastore for AI projects, supporting a wide array of data types including images, video, 3D, text, audio, and geospatial data. Its architecture emphasizes user control over data and integrates features for data labeling (human supervision), AI application workflows, and visual data exploration via a UI catalog.

Quick Start & Requirements

  • Installation is typically done by the user. Specific installation commands are not detailed in the README.
  • No specific hardware or software prerequisites are listed beyond general application requirements.
  • Links to a video explainer and commercial open-source license are provided.

Highlighted Details

  • Supports a broad range of media types for data labeling, including image, video, 3D, text, audio, and geospatial, with conversational/LLM support in preview.
  • Emphasizes data control and compliance, particularly for PII data.
  • Features integrated human supervision for data annotation and AI data application workflows.
  • Boasts 706 tests (E2E, unit, etc.) indicating a focus on quality.

Maintenance & Community

  • The project has been in commercial use since 2018.
  • A Slack community is available via invite.
  • News, roadmap, and development system information are linked.

Licensing & Compatibility

  • The project uses a new Diffgram License version 2 (DLv2) and a Contributor License (CL), both available at no financial cost.
  • Details on commercial use compatibility are available via the linked license.

Limitations & Caveats

  • Document, HTML, and DICOM data types are listed on the roadmap, indicating they are not yet fully supported.
  • Conversational & LLM support is noted as "Preview."
Health Check
Last commit

8 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
16 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), and
7 more.

mindsdb by mindsdb

0.5%
35k
AI query engine for federated data sources
created 7 years ago
updated 1 day ago
Feedback? Help us improve.