diffgram  by diffgram

AI datastore for schemas, BLOBs, and predictions

Created 7 years ago
1,882 stars

Top 23.1% on SourcePulse

GitHubView on GitHub
Project Summary

Diffgram is an AI datastore designed for managing schemas, BLOBs, and predictions, targeting AI developers and researchers. It offers integrated human supervision, data workflow management, and a UI catalog to enhance AI data utilization and compliance, particularly for PII data.

How It Works

Diffgram functions as a centralized datastore for AI projects, supporting a wide array of data types including images, video, 3D, text, audio, and geospatial data. Its architecture emphasizes user control over data and integrates features for data labeling (human supervision), AI application workflows, and visual data exploration via a UI catalog.

Quick Start & Requirements

  • Installation is typically done by the user. Specific installation commands are not detailed in the README.
  • No specific hardware or software prerequisites are listed beyond general application requirements.
  • Links to a video explainer and commercial open-source license are provided.

Highlighted Details

  • Supports a broad range of media types for data labeling, including image, video, 3D, text, audio, and geospatial, with conversational/LLM support in preview.
  • Emphasizes data control and compliance, particularly for PII data.
  • Features integrated human supervision for data annotation and AI data application workflows.
  • Boasts 706 tests (E2E, unit, etc.) indicating a focus on quality.

Maintenance & Community

  • The project has been in commercial use since 2018.
  • A Slack community is available via invite.
  • News, roadmap, and development system information are linked.

Licensing & Compatibility

  • The project uses a new Diffgram License version 2 (DLv2) and a Contributor License (CL), both available at no financial cost.
  • Details on commercial use compatibility are available via the linked license.

Limitations & Caveats

  • Document, HTML, and DICOM data types are listed on the roadmap, indicating they are not yet fully supported.
  • Conversational & LLM support is noted as "Preview."
Health Check
Last Commit

10 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 30 days

Explore Similar Projects

Starred by Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
7 more.

argilla by argilla-io

0.2%
5k
Collaboration tool for building high-quality AI datasets
Created 4 years ago
Updated 3 days ago
Feedback? Help us improve.