nomic  by nomic-ai

Python client for massive unstructured data interaction

Created 3 years ago
1,812 stars

Top 23.8% on SourcePulse

GitHubView on GitHub
Project Summary

Nomic Atlas is a Python client for interacting with a powerful, browser-based platform designed for exploring, labeling, searching, and sharing massive unstructured datasets. It caters to researchers and developers working with text, image, audio, and video data, enabling efficient insight discovery and data organization.

How It Works

Atlas leverages embeddings to represent unstructured data, allowing for semantic search and clustering into topics. It generates and stores these embeddings, providing access to both high-dimensional latent representations and 2D projections for visualization. The platform facilitates programmatic access to data structures, individual data points, and automatically generated topic models, enabling both coding-based and no-code interaction.

Quick Start & Requirements

Highlighted Details

  • Supports datasets from hundreds to tens of millions of points.
  • Handles multiple data modalities: text, image, audio, video.
  • Features semantic search, topic clustering, data tagging, and deduplication.
  • Offers shareable, interactive maps with or without coding.

Maintenance & Community

Licensing & Compatibility

  • License details are not explicitly stated in the README.
  • Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README does not specify the exact license, which may impact commercial adoption. There are no explicit mentions of supported operating systems or hardware requirements beyond general Python compatibility.

Health Check
Last Commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
0
Star History
21 stars in the last 30 days

Explore Similar Projects

Starred by John Resig John Resig(Author of jQuery; Chief Software Architect at Khan Academy), Chenlin Meng Chenlin Meng(Cofounder of Pika), and
9 more.

clip-retrieval by rom1504

0.2%
3k
CLIP retrieval system for semantic search
Created 4 years ago
Updated 1 month ago
Feedback? Help us improve.