xtreme1  by xtreme1-io

Open-source platform for multimodal training data annotation

created 3 years ago
1,045 stars

Top 36.6% on sourcepulse

GitHubView on GitHub
Project Summary

Xtreme1 is an open-source, all-in-one platform designed for multimodal data annotation, curation, and ontology management, targeting machine learning engineers and researchers. It aims to streamline the creation of training data for computer vision and LLM applications, offering AI-powered tools to enhance efficiency in tasks like 2D/3D object detection, segmentation, and LiDAR-camera fusion.

How It Works

Xtreme1 utilizes a Docker-based architecture to provide a comprehensive suite of annotation tools. It supports various data types including images and 3D LiDAR point clouds, with specific integrations for popular libraries like OpenPCDet and AB3DMOT for LiDAR-camera fusion. The platform incorporates pre-labeling and interactive models, configurable ontologies with hierarchies, and features for data management, quality monitoring, and error identification. It also includes beta support for RLHF annotation for LLMs.

Quick Start & Requirements

  • Install: Download and unzip the release package, then run docker compose up from the package directory.
  • Prerequisites: Docker Desktop (4.1+) or Docker Engine (20.10+) with Docker Compose Plugin (2.0+). For model deployment, an NVIDIA GPU with CUDA Driver and NVIDIA Container Toolkit is required on a Linux server.
  • Resource Footprint: 2GB+ RAM, 10GB+ disk space. Model deployment requires 4GB+ GPU RAM.
  • Docs: http://docs.xtreme1.io/

Highlighted Details

  • Supports multimodal data labeling: images, 3D LiDAR, and 2D/3D sensor fusion.
  • Includes AI-assisted pre-labeling and interactive models for various CV tasks.
  • Features RLHF annotation tool for LLMs (beta).
  • Offers data curation and visualization tools.

Maintenance & Community

The project is hosted by LF AI & Data Foundation. Community engagement is encouraged via Twitter and GitHub Issues.

Licensing & Compatibility

Licensed under Apache 2.0. Permissive for commercial use and closed-source linking.

Limitations & Caveats

Built-in model containers require Linux with NVIDIA hardware. ARM CPU compatibility may require emulation (e.g., platform: linux/amd64 in docker-compose.override.yml), potentially impacting performance. RLHF features are in beta.

Health Check
Last commit

2 weeks ago

Responsiveness

1 day

Pull Requests (30d)
4
Issues (30d)
0
Star History
57 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Tim J. Baek Tim J. Baek(Founder of Open WebUI), and
2 more.

llmware by llmware-ai

0.2%
14k
Framework for enterprise RAG pipelines using small, specialized models
created 1 year ago
updated 1 week ago
Feedback? Help us improve.