Discover and explore top open-source AI tools and projects—updated daily.
ModelEngine-GroupEnterprise data platform for AI model development and RAG
Top 80.7% on SourcePulse
DataMate is an enterprise-grade data processing platform engineered for AI model fine-tuning and Retrieval Augmented Generation (RAG). It addresses the end-to-end data lifecycle, offering a unified solution for data collection, management, cleaning, synthesis, annotation, evaluation, and knowledge generation, thereby accelerating AI development workflows.
How It Works
The platform employs a visual orchestration engine, enabling users to design complex data processing workflows via a drag-and-drop interface. Its core strength lies in a rich, extensible operator ecosystem, supporting both pre-built and custom operators. This modular approach facilitates efficient pipeline construction and integration of diverse data processing tasks.
Quick Start & Requirements
wget -qO docker-compose.yml https://raw.githubusercontent.com/ModelEngine-Group/DataMate/refs/heads/main/deployment/docker/datamate/docker-compose.yml \
&& REGISTRY=ghcr.io/modelengine-group/ docker compose up -d
http://localhost:30000DEVELOPMENT.md, AGENTS.md, and service-specific READMEs within the repository structure.make install-label-studio), Mineru PDF processing (make build-mineru, make install-mineru), and DeerFlow LLM service (make install-deer-flow).Highlighted Details
Maintenance & Community
The project features active CI pipelines for backend and frontend services. Contributions are managed via standard GitHub Issues and Pull Requests. No dedicated community channels (e.g., Slack, Discord) or public roadmap are explicitly detailed in the README.
Licensing & Compatibility
DataMate is released under the permissive MIT license. This license allows for broad use, modification, and distribution, including integration into commercial and closed-source applications without significant restrictions.
Limitations & Caveats
The provided README does not explicitly detail known limitations, alpha/beta status, or specific unsupported platforms. Deployment complexity may vary based on the chosen method (Docker Compose vs. Kubernetes/Helm).
1 day ago
Inactive
zenml-io