datapizza-ai  by datapizza-labs

Accelerate GenAI development and production deployment

Created 2 months ago
1,900 stars

Top 22.8% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

Datapizza AI is a Python framework designed for building reliable, production-ready Generative AI solutions with reduced overhead. It targets engineers and power users, enabling faster development, predictable agent behavior, and efficient debugging for Gen AI applications.

How It Works

The framework emphasizes an "API-first" and "observable by design" approach, offering "less abstraction, more control." It provides a vendor-agnostic core with clear interfaces, allowing seamless swapping of AI providers (OpenAI, Google Gemini, Anthropic, Mistral, Azure) and integration of tools like web search and document processing. Core functionalities include composable, reusable blocks, advanced document ingestion pipelines, and built-in observability via OpenTelemetry tracing for detailed performance monitoring and bottleneck identification.

Quick Start & Requirements

Installation is straightforward via pip: pip install datapizza-ai. Additional client packages (e.g., datapizza-ai-clients-openai) can be installed separately. The framework requires Python 3.10+ and API keys for the chosen AI providers. A quick start guide and comprehensive documentation are available at docs.datapizza.ai.

Highlighted Details

  • Multi-Provider & Tool Support: Integrates with OpenAI, Google Gemini, Anthropic, Mistral, and Azure, offering built-in web search, document processing, and custom tool capabilities.
  • Observability: Features OpenTelemetry tracing for end-to-end instrumentation, client I/O tracing, and custom spans to pinpoint performance bottlenecks.
  • Document Processing & RAG: Includes pipelines for parsing PDFs/DOCX/images, smart chunking, embedding generation, and RAG systems with query rewriting and retrieval.
  • Vendor Agnosticism: Designed for easy model and provider swapping without significant code rewiring, promoting flexibility and migration ease.
  • Multi-Agent Systems: Supports building complex collaborative AI systems with specialized agents.

Maintenance & Community

The project is actively developed by "Datapizza, the AI native company." Community engagement is fostered through a Discord server (https://discord.gg/s5sJNHz2C8) and GitHub Issues for bug reports and feature requests. Contributions are welcomed across bug fixes, features, and documentation.

Licensing & Compatibility

Datapizza AI is released under the permissive MIT License, allowing for broad compatibility with commercial and closed-source projects.

Limitations & Caveats

While offering extensive features, the framework's "less abstraction, more control" philosophy may require more configuration effort compared to highly opinionated solutions. Specific setup for various AI provider API keys is necessary. No explicit information is provided regarding alpha/beta status, known bugs, or deprecation schedules.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
40
Issues (30d)
40
Star History
1,917 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

ragbits by deepsense-ai

0.2%
2k
GenAI application development toolkit
Created 1 year ago
Updated 11 hours ago
Starred by Peter Norvig Peter Norvig(Author of "Artificial Intelligence: A Modern Approach"; Research Director at Google), Zhen Lu Zhen Lu(Cofounder of Runpod), and
1 more.

agents-towards-production by NirDiamant

0.9%
15k
Production-ready GenAI agent tutorials
Created 4 months ago
Updated 4 days ago
Starred by Luis Capelo Luis Capelo(Cofounder of Lightning AI), Addy Osmani Addy Osmani(Head of Chrome Developer Experience at Google), and
23 more.

goose by block

4.3%
22k
Open-source AI agent for automating complex engineering tasks
Created 1 year ago
Updated 13 hours ago
Feedback? Help us improve.