haystack by deepset-ai

AI orchestration framework for LLM application development

Created 6 years ago

24,300 stars

Top 1.7% on SourcePulse

View on GitHub

32 Experts Love This Project

Tobi Lutke

Cofounder of Shopify

Rodrigo Nader

Cofounder of Langflow

Max Deichmann

Cofounder of Langfuse

Anton Troynikov

Cofounder of Chroma

and 28 more!

Project Summary

Haystack is an AI orchestration framework designed for building production-ready LLM applications, particularly those involving retrieval-augmented generation (RAG), question answering, semantic search, and conversational agents. It targets developers and researchers who need to connect various components like LLMs, vector databases, and file converters into flexible pipelines or agents to interact with custom data.

How It Works

Haystack employs a modular, pipeline-based architecture, allowing users to chain together distinct components (e.g., retrievers, readers, generators) to create end-to-end NLP workflows. Its technology-agnostic design emphasizes flexibility, enabling seamless integration and switching between different LLM providers (OpenAI, Cohere, Hugging Face, Azure, Bedrock, SageMaker) and vector databases. This approach facilitates experimentation and adaptation to evolving AI landscapes.

Quick Start & Requirements

Install via pip: pip install haystack-ai
Requires Python 3.7+
Official documentation: https://docs.haystack.deepset.ai/
Cookbook for recipes: https://haystack.deepset.ai/cookbook

Highlighted Details

Supports RAG, question answering, semantic search, and complex decision-making agents.
Offers flexibility to use various LLMs and vector databases, including local and cloud-hosted models.
Provides tools for data handling, including file conversion, cleaning, splitting, training, and evaluation.
Integrates with deepset Cloud for managed solutions and Hayhooks for self-hosted REST APIs.
Introduces deepset Studio for visual pipeline creation and deployment.

Maintenance & Community

Active development with regular updates and community contributions encouraged.
Community support available via GitHub Discussions and Discord.
Follow on Twitter: @haystack_ai

Licensing & Compatibility

Apache 2.0 License.
Permissive license suitable for commercial use and integration into closed-source applications.

Limitations & Caveats

Haystack collects anonymous usage statistics of pipeline components by default; users can opt out via documentation. While flexible, managing complex pipelines with numerous integrations may require significant configuration effort.

Health Check

Last Commit

17 hours ago

Responsiveness

1 day

Pull Requests (30d)

165

Issues (30d)

Star History

351 stars in the last 30 days