haystack  by deepset-ai

AI orchestration framework for LLM application development

Created 6 years ago
23,241 stars

Top 1.8% on SourcePulse

GitHubView on GitHub
Project Summary

Haystack is an AI orchestration framework designed for building production-ready LLM applications, particularly those involving retrieval-augmented generation (RAG), question answering, semantic search, and conversational agents. It targets developers and researchers who need to connect various components like LLMs, vector databases, and file converters into flexible pipelines or agents to interact with custom data.

How It Works

Haystack employs a modular, pipeline-based architecture, allowing users to chain together distinct components (e.g., retrievers, readers, generators) to create end-to-end NLP workflows. Its technology-agnostic design emphasizes flexibility, enabling seamless integration and switching between different LLM providers (OpenAI, Cohere, Hugging Face, Azure, Bedrock, SageMaker) and vector databases. This approach facilitates experimentation and adaptation to evolving AI landscapes.

Quick Start & Requirements

Highlighted Details

  • Supports RAG, question answering, semantic search, and complex decision-making agents.
  • Offers flexibility to use various LLMs and vector databases, including local and cloud-hosted models.
  • Provides tools for data handling, including file conversion, cleaning, splitting, training, and evaluation.
  • Integrates with deepset Cloud for managed solutions and Hayhooks for self-hosted REST APIs.
  • Introduces deepset Studio for visual pipeline creation and deployment.

Maintenance & Community

  • Active development with regular updates and community contributions encouraged.
  • Community support available via GitHub Discussions and Discord.
  • Follow on Twitter: @haystack_ai

Licensing & Compatibility

  • Apache 2.0 License.
  • Permissive license suitable for commercial use and integration into closed-source applications.

Limitations & Caveats

Haystack collects anonymous usage statistics of pipeline components by default; users can opt out via documentation. While flexible, managing complex pipelines with numerous integrations may require significant configuration effort.

Health Check
Last Commit

4 hours ago

Responsiveness

1 day

Pull Requests (30d)
125
Issues (30d)
44
Star History
429 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Vasek Mlejnsky Vasek Mlejnsky(Cofounder of E2B).

super-rag by superagent-ai

0%
384
RAG pipeline for AI apps
Created 1 year ago
Updated 1 year ago
Starred by Li Jiang Li Jiang(Coauthor of AutoGen; Engineer at Microsoft), Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), and
1 more.

AutoRAG by Marker-Inc-Korea

0.2%
4k
RAG AutoML tool for optimizing RAG pipelines
Created 1 year ago
Updated 3 weeks ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Nir Gazit Nir Gazit(Cofounder of Traceloop), and
4 more.

llmware by llmware-ai

0.1%
14k
Framework for enterprise RAG pipelines using small, specialized models
Created 2 years ago
Updated 3 months ago
Feedback? Help us improve.