haystack  by deepset-ai

AI orchestration framework for LLM application development

created 5 years ago
21,712 stars

Top 2.0% on sourcepulse

GitHubView on GitHub
Project Summary

Haystack is an AI orchestration framework designed for building production-ready LLM applications, particularly those involving retrieval-augmented generation (RAG), question answering, semantic search, and conversational agents. It targets developers and researchers who need to connect various components like LLMs, vector databases, and file converters into flexible pipelines or agents to interact with custom data.

How It Works

Haystack employs a modular, pipeline-based architecture, allowing users to chain together distinct components (e.g., retrievers, readers, generators) to create end-to-end NLP workflows. Its technology-agnostic design emphasizes flexibility, enabling seamless integration and switching between different LLM providers (OpenAI, Cohere, Hugging Face, Azure, Bedrock, SageMaker) and vector databases. This approach facilitates experimentation and adaptation to evolving AI landscapes.

Quick Start & Requirements

Highlighted Details

  • Supports RAG, question answering, semantic search, and complex decision-making agents.
  • Offers flexibility to use various LLMs and vector databases, including local and cloud-hosted models.
  • Provides tools for data handling, including file conversion, cleaning, splitting, training, and evaluation.
  • Integrates with deepset Cloud for managed solutions and Hayhooks for self-hosted REST APIs.
  • Introduces deepset Studio for visual pipeline creation and deployment.

Maintenance & Community

  • Active development with regular updates and community contributions encouraged.
  • Community support available via GitHub Discussions and Discord.
  • Follow on Twitter: @haystack_ai

Licensing & Compatibility

  • Apache 2.0 License.
  • Permissive license suitable for commercial use and integration into closed-source applications.

Limitations & Caveats

Haystack collects anonymous usage statistics of pipeline components by default; users can opt out via documentation. While flexible, managing complex pipelines with numerous integrations may require significant configuration effort.

Health Check
Last commit

2 days ago

Responsiveness

1 day

Pull Requests (30d)
57
Issues (30d)
44
Star History
1,291 stars in the last 90 days

Explore Similar Projects

Starred by Lewis Tunstall Lewis Tunstall(Researcher at Hugging Face), Lysandre Debut Lysandre Debut(Chief Open-Source Officer at Hugging Face), and
3 more.

FARM by deepset-ai

0%
2k
NLP framework for transfer learning with BERT & Co
created 6 years ago
updated 1 year ago
Starred by Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
4 more.

argilla by argilla-io

0.4%
5k
Collaboration tool for building high-quality AI datasets
created 4 years ago
updated 4 days ago
Feedback? Help us improve.