synthesizer  by SciPhi-AI

LLM framework for RAG and data creation

Created 2 years ago
629 stars

Top 52.7% on SourcePulse

GitHubView on GitHub
Project Summary

Synthesizer[ΨΦ] is a Python framework designed for generating synthetic data and implementing Retrieval-Augmented Generation (RAG) pipelines. It targets developers and researchers needing to create custom datasets for LLM training or RAG systems, and those looking to quickly evaluate RAG performance against real-world data sources. The framework offers integrated RAG capabilities and supports multiple LLM providers.

How It Works

Synthesizer employs a modular architecture, allowing users to integrate various LLM providers (Anthropic, OpenAI, vLLM, HuggingFace, SciPhi) and RAG providers (e.g., Agent Search API). It facilitates custom data creation by leveraging LLMs to generate tailored datasets, and enables RAG pipeline evaluation through its rag_harness script, which can benchmark performance against specified data sources and LLM configurations.

Quick Start & Requirements

  • Primary install: pip install sciphi-synthesizer
  • Prerequisites: SCIPHI_API_KEY environment variable.
  • Documentation: Synthesizer Documentation
  • Community: Discord

Highlighted Details

  • Supports custom data generation for LLM training and RAG.
  • Built-in RAG provider interface with turnkey integration for Agent Search API.
  • Enables RAG pipeline performance evaluation.
  • Offers integration with multiple LLM providers including OpenAI, Anthropic, vLLM, and HuggingFace.

Maintenance & Community

The project actively engages its community via Discord and provides email support for inquiries.

Licensing & Compatibility

The repository's license is not explicitly stated in the provided README.

Limitations & Caveats

The README mentions the requirement for a SCIPHI_API_KEY, suggesting potential reliance on proprietary services or specific configurations. The framework appears to be in active development, with specific RAG provider integrations like "agent-search" highlighted.

Health Check
Last Commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Li Jiang Li Jiang(Coauthor of AutoGen; Engineer at Microsoft), Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), and
1 more.

AutoRAG by Marker-Inc-Korea

0.3%
4k
RAG AutoML tool for optimizing RAG pipelines
Created 1 year ago
Updated 1 day ago
Feedback? Help us improve.