embedchainjs  by mem0ai

JavaScript framework for LLM-powered bots over any dataset

created 2 years ago
326 stars

Top 84.8% on sourcepulse

GitHubView on GitHub
Project Summary

embedchainjs is a JavaScript framework designed to simplify the creation of LLM-powered chatbots that can query any dataset. It abstracts the complexities of data loading, chunking, embedding generation, and vector storage, enabling users to quickly build bots over web pages, PDF files, or custom QnA pairs. The framework targets developers looking for an easy-to-use solution for RAG (Retrieval-Augmented Generation) applications.

How It Works

embedchainjs leverages Langchain for its core LLM orchestration, utilizing OpenAI's Ada model for embeddings and ChatGPT API for generating answers based on retrieved context. It handles the entire RAG pipeline: data ingestion, chunking, embedding, and storage in a vector database (ChromaDB is the default, requiring Docker). When a query is made, it embeds the query, retrieves similar documents from the vector store, and passes them as context to the LLM for a final answer.

Quick Start & Requirements

  • Install: npm install embedchain && npm install -S openai@^3.3.0
  • Prerequisites:
    • Node.js
    • openai package version 3.x (not 4.x)
    • Docker for ChromaDB
    • OpenAI API Key (set as OPENAI_API_KEY in a .env file)
  • Setup: Requires Docker setup for ChromaDB and setting up the OpenAI API key.
  • Docs: https://github.com/mem0ai/embedchainjs

Highlighted Details

  • Supports adding data from web pages, local PDF files, and QnA pairs.
  • Includes a dryRun method to test retrieval without consuming full prompt tokens.
  • Abstracts common RAG challenges like chunking strategy and embedding model selection.
  • Built on Langchain, OpenAI embeddings (Ada), OpenAI LLM (ChatGPT), and ChromaDB.

Maintenance & Community

  • Author: Taranjeet Singh (@taranjeetio)
  • Citation details are provided for academic use.

Licensing & Compatibility

  • License: Not explicitly stated in the README, but the Python counterpart is MIT.
  • Compatibility: Requires specific openai package version (3.x).

Limitations & Caveats

The framework currently requires a specific older version of the openai package (3.x), which may cause compatibility issues with newer Node.js projects or other dependencies. The default vector database, ChromaDB, requires Docker, adding an operational overhead. Support for additional data formats is pending user requests.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 90 days

Explore Similar Projects

Starred by Jared Palmer Jared Palmer(Ex-VP of AI at Vercel; Founder of Turborepo; Author of Formik, TSDX).

chatgpt-pgvector by gannonh

0%
938
Domain-specific chat completions app
created 2 years ago
updated 2 years ago
Feedback? Help us improve.