JavaScript framework for LLM-powered bots over any dataset
Top 84.8% on sourcepulse
embedchainjs is a JavaScript framework designed to simplify the creation of LLM-powered chatbots that can query any dataset. It abstracts the complexities of data loading, chunking, embedding generation, and vector storage, enabling users to quickly build bots over web pages, PDF files, or custom QnA pairs. The framework targets developers looking for an easy-to-use solution for RAG (Retrieval-Augmented Generation) applications.
How It Works
embedchainjs leverages Langchain for its core LLM orchestration, utilizing OpenAI's Ada model for embeddings and ChatGPT API for generating answers based on retrieved context. It handles the entire RAG pipeline: data ingestion, chunking, embedding, and storage in a vector database (ChromaDB is the default, requiring Docker). When a query is made, it embeds the query, retrieves similar documents from the vector store, and passes them as context to the LLM for a final answer.
Quick Start & Requirements
npm install embedchain && npm install -S openai@^3.3.0
openai
package version 3.x (not 4.x)OPENAI_API_KEY
in a .env
file)Highlighted Details
dryRun
method to test retrieval without consuming full prompt tokens.Maintenance & Community
Licensing & Compatibility
openai
package version (3.x).Limitations & Caveats
The framework currently requires a specific older version of the openai
package (3.x), which may cause compatibility issues with newer Node.js projects or other dependencies. The default vector database, ChromaDB, requires Docker, adding an operational overhead. Support for additional data formats is pending user requests.
1 year ago
1 day