lotus  by lotus-data

Query engine for LLM-powered data processing using semantic operators

Created 1 year ago
1,550 stars

Top 26.4% on SourcePulse

GitHubView on GitHub
Project Summary

LOTUS is a semantic query engine designed for efficient LLM-powered data processing, targeting data scientists and engineers who need to build complex reasoning pipelines over structured and unstructured data. It offers a declarative, Pandas-like API with semantic operators that leverage natural language expressions for data transformations, simplifying the creation of AI-driven analytics.

How It Works

LOTUS implements a semantic operator model, extending traditional relational operators with natural language predicates. This approach allows users to define data operations (like joins, filters, and aggregations) using high-level, human-readable expressions. The engine then optimizes and executes these operations using various AI-based algorithms, abstracting away the underlying LLM complexities and enabling flexible, composable AI pipelines.

Quick Start & Requirements

  • Installation: pip install lotus-ai (stable) or pip install git+https://github.com/lotus-data/lotus.git@main (latest).
  • Prerequisites: Python 3.10, Conda recommended. For Mac users, specific Faiss installations (CPU or GPU) are required. Requires LLM API keys (e.g., OpenAI, Ollama, vLLM) configured via lotus.settings.configure(lm=lm).
  • Resources: LLM API usage costs apply.
  • Docs/Demo: Colab tutorial, Documentation, Examples.

Highlighted Details

  • Supports a wide range of LLMs via LiteLLM and SentenceTransformers for retrieval/reranking.
  • Offers semantic operators like sem_join, sem_filter, sem_map, sem_extract, sem_agg, sem_topk, sem_sim_join, and sem_search.
  • Integrates seamlessly with Pandas DataFrames.
  • Leverages natural language expressions for defining complex data operations.

Maintenance & Community

Health Check
Last Commit

6 days ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
0
Star History
18 stars in the last 30 days

Explore Similar Projects

Starred by Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), Joe Walnes Joe Walnes(Head of Experimental Projects at Stripe), and
1 more.

KAG by OpenSPG

0.1%
9k
Logical reasoning framework for domain knowledge bases
Created 1 year ago
Updated 4 weeks ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Nir Gazit Nir Gazit(Cofounder of Traceloop), and
4 more.

llmware by llmware-ai

0.1%
15k
Framework for enterprise RAG pipelines using small, specialized models
Created 2 years ago
Updated 4 days ago
Feedback? Help us improve.