pandas-ai  by sinaptik-ai

Python SDK for conversational data analysis using LLMs and RAG

created 2 years ago
21,198 stars

Top 2.1% on sourcepulse

GitHubView on GitHub
Project Summary

PandasAI is a Python library designed to make data analysis conversational, enabling users to query databases, CSVs, and datalakes using natural language. It targets both technical and non-technical users, aiming to simplify data interaction and accelerate analysis workflows through Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG).

How It Works

PandasAI integrates with LLMs to interpret natural language queries and translate them into executable code (e.g., Python/Pandas or SQL). It supports RAG for enhanced context and accuracy, allowing it to query various data sources. The library can also generate visualizations based on user prompts, and it supports querying across multiple DataFrames. A key feature is its Docker sandbox for secure code execution.

Quick Start & Requirements

Highlighted Details

  • Supports querying SQL databases, CSVs, and Parquet files.
  • Enables natural language querying across multiple DataFrames.
  • Offers a Docker sandbox for secure code execution.
  • Can generate data visualizations based on natural language prompts.

Maintenance & Community

  • Beta Notice: Release v3 is currently in beta.
  • Community: Discord server available for discussions.
  • Contributing: Guidelines are provided for contributions.

Licensing & Compatibility

  • License: MIT Expat license for the core library. The pandasai/ee directory has a separate license.
  • Commercial Use: Compatibility for commercial use is implied by the MIT license for the core library, but specific terms for enterprise offerings should be confirmed.

Limitations & Caveats

The project is currently in beta (v3), meaning features and functionality are in progress and subject to change. The default LLM (BambooLLM) requires an API key, and while other LLMs can be configured, this is not explicitly detailed in the README.

Health Check
Last commit

2 days ago

Responsiveness

1 day

Pull Requests (30d)
3
Issues (30d)
15
Star History
1,423 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Carol Willing Carol Willing(Core Contributor to CPython, Jupyter).

genai by rgbkrk

0%
352
IPython extension for generative AI assistance in Jupyter notebooks
created 3 years ago
updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), and
7 more.

mindsdb by mindsdb

0.5%
35k
AI query engine for federated data sources
created 7 years ago
updated 1 day ago
Feedback? Help us improve.