Data science tool for conversational data analysis using LLMs
Top 98.6% on sourcepulse
DataHorse is an open-source Python library and tool that democratizes data science by enabling users to interact with, analyze, and visualize data, as well as build machine learning models, using plain English commands. It is designed for business users and individuals without technical expertise, allowing them to derive insights and make data-driven decisions quickly and easily.
How It Works
DataHorse leverages Large Language Models (LLMs) to interpret natural language instructions and translate them into executable data manipulation, analysis, and machine learning operations. This conversational approach abstracts away complex syntax and coding requirements, making data science accessible to a broader audience. The library supports data modification, visualization, model training, and testing through a simple, chat-like interface.
Quick Start & Requirements
pip install datahorse
pip install -r requirements.text
, and running streamlit run app.py
.Highlighted Details
seed
parameter and caching via cache_req=True
.Maintenance & Community
The project encourages contributions and provides a contributing guide. Users can follow the project on LinkedIn.
Licensing & Compatibility
The README does not explicitly state the license type.
Limitations & Caveats
The README does not detail specific limitations, unsupported platforms, or known bugs. The reliance on LLMs suggests potential costs or setup complexities related to model access or API usage not fully elaborated upon.
9 months ago
1 week