Data-Copilot  by zwq2018

LLM-based system for autonomous data workflows

created 2 years ago
1,505 stars

Top 28.0% on sourcepulse

GitHubView on GitHub
Project Summary

Data-Copilot is an LLM-based system designed to autonomously manage, process, analyze, predict, and visualize data for users, particularly focusing on Chinese financial markets. It aims to bridge the gap between vast datasets and human understanding by transforming raw data into informative results and interactive interfaces.

How It Works

Data-Copilot leverages LLMs (GPT-3.5, Azure-GPT-3.5, Qwen-72b-Chat) to interpret user requests and autonomously design, dispatch, and execute workflows. It acts as a "designer" by creating interface tools and a "dispatcher" by sequentially or in parallel invoking these tools to fetch, process, and visualize data from heterogeneous sources like Chinese stocks, funds, economic, and financial data. This autonomous workflow generation and execution aims to reduce manual intervention in complex data analysis tasks.

Quick Start & Requirements

  • Install: pip install -r requirements.txt
  • Run: python main.py (for core processing) or python app.py (for Gradio demo).
  • Prerequisites: OpenAI API key (or Azure equivalent with api-base and engine), Tushare token.
  • Resources: Requires API keys for LLM services and Tushare.
  • Demo: Available on Hugging Face Space.

Highlighted Details

  • Supports Chinese stock, fund, economic, and financial data.
  • Autonomous workflow design and execution for data processing and visualization.
  • Can generate versatile interface tools through self-request and iterative refinement.
  • Outputs results as text summaries, images, and tables.

Maintenance & Community

  • Project associated with authors from Zhejiang University.
  • Contact email provided for questions.
  • Acknowledgements include ChatGPT, Tushare, and Qwen.

Licensing & Compatibility

  • No explicit license is mentioned in the README.
  • Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The system's current data access is limited by the 4k input token limit of GPT-3.5, restricting it to Chinese financial data. Future support for foreign financial markets is planned.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
32 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), and
7 more.

mindsdb by mindsdb

0.5%
35k
AI query engine for federated data sources
created 7 years ago
updated 1 day ago
Feedback? Help us improve.