DATAGEN  by starpig1129

AI-driven multi-agent research assistant for data analysis and report writing

created 1 year ago
1,389 stars

Top 29.7% on sourcepulse

GitHubView on GitHub
Project Summary

DATAGEN is an AI-driven research assistant designed to automate hypothesis generation, data analysis, visualization, and report writing. It targets researchers and analysts seeking to streamline complex data exploration tasks through a multi-agent system. The platform aims to accelerate the research lifecycle by intelligently coordinating specialized AI agents.

How It Works

DATAGEN employs a multi-agent architecture orchestrated by LangGraph, leveraging LangChain and OpenAI's GPT models. Specialized agents handle distinct research functions like hypothesis generation, data processing, visualization, code execution, web searching, and report writing. These agents communicate and coordinate through a state graph, enabling dynamic workflow adjustments and real-time optimization for complex research processes.

Quick Start & Requirements

  • Installation: Clone the repository, create a Conda environment (conda create -n data_assistant python=3.10), activate it (conda activate data_assistant), and install dependencies (pip install -r requirements.txt).
  • Prerequisites: Python 3.10+, Conda, Jupyter Notebook environment, ChromeDriver executable, and OpenAI API key (required). Optional: Firecrawl API key, LangChain API key.
  • Configuration: Set environment variables in a .env file, including DATA_STORAGE_PATH, CONDA_PATH, CONDA_ENV, CHROMEDRIVER_PATH, and API keys.
  • Usage: Run via main.ipynb or python main.py. Place data files (e.g., YourDataName.csv) in the data_storage directory.
  • Links: DATAGEN Digital (under development).

Highlighted Details

  • Advanced multi-agent system with specialized agents for diverse research tasks.
  • Intelligent task distribution, coordination, and real-time adaptation.
  • Smart context management via a "Note Taker" agent for efficient memory utilization.
  • Enterprise-grade performance with a robust and scalable architecture.
  • Upcoming strategic partnership with CTL GROUP for AI Crypto Intelligence.

Maintenance & Community

The project is actively maintained by starpig1129. Further community engagement details (Discord/Slack, roadmap) are not explicitly provided in the README.

Licensing & Compatibility

Licensed under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The system requires significant OpenAI API usage and may incur costs. The README notes potential issues like OpenAI Internal Server Errors, efficiency improvements needed for the NoteTaker, and overall runtime optimization. The agent system may modify input data, necessitating backups.

Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
115 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.