AI-driven multi-agent research assistant for data analysis and report writing
Top 29.7% on sourcepulse
DATAGEN is an AI-driven research assistant designed to automate hypothesis generation, data analysis, visualization, and report writing. It targets researchers and analysts seeking to streamline complex data exploration tasks through a multi-agent system. The platform aims to accelerate the research lifecycle by intelligently coordinating specialized AI agents.
How It Works
DATAGEN employs a multi-agent architecture orchestrated by LangGraph, leveraging LangChain and OpenAI's GPT models. Specialized agents handle distinct research functions like hypothesis generation, data processing, visualization, code execution, web searching, and report writing. These agents communicate and coordinate through a state graph, enabling dynamic workflow adjustments and real-time optimization for complex research processes.
Quick Start & Requirements
conda create -n data_assistant python=3.10
), activate it (conda activate data_assistant
), and install dependencies (pip install -r requirements.txt
)..env
file, including DATA_STORAGE_PATH
, CONDA_PATH
, CONDA_ENV
, CHROMEDRIVER_PATH
, and API keys.main.ipynb
or python main.py
. Place data files (e.g., YourDataName.csv
) in the data_storage
directory.Highlighted Details
Maintenance & Community
The project is actively maintained by starpig1129. Further community engagement details (Discord/Slack, roadmap) are not explicitly provided in the README.
Licensing & Compatibility
Licensed under the MIT License, permitting commercial use and integration with closed-source projects.
Limitations & Caveats
The system requires significant OpenAI API usage and may incur costs. The README notes potential issues like OpenAI Internal Server Errors, efficiency improvements needed for the NoteTaker, and overall runtime optimization. The agent system may modify input data, necessitating backups.
1 month ago
1 day