AI dataset generator for realistic data
Top 52.2% on sourcepulse
This project provides an AI-powered tool for generating realistic datasets for demos, learning, and dashboards, targeting developers and data analysts. It simplifies data creation through a conversational prompt builder and integrates with Metabase for immediate data exploration, offering free CSV/SQL exports after an initial low-cost preview.
How It Works
The core of the generator uses OpenAI's GPT-4o to interpret user prompts and create a detailed data specification (schema, business rules). Actual data rows are then generated locally using the Faker library based on this LLM-generated spec. This approach ensures that only the initial preview or schema definition incurs OpenAI costs; subsequent data exports are free and instantaneous.
Quick Start & Requirements
npm install
and run with npm run dev
..env.example
to .env.local
, and adding the OpenAI API key.http://localhost:3000
.Highlighted Details
Maintenance & Community
The project is maintained by Metabase. Further community interaction details are not specified in the README.
Licensing & Compatibility
The project is licensed under the MIT License, permitting commercial use and integration with closed-source applications.
Limitations & Caveats
The generation process relies on an external OpenAI API key, incurring costs for data previews. While data exports are free, the quality and realism of the generated data are dependent on the LLM's interpretation of the prompt and the Faker library's capabilities.
1 week ago
Inactive