Discover and explore top open-source AI tools and projects—updated daily.
vercelAI agentic framework for Next.js coding evaluations
Top 98.3% on SourcePulse
This repository provides an automated framework for evaluating AI model competency on Next.js coding tasks, up to version 15.5.6. It empowers developers and researchers to benchmark AI agents against specific Next.js development challenges, offering insights into model performance and identifying areas for improvement.
How It Works
The system leverages @vercel/agent-eval to run evaluations. Each evaluation is a self-contained Next.js project within the evals/ directory, comprising a PROMPT.md defining the task, EVAL.ts for Vitest assertions (hidden from the agent), and necessary project files. A smart runner automatically detects new models or evals, executing only uncompleted pairs and cleaning up infrastructure failures before exporting results.
Quick Start & Requirements
npm install.env.local .env and set VERCEL_OIDC_TOKEN and AI_GATEWAY_API_KEY.npm run eval (memoized), npm run eval:dry (preview), npm run eval -- --force (rerun all), npm run eval:smoke (sanity check).npm run export-results to agent-results.json.evals/ or experiments/ directories, respectively.Highlighted Details
agent-results.json and published to nextjs.org/evals.Maintenance & Community
No specific details regarding contributors, sponsorships, or community channels were found in the provided README.
Licensing & Compatibility
The project's license is detailed in the LICENSE file. Specific compatibility notes for commercial use or closed-source linking are not detailed in the README.
Limitations & Caveats
Requires specific Vercel environment variables for operation. The framework is designed for AI model evaluation rather than direct application development, and its scope is limited to Next.js versions up to 15.5.6.
1 day ago
Inactive