web-codegen-scorer by angular

Tool for evaluating LLM-generated web code quality

Created 5 months ago

682 stars

Top 49.7% on SourcePulse

Project Summary

Web Codegen Scorer addresses the need for empirical evaluation of AI-generated web code quality. It empowers developers and researchers to make data-driven decisions regarding LLM-generated code by providing systematic testing and comparison capabilities. The tool facilitates prompt iteration, model comparison, and quality monitoring for web development workflows.

How It Works

The scorer focuses on web code, employing well-established quality metrics. It allows users to configure evaluations with various LLMs, frameworks, and tools, specifying system instructions and integrating with MCP servers. Built-in checks cover build success, runtime errors, accessibility, security, LLM ratings, and coding best practices, with automatic issue repair capabilities. An intuitive UI visualizes and compares evaluation results.

Quick Start & Requirements

Installation is via npm: npm install -g web-codegen-scorer. Setup requires exporting API keys for LLM providers (e.g., GEMINI_API_KEY, OPENAI_API_KEY) as environment variables. A basic evaluation can be run with web-codegen-scorer eval --env=angular-example. Custom evaluations are initiated with web-codegen-scorer init. For local development, pnpm install is required, followed by commands like pnpm run eval.

Highlighted Details

Comprehensive built-in checks: build success, runtime errors, accessibility, security, LLM rating, and coding best practices.
Automatic issue repair functionality for detected code problems.
Supports any web library, framework, or LLM, not limited to Angular or Google models.
Configurable with custom Retrieval-Augmented Generation (RAG) endpoints.
Features an intuitive report viewer UI for results analysis.

Maintenance & Community

Developed by the Angular team at Google, the project has a roadmap for expanding checks, including interaction testing, Core Web Vitals measurement, and evaluating LLM edits on existing codebases. No specific community channels (e.g., Discord, Slack) or direct social handles are mentioned in the provided README.

Licensing & Compatibility

The provided README does not specify the software license. Users should verify licensing terms before adoption, especially concerning commercial use or integration with closed-source projects.

Limitations & Caveats

The tool is actively evolving, with plans to introduce more built-in checks and testing scenarios. While it aims for comprehensive evaluation, current checks are not exhaustive, and further features like interaction testing are planned for future releases.

web-codegen-scorer by angular

Explore Similar Projects

DeepV-Ki by OrionStarAI

llm-debugger-vscode-extension by mohsen1

codebase-digest by kamilstanuch

watermelon-vscode by watermelontools

CodeAsk by woniu9524

gentleman-guardian-angel by Gentleman-Programming

sourcery by sourcery-ai

pilot-shell by maxritter

LiveCodeBench by LiveCodeBench

OpenCodeInterpreter by OpenCodeInterpreter

Awesome-Code-LLM by codefuse-ai

AlphaCodium by Codium-ai