testpilot by githubnext

Unit test generator for npm packages using LLMs

Created 2 years ago

561 stars

Top 57.1% on SourcePulse

Project Summary

TestPilot is a research prototype for automatically generating unit tests for JavaScript/TypeScript npm packages using large language models (LLMs). It targets researchers and developers exploring LLM-based test generation, offering a framework that requires no additional training data.

How It Works

TestPilot prompts an LLM with a test skeleton, including function signatures, body, and examples mined from documentation. The LLM's response is parsed into a runnable unit test. Optionally, failed tests trigger re-prompting with failure details for refinement. This approach avoids the need for example test-function pairs or reinforcement learning.

Quick Start & Requirements

Install: Install from a tarball (not on npm registry) or from source (npm install and npm run build in the root directory).
Prerequisites: Access to a Codex-style LLM with a completion API. Set TESTPILOT_LLM_API_ENDPOINT and TESTPILOT_LLM_AUTH_HEADERS environment variables.
Dependencies: Requires mocha for testing if not already a project dependency.
Docs: Research paper available on arXiv and IEEExplore.

Highlighted Details

Generates tests for exported functions in npm packages.
Supports re-prompting for test refinement upon failure.
Includes a benchmarking harness for evaluating performance on multiple packages.
Offers a reproduction mode to replay benchmark runs using recorded API requests and responses.

Maintenance & Community

Archived project; refer to neu-se/testpilot2.
Maintained by Max Schaefer, Frank Tip, and Sarah Nadi.
Not officially supported; file issues for questions or feedback.

Licensing & Compatibility

MIT License.
Permissive license suitable for commercial use and integration with closed-source projects.

Limitations & Caveats

This version is archived and intended for research. For daily use, Copilot Chat is recommended. Reproduction mode may encounter issues with replaying refined tests due to system-specific failure messages.

testpilot by githubnext

Explore Similar Projects

benchllm by v7labs

supercharger by catid

bench by arthur-ai

LLM4SoftwareTesting by LLM-Testing

evalite by mattpocock

Test-Agent by codefuse-ai

yet-another-applied-llm-benchmark by carlini

auto-unit-test-case-generator by traas-stack

pythagora by Pythagora-io

qodo-cover by qodo-ai

promptfoo by promptfoo

ragas by vibrantlabsai