testpilot  by githubnext

Unit test generator for npm packages using LLMs

created 1 year ago
556 stars

Top 58.5% on sourcepulse

GitHubView on GitHub
Project Summary

TestPilot is a research prototype for automatically generating unit tests for JavaScript/TypeScript npm packages using large language models (LLMs). It targets researchers and developers exploring LLM-based test generation, offering a framework that requires no additional training data.

How It Works

TestPilot prompts an LLM with a test skeleton, including function signatures, body, and examples mined from documentation. The LLM's response is parsed into a runnable unit test. Optionally, failed tests trigger re-prompting with failure details for refinement. This approach avoids the need for example test-function pairs or reinforcement learning.

Quick Start & Requirements

  • Install: Install from a tarball (not on npm registry) or from source (npm install and npm run build in the root directory).
  • Prerequisites: Access to a Codex-style LLM with a completion API. Set TESTPILOT_LLM_API_ENDPOINT and TESTPILOT_LLM_AUTH_HEADERS environment variables.
  • Dependencies: Requires mocha for testing if not already a project dependency.
  • Docs: Research paper available on arXiv and IEEExplore.

Highlighted Details

  • Generates tests for exported functions in npm packages.
  • Supports re-prompting for test refinement upon failure.
  • Includes a benchmarking harness for evaluating performance on multiple packages.
  • Offers a reproduction mode to replay benchmark runs using recorded API requests and responses.

Maintenance & Community

  • Archived project; refer to neu-se/testpilot2.
  • Maintained by Max Schaefer, Frank Tip, and Sarah Nadi.
  • Not officially supported; file issues for questions or feedback.

Licensing & Compatibility

  • MIT License.
  • Permissive license suitable for commercial use and integration with closed-source projects.

Limitations & Caveats

This version is archived and intended for research. For daily use, Copilot Chat is recommended. Reproduction mode may encounter issues with replaying refined tests due to system-specific failure messages.

Health Check
Last commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
14 stars in the last 90 days

Explore Similar Projects

Starred by Jared Palmer Jared Palmer(Ex-VP of AI at Vercel; Founder of Turborepo; Author of Formik, TSDX) and Travis Fischer Travis Fischer(Founder of Agentic).

evalite by mattpocock

1.6%
770
TypeScript testing framework for LLM apps
created 8 months ago
updated 1 week ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
4 more.

yet-another-applied-llm-benchmark by carlini

0.2%
1k
LLM benchmark for evaluating models on previously asked programming questions
created 1 year ago
updated 3 months ago
Feedback? Help us improve.