neurips_llm_efficiency_challenge  by llm-efficiency-challenge

Competition toolkit for efficient LLM inference on a single GPU

created 2 years ago
256 stars

Top 99.0% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the framework and guidelines for the NeurIPS Large Language Model Efficiency Challenge, targeting researchers and engineers aiming to optimize LLM performance under strict resource constraints (1 LLM, 1 GPU, 1 Day). It facilitates reproducible submissions via Dockerfiles and evaluation using the HELM benchmark suite.

How It Works

Submissions are packaged as Dockerfiles containing all necessary code and dependencies. These Dockerfiles expose HTTP endpoints (/process and /tokenize) that are queried by the HELM evaluation framework. The approach emphasizes reproducibility and standardized evaluation, allowing participants to leverage provided sample submissions (Lit-GPT, llama-recipes) or custom frameworks for fine-tuning and deployment.

Quick Start & Requirements

  • Install/Run: Follow instructions in sample submissions (Lit-GPT, llama-recipes) and the main main.py for FastAPI server setup.
  • Prerequisites: Docker, Python, approved LLMs/datasets (list provided), access to a 40GB A100 or 4090 GPU for local evaluation. GPU funding options are available.
  • Resources: Local evaluation requires significant GPU resources. Submission evaluation can take 1-2 hours per submission.
  • Links: Sample Submissions, HELM Evaluation, Discord

Highlighted Details

  • Submissions are evaluated against a secret subset of HELM tasks, plus custom held-out tasks.
  • A Discord-based leaderboard bot (evalbot#4372) allows for early testing and performance feedback.
  • Final submissions require a single Dockerfile; model weights must be downloaded at build or runtime, not included directly.
  • AWS credits are available for eligible participants with proposals.

Maintenance & Community

  • The challenge is organized by the NeurIPS LLM Efficiency Challenge committee.
  • Community support and discussion are primarily channeled through their Discord server.
  • Key dates and timeline are available.

Licensing & Compatibility

  • Uses approved LLMs and datasets with specific licensing considerations. Participants must ensure their chosen models/datasets are permissible.
  • Compatibility for commercial use depends on the licenses of the chosen LLMs and datasets.

Limitations & Caveats

The exact evaluation tasks are not disclosed until after the submission deadline. Participants must ensure their Dockerfile correctly builds and runs the HTTP server according to the provided OpenAPI specification.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley).

SWE-Gym by SWE-Gym

1.0%
513
Environment for training software engineering agents
created 9 months ago
updated 4 days ago
Starred by David Cournapeau David Cournapeau(Author of scikit-learn), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
6 more.

llm-numbers by ray-project

0%
4k
LLM developer's reference for key numbers
created 2 years ago
updated 1 year ago
Feedback? Help us improve.