ParScale  by QwenLM

Research paper introducing parallel scaling for language models

created 2 months ago
417 stars

Top 71.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository introduces "Parallel Scaling" (ParScale), a novel paradigm for scaling Large Language Models (LLMs) that complements parameter and inference time scaling. It targets researchers and practitioners seeking to improve LLM performance and efficiency, offering a way to achieve logarithmic gains in capability with significantly reduced resource overhead.

How It Works

ParScale applies $P$ diverse, learnable transformations to the input, processing them in parallel through the LLM. The outputs are then dynamically aggregated. This approach theoretically and empirically demonstrates a logarithmic scaling law ($O(\log P)$) with the number of parallel streams, suggesting it's an efficient substitute for parameter growth, particularly beneficial for reasoning-intensive tasks.

Quick Start & Requirements

  • Install: pip install . after cloning the llm-analysis repository for cost analysis.
  • Prerequisites: CUDA, Python. Models are available on Hugging Face.
  • Resources: Requires GPU for inference.
  • Links: Hugging Face Models, Paper

Highlighted Details

  • Achieves $O(\log P)$ scaling, comparable to parameter scaling.
  • Universal applicability across model architectures, tasks, and data.
  • Demonstrates superior inference efficiency: up to 22x less memory and 6x less latency increase than parameter scaling for equivalent performance gains (batch size=1).
  • Enables cost-efficient training via a two-stage strategy and dynamic adaptation at inference time with frozen parameters.

Maintenance & Community

The project is associated with authors from institutions like Tsinghua University. Further community engagement details are not explicitly provided in the README.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README does not specify any limitations or caveats regarding unsupported platforms, known bugs, or alpha status. The "trust_remote_code=True" requirement for Hugging Face model loading implies potential security considerations.

Health Check
Last commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
420 stars in the last 90 days

Explore Similar Projects

Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), and
2 more.

matmulfreellm by ridgerchu

0.1%
3k
MatMul-free language models
created 1 year ago
updated 1 week ago
Feedback? Help us improve.