ParScale  by QwenLM

Research paper introducing parallel scaling for language models

Created 4 months ago
444 stars

Top 67.6% on SourcePulse

GitHubView on GitHub
Project Summary

This repository introduces "Parallel Scaling" (ParScale), a novel paradigm for scaling Large Language Models (LLMs) that complements parameter and inference time scaling. It targets researchers and practitioners seeking to improve LLM performance and efficiency, offering a way to achieve logarithmic gains in capability with significantly reduced resource overhead.

How It Works

ParScale applies $P$ diverse, learnable transformations to the input, processing them in parallel through the LLM. The outputs are then dynamically aggregated. This approach theoretically and empirically demonstrates a logarithmic scaling law ($O(\log P)$) with the number of parallel streams, suggesting it's an efficient substitute for parameter growth, particularly beneficial for reasoning-intensive tasks.

Quick Start & Requirements

  • Install: pip install . after cloning the llm-analysis repository for cost analysis.
  • Prerequisites: CUDA, Python. Models are available on Hugging Face.
  • Resources: Requires GPU for inference.
  • Links: Hugging Face Models, Paper

Highlighted Details

  • Achieves $O(\log P)$ scaling, comparable to parameter scaling.
  • Universal applicability across model architectures, tasks, and data.
  • Demonstrates superior inference efficiency: up to 22x less memory and 6x less latency increase than parameter scaling for equivalent performance gains (batch size=1).
  • Enables cost-efficient training via a two-stage strategy and dynamic adaptation at inference time with frozen parameters.

Maintenance & Community

The project is associated with authors from institutions like Tsinghua University. Further community engagement details are not explicitly provided in the README.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README does not specify any limitations or caveats regarding unsupported platforms, known bugs, or alpha status. The "trust_remote_code=True" requirement for Hugging Face model loading implies potential security considerations.

Health Check
Last Commit

4 months ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
0
Star History
15 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI), and
20 more.

alpa by alpa-projects

0.0%
3k
Auto-parallelization framework for large-scale neural network training and serving
Created 4 years ago
Updated 1 year ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Li Jiang Li Jiang(Coauthor of AutoGen; Engineer at Microsoft), and
26 more.

ColossalAI by hpcaitech

0.1%
41k
AI system for large-scale parallel training
Created 3 years ago
Updated 16 hours ago
Feedback? Help us improve.