ParScale by QwenLM

Research paper introducing parallel scaling for language models

Created 8 months ago

466 stars

Top 65.2% on SourcePulse

View on GitHub

1 Expert Loves This Project

Binyuan Hui

Research Scientist at Alibaba Qwen

Project Summary

This repository introduces "Parallel Scaling" (ParScale), a novel paradigm for scaling Large Language Models (LLMs) that complements parameter and inference time scaling. It targets researchers and practitioners seeking to improve LLM performance and efficiency, offering a way to achieve logarithmic gains in capability with significantly reduced resource overhead.

How It Works

ParScale applies $P$ diverse, learnable transformations to the input, processing them in parallel through the LLM. The outputs are then dynamically aggregated. This approach theoretically and empirically demonstrates a logarithmic scaling law ($O(\log P)$) with the number of parallel streams, suggesting it's an efficient substitute for parameter growth, particularly beneficial for reasoning-intensive tasks.

Quick Start & Requirements

Install: pip install . after cloning the llm-analysis repository for cost analysis.
Prerequisites: CUDA, Python. Models are available on Hugging Face.
Resources: Requires GPU for inference.
Links: Hugging Face Models, Paper

Highlighted Details

Achieves $O(\log P)$ scaling, comparable to parameter scaling.
Universal applicability across model architectures, tasks, and data.
Demonstrates superior inference efficiency: up to 22x less memory and 6x less latency increase than parameter scaling for equivalent performance gains (batch size=1).
Enables cost-efficient training via a two-stage strategy and dynamic adaptation at inference time with frozen parameters.

Maintenance & Community

The project is associated with authors from institutions like Tsinghua University. Further community engagement details are not explicitly provided in the README.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README does not specify any limitations or caveats regarding unsupported platforms, known bugs, or alpha status. The "trust_remote_code=True" requirement for Hugging Face model loading implies potential security considerations.

ParScale by QwenLM

Explore Similar Projects

VisionZip by JIA-Lab-research

gsm8k-ScRel by OFA-Sys

datablations by huggingface

Black-Box-Tuning by txsun1997

LLM-Reading-List by evanmiller

atom by qixucen

X-R1 by dhcode-cpp

guidellm by vllm-project

Awesome-Efficient-LLM by horseee

alpa by alpa-projects

peft by huggingface

ColossalAI by hpcaitech