aqt  by google

Quantized training and inference library for JAX

Created 4 years ago
337 stars

Top 81.6% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

AQT (Accurate Quantized Training) is a JAX library for tensor operation quantization, enabling high-quality int8 models with zero hand-tuning. It targets researchers and production ML workloads, delivering substantial training speedups and bit-exact training-to-serving consistency, simplifying deployment and mitigating quantization-induced bias.

How It Works

AQT injects quantized tensor operations, primarily lax.dot_general (core for matmul, einsum, conv), directly into JAX computations. This approach allows quantization application to any JAX computation, extending beyond neural networks. A key advantage is its "What You Train Is What You Serve" (WYSIWYS) principle, ensuring bit-exact consistency between training and serving via quantization folding into model checkpoints. This process eliminates re-quantization overhead and reduces memory bandwidth during inference.

Quick Start & Requirements

  • Installation: pip install aqtp
  • Prerequisites: JAX, Flax. Examples utilize jax.numpy, numpy, and matplotlib.
  • Usage: Import aqt.jax.v2. Designed for integration into JAX frameworks (Flax, Pax) via lax.dot_general substitution with AQT's quantized variant.
  • Resources: No specific setup time or resource footprint detailed; assumes standard ML training environments.

Highlighted Details

  • Achieves excellent int8 model quality without manual tuning.
  • Delivers significant training speedups on contemporary ML accelerators.
  • Guarantees bit-exact model behavior between training and serving (WYSIWYS).
  • Features flexible configuration for quantization parameters (bits, calibration, rounding).
  • Accelerates backpropagation, demonstrating 1.2x-1.4x step time reduction on large Transformers.
  • Supports "serving conversion" for inference optimization via checkpointed quantized weights.

Maintenance & Community

AQT is a Google-developed library, validated across internal frameworks like Flax, Pax, and MaxText. AQTv1 is slated for deprecation in early Q1 2024, with AQTv2 as the recommended version. Issue reporting is via GitHub. No community channels (Discord, Slack) or social handles are provided.

Licensing & Compatibility

The provided README omits explicit licensing information. This absence is a critical blocker for assessing commercial use or integration into closed-source projects.

Limitations & Caveats

The library includes multiple versions, with AQTv1 scheduled for deprecation. The most significant limitation for adoption is the lack of stated licensing, preventing clear understanding of usage rights and compatibility.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Wei-Lin Chiang Wei-Lin Chiang(Cofounder of LMArena), and
13 more.

awesome-tensor-compilers by merrymercy

0.4%
3k
Curated list of tensor compiler projects and papers
Created 5 years ago
Updated 1 year ago
Starred by Shengjia Zhao Shengjia Zhao(Chief Scientist at Meta Superintelligence Lab), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
14 more.

BIG-bench by google

0.2%
3k
Collaborative benchmark for probing and extrapolating LLM capabilities
Created 4 years ago
Updated 1 year ago
Starred by Lysandre Debut Lysandre Debut(Chief Open-Source Officer at Hugging Face), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
14 more.

simpletransformers by ThilinaRajapakse

0.0%
4k
Rapid NLP task implementation
Created 6 years ago
Updated 3 months ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
16 more.

text-to-text-transfer-transformer by google-research

0.1%
6k
Unified text-to-text transformer for NLP research
Created 6 years ago
Updated 3 weeks ago
Starred by Vaibhav Nivargi Vaibhav Nivargi(Cofounder of Moveworks), Chuan Li Chuan Li(Chief Scientific Officer at Lambda), and
5 more.

awesome-mlops by visenger

0.1%
13k
Curated MLOps knowledge hub
Created 5 years ago
Updated 1 year ago
Feedback? Help us improve.