Discover and explore top open-source AI tools and projects—updated daily.
Quantized training and inference library for JAX
Top 82.5% on SourcePulse
Summary
AQT (Accurate Quantized Training) is a JAX library for tensor operation quantization, enabling high-quality int8 models with zero hand-tuning. It targets researchers and production ML workloads, delivering substantial training speedups and bit-exact training-to-serving consistency, simplifying deployment and mitigating quantization-induced bias.
How It Works
AQT injects quantized tensor operations, primarily lax.dot_general
(core for matmul, einsum, conv), directly into JAX computations. This approach allows quantization application to any JAX computation, extending beyond neural networks. A key advantage is its "What You Train Is What You Serve" (WYSIWYS) principle, ensuring bit-exact consistency between training and serving via quantization folding into model checkpoints. This process eliminates re-quantization overhead and reduces memory bandwidth during inference.
Quick Start & Requirements
pip install aqtp
jax.numpy
, numpy
, and matplotlib
.aqt.jax.v2
. Designed for integration into JAX frameworks (Flax, Pax) via lax.dot_general
substitution with AQT's quantized variant.Highlighted Details
Maintenance & Community
AQT is a Google-developed library, validated across internal frameworks like Flax, Pax, and MaxText. AQTv1 is slated for deprecation in early Q1 2024, with AQTv2 as the recommended version. Issue reporting is via GitHub. No community channels (Discord, Slack) or social handles are provided.
Licensing & Compatibility
The provided README omits explicit licensing information. This absence is a critical blocker for assessing commercial use or integration into closed-source projects.
Limitations & Caveats
The library includes multiple versions, with AQTv1 scheduled for deprecation. The most significant limitation for adoption is the lack of stated licensing, preventing clear understanding of usage rights and compatibility.
1 month ago
Inactive