Aquila2 by FlagAI-Open

LLM for base language and chat models

Created 2 years ago

445 stars

Top 67.3% on SourcePulse

Project Summary

The Aquila2 project provides open-source large language and chat models, including 7B, 34B, and 70B parameter variants, with a focus on strong performance across various benchmarks and long-context understanding. It is targeted at researchers and developers looking to leverage advanced LLMs for diverse applications, offering fine-tuning capabilities and efficient inference options.

How It Works

Aquila2 models are based on a Transformer architecture, with specific versions like AquilaChat2-34B-16K enhanced for long-context understanding through positional coding interpolation and supervised fine-tuning on extensive conversation datasets. The project also leverages FlagScale, a pretraining framework built on Megatron-LM, for efficient large-scale training.

Quick Start & Requirements

Installation: pip install -r requirements.txt
Prerequisites: Python 3.10+, PyTorch 1.12+ (2.0+ recommended), Transformers 4.32+, CUDA 11.4+ (recommended for GPU/flash-attention).
Optional: Flash-attention for speed and memory reduction. Docker image available.
Resources: Inference examples provided for single and multi-GPU setups. Fine-tuning scripts for 7B and 34B models (full-parameter, LoRA, Q-LoRA) are available.
Links: Hugging Face, ModelScope, FlagOpen

Highlighted Details

Aquila2-34B v1.2 shows significant improvements on reasoning and comprehension datasets, approaching GPT-3.5 levels.
Long-context models (e.g., AquilaChat2-34B-16K) demonstrate leading performance among open-source options, comparable to GPT-3.5-16K.
Supports 4-bit quantization (BitsAndBytes, GPTQ) and AWQ for reduced memory footprint with minimal performance loss.
Fine-tuning scripts for full-parameter, LoRA, and Q-LoRA are provided for 7B and 34B models.

Maintenance & Community

Active development with recent releases of 70B models and performance updates for 34B models.
Community contributions are encouraged via GitHub Issues and Pull Requests. WeChat groups are available for contact.

Licensing & Compatibility

Project License: Apache 2.0.
Model Licenses: BAAI Aquila Model License Agreement for 7B/34B models, and a specific BAAI Aquila 70B Model License Agreement for 70B models. These may have restrictions on commercial use or redistribution.

Limitations & Caveats

A data leakage issue with GSM8K in pre-training was identified and addressed, with affected results removed.
The 70B models are experimental.
FlagScale, the pretraining framework, is in its early stages.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days