360zhinao  by Qihoo360

LLM series for base, chat, search, and reranking models

created 1 year ago
290 stars

Top 91.7% on sourcepulse

GitHubView on GitHub
Project Summary

The 360Zhinao project provides a suite of large language models, including 7B parameter base and chat variants with extended context lengths up to 360K tokens, alongside specialized retrieval and reranking models. It targets developers and researchers seeking high-performance Chinese and multilingual NLP capabilities, offering competitive benchmarks and extensive context handling.

How It Works

The 7B models are trained on a 3.4 trillion token corpus, emphasizing Chinese, English, and code. The chat models achieve extended context lengths through a two-stage approach: initial continual pretraining with increased RoPE base and a 32K context, followed by fine-tuning on long-context data, including synthetic multi-document QA and single-document QA. The retrieval and reranking models leverage specialized architectures and fine-tuning techniques, including data filtering, source enhancement, and negative example mining, to achieve state-of-the-art performance on benchmarks like C-MTEB.

Quick Start & Requirements

  • Installation: pip install -r requirements.txt (Python >= 3.8, PyTorch >= 2.0, Transformers >= 4.37.2, CUDA >= 11.4). Flash-Attention 2 is recommended for performance.
  • Inference: Demonstrations provided via Hugging Face Transformers and ModelScope. vLLM deployment is supported with specific installation steps.
  • Finetuning: Scripts provided using DeepSpeed for multi-GPU training.
  • Resources: Requires significant GPU resources for inference and fine-tuning.

Highlighted Details

  • 360Zhinao-7B-Chat-360K offers the longest context window among open-source Chinese models (as of April 2024).
  • Achieves over 98% accuracy on English and Chinese NeedleInAHaystack tasks for long-context evaluation.
  • 360Zhinao-search and 360Zhinao-1.8B-Reranking models ranked first on C-MTEB Retrieval and Reranking leaderboards, respectively.
  • Supports Int4 quantization via AutoGPTQ.

Maintenance & Community

The project is actively updated, with recent releases including specialized search and reranking models, and extensions to Llama 3. Links to HuggingFace, ModelScope, and a technical report are provided.

Licensing & Compatibility

The source code is licensed under Apache 2.0. Model usage for commercial purposes requires contacting the developers for application and adherence to the specific "360 Zhinao Open-Source Model License."

Limitations & Caveats

The mps device is not supported for Mac users in the CLI demo. Commercial use requires explicit permission and adherence to a separate license agreement, which may impose restrictions beyond the Apache 2.0 code license.

Health Check
Last commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.