Yi-1.5 by 01-ai

Yi-1.5: upgraded open-source language model series

Created 1 year ago

555 stars

Top 57.6% on SourcePulse

2 Experts Love This Project

hiyouga

Author of LLaMA-Factory

abidlabs

Cofounder of Gradio

Project Summary

Yi-1.5 is a suite of large language models (LLMs) offering enhanced capabilities in coding, math, reasoning, and instruction following. Targeting developers and researchers, it provides improved performance over its predecessor, Yi, with models available in 34B, 9B, and 6B parameter sizes.

How It Works

Yi-1.5 is built upon a foundation of continuous pre-training on a 500 billion token corpus, followed by fine-tuning on 3 million diverse samples. This extensive training regimen aims to bolster its proficiency in complex cognitive tasks while retaining strong language understanding and commonsense reasoning abilities.

Quick Start & Requirements

Installation: pip install -r requirements.txt
Prerequisites: Python 3.10+, transformers library. Models can be downloaded from Hugging Face, ModelScope, or WiseModel.
Local Inference: Example provided using Hugging Face transformers for local execution on CUDA-enabled GPUs.
Ollama: Supports running Yi-1.5 models locally via ollama run yi:v1.5.
vLLM: Deployment via vLLM's OpenAI-compatible API server is supported.
Web Demo: Local web demo available via python demo/web_demo.py -c <your-model-path>.
Docs: Yi Cookbook, FAQ, Learning Hub

Highlighted Details

Enhanced performance in coding, math, reasoning, and instruction-following.
Available in 34B, 9B, and 6B parameter sizes.
OpenAI-compatible API available via Yi Platform, Replicate, and OpenRouter.
Supports fine-tuning with popular frameworks like LLaMA-Factory, Swift, XTuner, and Firefly.

Maintenance & Community

Active community support via Discord and Twitter.
Discord, Twitter

Licensing & Compatibility

Licensed under Apache 2.0.
Derivative works require attribution to 01.AI.

Limitations & Caveats

The README does not specify hardware requirements for each model size or provide explicit benchmarks comparing Yi-1.5 against other leading LLMs.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

0

Star History

0 stars in the last 30 days

Explore Similar Projects

Aurora by WangRongsheng

Code for a research paper on instruction-tuning a Chinese chat model

Created 2 years ago

Updated 1 year ago

EXAONE-Deep by LG-AI-EXAONE

EXAONE Deep: Reasoning-focused language models (2.4B-32B params)

Created 10 months ago

Updated 7 months ago

Cheetah by DCDmllm

Multimodal LLM for following zero-shot demonstrative instructions

Created 2 years ago

Updated 1 year ago

Deepdive-llama3-from-scratch by therealoliver

Llama3 inference walkthrough, step-by-step

Created 10 months ago

Updated 10 months ago

Starred by

Maxime Labonne

Maxime Labonne(Head of Post-Training at Liquid AI),

Omar Sanseviero

Omar Sanseviero(DevRel at Google DeepMind), and

1 more.

bonito by BatsResearch

Synthetic data generator for instruction tuning datasets

Created 1 year ago

Updated 6 months ago

CoLLiE by OpenMOSS

LLM training toolkit for efficient collaborative tuning

Created 2 years ago

Updated 1 year ago

step_into_llm by mindspore-lab

Online course for large language model (LLM) techniques using MindSpore

Created 2 years ago

Updated 2 weeks ago

Starred by

Yaowei Zheng

Yaowei Zheng(Author of LLaMA-Factory).

GLM-V by zai-org

Multimodal reasoning model with a "thinking" paradigm

Created 6 months ago

Updated 3 weeks ago

Starred by

Thomas Wolf

Thomas Wolf(Cofounder of Hugging Face),

Omar Sanseviero

Omar Sanseviero(DevRel at Google DeepMind), and

1 more.

starcoder2 by bigcode-project

Code generation model family (3B, 7B, 15B) for code completion

Created 2 years ago

Updated 1 year ago

Chinese-Vicuna by Facico

Chinese LLaMA fine-tuning project for instruction-following

Created 2 years ago

Updated 8 months ago

Starred by

Vincent Weisser

Vincent Weisser(Cofounder of Prime Intellect),

Chaoyu Yang

Chaoyu Yang(Founder of Bento), and

11 more.

mistral-inference by mistralai

Inference library for Mistral models

Created 2 years ago

Updated 1 month ago

Starred by

Didier Lopes

Didier Lopes(Founder of OpenBB),

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and

3 more.

DeepSeek-Coder-V2 by deepseek-ai

Open-source code language model comparable to GPT4-Turbo

Created 1 year ago

Updated 2 months ago

Feedback? Help us improve.