Chinese-LLaVA  by LinkSoul-AI

Open-source, commercially usable multimodal model for bilingual visual-text dialogue

Created 2 years ago
377 stars

Top 75.4% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides Chinese-LLaVA, an open-source, commercially viable multimodal model supporting bilingual (Chinese and English) visual-textual dialogue. It also includes the Chinese-LLaVA-Vision-Instructions dataset for visual instruction tuning, targeting researchers and developers working with multimodal AI in Chinese and English.

How It Works

Chinese-LLaVA builds upon the LLaVA architecture, integrating Chinese language models like Chinese-Llama-2-7B and Baichuan-7B with visual encoders. This approach allows for seamless understanding and generation of responses based on both image content and textual prompts in two languages, offering a robust solution for multimodal applications.

Quick Start & Requirements

  • Install: Clone the repository, create a conda environment (conda create -n Cllava python=3.10), activate it (conda activate Cllava), and install the package (pip install -e .).
  • Prerequisites: Python 3.10.
  • Demo: Available at HuggingFace Spaces.
  • Model Downloads: HuggingFace and Baidu Netdisk links provided for Chinese-LLaVA-Chinese-Llama-2-7B and Chinese-LLaVA-Baichuan-7B.
  • Dataset: Available on HuggingFace Datasets and Baidu Netdisk.

Highlighted Details

  • Supports both Chinese and English in multimodal dialogue.
  • Offers pre-trained models based on Chinese-Llama-2-7B and Baichuan-7B.
  • Includes a bilingual visual instruction dataset for fine-tuning.
  • Provides a quick inference script for testing.

Maintenance & Community

  • WeChat group available for communication.

Licensing & Compatibility

  • License: Apache-2.0.
  • Compatibility: Permissive license allows for commercial use.

Limitations & Caveats

The README mentions "TODO" for training details, int4 quantization, and Docker deployment, indicating these features may be incomplete or under development.

Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Elvis Saravia Elvis Saravia(Founder of DAIR.AI), and
1 more.

InternGPT by OpenGVLab

0.1%
3k
Interactive demo platform for showcasing AI models
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.