Inference code for the Persimmon-8B LLM
Top 71.7% on sourcepulse
This repository provides inference code for Persimmon-8B, a large language model from Adept AI. It enables users to download and run both the base and chat-fine-tuned versions of the model, offering a powerful tool for text generation tasks.
How It Works
The inference code is designed to serve the Persimmon-8B model via a REST API. It leverages a Dockerized environment for dependency management and ease of deployment. The core functionality involves loading the model weights, processing input prompts according to a specific chat format (human: {prompt}\n\nadept:), and generating text outputs.
Quick Start & Requirements
docker build -f docker/Dockerfile -t 'adeptdocker' .
and sh docker_launch.sh
.Highlighted Details
Maintenance & Community
No specific community channels or maintenance details are provided in the README.
Licensing & Compatibility
Limitations & Caveats
The chat model's CC-BY-NC 4.0 license restricts commercial use. Running the model naively requires a substantial 80GB GPU, with potential workarounds for 40GB cards.
1 year ago
Inactive