adept-inference  by persimmon-ai-labs

Inference code for the Persimmon-8B LLM

created 1 year ago
415 stars

Top 71.7% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides inference code for Persimmon-8B, a large language model from Adept AI. It enables users to download and run both the base and chat-fine-tuned versions of the model, offering a powerful tool for text generation tasks.

How It Works

The inference code is designed to serve the Persimmon-8B model via a REST API. It leverages a Dockerized environment for dependency management and ease of deployment. The core functionality involves loading the model weights, processing input prompts according to a specific chat format (human: {prompt}\n\nadept:), and generating text outputs.

Quick Start & Requirements

  • Install/Run: Build and run via Docker using docker build -f docker/Dockerfile -t 'adeptdocker' . and sh docker_launch.sh.
  • Prerequisites: Requires an 80GB GPU for naive execution. A 40GB GPU may suffice with modifications to remove unused embeddings, reduce sequence length, or by using 8-bit quantization.
  • Model Download: Checkpoints are available via OCI bucket links provided in the README.
  • Documentation: User guide and model details are in the README.

Highlighted Details

  • Offers both a base and a chat-fine-tuned version of Persimmon-8B.
  • Requires specific prompt formatting for optimal chat model performance.
  • Supports tensor parallelism of 1, impacting GPU memory requirements.

Maintenance & Community

No specific community channels or maintenance details are provided in the README.

Licensing & Compatibility

  • Base model: Apache 2.0 license.
  • Chat model: CC-BY-NC 4.0 license (Non-Commercial use).

Limitations & Caveats

The chat model's CC-BY-NC 4.0 license restricts commercial use. Running the model naively requires a substantial 80GB GPU, with potential workarounds for 40GB cards.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.