ReplitLM  by replit

Inference code and configs for ReplitLM model family

Created 2 years ago
996 stars

Top 37.4% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides inference code and configurations for Replit's family of code-focused large language models, ReplitLM. It targets developers and researchers looking to leverage or fine-tune code generation models, offering integration with Hugging Face Transformers and MosaicML's LLM Foundry for advanced training.

How It Works

ReplitLM models are designed for code understanding and generation. The repository facilitates their use via Hugging Face Transformers, allowing direct loading and inference. For fine-tuning and further training, it strongly recommends MosaicML's LLM Foundry and Composer, which provide optimized training pipelines, state-of-the-art techniques, and PyTorch-based components for efficient model adaptation on custom datasets.

Quick Start & Requirements

  • Inference: Models are available on Hugging Face (replit/replit-code-v1-3b). Use with Hugging Face Transformers library.
  • Training: Requires LLM Foundry and Composer installation. Dataset conversion to Mosaic StreamingDataset format is necessary.
  • Prerequisites: Python, PyTorch, Hugging Face libraries. LLM Foundry setup is recommended via Docker. Specific requirements are detailed in requirements.txt.
  • Links: Hosted Demo, LLM Foundry, Composer.

Highlighted Details

  • replit-code-v1-3b model available, with v1_5 coming soon.
  • Models trained on a mixture of 20 languages, with a strong emphasis on programming languages like Python, JavaScript, Java, and Markdown.
  • Detailed guides for instruction tuning using Hugging Face Transformers (Alpaca-style) and LLM Foundry.
  • Workaround provided for saving checkpoints with tokenizers that include .py files when using Composer.

Maintenance & Community

The project is actively updated by Replit. Further community interaction details (e.g., Discord/Slack) are not explicitly mentioned in the README.

Licensing & Compatibility

  • Model Checkpoints & Vocabulary: CC BY-SA 4.0
  • Code: Apache 2.0
  • The CC BY-SA 4.0 license for models requires attribution and sharing of derivatives under the same license, which may have implications for commercial use or closed-source integration.

Limitations & Caveats

The replit-code-v1_5-3b model is listed as "Coming Soon." A workaround is required for saving checkpoints with certain tokenizers when using LLM Foundry/Composer, indicating potential integration friction. The CC BY-SA 4.0 license for models may restrict commercial applications.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 30 days

Explore Similar Projects

Starred by Jeremy Howard Jeremy Howard(Cofounder of fast.ai) and Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake).

SwissArmyTransformer by THUDM

0.3%
1k
Transformer library for flexible model development
Created 4 years ago
Updated 8 months ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
11 more.

ctransformers by marella

0.1%
2k
Python bindings for fast Transformer model inference
Created 2 years ago
Updated 1 year ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), and
5 more.

matmulfreellm by ridgerchu

0.0%
3k
MatMul-free language models
Created 1 year ago
Updated 1 month ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
17 more.

open_llama by openlm-research

0.1%
8k
Open-source reproduction of LLaMA models
Created 2 years ago
Updated 2 years ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), and
15 more.

codellama by meta-llama

0.0%
16k
Inference code for CodeLlama models
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.