large-language-models  by databricks-academy

Courseware for LLM application through production

created 2 years ago
806 stars

Top 44.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides notebooks and slides for the "Large Language Models: Application through Production" course, targeting individuals seeking to learn and apply LLMs from development to deployment. It offers a structured curriculum for mastering LLM applications.

How It Works

The course material is delivered via Databricks notebooks, designed to be imported directly into a Databricks workspace. It leverages Databricks Runtime for Machine Learning, specifically version 13.3 LTS, to ensure compatibility with pre-installed ML packages. The notebooks cover a range of LLM topics, including fine-tuning, with specific cluster configurations recommended for different modules.

Quick Start & Requirements

  • Import: Import via Git URL (https://github.com/databricks-academy/large-language-models.git) or download .dbc files from GitHub releases.
  • Databricks Runtime: Databricks Runtime 13.3 LTS for Machine Learning is required.
  • Cluster: Single Node cluster recommended. GPU instances (g5.2xlarge) are needed for fine-tuning notebooks (LLM 04a, LLM04L). CPU instances (i3.xlarge, i3.2xlarge) are sufficient for other notebooks.
  • Datasets: Run LLM 00a - Install Datasets notebook first; installation can take up to 25 minutes.
  • Documentation: Databricks Repos Setup, Databricks Runtime 13.3 LTS ML Release Notes.

Highlighted Details

  • Comprehensive curriculum covering LLM application through production.
  • Specific guidance on Databricks cluster configuration for different LLM tasks.
  • Includes a dedicated notebook for pre-installing datasets and models to optimize performance.
  • Course slides are available as PDF downloads from GitHub releases.

Maintenance & Community

This repository is part of the Databricks Academy curriculum. Further community or maintenance details are not specified in the README.

Licensing & Compatibility

The repository's license is not specified in the README. Compatibility is primarily with the Databricks platform and specific Databricks Runtime versions.

Limitations & Caveats

The courseware is explicitly tested on Databricks Runtime 13.3 LTS for Machine Learning; using other versions may require significant additional library installations and is not guaranteed to run. GPU instances are mandatory for specific fine-tuning notebooks.

Health Check
Last commit

1 year ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
19 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.