training-fine-tuning-large-language-models-workshop-dhs2024 by dipanjanS

Workshop for training and fine-tuning large language models

Created 1 year ago

310 stars

Top 86.9% on SourcePulse

Project Summary

This repository provides comprehensive materials for a full-day workshop on training and fine-tuning Large Language Models (LLMs), targeting data scientists and ML engineers. It offers hands-on notebooks and presentations covering essential LLM concepts, from basic embeddings and prompt engineering to advanced techniques like parameter-efficient fine-tuning (PEFT) and reinforcement learning from human feedback (RLHF).

How It Works

The workshop is structured into five modules, progressing from foundational knowledge to advanced alignment techniques. It utilizes Hugging Face's Transformers and PEFT libraries, demonstrating practical applications with models like Phi-3 Mini, Llama 3.1, and GPT-2. The approach emphasizes hands-on coding within Jupyter notebooks, complemented by conceptual explanations in presentation slides, enabling participants to build and adapt LLMs for various tasks.

Quick Start & Requirements

Installation: Each module includes a *_Install_Requirements.ipynb notebook detailing necessary library installations.
Environment: Recommended environment: PyTorch 2.4.0, Python 3.11, CUDA 12.4, Ubuntu 22.04.
Hardware: A GPU with at least 48GB VRAM (e.g., NVIDIA A40) is recommended for fine-tuning tasks. A 30GB disk volume is needed for storing LLM weights.
Resources: Links to module-specific setup notebooks are provided within the README.

Highlighted Details

Covers prompt engineering with local LLMs (Phi-3 Mini, Llama 3.1).
Demonstrates Parameter-Efficient Fine-Tuning (PEFT) techniques like QLoRA.
Includes building custom Retrieval-Augmented Generation (RAG) pipelines.
Explores alignment methods: RLHF, PPO, DPO, and ORPO.

Maintenance & Community

The repository is maintained by Dipanjan (DJ) Sarkar.
References Hugging Face documentation and various blogs for foundational concepts and implementation details.

Licensing & Compatibility

The repository itself is hosted on GitHub, implying a standard open-source license, though not explicitly stated in the provided text.
Content is intended for educational purposes within the workshop context.

Limitations & Caveats

The workshop environment is optimized for specific hardware (48GB VRAM GPU) and software versions, which may require adjustments for different setups.
Some exercises might require downloading large model weights (e.g., Llama 3), necessitating significant disk space.

Health Check

Last Commit

10 months ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

0

Star History

3 stars in the last 30 days

Explore Similar Projects

fancy-nlp by boat-group

NLP toolkit for rapid prototyping and deployment

Created 6 years ago

Updated 3 years ago

llms by IbrahimSobh

Collection of resources for large language models

Created 2 years ago

Updated 3 months ago

nlp-tutorial by shibing624

NLP tutorial with examples for various tasks, good for learning NLP and PyTorch

Created 4 years ago

Updated 3 years ago

nlp_notes by YangBin1729

NLP notes for ML/DL principles, examples, and model deployment

Created 6 years ago

Updated 5 years ago

Starred by

Robert Stojnic

Robert Stojnic(Cocreator of Papers with Code).

finetune by IndicoDataSolutions

NLP finetuning library with scikit-learn style API

Created 7 years ago

Updated 2 months ago

Starred by

Elvis Saravia

Elvis Saravia(Founder of DAIR.AI) and

Stas Bekman

Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake).

awesome-transformer-nlp by cedrickchee

Curated list of NLP resources for Transformer networks

Created 7 years ago

Updated 1 year ago

Starred by

Jeff Hammerbacher

Jeff Hammerbacher(Cofounder of Cloudera) and

Lysandre Debut

Lysandre Debut(Chief Open-Source Officer at Hugging Face).

dllm by ZHZisZZ

Framework for diffusion language modeling

Created 3 months ago

Updated 5 days ago

llm_from_scratch by vivekkalyanarangan30

Building Large Language Models from scratch with PyTorch

Created 5 months ago

Updated 2 months ago

How-to-use-Transformers by jsksxs360

Tutorial code for quick-start with Transformers library

Created 3 years ago

Updated 1 year ago

tensorflow-nlp-tutorial by ukairia777

TensorFlow 2.0 tutorials for NLP tasks

Created 4 years ago

Updated 6 months ago

LLMs-from-scratch-CN by MLNLP-World

Chinese translation of LLM tutorial from scratch

Created 11 months ago

Updated 2 months ago

Starred by

Andrew Kane

Andrew Kane(Author of pgvector),

Stas Bekman

Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), and

11 more.

xlnet by zihangdai

Language model research paper using generalized autoregressive pretraining

Created 6 years ago

Updated 2 years ago

Feedback? Help us improve.