llms by IbrahimSobh

Collection of resources for large language models

Created 2 years ago

394 stars

Top 73.1% on SourcePulse

Project Summary

This repository provides a comprehensive survey of Large Language Models (LLMs), covering theoretical foundations, practical applications, and implementation details. It serves as a valuable resource for researchers, engineers, and practitioners looking to understand and utilize LLMs, offering insights into various architectures, training methodologies, and deployment strategies.

How It Works

The repository explores LLMs from statistical n-gram models to advanced neural network architectures like Transformers. It details concepts such as probability distributions, perplexity for evaluation, and the advantages of neural models in handling long-range dependencies and avoiding sparsity issues. The practical section showcases implementations of popular models like GPT, BERT, Falcon, and Llama, demonstrating text generation, fine-tuning, and retrieval-augmented generation (RAG) techniques.

Quick Start & Requirements

Installation: Primarily uses the Hugging Face transformers library (pip install transformers). Specific models may require additional dependencies like torch or gpt4all.
Prerequisites: Python 3.x, PyTorch, and potentially CUDA for GPU acceleration. Some examples require API keys (e.g., OpenAI, Google).
Resources: Model sizes vary significantly; running larger models locally may require substantial RAM and GPU VRAM.
Links: Hugging Face Transformers documentation: https://huggingface.co/docs/transformers/index

Highlighted Details

Detailed explanations of statistical vs. neural language modeling.
Practical code examples for popular LLMs (GPT-2, BERT, Falcon, Llama, CodeT5+).
Comprehensive overview of decoding strategies (greedy, beam search, sampling, top-k, top-p).
In-depth coverage of prompt engineering techniques (zero-shot, few-shot, chain-of-thought).
Introduction to parameter-efficient fine-tuning (PEFT) methods like LoRA and Prompt Tuning.
Explanation and implementation of Retrieval Augmented Generation (RAG).
Overview of LangChain for building LLM-powered applications, including chains, agents, memory, and document handling.

Maintenance & Community

The repository is a survey and educational resource, not a continuously maintained software project. It references widely used libraries and models from the NLP community.

Licensing & Compatibility

The repository itself does not specify a license. However, it extensively uses and demonstrates models and libraries (e.g., Hugging Face Transformers, Falcon) that have their own licenses, many of which are permissive (e.g., Apache 2.0 for Falcon) and allow commercial use. Users must adhere to the licenses of the individual models and libraries they choose to use.

Limitations & Caveats

This repository is primarily an educational survey and collection of examples. It does not provide a unified framework or a single entry point for all functionalities. Users will need to adapt code and manage dependencies for specific models and tasks. Some examples may require specific hardware or API access.

llms by IbrahimSobh

Explore Similar Projects

Mengzi3 by Langboat

llm-seminar by craffel

step_into_llm by mindspore-lab

awesome-transformer-nlp by cedrickchee

Mastering-Transformers by PacktPublishing

large_concept_model by facebookresearch

GLM by THUDM

zero_nlp by yuanzhoulvpi2017

awesome-pretrained-chinese-nlp-models by lonePatient

happy-llm by datawhalechina

unilm by microsoft

Hands-On-Large-Language-Models by HandsOnLLM