machine-learning-list by elicit

ML curriculum for foundation model learning

Created 1 year ago

1,426 stars

Top 28.3% on SourcePulse

View on GitHub

3 Experts Love This Project

Jeff Hammerbacher

Cofounder of Cloudera

Carlos E. Jimenez

Coauthor of SWE-bench, SWE-agent

Travis Fischer

Founder of Agentic

Project Summary

This repository provides a comprehensive, tiered reading list for understanding foundation models, from fundamental concepts to cutting-edge research. It's designed for individuals new to machine learning, particularly those interested in language models, and aims to balance practical deployment knowledge with long-term scalability insights.

How It Works

The curriculum is structured into tiers, guiding readers through foundational topics like Transformers and key architectures (GPT, LLaMA, T5, Mamba), then progressing to training, finetuning, reasoning strategies (Chain-of-Thought, Tree of Thoughts), and advanced applications. Each topic is broken down into recommended reading, with "Tier 1" papers offering essential overviews and subsequent tiers delving into more specialized or recent research.

Quick Start & Requirements

This is a curated list of academic papers and resources, not a software library. No installation or execution is required. Access to the internet is needed to view the linked papers and resources.

Highlighted Details

Extensive coverage of Transformer architectures, including visual explanations and implementation details.
Detailed sections on reasoning strategies, task decomposition, and tool use in LLMs.
Broad application areas, from science and forecasting to search, ranking, and production deployment.
In-depth exploration of advanced topics like world models, causality, interpretability, and reinforcement learning.

Maintenance & Community

The list is maintained by andreas@elicit.com. The structure suggests ongoing updates, with "✨ Added after 2024/4/1" indicating recent additions.

Licensing & Compatibility

This repository contains links to external academic papers and resources. The licensing and compatibility of these linked resources vary by their original source.

Limitations & Caveats

The list is a reading curriculum, not a runnable codebase. While comprehensive, it requires significant self-directed effort to digest the material. The rapid pace of LLM research means the "frontier" topics may quickly evolve.

Health Check

Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

16 stars in the last 30 days