mint by dpressel

Minimal PyTorch library for Transformer tutorials

Created 3 years ago

260 stars

Top 97.7% on SourcePulse

View on GitHub

1 Expert Loves This Project

Sebastian Raschka

Author of "Build a Large Language Model (From Scratch)"

Project Summary

MinT provides a minimal, from-scratch PyTorch implementation of common Transformer architectures, targeting researchers and engineers who want to understand and build these models. It offers a series of tutorials and a reusable Python package for implementing BERT, GPT, BART, T5, and SentenceBERT, facilitating hands-on learning and customization.

How It Works

MinT implements core Transformer components like attention mechanisms, feed-forward networks, and positional encodings in pure PyTorch. It prioritizes clarity and educational value, building models step-by-step through tutorials. The library leverages HuggingFace's tokenizers for efficient subword tokenization, a deliberate choice for speed and widespread adoption.

Quick Start & Requirements

Install via pip install .[examples] for full functionality including examples.
Requires PyTorch.
GPU with CUDA is recommended for training.
HuggingFace tokenizers library is a core dependency.
Pretraining on Wikipedia requires wikiextractor and a Wikipedia dump.

Highlighted Details

Tutorials cover building BERT, GPT, GPT2, BART, T5, and SentenceBERT from scratch.
Includes examples for in-memory and out-of-memory pretraining on custom datasets.
Demonstrates fine-tuning for classification tasks.
Features a REPL for interactive BERT masked string completion.

Maintenance & Community

The project appears to be a personal or educational effort by dpressel. No specific community channels or active maintenance signals are mentioned in the README.

Licensing & Compatibility

The README does not explicitly state a license. This is a critical omission for evaluating commercial use or integration into closed-source projects.

Limitations & Caveats

The project is described as "minimalistic" and "from scratch," implying it may lack the robustness, extensive features, or optimizations of larger, more established libraries. The lack of an explicit license is a significant caveat.

Health Check

Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days