kan-gpt by AdityaNG

PyTorch implementation of GPTs using Kolmogorov-Arnold Networks (KANs) for language modeling

Created 1 year ago

723 stars

Top 47.7% on SourcePulse

View on GitHub

1 Expert Loves This Project

Elvis Saravia

Founder of DAIR.AI

Project Summary

This repository provides a PyTorch implementation of Generative Pre-trained Transformers (GPTs) that leverage Kolmogorov-Arnold Networks (KANs) for language modeling. It aims to explore the potential of KANs in improving GPT architectures, offering a novel alternative to traditional MLP-based transformers for researchers and practitioners in natural language processing.

How It Works

KAN-GPT replaces the standard MLP layers within the GPT architecture with KANs. KANs represent functions as a composition of univariate functions on learnable grids, offering a potentially more efficient and interpretable alternative to MLPs. This approach allows for a more flexible and potentially higher-performing model by learning complex relationships through these univariate functions.

Quick Start & Requirements

Install from PyPI: pip install kan_gpt
Development setup requires cloning the repo, downloading datasets (TinyShakespeare, MNIST, WebText), and installing dependencies via pip install -r requirements.txt and pip install -e ..
Training can be initiated with python3 -m kan_gpt.train.
Prompting is demonstrated via python -m kan_gpt.prompt --prompt "..." --model_path (checkpoint).
Official documentation and usage examples are available in KAN_GPT.ipynb and kan_gpt/prompt.py.

Highlighted Details

Replaces MLP layers in GPT with Kolmogorov-Arnold Networks (KANs).
Includes scripts for dataset downloading (TinyShakespeare, MNIST, WebText).
Provides training scripts for both KAN-based and MLP-based GPTs.
Preliminary results suggest KAN-GPT performs slightly better than MLP-GPT on the Tiny Shakespeare dataset.

Maintenance & Community

The project is actively developed by Aditya Nalgunda Ganesh.
References include minGPT, pykan, webtext, and tinyshakespeare.
A CONTRIBUTING.md file is available for development guidelines.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README.

Limitations & Caveats

The project is in its early stages, with several TODO items including auto-downloading model weights, integrating with PyTorch Lightning, and adding comprehensive test cases.
Performance comparisons are currently limited to the Tiny Shakespeare dataset.
Requirements.txt constraints are noted for potential reduction.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days