PyTorch implementation of GPTs using Kolmogorov-Arnold Networks (KANs) for language modeling
Top 48.8% on sourcepulse
This repository provides a PyTorch implementation of Generative Pre-trained Transformers (GPTs) that leverage Kolmogorov-Arnold Networks (KANs) for language modeling. It aims to explore the potential of KANs in improving GPT architectures, offering a novel alternative to traditional MLP-based transformers for researchers and practitioners in natural language processing.
How It Works
KAN-GPT replaces the standard MLP layers within the GPT architecture with KANs. KANs represent functions as a composition of univariate functions on learnable grids, offering a potentially more efficient and interpretable alternative to MLPs. This approach allows for a more flexible and potentially higher-performing model by learning complex relationships through these univariate functions.
Quick Start & Requirements
pip install kan_gpt
pip install -r requirements.txt
and pip install -e .
.python3 -m kan_gpt.train
.python -m kan_gpt.prompt --prompt "..." --model_path (checkpoint)
.KAN_GPT.ipynb
and kan_gpt/prompt.py
.Highlighted Details
Maintenance & Community
CONTRIBUTING.md
file is available for development guidelines.Licensing & Compatibility
Limitations & Caveats
8 months ago
1 week