torch-conv-kan  by IvanDrokin

PyTorch implementation for convolutional Kolmogorov-Arnold Networks research

created 1 year ago
513 stars

Top 61.8% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a comprehensive collection of Convolutional Kolmogorov-Arnold Networks (CKANs) for researchers and practitioners exploring novel neural network architectures. It offers implementations of various CKAN variants, model architectures like ResNet and DenseNet adapted with CKANs, and training scripts for benchmarking on standard datasets, aiming to advance the state-of-the-art in deep learning.

How It Works

The core innovation lies in replacing traditional convolutional kernels (weight matrices) with learnable univariate functions (phi functions), inspired by the Kolmogorov-Arnold representation theorem. This approach allows for more flexible and potentially more efficient feature extraction by learning non-linear transformations directly within the convolution operation, rather than relying solely on subsequent activation functions.

Quick Start & Requirements

Highlighted Details

  • Implements KANConv, KALNConv, FastKANConv, KACNConv, KAGNConv, WavKANConv, KAJNConv, and ReLU KAN Conv layers.
  • Offers Bottleneck Convolutional KAN layers to reduce parameter count.
  • Includes ResNet-like (ResKANets), DenseNet-like (DenseKANets), VGG-like (VGGKANs), and U-Net-like (UKANet, U2KANet) model architectures.
  • Achieves competitive results on MNIST and Tiny ImageNet, with preliminary benchmarks on CIFAR datasets and ImageNet1k.
  • Provides pretrained checkpoints for VGG-like models on ImageNet1k.

Maintenance & Community

  • Project status: Under Development with frequent updates.
  • Based on several other KAN-related repositories (TorchKAN, FastKAN, etc.).
  • Contributions are welcome; issues can be raised for problems.

Licensing & Compatibility

  • The repository does not explicitly state a license in the README. It is based on other projects which may have their own licenses. Users should verify licensing for commercial or closed-source use.

Limitations & Caveats

The README notes that results are preliminary and model architectures are not exhaustively explored. ChebyKAN-based convolutions exhibit stability issues. Performance on CIFAR-10/100 is reported to be significantly lower than classical convolutions, indicating a need for further architectural research for these datasets.

Health Check
Last commit

8 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
30 stars in the last 90 days

Explore Similar Projects

Starred by Lilian Weng Lilian Weng(Cofounder of Thinking Machines Lab), Patrick Kidger Patrick Kidger(Core Contributor to JAX ecosystem), and
4 more.

glow by openai

0.1%
3k
Generative flow research paper code
created 7 years ago
updated 1 year ago
Feedback? Help us improve.