VardaGPT by ixaxaar

Associative memory-enhanced GPT-2 model

Created 2 years ago

336 stars

Top 81.9% on SourcePulse

Project Summary

VardaGPT enhances GPT-2 with an associative memory powered by FAISS, aiming to improve context retrieval and text generation. It's designed for researchers and developers interested in memory-augmented language models.

How It Works

VardaGPT integrates a FAISS-based associative memory with a GPT-2 model. During inference and training, it retrieves relevant information from the memory based on input embeddings. This retrieved information is concatenated with the original input embeddings before being processed by the GPT-2 transformer. This approach allows the model to access and utilize a larger, external knowledge base, potentially leading to more coherent and contextually relevant text generation.

Quick Start & Requirements

Primary install / run command: pip install -r requirements.txt followed by python train_varda_gpt_associative.py
Non-default prerequisites: Python 3.7+, PyTorch 1.8.1+, FAISS (CPU version specified).
Links: GitHub Repo

Highlighted Details

Leverages FAISS for efficient similarity search in the associative memory.
Modifies GPT-2 architecture to incorporate memory retrieval and concatenation.
Includes custom loss functions and training scripts for the memory-enhanced model.
Supports training on datasets like WikiText-2.

Maintenance & Community

The repository is maintained by ixaxaar.
No specific community channels (Discord/Slack) or roadmap are mentioned in the README.

Licensing & Compatibility

License: MIT License.
Compatibility: Permissive MIT license allows for commercial use and integration with closed-source projects.

Limitations & Caveats

The README primarily focuses on training the VardaGPTAssociative model with FAISS CPU; GPU support for FAISS is not explicitly detailed in the provided text.
The project appears to be focused on GPT-2, and compatibility with newer or larger models is not discussed.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days