Curated list of transformer models
Top 51.7% on sourcepulse
This repository is a curated list of transformer models, categorized by architecture and modality, aimed at researchers and practitioners in NLP, computer vision, and speech processing. It provides model names, descriptions, links to Hugging Face or GitHub repositories, original papers, sources, and licenses, facilitating the discovery and selection of suitable pre-trained models for various tasks.
How It Works
The list is organized into distinct categories such as Encoder, Decoder, Encoder+Decoder, Multimodal, Vision, Audio, Recommendation, and Grounded Situation Recognition. Each entry includes essential metadata, allowing users to quickly assess model capabilities, origins, and licensing terms. The curation aims to cover a broad spectrum of transformer applications.
Quick Start & Requirements
This is a curated list, not a runnable codebase. To use the models, refer to the individual model links provided for installation and usage instructions.
Highlighted Details
Maintenance & Community
The list is maintained by abacaj, with an invitation for community contributions via pull requests or Twitter outreach.
Licensing & Compatibility
Licenses vary significantly, including Apache 2.0, MIT, BSD 3-Clause, CC BY 4.0, CC BY-NC-SA 4.0, and custom licenses with use-based restrictions. Some models, like LLaMa and OPT, require approval and are non-commercial. VALL-E has a dependency on a CC-BY-NC library.
Limitations & Caveats
The list is a directory and does not provide direct access to the models themselves. Users must consult individual model repositories for specific usage, dependencies, and potential compatibility issues, especially concerning non-commercial licenses or restricted use cases.
2 years ago
1 day