pure_attention  by mmmwhy

Attention-based models for CV and NLP tasks

Created 8 years ago
808 stars

Top 43.8% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a unified PyTorch implementation of attention-based models for both Natural Language Processing (NLP) and Computer Vision (CV) tasks. It aims to establish a strong baseline for future research by offering a streamlined codebase for core Transformer components and common downstream applications, enabling efficient training and ONNX-based deployment.

How It Works

The project focuses on implementing the core encoder and decoder stages of the Transformer architecture, drawing inspiration from Hugging Face's transformers library but with a simplified codebase. This approach allows for direct comparison and validation against established models like BERT, ensuring consistency in results. The goal is to create a foundational backbone that can achieve state-of-the-art performance across diverse NLP and CV tasks.

Quick Start & Requirements

  • Installation: PyTorch.
  • Prerequisites: Python, Hugging Face transformers library for model downloading.
  • Resources: Access to domestic download mirrors for Hugging Face models is provided.
  • Documentation: transformers国内下载镜像

Highlighted Details

  • PyTorch implementation of Transformer encoder and BERT.
  • Achieves consistent results with Hugging Face's encoder.
  • Provides domestic download mirrors for popular Chinese NLP models (bert-base-chinese, chinese-roberta-wwm-ext, etc.).
  • Planned implementation of Transformer decoder for seq2seq tasks.

Maintenance & Community

The project is actively maintained by mmmwhy. Further community engagement details are not specified in the README.

Licensing & Compatibility

The repository's license is not explicitly stated in the provided README.

Limitations & Caveats

The project is currently in its first phase, with planned implementations for NLP downstream tasks (sequence labeling, classification), ViT, UNILM, MAE, GPT series, and ONNX export/inference. The current focus is on validating the core encoder and decoder implementations.

Health Check
Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), Lewis Tunstall Lewis Tunstall(Research Engineer at Hugging Face), and
4 more.

fastformers by microsoft

0%
707
NLU optimization recipes for transformer models
Created 5 years ago
Updated 6 months ago
Starred by Elvis Saravia Elvis Saravia(Founder of DAIR.AI) and Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake).

awesome-transformer-nlp by cedrickchee

0%
1k
Curated list of NLP resources for Transformer networks
Created 6 years ago
Updated 10 months ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), and
5 more.

matmulfreellm by ridgerchu

0.0%
3k
MatMul-free language models
Created 1 year ago
Updated 1 month ago
Feedback? Help us improve.