spacy-transformers  by explosion

spaCy extension for transformer models

Created 6 years ago
1,393 stars

Top 29.0% on SourcePulse

GitHubView on GitHub
Project Summary

This package provides spaCy components and architectures to integrate Hugging Face's transformer models (BERT, XLNet, GPT-2, etc.) into spaCy pipelines. It enables users to leverage state-of-the-art NLP models for tasks within the spaCy ecosystem, offering convenient access to powerful pre-trained representations.

How It Works

The package introduces a Transformer pipeline component that acts as a bridge to Hugging Face's transformers library. It handles the automatic alignment of transformer outputs to spaCy's tokenization, allowing seamless integration. This approach simplifies using advanced transformer architectures within spaCy's established pipeline structure and configuration system.

Quick Start & Requirements

  • Install via pip: pip install 'spacy[transformers]'
  • Requirements: Python 3.6+, PyTorch v1.5+, spaCy v3.0+.
  • GPU installation requires specifying CUDA version: spacy[transformers,cudaXX] (e.g., spacy[transformers,cuda110]).
  • Documentation: https://spacy.io/usage/transformers

Highlighted Details

  • Enables multi-task learning by backpropagating from multiple pipeline components to a single transformer.
  • Integrates with spaCy v3's configuration system for training and customization.
  • Supports automatic alignment of transformer outputs to spaCy's tokenization.
  • Facilitates customization of saved transformer data and document processing length.

Maintenance & Community

  • Issues and bug reports should be filed on spaCy's issue tracker.
  • Discussion threads can be opened on the spaCy discussion board.

Licensing & Compatibility

  • The package is distributed under the MIT License.
  • Compatible with spaCy v3.x.

Limitations & Caveats

The Transformer component itself does not directly support task-specific heads (e.g., for token or text classification). For using pre-trained classification models, the spacy-huggingface-pipelines package is recommended.

Health Check
Last Commit

3 months ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), Lewis Tunstall Lewis Tunstall(Research Engineer at Hugging Face), and
4 more.

fastformers by microsoft

0%
707
NLU optimization recipes for transformer models
Created 5 years ago
Updated 6 months ago
Starred by Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake) and Thomas Wolf Thomas Wolf(Cofounder of Hugging Face).

transformer by sannykim

0%
546
Resource list for studying Transformers
Created 6 years ago
Updated 1 year ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), and
5 more.

matmulfreellm by ridgerchu

0.0%
3k
MatMul-free language models
Created 1 year ago
Updated 1 month ago
Feedback? Help us improve.