spacy-llm  by explosion

spaCy plugin for LLM-powered NLP pipelines

Created 2 years ago
1,313 stars

Top 30.5% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This package integrates Large Language Models (LLMs) into spaCy's structured NLP pipelines, enabling rapid prototyping of NLP tasks without requiring training data. It targets developers and researchers seeking to leverage LLMs for tasks like NER, text classification, and summarization, offering a flexible way to combine LLM-powered components with traditional spaCy models.

How It Works

spacy-llm provides a modular system for defining LLM tasks, including prompting and response parsing. It offers interfaces to major LLM providers (OpenAI, Cohere, Anthropic, Google, Azure) and Hugging Face hosted open-source models. The package supports LangChain integration and includes a map-reduce approach for handling prompts exceeding context window limits, allowing for efficient processing of large documents.

Quick Start & Requirements

  • Install via pip: python -m pip install spacy-llm
  • Requires spaCy installation.
  • API keys for LLM providers (e.g., OpenAI) must be set as environment variables.
  • Official documentation: https://spacy.io/api/large-language-models

Highlighted Details

  • Integrates LLMs as serializable spaCy components.
  • Supports a wide range of tasks out-of-the-box, including NER, text classification, summarization, and more.
  • Enables custom task implementation via spaCy's registry.
  • Offers map-reduce for long context window handling.

Maintenance & Community

  • Bug reports can be filed on the spaCy issue tracker.
  • Discussion board available for questions and feedback.
  • Migration guides are provided.

Licensing & Compatibility

  • The package itself is likely distributed under a permissive license compatible with spaCy's ecosystem (e.g., MIT), but users must adhere to the terms of service of the LLM providers used.

Limitations & Caveats

This package is experimental, and minor version updates may introduce breaking changes to the interface. While LLMs are powerful for prototyping, the README notes that traditional supervised learning models often offer better efficiency, reliability, control, and accuracy for production use cases when sufficient training data is available.

Health Check
Last Commit

8 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
14 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
6 more.

prompt-engine by microsoft

0.1%
3k
NPM library for LLM prompt engineering
Created 3 years ago
Updated 2 years ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Kevin Hou Kevin Hou(Head of Product Engineering at Windsurf), and
6 more.

TypeChat by microsoft

0.1%
9k
Library for building natural language interfaces using types
Created 2 years ago
Updated 2 months ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), François Chollet François Chollet(Author of Keras; Cofounder of Ndea, ARC Prize), and
42 more.

spaCy by explosion

0.1%
32k
NLP library for production applications
Created 11 years ago
Updated 3 months ago
Feedback? Help us improve.