scikit-llm  by BeastByteAI

SDK for integrating LLMs into scikit-learn pipelines

created 2 years ago
3,473 stars

Top 14.2% on sourcepulse

GitHubView on GitHub
Project Summary

Scikit-LLM enables the integration of large language models (LLMs) into the scikit-learn ecosystem, targeting data scientists and ML engineers who want to leverage LLMs for text analysis within a familiar framework. It simplifies using LLMs for tasks like classification, offering a scikit-learn-compatible API.

How It Works

The library provides scikit-learn-compatible estimators that wrap various LLMs, abstracting away the complexities of API calls and prompt engineering. It allows users to treat LLMs as interchangeable components within scikit-learn pipelines, facilitating experimentation and deployment.

Quick Start & Requirements

Highlighted Details

  • Zero-shot text classification example provided using GPT-4.
  • Supports integration with scikit-learn pipelines.
  • Offers a consistent API for different LLMs.

Maintenance & Community

  • Project authors: Iryna Kondrashchenko and Oleh Kostromin.
  • Community engagement encouraged via GitHub issues and Discord.
  • Related projects: Dingo, Falcon.

Licensing & Compatibility

  • License: Not explicitly stated in the README.
  • Compatibility: Designed for use with scikit-learn, implying Python compatibility.

Limitations & Caveats

The library currently focuses on OpenAI models and requires users to manage their own API keys and costs. The README does not specify supported LLM providers beyond OpenAI or detail performance benchmarks.

Health Check
Last commit

2 days ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
48 stars in the last 90 days

Explore Similar Projects

Starred by Ying Sheng Ying Sheng(Author of SGLang), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
2 more.

ToolBench by OpenBMB

0.1%
5k
Open platform for LLM tool learning (ICLR'24 spotlight)
created 2 years ago
updated 2 months ago
Feedback? Help us improve.