tanuki.py  by Tanuki

SDK for LLM-powered apps that get cheaper/faster via model distillation

Created 2 years ago
692 stars

Top 49.2% on SourcePulse

GitHubView on GitHub
Project Summary

Tanuki is a Python library for building LLM-powered applications with predictable, type-safe outputs and automatic cost/latency reductions. It targets developers seeking to integrate LLMs into their workflows as reliably as traditional functions, offering benefits like automatic model distillation for performance gains and test-driven alignment for behavioral consistency.

How It Works

Tanuki allows developers to decorate Python function stubs with @tanuki.patch. When called, an LLM generates a response that is programmatically cast to the function's specified return type (e.g., Pydantic models, literals). Behavior is refined using @tanuki.align decorated functions containing assert statements, which serve as training data for model distillation. Over time, Tanuki trains smaller, specialized models to emulate the behavior of larger ones, reducing costs and latency by up to 90% and 80% respectively.

Quick Start & Requirements

  • Install via pip install tanuki.py or poetry add tanuki.py.
  • Requires an OpenAI API key set as OPENAI_API_KEY.
  • Official documentation and examples are available.

Highlighted Details

  • Test-Driven Alignment (TDA): Uses assert statements in @tanuki.align functions to define and enforce LLM behavior, ensuring predictable outputs.
  • Automatic Distillation: Trains smaller, cheaper, and faster models from larger ones based on usage data, reducing operational costs.
  • Type-Safe Outputs: Enforces strict output types (Python base types, Pydantic, Literals) to prevent LLM-induced bugs.
  • Broad Model Support: Integrates with OpenAI, Amazon Bedrock, and Together AI.

Maintenance & Community

  • Active development is implied by the feature set and community engagement channels (Discord).
  • Specific contributor or sponsorship details are not prominent in the README.

Licensing & Compatibility

  • The README does not explicitly state a license. This requires further investigation before commercial use or integration into closed-source projects.

Limitations & Caveats

  • Currently, model distillation is limited to GPT-4 (teacher) to GPT-3.5 (student) for OpenAI models; other providers are not yet supported for distillation.
  • Not ideal for tasks requiring extensive context or direct complex natural language generation.
  • Support for asynchronous functions and tool usage is noted as a future roadmap item.
Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
4 more.

batch_invariant_ops by thinking-machines-lab

2.0%
823
Enhance LLM inference determinism
Created 1 month ago
Updated 1 week ago
Starred by Lianmin Zheng Lianmin Zheng(Coauthor of SGLang, vLLM), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
1 more.

MiniCPM by OpenBMB

0.2%
8k
Ultra-efficient LLMs for end devices, achieving 5x+ speedup
Created 1 year ago
Updated 6 days ago
Feedback? Help us improve.