telos-sdk by learningCatHD

AI agent gateway for significant LLM cost reduction

Created 2 months ago

644 stars

Top 50.8% on SourcePulse

Project Summary

TELOS SDK addresses the significant cost and latency associated with large language model (LLM) interactions by introducing a cache-aware gateway protocol. It targets developers and researchers building AI agents, offering substantial savings on token billing and potentially faster inference without altering existing agent logic or model behavior.

How It Works

TELOS acts as a transparent proxy between an AI agent and its LLM backend. It employs a cache-aware mechanism to identify and reuse common prefixes within conversation histories or tool calls. Instead of re-processing and re-billing these repeated segments on every turn, TELOS serves them from a local cache, drastically reducing the number of input tokens billed by the LLM provider. This approach is advantageous as it directly targets the cost bottleneck without requiring modifications to the agent's prompts, underlying models, or core functionalities.

Quick Start & Requirements

Installation can be achieved via a one-line script for Linux/macOS/WSL2/Android Termux (curl -fsSL https://raw.githubusercontent.com/learningCatHD/telos-sdk/main/scripts/install.sh | bash) or using pip (uv pip install -U telos-sdk). After installation, run telos init to configure the gateway and telos dashboard to view savings. Prerequisites include Python 3.10+. Detailed guides and documentation are available at docs.telosai.pro.

Highlighted Details

Token Bill Savings: Claims range from 50% to 90% reduction in billed input tokens, with a real-world 6-turn conversation example showing a 92.3% saving and SWE-bench verified savings of 40.5% end-to-end cost.
Agent Behavior Guarantee: A/B testing via SWE-bench indicates no statistically significant regression in agent performance or output quality (McNemar p = 0.66).
Inference Speed: The system is designed to be faster, as cache hits bypass redundant token processing, reducing time-to-first-token.
Data Privacy: The gateway operates locally (127.0.0.1) and captures no content; usage logs record only token counts, ensuring no sensitive prompt or response text leaves the user's environment.

Maintenance & Community

Core contributors include Zheng Wang, Shenzhi Wang, HongTao Zhong, Shiji Song, and Gao Huang. The project has recently launched its documentation site and includes news on integrations like cc-switch coexistence and Codex.app support. No direct community channels (e.g., Discord, Slack) are listed in the README.

Licensing & Compatibility

The core of the project is licensed under the Apache 2.0 license, which is permissive and generally compatible with commercial use and closed-source applications.

Limitations & Caveats

The TELOS SDK is currently in a Beta status, indicating potential for ongoing development, undiscovered bugs, or API changes. Specific limitations or unsupported platforms are not detailed in the provided README.

Health Check

Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days