python-toon  by xaviviro

LLM data serialization for token efficiency

Created 1 month ago
293 stars

Top 90.2% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a Python implementation of Token-Oriented Object Notation (TOON), a data format designed to significantly reduce the token count required for transmitting structured data to Large Language Models (LLMs). It targets developers seeking to lower LLM API costs by offering a compact, semantically clear alternative to JSON, promising 30-60% token reduction. However, this specific repository is deprecated and users should migrate to the official implementation.

How It Works

TOON combines YAML's indentation for nested objects with CSV-like tabular formatting for uniform data rows. Its core design principle is minimizing syntax by omitting redundant punctuation like braces, brackets, and most quotes. It features explicit metadata, such as array length indicators [N], to aid validation and maintain semantic clarity while drastically reducing token overhead compared to JSON.

Quick Start & Requirements

  • Installation: pip install python-toon
  • Prerequisites: Python. No other non-default prerequisites are specified.
  • Note: This repository is deprecated. The official implementation is available at toon-format/toon-python.

Highlighted Details

  • Achieves 30-60% token reduction compared to JSON.
  • Employs minimal syntax, eliminating redundant punctuation.
  • Supports tabular arrays using a CSV-like row format for uniform object collections.
  • Includes explicit metadata like array length indicators [N] for validation.
  • Maintains semantic clarity for LLM consumption.
  • Claims 100% output compatibility with the original TypeScript implementation.

Maintenance & Community

This repository is marked as deprecated. The official implementation and development efforts have moved to toon-format/toon-python. Further community engagement, support, and development are expected at the official repository.

Licensing & Compatibility

The project is licensed under the MIT License. This license is permissive and generally compatible with commercial use and closed-source linking, allowing broad adoption without significant restrictions.

Limitations & Caveats

The primary limitation is that this repository is deprecated and no longer actively maintained. Users are strongly advised to migrate to the official toon-format/toon-python repository to benefit from ongoing development, support, and bug fixes.

Health Check
Last Commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
4
Star History
225 stars in the last 30 days

Explore Similar Projects

Starred by Kaichao You Kaichao You(Core Maintainer of vLLM), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
3 more.

lm-format-enforcer by noamgat

0.1%
2k
Format enforcer for language model outputs (JSON, regex, etc.)
Created 2 years ago
Updated 3 months ago
Starred by Lewis Tunstall Lewis Tunstall(Research Engineer at Hugging Face), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
12 more.

datatrove by huggingface

0.1%
3k
Data processing library for large-scale text data
Created 2 years ago
Updated 5 days ago
Feedback? Help us improve.