Discover and explore top open-source AI tools and projects—updated daily.
simonwCLI tool for counting and truncating text based on tokens
Top 75.5% on SourcePulse
ttok is a command-line utility for counting and truncating text based on token counts, primarily for use with Large Language Models (LLMs). It leverages OpenAI's tiktoken library, making it useful for developers and researchers working with LLM APIs that have token-based pricing or context window limits.
How It Works
The tool utilizes the tiktoken library to encode text into integer token IDs, mirroring how LLMs process input. It supports various OpenAI models by allowing users to specify the model via the -m flag, ensuring accurate tokenization for different LLM architectures. The core functionality includes counting tokens in provided text or piped input and truncating text to a specified token limit using the -t flag.
Quick Start & Requirements
pip install ttokbrew install simonw/llm/ttokHighlighted Details
--encode) and decode them back to text (--decode).--tokens).-i -).Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The tool relies on the tiktoken library, meaning its accuracy is tied to the library's updates and support for specific models. No specific limitations are mentioned in the README regarding unsupported platforms or known bugs.
1 year ago
Inactive
guillaume-be
noamgat
latitudegames
karpathy
huggingface