JS library for OpenAI GPT model token encoding/decoding
Top 55.4% on sourcepulse
This library provides the fastest JavaScript Byte Pair Encoding (BPE) tokenizer for OpenAI's GPT models, designed for developers working with LLMs in JavaScript environments. It offers a performant, low-footprint solution for encoding and decoding text to and from tokens, supporting all current OpenAI models and offering advanced features like chat tokenization and asynchronous streaming.
How It Works
The library is a direct port of OpenAI's tiktoken
library, implemented in TypeScript for type safety and performance. It utilizes BPE algorithms to convert text into integer token sequences, mirroring OpenAI's official tokenization. Key advantages include synchronous operation, generator functions for streaming, efficient isWithinTokenLimit
checks, and a memory-efficient design without global caches.
Quick Start & Requirements
npm install gpt-tokenizer
Highlighted Details
o200k_base
, cl100k_base
, etc.).encodeChat
for efficient chat message tokenization.decodeAsyncGenerator
) for streaming token processing.isWithinTokenLimit
function for quick token count checks without full encoding.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
EncodeOptions
.1 month ago
1+ week