JSON repair tool for LLM outputs
Top 18.9% on sourcepulse
This Python module addresses the common issue of malformed JSON output from Large Language Models (LLMs). It provides a robust solution for repairing syntactically incorrect JSON strings, making them parsable by standard libraries. The target audience includes developers working with LLM-generated data that requires reliable JSON parsing, offering a lightweight and effective way to handle common LLM output errors.
How It Works
The library employs a heuristic-based approach to fix JSON. It parses the JSON string according to the standard BNF definition, identifying and correcting common syntax errors such as missing quotes, misplaced commas, unescaped characters, and incomplete structures. When errors are detected, it applies simple, intelligent fixes like adding missing delimiters, quoting unquoted strings, and cleaning up extraneous characters or whitespace.
Quick Start & Requirements
pip install json-repair
from json_repair import repair_json
Highlighted Details
json.loads()
and json.load()
via json_repair.loads()
and json_repair.load()
.pipx install json-repair
.ensure_ascii=False
.Maintenance & Community
The project follows strict semantic versioning and TDD, with frequent updates and no breaking changes in minor/patch versions. Users are advised to pin dependencies as json_repair==0.*
.
Licensing & Compatibility
The library is available under a permissive license, suitable for commercial use and integration into closed-source projects.
Limitations & Caveats
While comprehensive, the library may not cover all obscure JSON corruption scenarios, and users are encouraged to contribute examples or pull requests for unhandled edge cases. The skip_json_loads=True
option should only be used when the input is guaranteed to be invalid JSON.
1 week ago
1 day