json_repair  by mangiucugna

JSON repair tool for LLM outputs

created 1 year ago
2,531 stars

Top 18.9% on sourcepulse

GitHubView on GitHub
Project Summary

This Python module addresses the common issue of malformed JSON output from Large Language Models (LLMs). It provides a robust solution for repairing syntactically incorrect JSON strings, making them parsable by standard libraries. The target audience includes developers working with LLM-generated data that requires reliable JSON parsing, offering a lightweight and effective way to handle common LLM output errors.

How It Works

The library employs a heuristic-based approach to fix JSON. It parses the JSON string according to the standard BNF definition, identifying and correcting common syntax errors such as missing quotes, misplaced commas, unescaped characters, and incomplete structures. When errors are detected, it applies simple, intelligent fixes like adding missing delimiters, quoting unquoted strings, and cleaning up extraneous characters or whitespace.

Quick Start & Requirements

Highlighted Details

  • Supports fixing syntax errors, malformed arrays/objects, and auto-completing missing values.
  • Offers drop-in replacements for json.loads() and json.load() via json_repair.loads() and json_repair.load().
  • Includes CLI support via pipx install json-repair.
  • Handles non-Latin characters correctly with ensure_ascii=False.

Maintenance & Community

The project follows strict semantic versioning and TDD, with frequent updates and no breaking changes in minor/patch versions. Users are advised to pin dependencies as json_repair==0.*.

Licensing & Compatibility

The library is available under a permissive license, suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

While comprehensive, the library may not cover all obscure JSON corruption scenarios, and users are encouraged to contribute examples or pull requests for unhandled edge cases. The skip_json_loads=True option should only be used when the input is guaranteed to be invalid JSON.

Health Check
Last commit

1 week ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
3
Star History
685 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.