CodeIO by hkust-nlp

Research paper enhancing LLMs' reasoning via code I/O prediction

Created 11 months ago

566 stars

Top 56.8% on SourcePulse

View on GitHub

1 Expert Loves This Project

Nathan Lambert

Research Scientist at AI2

Project Summary

CodeI/O offers a novel method for enhancing Large Language Models' (LLMs) reasoning abilities by transforming code-based reasoning patterns into natural language Chain-of-Thought rationales. It targets researchers and developers aiming to improve LLM performance across diverse reasoning tasks, including symbolic, scientific, and commonsense reasoning, by extracting universal reasoning primitives while preserving procedural rigor.

How It Works

CodeI/O systematically converts diverse code patterns into natural language rationales, decoupling reasoning from specific code syntax while retaining logical structure. This approach allows for multi-task enhancement and fully verifiable predictions through cached ground-truth matching or code re-execution. An enhanced version, CodeI/O++, incorporates multi-turn revisions for improved accuracy.

Quick Start & Requirements

Setup: Install via requirements.txt or environment.yaml using Conda.

conda create -n codeio_exec python=3.11
conda activate codeio_exec
pip install -r requirements.txt

Prerequisites: Python 3.11, Conda. The environment may need updates for specific Python code execution requirements.
Data Processing: Requires API access (OpenAI or DeepSeek) for inference steps. Local inference with frameworks like vLLM or sglang is also supported. The process involves converting raw code, parsing I/O pairs, generating predictions, verification, and optional multi-turn revision.
Training: Compatible with frameworks like LLaMA-Factory.
Resources: The project provides a Hugging Face dataset (hkust-nlp/CodeIO-Pyedu-Reasoning) and pre-trained models (Qwen 2.5 7B Coder, LLaMA 3.1 8B, DeepSeek v2 Lite Coder).
Documentation: Links to paper, project page, released resources, and dataset are available.

Highlighted Details

Universal transformation of code patterns into natural language Chain-of-Thought rationales.
Syntax-decoupled reasoning that preserves logical structure.
Demonstrated performance improvements across multiple reasoning task categories.
Fully verifiable prediction outputs via code re-execution or ground-truth matching.

Maintenance & Community

The project is associated with HKUST NLP. Further community or maintenance details are not explicitly provided in the README.

Licensing & Compatibility

The README does not specify a license. Compatibility for commercial use or closed-source linking is not detailed.

Limitations & Caveats

The provided setup does not guarantee execution of all Python code types. The data processing pipeline relies heavily on external API calls, which may be subject to rate limits or changes. Only the PythonEdu-Reasoning subset of the dataset is released due to collaborator compliance requirements.

Health Check

Last Commit

8 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days