arc-lang-public by jerber

LLM-powered asynchronous solver for Abstraction and Reasoning Corpus (ARC) puzzles

Created 6 months ago

313 stars

Top 86.4% on SourcePulse

View on GitHub

1 Expert Loves This Project

Jeff Hammerbacher

Cofounder of Cloudera

Project Summary

This project provides an asynchronous pipeline for solving Abstraction and Reasoning Corpus (ARC) puzzles using large language models (LLMs). It targets researchers and power users seeking to automate complex reasoning tasks by iteratively generating, scoring, and refining LLM-generated instructions. The system aims to produce candidate solutions for ARC challenges, facilitating participation in competitions and research.

How It Works

The core of the system is an asynchronous pipeline orchestrated by src/run.py. It processes ARC challenge datasets in batches, employing a monitored semaphore to manage API concurrency. For each step, LLMs generate instructions based on training grids, which are then scored via cross-validation using another LLM call to execute the instructions. The system iteratively revises poorly performing instructions or synthesizes new plans from the best-performing ones, feeding back into the scoring loop. Finally, the strongest instructions are used to generate multiple candidate outputs for hidden test grids. This iterative refinement and asynchronous execution are key to its approach.

Quick Start & Requirements

Installation: Use uv sync or pip install.
Prerequisites: Python 3.12+, LLM provider access tokens (e.g., xAI Grok, OpenAI, Anthropic, Gemini, DeepSeek, OpenRouter), and the MAX_CONCURRENCY environment variable. Optional: NEON_DSN for PostgreSQL persistence.
Running: Execute python src/run.py for a basic smoke test using default configurations and the 2025 evaluation challenges. Custom configurations and paths can be used via run_from_json.

Highlighted Details

Iterative instruction generation, scoring, revision, and pooling loop for robust problem-solving.
Asynchronous execution leveraging asyncio with concurrency control via MonitoredSemaphore.
Support for multiple LLM providers through configurable RunConfig presets.
Optional persistence of intermediate results and final guesses to PostgreSQL.
Outputs are formatted for direct submission to ARC competitions.

Maintenance & Community

No specific details regarding maintainers, community channels (e.g., Discord, Slack), or project roadmap were found in the provided README text.

Licensing & Compatibility

The license type is not specified in the provided README text, which may pose a compatibility concern for certain use cases, particularly commercial applications.

Limitations & Caveats

The system requires careful environment variable configuration, including API keys and MAX_CONCURRENCY. Visualization features (VIZ=1) may cause issues on headless servers. Repeated rate-limiting errors suggest adjusting concurrency levels. The lack of explicit licensing information is a notable caveat.

Health Check

Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days