little-coder  by itayinbarr

Coding agent optimized for small local LLMs

Created 2 weeks ago

New!

517 stars

Top 60.5% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

This project addresses the challenge of deploying capable coding agents on resource-constrained hardware by optimizing them for smaller, locally runnable Large Language Models (LLMs). It provides a framework and specific adaptations that enable frontier-level coding performance with models typically ranging from 5GB to 25GB, targeting users who prioritize local execution and privacy. The primary benefit is achieving high-fidelity coding assistance without relying on powerful cloud infrastructure.

How It Works

little-coder is built upon pi, a minimal agent substrate, by implementing its specialized coding adaptations as modular pi extensions. This design allows for fine-grained control over agent behavior, enabling users to mix, disable, or add extensions via .pi/settings.json. Key adaptations include the Write-vs-Edit tool invariant, per-turn tool-skill injection, algorithm cheat-sheet injection, a thinking-budget cap, output repair, and quality monitoring, all designed to optimize performance for smaller LLMs.

Quick Start & Requirements

  • Primary install: Clone the repository and run npm install.
  • Prerequisites: Node.js 20+ is required for the pi runtime. Users need either a local LLM (e.g., via llama.cpp or Ollama) or an API key for a supported cloud provider. Python 3.10+ and Docker are necessary only for running benchmarks.
  • Links: pi.dev is recommended for understanding the underlying pi agent framework.

Highlighted Details

  • Achieved 45.56% on the Aider Polyglot benchmark using a 9.7B Qwen model, significantly outperforming a vanilla baseline.
  • A pre-pi Python version with a 35B Qwen model reached 78.67% on Aider Polyglot.
  • On the current pi substrate, a 35B Qwen model achieved 40.0% on Terminal-Bench v0.1.1.
  • All benchmark results were obtained using consumer laptop hardware (e.g., RTX 5070 Laptop GPU) without cloud inference.
  • Core adaptations are implemented as modular pi extensions, allowing flexible customization and integration.

Maintenance & Community

The README does not provide explicit links to community channels (like Discord or Slack) or a detailed roadmap. Development appears active with "in progress" notes for certain benchmark integrations.

Licensing & Compatibility

The project is licensed under Apache 2.0, with MIT also mentioned for the pi dependency. This license is permissive and generally compatible with commercial use and closed-source linking. Upstream attribution is tracked in a NOTICE file.

Limitations & Caveats

The full research rationale and architectural details are primarily documented on an external Substack, requiring users to consult external resources for complete context. Reproducing specific historical benchmark results may necessitate checking out older commits and following distinct Python-based setup instructions.

Health Check
Last Commit

19 hours ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
7
Star History
524 stars in the last 17 days

Explore Similar Projects

Feedback? Help us improve.