how_to_fix_your_context by langchain-ai

Optimizing LLM context windows for improved performance

Created 5 months ago

501 stars

Top 62.1% on SourcePulse

Project Summary

This repository addresses the challenge of Large Language Model (LLM) context window limitations and performance degradation with increasing input length. It provides practical demonstrations of six context engineering techniques, enabling developers to improve LLM efficiency and accuracy when handling extensive information. The project serves as a valuable resource for AI engineers and researchers seeking to optimize LLM applications by managing context effectively.

How It Works

This project leverages LangGraph, a framework for building stateful AI applications, to implement and illustrate six context engineering techniques: Retrieval-Augmented Generation (RAG), Tool Loadout, Context Quarantine, Context Pruning, Context Summarization, and Context Offloading. Each technique is presented via Jupyter notebooks, showcasing how LangGraph's node-and-edge architecture facilitates the creation of complex, stateful agent workflows. This approach allows for granular control over data flow and state management, making it ideal for tackling the nuances of context manipulation in LLMs.

Quick Start & Requirements

Installation: Clone the repository, create and activate a virtual environment (e.g., using uv venv), and install dependencies with uv pip install -r requirements.txt.
Prerequisites: Python 3.9 or higher, uv package manager.
Environment Variables: Set OPENAI_API_KEY and/or ANTHROPIC_API_KEY for model providers.
Resources: Links to notebooks are provided within the README for specific technique implementations.

Highlighted Details

RAG: Implements a RAG agent using LangGraph with a retrieval tool, demonstrating effective context augmentation for complex queries.
Tool Loadout: Showcases semantic tool selection by indexing math functions in a vector store, dynamically binding relevant tools to reduce context confusion.
Context Quarantine: Utilizes a LangGraph Supervisor architecture for multi-agent systems with isolated context windows, preventing context clash and distraction.
Context Pruning: Extends RAG by adding an intelligent pruning step using GPT-4o-mini to remove irrelevant content, significantly reducing token usage (e.g., from 25k to 11k tokens).
Context Summarization: Condenses tool call results or documents into concise summaries using GPT-4o-mini, reducing context size while preserving essential information.
Context Offloading: Demonstrates temporary (scratchpad) and persistent (key-value store) context management outside the LLM's immediate context, enabling memory across interactions.

Maintenance & Community

Information regarding maintainers, community channels (like Discord/Slack), or a public roadmap is not detailed in the provided README content.

Licensing & Compatibility

The specific open-source license under which this repository is distributed is not mentioned in the provided README content.

Limitations & Caveats

The README focuses on demonstrating solutions and does not explicitly list limitations, known bugs, or alpha status. The effectiveness of the techniques may depend on the specific LLM used and the complexity of the task.

how_to_fix_your_context by langchain-ai

Explore Similar Projects

long-llms-learning by Strivin0311

FILM by microsoft

LongAlign by THUDM

Context-Engineering-2.0 by GAIR-NLP

opencode-dynamic-context-pruning by Opencode-DCP

NBCE by bojone

LongRoPE by microsoft

Qwen-Doc by Tongyi-Zhiwen

ChunkLlama by HKUNLP

Samba by microsoft

LongLM by datamllab

long_llama by CStanKonrad