how_to_fix_your_context  by langchain-ai

Optimizing LLM context windows for improved performance

Created 3 months ago
452 stars

Top 66.6% on SourcePulse

GitHubView on GitHub
Project Summary

This repository addresses the challenge of Large Language Model (LLM) context window limitations and performance degradation with increasing input length. It provides practical demonstrations of six context engineering techniques, enabling developers to improve LLM efficiency and accuracy when handling extensive information. The project serves as a valuable resource for AI engineers and researchers seeking to optimize LLM applications by managing context effectively.

How It Works

This project leverages LangGraph, a framework for building stateful AI applications, to implement and illustrate six context engineering techniques: Retrieval-Augmented Generation (RAG), Tool Loadout, Context Quarantine, Context Pruning, Context Summarization, and Context Offloading. Each technique is presented via Jupyter notebooks, showcasing how LangGraph's node-and-edge architecture facilitates the creation of complex, stateful agent workflows. This approach allows for granular control over data flow and state management, making it ideal for tackling the nuances of context manipulation in LLMs.

Quick Start & Requirements

  • Installation: Clone the repository, create and activate a virtual environment (e.g., using uv venv), and install dependencies with uv pip install -r requirements.txt.
  • Prerequisites: Python 3.9 or higher, uv package manager.
  • Environment Variables: Set OPENAI_API_KEY and/or ANTHROPIC_API_KEY for model providers.
  • Resources: Links to notebooks are provided within the README for specific technique implementations.

Highlighted Details

  • RAG: Implements a RAG agent using LangGraph with a retrieval tool, demonstrating effective context augmentation for complex queries.
  • Tool Loadout: Showcases semantic tool selection by indexing math functions in a vector store, dynamically binding relevant tools to reduce context confusion.
  • Context Quarantine: Utilizes a LangGraph Supervisor architecture for multi-agent systems with isolated context windows, preventing context clash and distraction.
  • Context Pruning: Extends RAG by adding an intelligent pruning step using GPT-4o-mini to remove irrelevant content, significantly reducing token usage (e.g., from 25k to 11k tokens).
  • Context Summarization: Condenses tool call results or documents into concise summaries using GPT-4o-mini, reducing context size while preserving essential information.
  • Context Offloading: Demonstrates temporary (scratchpad) and persistent (key-value store) context management outside the LLM's immediate context, enabling memory across interactions.

Maintenance & Community

Information regarding maintainers, community channels (like Discord/Slack), or a public roadmap is not detailed in the provided README content.

Licensing & Compatibility

The specific open-source license under which this repository is distributed is not mentioned in the provided README content.

Limitations & Caveats

The README focuses on demonstrating solutions and does not explicitly list limitations, known bugs, or alpha status. The effectiveness of the techniques may depend on the specific LLM used and the complexity of the task.

Health Check
Last Commit

3 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
88 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Luis Capelo Luis Capelo(Cofounder of Lightning AI).

LongLM by datamllab

0%
661
Self-Extend: LLM context window extension via self-attention
Created 1 year ago
Updated 1 year ago
Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI) and Lianmin Zheng Lianmin Zheng(Coauthor of SGLang, vLLM).

DeepSeek-V3.2-Exp by deepseek-ai

3.2%
961
Experimental LLM boosting long-context efficiency
Created 1 month ago
Updated 1 month ago
Feedback? Help us improve.