Framework for LLM long-sequence processing via MapReduce-inspired divide-and-conquer
Top 45.9% on sourcepulse
LLMxMapReduce is a framework for processing and generating long sequences using large language models (LLMs), inspired by the MapReduce paradigm. It aims to address the challenge of integrating and analyzing information from extensive inputs, enabling LLMs to handle long-to-long generation tasks more effectively. The target audience includes researchers and developers working with LLMs on applications requiring long-form content generation.
How It Works
LLMxMapReduce-V2 employs an entropy-driven convolutional test-time scaling mechanism. This approach, drawing from convolutional neural networks, uses stacked convolutional scaling layers to progressively integrate local features into higher-level global representations. This iterative refinement allows LLMs to better process and synthesize information from extremely large input volumes, improving coherence and informativeness in generated long-form articles.
Quick Start & Requirements
conda create -n llm_mr_v2 python=3.11
, conda activate llm_mr_v2
), install dependencies (pip install -r requirements.txt
), and install Playwright browsers (python -m playwright install --with-deps chromium
).nltk.download('punkt_tab')
).OPENAI_API_KEY
, OPENAI_API_BASE
, GOOGLE_API_KEY
, SERP_API_KEY
(optional), PROMPT_LANGUAGE
(optional, defaults to English).model_config.json
specifies API type (OpenAI/Google) and model names.bash scripts/pipeline_start.sh TOPIC output_file_path.jsonl
title
and papers
(each containing title
, optional abstract
, and txt
).Highlighted Details
Maintenance & Community
Developed collaboratively by AI9STARS, OpenBMB, and THUNLP.
Licensing & Compatibility
The repository does not explicitly state a license in the README.
Limitations & Caveats
The project strongly recommends using Gemini Flash models, warning of potential unknown errors with other models. It is not recommended for use with locally deployed models due to high API consumption and concurrency requirements.
1 month ago
Inactive