MLLM research paper for reasoning/reflection via collective Monte Carlo Tree Search
Top 33.1% on sourcepulse
Mulberry is an open-source project that enhances multimodal large language models (MLLMs) with advanced reasoning and reflection capabilities. It targets researchers and developers looking to improve MLLM performance on complex tasks requiring step-by-step problem-solving, offering a novel approach to generate and leverage reasoning data.
How It Works
Mulberry employs Collective Monte Carlo Tree Search (CoMCTS) to generate step-by-step reasoning and reflection data. CoMCTS leverages the collective knowledge of multiple LLMs to collaboratively explore, identify, and refine effective reasoning paths. This iterative process, involving expansion, simulation, error positioning, backpropagation, and selection, aims to improve the success rate and efficiency of reasoning path searches, ultimately leading to better model performance.
Quick Start & Requirements
infer.py
), data construction (data_construction.py
), and training (via LLaMA-Factory).Highlighted Details
Maintenance & Community
The project is primarily associated with authors from Nanyang Technological University, Tsinghua University, Baidu, and SYSU. Recent updates include the release of evaluation code, models (Mulberry_llama_11b, Mulberry_qwen2vl_7b), and reasoning data.
Licensing & Compatibility
The repository does not explicitly state a license in the README. Users should verify licensing for commercial use or integration with closed-source projects.
Limitations & Caveats
The project acknowledges that hallucinations in intermediate reasoning steps can still occur, even with error detection mechanisms. Smaller models used for error localization may be less effective, and larger models can sometimes exhibit inaccurate localization. Ensuring the correctness of all intermediate steps is noted as a significant challenge requiring costly human verification.
4 months ago
1 day