CognitiveKernel-Pro by Tencent

Framework for training deep research agents and foundation models

Created 5 months ago

482 stars

Top 63.6% on SourcePulse

Project Summary

Cognitive Kernel-Pro (CogKernel-Pro) is an open-source framework for training deep research agents and agent foundation models. It offers a reproducible Supervised Fine-Tuning (SFT) recipe that claims to outperform Reinforcement Learning (RL) based models without requiring RL. The framework is designed for researchers and developers building sophisticated AI agents capable of complex tasks involving web browsing, file manipulation, and multimodal interactions.

How It Works

CogKernel-Pro employs a modular agent architecture, allowing different agents (e.g., web agent, file agent) to handle specific tasks. It integrates with various LLMs and VLMs, supporting both local deployments (like vLLM) and cloud APIs (OpenAI, Claude). The framework executes generated Python code directly, necessitating careful sandboxing. It leverages tools like Playwright for web interaction and supports multiple search backends, including Google and DuckDuckGo. The core innovation lies in its SFT training approach, which bypasses the complexities of RL for agent training.

Quick Start & Requirements

Installation: pip install ... (see README for full list).
Prerequisites: Python 3.12 recommended, bash, zsh (Mac), Node.js (for web server), poppler-utils, libreoffice, ffmpeg. Google Search API key is optional; DuckDuckGo is a free alternative.
Web Server Setup: Requires running ck_web/_web/run_local.sh (Linux) or ck_web/_web/run_local_mac.sh (Mac).
Sandboxing: Strongly recommended due to direct code execution.
Documentation: arXiv Paper

Highlighted Details

Outperforms RL-based models like WebDancer and WebSailor via SFT.
Supports multimodal inputs and outputs.
Includes reflection and self-evaluation capabilities (gpt_judge).
Provides data formats and analysis scripts for trajectory evaluation.

Maintenance & Community

Developed by Tencent AI Lab.
Contact: tianqfang(at)tencent(dot)com.
Paper: arXiv:2508.00414

Licensing & Compatibility

The repository appears to be open-source, but a specific license is not explicitly stated in the provided README. Compatibility for commercial use or closed-source linking would require clarification.

Limitations & Caveats

The framework explicitly warns that generated Python code is executed directly without current safety checks, making sandboxing essential. The setup involves multiple components (LLM server, web server) and environment variable configurations, which can be complex. The full SFT dataset is noted as "coming soon."

CognitiveKernel-Pro by Tencent

Explore Similar Projects

LLM-Agent-Survey by xinzhel

lumos by allenai

dr-tulu by rlresearch

Nemotron by NVIDIA-NeMo

verl-tool by TIGER-AI-Lab

agents-deep-research by qx-labs

Mind2Web by OSU-NLP-Group

awesome-local-ai by janhq

plexe by plexe-ai

ms-agent by modelscope

Awesome-LLMOps by tensorchord

flower by adap