intellagent by plurai-ai

Framework for agent diagnosis and optimization using simulated interactions

Created 1 year ago

1,168 stars

Top 32.9% on SourcePulse

Project Summary

IntellAgent is a framework for evaluating and optimizing conversational AI agents by simulating thousands of realistic, challenging interactions. It targets developers and researchers seeking to uncover agent blind spots, improve reliability, and enhance user experience before real-world deployment. The core benefit is stress-testing agents to identify and fix failure points through automated scenario generation and detailed performance analysis.

How It Works

The framework decomposes user prompts into a policy graph, samples policies based on real conversation distributions, and generates interaction scenarios. A user agent then simulates these interactions with the target chatbot. Finally, the conversation is critiqued to provide feedback on tested policies, enabling targeted improvements. This multi-agent simulation approach allows for comprehensive stress-testing and identification of edge-case failures.

Quick Start & Requirements

Install via pip install -r requirements.txt after cloning the repository.
Requires Python >= 3.9.
LLM API keys (OpenAI, Azure, Vertex, Anthropic) must be configured in config/llm_env.yml.
Default cost per sample is ~$0.10, controllable via cost_limit.
Documentation: https://intellagent-doc.plurai.ai/
Quick Start: https://github.com/plurai-ai/intellagent#fire-quickstart

Highlighted Details

Automatically generates thousands of realistic edge-case scenarios tailored to specific agents.
Simulates diverse user interactions across varying complexity levels.
Provides comprehensive performance evaluations to identify gaps and compare outcomes.
Offers simple integration with existing conversational agents.

Maintenance & Community

Active development with a roadmap including integrations for LangGraph, CrewAI, and AutoGen.
Community support via Discord: https://discord.gg/YWbT87vAau
Newsletter available: https://plurai.substack.com/

Licensing & Compatibility

Licensed under Apache 2.0.
Permissive license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

The project collects basic usage metrics, which can be disabled via PLURAI_DO_NOT_TRACK. Some advanced optimization features are noted as available with premium access.

intellagent by plurai-ai

Explore Similar Projects

awesome-computer-use by ranpox

generativeAgent_LLM by QuangBK

agent-evaluation by awslabs

context-engineering-kit by NeoLabHQ

appworld by StonyBrookNLP

meta-agents-research-environments by facebookresearch

AgentGym by WooooDyy

chatarena by Farama-Foundation

webarena by web-arena-x

agent-squad by awslabs

camel by camel-ai

generative_agents by joonspk-research