Discover and explore top open-source AI tools and projects—updated daily.
YutoTerashimaLLM agent safety and tool-use evaluation
New!
Top 77.7% on SourcePulse
<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> This project provides a reproducible lab for evaluating Large Language Model (LLM) agents as systems, focusing on their complete execution traces, tool usage, policy adherence, and safety outcomes. It targets engineers and researchers needing to assess agent reliability beyond single-message interactions, offering a systematic approach to identify and mitigate risks in complex agent workflows.
How It Works
The lab employs a mock mode by default, allowing for isolated testing and replayability. It integrates with various LLM adapters (OpenAI, Hugging Face, LiteLLM) by normalizing agent traces. A core pipeline records traces, grades tool policy adherence, and applies safety rubrics, culminating in a risk report. This architecture enables detailed analysis of agent behavior, including denied tool counts and latency, within a replayable mock environment.
Quick Start & Requirements
pip install -e ".[dev]" within a Python virtual environment (.venv), followed by python examples/run_mock_eval.py.conda run -n Transformers python scripts/run_experiment.py --device cuda) and a specific Transformers conda environment.Highlighted Details
Maintenance & Community
No specific community links (e.g., Discord, Slack) or details on maintainers/sponsorships were found in the provided text.
Licensing & Compatibility
The license type is not specified in the provided README content.
Limitations & Caveats
Publicly shared failure examples are redacted to avoid exposing sensitive content, though metadata remains for reproducibility. GPU-backed experiments require specific conda environments and hardware. Some experimental configurations, like the GPU MLP, may require further calibration for production use.
3 weeks ago
Inactive
hkust-nlp
openlit