Open-source testing framework for AI & LLM systems
Top 10.7% on sourcepulse
Giskard is an open-source Python framework for evaluating and testing AI systems, including LLM-based applications like RAG agents and traditional ML models. It aims to identify and mitigate risks related to performance, bias, and security vulnerabilities, offering automated scanning and dataset generation for comprehensive quality assurance.
How It Works
Giskard automates the detection of issues such as hallucinations, prompt injection, and discrimination by analyzing model outputs against predefined or generated test cases. For RAG applications, its RAG Evaluation Toolkit (RAGET) can automatically generate question-answer pairs and relevant contexts from a knowledge base, enabling detailed evaluation of RAG components like the generator, retriever, and knowledge base itself.
Quick Start & Requirements
pip install "giskard[llm]" -U
langchain
, langchain-openai
, tiktoken
, and pypdf
.Highlighted Details
giskard-vision
for computer vision tasks.giskard.scan()
function for automated issue detection and scan_results.generate_test_suite()
for creating test suites.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
3 weeks ago
1 week