Trustworthiness benchmark for large language models (ICML 2024)
Top 56.2% on sourcepulse
TrustLLM is a comprehensive framework for evaluating the trustworthiness of Large Language Models (LLMs), targeting researchers and developers. It provides a standardized benchmark, evaluation toolkit, and dataset covering eight dimensions of trustworthiness, enabling systematic assessment and comparison of LLM performance.
How It Works
TrustLLM establishes a benchmark across six key trustworthiness dimensions: truthfulness, safety, fairness, robustness, privacy, and machine ethics. It utilizes a curated collection of over 30 datasets, many of which are introduced in this work, and employs a mix of automatic and human-in-the-loop evaluation methods. The toolkit facilitates easy integration and evaluation of various LLMs, including those accessible via APIs like Azure OpenAI, Replicate, and DeepInfra.
Quick Start & Requirements
git clone git@github.com:HowieHwong/TrustLLM.git
) followed by pip install .
from the trustllm_pkg
directory. Pip installation (pip install trustllm
) is deprecated.Highlighted Details
Maintenance & Community
The project is associated with ICML 2024 and shows active development with recent updates (v0.3.0 in April 2024) adding new models and features. Contributions are welcomed via pull requests.
Licensing & Compatibility
The code is released under the MIT license, permitting commercial use and integration with closed-source projects.
Limitations & Caveats
The README mentions ongoing work for "Chinese output evaluation" and "Downstream application evaluation," suggesting these areas may be less mature or incomplete. Some datasets are marked as "first proposed in our benchmark," implying potential for ongoing refinement.
1 month ago
1 day