TrustLLM  by HowieHwong

Trustworthiness benchmark for large language models (ICML 2024)

created 1 year ago
585 stars

Top 56.2% on sourcepulse

GitHubView on GitHub
Project Summary

TrustLLM is a comprehensive framework for evaluating the trustworthiness of Large Language Models (LLMs), targeting researchers and developers. It provides a standardized benchmark, evaluation toolkit, and dataset covering eight dimensions of trustworthiness, enabling systematic assessment and comparison of LLM performance.

How It Works

TrustLLM establishes a benchmark across six key trustworthiness dimensions: truthfulness, safety, fairness, robustness, privacy, and machine ethics. It utilizes a curated collection of over 30 datasets, many of which are introduced in this work, and employs a mix of automatic and human-in-the-loop evaluation methods. The toolkit facilitates easy integration and evaluation of various LLMs, including those accessible via APIs like Azure OpenAI, Replicate, and DeepInfra.

Quick Start & Requirements

  • Installation: Recommended via GitHub clone (git clone git@github.com:HowieHwong/TrustLLM.git) followed by pip install . from the trustllm_pkg directory. Pip installation (pip install trustllm) is deprecated.
  • Prerequisites: Python 3.9 is recommended. GPU acceleration is implied for efficient LLM evaluation.
  • Resources: The TrustLLM dataset is available on Hugging Face. Links to the project website, paper, dataset, data map, and leaderboard are provided.

Highlighted Details

  • Supports evaluation of 16 mainstream LLMs, including recent models like Llama3 and Mixtral.
  • Integrates with UniGen for dynamic evaluation and offers support for models via Replicate, DeepInfra, and Azure OpenAI API.
  • Comprehensive dataset covers misinformation, hallucination, sycophancy, stereotype, disparagement, misuse, and privacy awareness.
  • Includes a public leaderboard for tracking LLM trustworthiness performance.

Maintenance & Community

The project is associated with ICML 2024 and shows active development with recent updates (v0.3.0 in April 2024) adding new models and features. Contributions are welcomed via pull requests.

Licensing & Compatibility

The code is released under the MIT license, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The README mentions ongoing work for "Chinese output evaluation" and "Downstream application evaluation," suggesting these areas may be less mature or incomplete. Some datasets are marked as "first proposed in our benchmark," implying potential for ongoing refinement.

Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
37 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Jerry Liu Jerry Liu(Cofounder of LlamaIndex).

deepeval by confident-ai

2.0%
10k
LLM evaluation framework for unit testing LLM outputs
created 2 years ago
updated 16 hours ago
Feedback? Help us improve.