Discover and explore top open-source AI tools and projects—updated daily.
xingjunmSurveying safety across large models and AI agents
Top 98.5% on SourcePulse
Summary
This repository presents a comprehensive survey of safety research concerning large AI models, including LLMs, VLMs, and diffusion models. It offers a structured taxonomy of safety threats, defense strategies, datasets, and benchmarks, aiming to provide researchers and practitioners with a systematic overview of the field. The survey highlights key trends, identifies open challenges, and serves as a foundational resource for understanding and advancing large model safety.
How It Works
The survey systematically reviews safety research across six model categories: Vision Foundation Models (VFMs), Large Language Models (LLMs), Vision-Language Pre-training (VLP) models, Vision-Language Models (VLMs), Diffusion Models (DMs), and large-model-based Agents. For each category, research is organized into attacks and defenses, detailing ten attack types (e.g., adversarial, backdoor, jailbreak, prompt injection). A two-level taxonomy (Category → Subcategory) classifies attacks and defenses based on threat models or specific subtasks. The methodology involved keyword-based searches followed by manual filtering, resulting in the analysis of 390 technical papers.
Quick Start & Requirements
This repository contains a survey paper and associated research findings, not executable code. Therefore, there are no installation or runtime requirements.
Highlighted Details
Maintenance & Community
A major revision was completed in August 2025, representing the final planned update for this version of the survey. Researchers can submit papers for citation via a provided form to help maintain comprehensiveness.
Licensing & Compatibility
Licensing information is not specified in the provided README.
Limitations & Caveats
This survey represents a final planned update, and further substantial revisions may not be possible. The focus is on summarizing key ideas and approaches, omitting deep technical details and experimental analyses of individual papers.
3 weeks ago
Inactive