File-System-Paper  by hegongshan

Curated list of file system research papers

Created 3 years ago
299 stars

Top 88.9% on SourcePulse

GitHubView on GitHub
Project Summary

This repository is a curated list of seminal and influential research papers on file systems, targeting researchers, students, and engineers in storage systems and operating systems. It provides a structured overview of key advancements and challenges in local, distributed, and specialized file system designs.

How It Works

The repository categorizes papers across various file system domains, including local file systems (kernel and user-space), distributed file systems (general purpose, big data, HPC, cloud, AI), crash consistency, fragmentation, scalability, data management, metadata, fault tolerance, and hardware optimization. Each entry includes a citation and a link to the PDF, facilitating in-depth study of file system evolution and design principles.

Quick Start & Requirements

  • Access to the papers requires an internet connection to download PDFs.
  • No software installation is needed; this is a reference list.
  • Links to official documentation or code are provided for specific file systems mentioned (e.g., Linux File System, libfuse, Ceph, HDFS, Lustre).

Highlighted Details

  • Comprehensive coverage from early UNIX file systems (e.g., FFS, Vnodes) to modern systems (e.g., BTRFS, EROFS, Ceph, HDFS, Lustre).
  • Dedicated sections for emerging areas like AI for File Systems and File Systems for AI.
  • Includes papers on crucial aspects like crash consistency, fragmentation, multicore scalability, and metadata management.
  • Features surveys and analysis papers for a broader understanding of the field.

Maintenance & Community

  • The repository is maintained by hegongshan.
  • No specific community channels (e.g., Discord, Slack) or roadmap are indicated in the README.

Licensing & Compatibility

  • The repository itself is a list of links to research papers. The licensing of the individual papers is determined by their respective publishers.
  • Compatibility for commercial use or closed-source linking depends on the licensing terms of each linked paper.

Limitations & Caveats

This repository is a reference list and does not provide code or implementations. The "quick start" is simply accessing the linked PDFs, and the depth of understanding depends on the reader's engagement with the cited research.

Health Check
Last Commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Elvis Saravia Elvis Saravia(Founder of DAIR.AI), and
4 more.

dolma by allenai

0.1%
1k
Toolkit for curating datasets for language model pre-training
Created 2 years ago
Updated 2 weeks ago
Feedback? Help us improve.