training-plan  by CDDSCLab

Curriculum for distributed database research

created 5 years ago
895 stars

Top 41.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository outlines a comprehensive training plan for new students at the CDDSCLab, focusing on database fundamentals, distributed systems, and modern AI applications in databases. It targets aspiring researchers and developers in distributed storage and computing, aiming to equip them with essential software engineering skills, programming proficiency (C++, Java, Go), and knowledge of database internals and distributed principles.

How It Works

The plan is structured as a 27-week curriculum, progressing from foundational database concepts (storage, indexing, query processing) to advanced topics like concurrency control, distributed transactions, and AI integration (LLMs for database tuning). Each module includes theoretical learning, practical implementation projects (e.g., B+ Tree, Raft KV, SQL Optimizer), and documentation. The lab provides project frameworks, encouraging students to contribute to a shared codebase and maintain a todo-list for ongoing improvements.

Quick Start & Requirements

  • Installation: Clone the repository (git clone git@github.com:ehds/training-plan.git).
  • Prerequisites: Proficiency in C++, Java, or Go is recommended. Familiarity with Git, Docker, and general software engineering practices is beneficial.
  • Submission: Fork the repository, complete assignments within designated folders (e.g., Week1-Databse-Introduction/DongShengHe-Week1), commit changes, and submit a Pull Request to the main lab repository.
  • Resources: Official documentation and project frameworks are provided within the repository structure.

Highlighted Details

  • Covers implementation of core database components: Buffer Pool, B+ Tree Index, SQL Execution Engine, Concurrency Control.
  • Includes practical projects on distributed systems: Raft consensus algorithm for a KV store, distributed transactions (Percolator).
  • Integrates modern AI: Data mining, NLP, Computer Vision, and specifically LLMs for database parameter tuning.
  • Emphasizes software engineering best practices: code standards, collaborative development, testing, and documentation.

Maintenance & Community

The repository is maintained by CDDSCLab at the University of Electronic Science and Technology of China. Further community interaction details (e.g., Discord/Slack) are not specified in the README.

Licensing & Compatibility

The repository's licensing is not explicitly stated in the README. Compatibility for commercial use or closed-source linking would require clarification on the license terms.

Limitations & Caveats

The plan is designed for new students within a specific lab environment, and its suitability for external use without adaptation is not guaranteed. The README does not specify version requirements for programming languages or tools, which may lead to compatibility issues.

Health Check
Last commit

3 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.