awesome-bigdata  by oxnr

Curated list of big data frameworks, resources, and tools

Created 11 years ago
13,876 stars

Top 3.6% on SourcePulse

GitHubView on GitHub
Project Summary

This repository is a curated list of "awesome" resources for big data technologies, covering frameworks, databases, processing engines, and related tools. It serves as a comprehensive reference for engineers, researchers, and practitioners looking to explore or implement solutions within the big data ecosystem.

How It Works

The list is organized into logical categories, such as RDBMS, Frameworks, Distributed Filesystems, Data Models (Key-Map, Key-Value, Graph, Columnar), NewSQL, Time-Series Databases, SQL-like processing, Data Ingestion, Service Programming, Scheduling, Machine Learning, Benchmarking, Security, System Deployment, and Applications. Each entry provides a brief description of the technology.

Quick Start & Requirements

This is a curated list, not a software project. No installation or execution is required.

Highlighted Details

  • Extensive categorization of technologies within the big data landscape.
  • Includes links to papers, books, videos, and other "awesome" lists for deeper learning.
  • Covers a wide range of data models and processing paradigms, from traditional RDBMS to modern stream processing and graph databases.
  • Features sections on machine learning, benchmarking, security, and system deployment relevant to big data.

Maintenance & Community

The list is community-driven, with contributions welcomed. Specific maintainers or community links are not detailed in the README.

Licensing & Compatibility

The repository itself is likely under a permissive license (e.g., MIT, CC0) as is common for "awesome" lists, but this is not explicitly stated. The listed technologies have their own diverse licenses.

Limitations & Caveats

As a curated list, its comprehensiveness and up-to-dateness depend on community contributions. Some entries might be outdated or superseded by newer technologies.

Health Check
Last Commit

7 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
81 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Alexander Wettig Alexander Wettig(Coauthor of SWE-bench, SWE-agent), and
5 more.

data-juicer by modelscope

0.7%
5k
Data-Juicer: Data processing system for foundation models
Created 2 years ago
Updated 1 day ago
Starred by Mike Krieger Mike Krieger(CPO at Anthropic; Cofounder of Instagram), Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), and
25 more.

redis by redis

0.1%
71k
Redis is a versatile data structure server, cache, and query engine
Created 16 years ago
Updated 3 days ago
Feedback? Help us improve.