cloudbreak  by hortonworks

Platform for data analytics and AI deployment on cloud services

created 11 years ago
359 stars

Top 79.1% on sourcepulse

GitHubView on GitHub
Project Summary

Cloudbreak is an integrated analytics and data management platform designed for cloud environments, offering broad data analytics, AI capabilities, and robust data governance. It targets engineers and power users needing to deploy and manage complex data workloads on cloud infrastructure. The platform aims to simplify the deployment and management of distributed data processing systems.

How It Works

Cloudbreak operates as a microservices-based platform, orchestrating the deployment and management of various data services like Hadoop, Spark, and others. It leverages a Cloudbreak Deployer (CBD) tool for local development and setup, which manages Docker containers and configurations. The architecture supports multiple cloud providers and includes services for environment management, data lake provisioning, autoscaling (Periscope), and database management (Redbeams).

Quick Start & Requirements

  • Installation: Local development setup is primarily guided using the cloudbreak-deployer tool.
  • Prerequisites: macOS, Homebrew, Java 21, Docker Desktop for Mac (with at least 6 CPU and 12 GB RAM allocated).
  • Setup: Detailed instructions are provided for configuring environment variables, database migrations, and running individual services via IntelliJ IDEA or command line.
  • Documentation: Official documentation is available at https://docs.cloudera.com/management-console/cloud/index.html.

Highlighted Details

  • Comprehensive local development setup guide using cloudbreak-deployer and IntelliJ IDEA.
  • Support for multiple services including Cloudbreak, Periscope, Datalake, FreeIPA, Redbeams, and Environment.
  • Detailed instructions for running services via command line using Gradle.
  • Database migration management via MyBatis Migrations and the cbd tool.

Maintenance & Community

  • The project is hosted on GitHub, with contribution guidelines provided.
  • Internal Cloudera resources and Slack channels (#eng_cb_dev_internal) are mentioned for support and details.

Licensing & Compatibility

  • The README does not explicitly state the license. Internal Cloudera repositories are referenced, suggesting potential internal or proprietary licensing.

Limitations & Caveats

  • Local development setup is primarily focused on macOS. Linux users may need to manually configure public IP addresses.
  • Requires specific Java versions (21) and significant Docker resource allocation.
  • Internal repository access for build artifacts requires credentials obtained via internal Slack channels.
  • SSL handshake issues may require manual certificate import for certain services.
Health Check
Last commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.