databricks-sdk-py  by databricks

Python SDK for Databricks Lakehouse development

Created 3 years ago
478 stars

Top 64.0% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This Python SDK provides a comprehensive interface for interacting with the Databricks Lakehouse platform, targeting developers and data engineers who need to automate Databricks workflows and manage resources programmatically. It aims to simplify Databricks operations by abstracting the underlying REST APIs.

How It Works

The SDK offers a robust internal HTTP client with intelligent retry mechanisms for handling failures. It exposes a WorkspaceClient and AccountClient for managing workspace resources and account-level configurations, respectively. Authentication is flexible, supporting Databricks native tokens, Azure AD, and GCP credentials, with a prioritized lookup order for configuration. Long-running operations are managed via a Waiter interface, and paginated API responses are abstracted into Python iterators.

Quick Start & Requirements

  • Install via pip: pip install databricks-sdk
  • Compatible with Python 3.7+ (3.8-3.11 recommended).
  • Databricks Runtime 13.1+ includes a bundled version.
  • Authentication requires Databricks host and token, or cloud-specific credentials.
  • Official examples are available in the GitHub repository.

Highlighted Details

  • Supports Databricks native, Azure AD, and GCP authentication methods.
  • Provides a Waiter interface for managing long-running operations like cluster creation and job execution.
  • Abstracts API pagination into Python iterators for simplified data retrieval.
  • Includes OAuth Authorization Code flow with PKCE for secure web application integration.
  • Offers dbutils functionality (e.g., dbutils.fs, dbutils.secrets) implemented natively in Python.

Maintenance & Community

  • The SDK is in Beta but supported for production, with expected future interface changes.
  • Feedback is encouraged via GitHub issues.
  • Links to SDKs for Java/Go and cloud-specific documentation are provided.

Licensing & Compatibility

  • The SDK is released under the Apache License 2.0.
  • Compatible with commercial use and closed-source applications.

Limitations & Caveats

  • The SDK is in Beta, and interface stability is not guaranteed, with potential for backward-incompatible changes.
  • Azure SSO with OAuth for local scripts is in early experimental stages.
  • GCP OAuth is not supported at the moment.
Health Check
Last Commit

12 hours ago

Responsiveness

1 week

Pull Requests (30d)
13
Issues (30d)
9
Star History
9 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.