databricks-sdk-py  by databricks

Python SDK for Databricks Lakehouse development

created 3 years ago
463 stars

Top 66.4% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This Python SDK provides a comprehensive interface for interacting with the Databricks Lakehouse platform, targeting developers and data engineers who need to automate Databricks workflows and manage resources programmatically. It aims to simplify Databricks operations by abstracting the underlying REST APIs.

How It Works

The SDK offers a robust internal HTTP client with intelligent retry mechanisms for handling failures. It exposes a WorkspaceClient and AccountClient for managing workspace resources and account-level configurations, respectively. Authentication is flexible, supporting Databricks native tokens, Azure AD, and GCP credentials, with a prioritized lookup order for configuration. Long-running operations are managed via a Waiter interface, and paginated API responses are abstracted into Python iterators.

Quick Start & Requirements

  • Install via pip: pip install databricks-sdk
  • Compatible with Python 3.7+ (3.8-3.11 recommended).
  • Databricks Runtime 13.1+ includes a bundled version.
  • Authentication requires Databricks host and token, or cloud-specific credentials.
  • Official examples are available in the GitHub repository.

Highlighted Details

  • Supports Databricks native, Azure AD, and GCP authentication methods.
  • Provides a Waiter interface for managing long-running operations like cluster creation and job execution.
  • Abstracts API pagination into Python iterators for simplified data retrieval.
  • Includes OAuth Authorization Code flow with PKCE for secure web application integration.
  • Offers dbutils functionality (e.g., dbutils.fs, dbutils.secrets) implemented natively in Python.

Maintenance & Community

  • The SDK is in Beta but supported for production, with expected future interface changes.
  • Feedback is encouraged via GitHub issues.
  • Links to SDKs for Java/Go and cloud-specific documentation are provided.

Licensing & Compatibility

  • The SDK is released under the Apache License 2.0.
  • Compatible with commercial use and closed-source applications.

Limitations & Caveats

  • The SDK is in Beta, and interface stability is not guaranteed, with potential for backward-incompatible changes.
  • Azure SSO with OAuth for local scripts is in early experimental stages.
  • GCP OAuth is not supported at the moment.
Health Check
Last commit

2 days ago

Responsiveness

1 week

Pull Requests (30d)
12
Issues (30d)
12
Star History
40 stars in the last 90 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers) and Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera).

client-python by mistralai

0.3%
628
Python SDK for Mistral AI platform
created 1 year ago
updated 1 week ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeremy Howard Jeremy Howard(Cofounder of fast.ai), and
3 more.

cohere-toolkit by cohere-ai

0.2%
3k
RAG toolkit for LLM application development and deployment
created 1 year ago
updated 1 week ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), and
6 more.

E2B by e2b-dev

0.7%
9k
Open-source cloud runtime for AI apps and agents
created 2 years ago
updated 1 day ago
Feedback? Help us improve.