Discover and explore top open-source AI tools and projects—updated daily.
databricks-industry-solutionsAccelerating medical image processing and AI analysis in the Lakehouse
Top 77.8% on SourcePulse
This project provides a Databricks Lakehouse solution accelerator for large-scale processing of DICOM medical images and related documents. It targets engineers and researchers needing to ingest, index, analyze DICOM metadata, and perform AI-driven image segmentation and interactive labeling, offering streamlined workflows and advanced analytics capabilities within a secure environment.
How It Works
The solution ingests DICOM files from cloud storage (ADLS, S3, GCS) via Unity Catalog Volumes, extracts and indexes metadata into Databricks tables, and applies PHI redaction. It integrates the OHIF Viewer for interactive segmentation and labeling, powered by NVIDIA's MONAI for AI-driven segmentation and custom model training. Data can be processed in batch, incremental, or streaming modes, with results accessible via SQL, BI dashboards, and real-time inference endpoints.
Quick Start & Requirements
To set up, clone the repository into a Databricks workspace. Attach a notebook to Serverless Compute or a cluster (>= DBR 14.3 LTS) and run config/setup.py to install the pixels package. Subsequently, execute the RUNME notebook via a Databricks job. GPU-enabled compute is recommended for optimal performance. Official quick-start examples are available within the repository's notebooks.
Highlighted Details
Maintenance & Community
The project is developed by Databricks, with listed contributors from Databricks and Prominence Advisors. No explicit community channels (e.g., Discord, Slack) or public roadmaps are detailed in the provided README.
Licensing & Compatibility
The core dbx.pixels library is provided under a Databricks license. It integrates several third-party libraries with permissive licenses (MIT, Apache-2.0, BSD). While third-party components are broadly compatible, the primary Databricks license terms should be reviewed for specific commercial use or closed-source integration requirements.
Limitations & Caveats
The solution is primarily designed for and requires a Databricks workspace environment. Nifti file format ingestion and robust pixel-level PHI redaction are noted as items on the future roadmap. Users are responsible for associated Databricks compute costs.
1 day ago
Inactive
diffgram
mlfoundations
microsoft