pointblank  by posit-dev

Data validation framework for Python

Created 10 months ago
276 stars

Top 93.9% on SourcePulse

GitHubView on GitHub
Project Summary

Pointblank is a Python framework for data validation, designed to ensure data quality through an intuitive, chainable API and generate interactive reports. It targets data scientists, engineers, and analysts, enabling them to identify and address data issues efficiently.

How It Works

Pointblank utilizes a composable pipeline approach, allowing users to chain various validation steps. It integrates with multiple data backends (Polars, Pandas, DuckDB, etc.) via the Narwhals and Ibis libraries, providing a consistent interface for data quality checks. The framework generates detailed, interactive HTML reports that highlight data anomalies and can be configured with custom actions for threshold breaches.

Quick Start & Requirements

Highlighted Details

  • Supports a wide range of data sources including Polars, Pandas, DuckDB, PostgreSQL, MySQL, SQLite, Parquet, PySpark, and Snowflake.
  • Offers YAML configuration for portable and version-controlled validation workflows.
  • Includes a Command Line Interface (CLI) for direct execution of validation tasks and data exploration.
  • Reports can be internationalized into over 20 languages.

Maintenance & Community

Licensing & Compatibility

  • Licensed under the MIT license.
  • Permissive license allows for commercial use and integration with closed-source projects.

Limitations & Caveats

The project is actively developed with a roadmap including LLM-powered suggestions and expanded backend support; users should be aware of ongoing feature additions and potential API changes.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
13
Issues (30d)
5
Star History
14 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.