Discover and explore top open-source AI tools and projects—updated daily.
BigQuery-powered Python DataFrames and ML
Top 97.5% on SourcePulse
BigQuery DataFrames (BigFrames) offers a Pythonic DataFrame and ML API leveraging the BigQuery engine, targeting data scientists and analysts who want to use familiar pandas and scikit-learn interfaces for large-scale data processing and machine learning directly within BigQuery. This allows for seamless migration of pandas workloads and efficient execution of ML tasks without moving data out of BigQuery.
How It Works
BigFrames translates DataFrame operations into BigQuery SQL queries, executing them directly on the BigQuery backend. This approach avoids costly data transfers and leverages BigQuery's distributed processing capabilities for performance. For ML, it integrates with BigQuery ML and supports remote functions for custom model execution, abstracting away the complexities of distributed training and inference.
Quick Start & Requirements
pip install --upgrade bigframes
Highlighted Details
bigframes.pandas
) for data manipulation.bigframes.ml
) for machine learning tasks.Maintenance & Community
bigframes-feedback@google.com
.Licensing & Compatibility
Limitations & Caveats
Version 2.0 enforces stricter defaults for allow_large_results
(defaulting to False
) and remote function security, requiring explicit configuration for operations exceeding 10GB or for specific service account usage. Users migrating from pre-2.0 versions must adapt to these changes or pin to an older version.
19 hours ago
1 day