AI code-writing assistant for Pandas dataframes
Top 20.3% on sourcepulse
Sketch is an AI-powered code-writing assistant designed for pandas users, aiming to enhance data analysis workflows by understanding data context for more relevant suggestions. It targets data analysts and engineers seeking to streamline tasks like data cataloging, cleaning, analysis, and visualization without IDE plugin installations.
How It Works
Sketch leverages data sketches, a technique for summarizing large datasets efficiently, to provide context to large language models (LLMs). It summarizes dataframe columns and feeds these statistics into prompts for code generation. The project plans to integrate these sketches directly into custom "data + language" foundation models for improved accuracy.
Quick Start & Requirements
pip install sketch
apply
functionality, an OpenAI API key is needed (OPENAI_API_KEY
environment variable).LAMBDAPROMPT_BACKEND
, SKETCH_USE_REMOTE_LAMBDAPROMPT='False'
, and HF_ACCESS_TOKEN
.Highlighted Details
.sketch
extension.ask
for data understanding and howto
for code generation.apply
enables data generation and feature engineering, built on LambdaPrompt.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The project is in active development, with the apply
feature requiring an OpenAI API key for full functionality unless local models are configured. The README implies future enhancements to integrate data sketches more deeply with foundation models.
1 year ago
1 day