boring-semantic-layer  by boringdata

Python semantic layer for structured data and LLMs

Created 4 months ago
266 stars

Top 96.2% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Summary

Boring Semantic Layer (BSL) is a lightweight, Ibis-powered semantic layer designed to bridge the gap between structured data sources and Large Language Models (LLMs). It allows users to define data models using Python or YAML, abstracting complex SQL generation and enabling LLMs to query data more effectively. BSL is targeted at developers and data scientists seeking a flexible, database-agnostic way to expose data for analytical purposes and AI-driven insights.

How It Works

BSL builds upon the Ibis library, enabling it to connect to any database engine supported by Ibis (e.g., DuckDB, Snowflake, BigQuery, PostgreSQL). Users define a SemanticModel by specifying tables, dimensions (attributes for filtering/grouping), and measures (aggregations). These definitions are expressed using Ibis expressions, which are Python functions that Ibis translates into optimized SQL queries for the underlying database. This approach provides a unified Python API for data interaction, supports rich, human-readable descriptions for enhanced AI understanding via the Model Context Protocol (MCP), and includes integrated charting capabilities.

Quick Start & Requirements

  • Installation: Basic installation via pip install boring-semantic-layer. Optional extras are available for examples ([examples]), MCP integration ([mcp]), and visualization ([viz-altair], [viz-plotly]).
  • Prerequisites: Requires an Ibis-compatible database. Examples utilize DuckDB. MCP integration is geared towards LLMs like Claude. Visualization requires Altair or Plotly.
  • Setup: Sample data (Parquet files) can be downloaded using curl.
  • Links: Example usage and code are provided within the README and implied to be available in the repository.

Highlighted Details

  • Ibis-Powered: Offers broad database compatibility by leveraging Ibis's extensive backend support.
  • LLM Integration (MCP): Features built-in support for the Model Context Protocol (MCP), facilitating seamless integration with LLMs such as Claude.
  • Rich Metadata: Allows defining descriptive metadata for models, dimensions, and measures, improving model documentation and AI interpretability.
  • Flexible Definition: Semantic models can be defined programmatically using Python or declaratively via YAML configuration files.
  • Integrated Visualization: Provides direct charting capabilities from query results using Altair (default) or Plotly backends, supporting auto-detection and custom specifications.

Maintenance & Community

BSL is a collaborative project between xorq-labs and boringdata, actively welcoming feedback and contributions. Specific community channels (e.g., Discord, Slack) or details on major contributors are not detailed in the provided README.

Licensing & Compatibility

The project's license is not explicitly stated in the README. This omission represents a significant caveat for adoption, particularly for commercial use or integration into proprietary systems.

Limitations & Caveats

The absence of a specified open-source license is a critical limitation that may hinder adoption. While MCP integration is a key feature, the primary configuration example targets Claude Desktop, potentially indicating a narrower scope of immediate LLM compatibility. Exporting charts as PNG or SVG requires installing additional dependencies for the Altair backend.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
3
Issues (30d)
7
Star History
27 stars in the last 30 days

Explore Similar Projects

Starred by Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), Michael Chiang Michael Chiang(Cofounder of Ollama), and
2 more.

enrichmcp by featureform

0.3%
615
ORM for AI agents
Created 6 months ago
Updated 2 weeks ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), and
12 more.

mindsdb by mindsdb

0.4%
36k
AI query engine for federated data sources
Created 7 years ago
Updated 15 hours ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Anton Troynikov Anton Troynikov(Cofounder of Chroma), and
47 more.

llama_index by run-llama

0.3%
45k
Data framework for building LLM-powered agents
Created 2 years ago
Updated 1 day ago
Feedback? Help us improve.