introspect by defog-ai

Service for deep research on internal data

Created 2 years ago

351 stars

Top 79.6% on SourcePulse

View on GitHub

1 Expert Loves This Project

Travis Fischer

Founder of Agentic

Project Summary

Defog Introspect is an AI-powered research service designed for structured and unstructured data analysis, enabling users to query databases, CSVs, Excel files, and PDFs, augmented by web search. It targets data analysts and researchers seeking to derive insights from diverse data sources through natural language interaction.

How It Works

The system employs an AI agent that utilizes tool-use capabilities. An LLM orchestrates queries across three primary tools: text_to_sql for structured data, web_search for external context, and pdf_with_citations for document analysis. The agent recursively employs these tools until it gathers sufficient information to answer the user's question. Default models include o4-mini for text-to-SQL, gemini-2.0-flash for web search, and claude-3-7-sonnet for PDF analysis and overall orchestration.

Quick Start & Requirements

Install via Docker Compose: docker compose up --build
Requires API keys for OpenAI, Anthropic, and Gemini, configured in a .env file.
Access the application at http://localhost:80.
Demo available at: https://demo.defog.ai/reports (user: admin, pass: admin).

Highlighted Details

Supports a wide range of databases (PostgreSQL, MySQL, BigQuery, Snowflake, etc.) and file formats (CSV, Excel).
Integrates PDF analysis with citation support.
Web search capability provides external context for queries.
Modular design with separate backend (Python) and frontend (JavaScript/TypeScript) components.

Maintenance & Community

Maintained by Defog.ai.
Future plans include user-selectable models and documentation for custom tools and data source integrations.

Licensing & Compatibility

License details are not explicitly stated in the README.

Limitations & Caveats

The project is marked as "Coming soon" for documentation.
Users cannot currently select specific LLM models for different tasks via configuration.
Integration with cloud storage services like Google Drive and OneDrive for unstructured data is not yet implemented.

Health Check

Last Commit

8 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days