Python framework for automated website content analysis and structured report generation
Top 52.0% on sourcepulse
This Python framework automates the analysis of enterprise AI case studies from websites or provided URLs. It leverages Claude 3.5 Sonnet for intelligent identification and analysis of AI case studies, and Firecrawl for efficient web scraping and content extraction, producing detailed individual, cross-case, and executive reports.
How It Works
The system employs a two-pronged approach: CSV mode for specific URLs and Website mode for broader discovery. In Website mode, Firecrawl's /v1/map
endpoint discovers links, followed by /v1/scrape
to extract markdown content and metadata. Claude 3.5 Sonnet then identifies relevant case studies, filters for enterprise AI relevance, and performs in-depth analysis of strategy, implementation, and business impact.
Quick Start & Requirements
pip install -r requirements.txt
ANTHROPIC_API_KEY
and FIRECRAWL_API_KEY
.python -m src.main
Highlighted Details
Maintenance & Community
Contributions are welcome. The project is MIT licensed.
Licensing & Compatibility
MIT License. Permissive for commercial use and integration with closed-source projects.
Limitations & Caveats
The system relies heavily on the quality of the Claude 3.5 Sonnet API and Firecrawl's scraping capabilities. Performance and accuracy may vary based on website structure and content. API keys are required for core functionality.
9 months ago
Inactive