This project provides a free, stable, and scalable API service for converting web content into a format suitable for Large Language Models (LLMs) and for performing web searches. It targets developers building LLM-powered agents and RAG systems, offering enhanced input quality and access to real-time information.
How It Works
The service operates via two main endpoints: r.jina.ai
for content retrieval and s.jina.ai
for web search. r.jina.ai
fetches content from any URL, processing it for LLM consumption, including handling JavaScript-heavy Single Page Applications (SPAs) via Puppeteer and headless Chrome. s.jina.ai
performs web searches, retrieves the top 5 results, and then applies the r.jina.ai
processing to each, providing richer context than typical search engine API snippets.
Quick Start & Requirements
git clone git@github.com:jina-ai/reader.git
) and run npm install
.https://r.jina.ai/<your_url>
for content reading or https://s.jina.ai/<your_query>
for web search.Highlighted Details
s.jina.ai
provides full content of top search results, not just snippets.Maintenance & Community
The service is actively maintained by Jina AI as a core product. Updates are deployed directly from commits to this repository. Users can report issues with specific URLs.
Licensing & Compatibility
Licensed under Apache-2.0. This license is permissive and generally compatible with commercial use and closed-source linking.
Limitations & Caveats
The build process is explicitly stated to fail for Node.js versions greater than v18. While the API is generally stable, past DDoS attacks have been noted, with recent improvements to reliability. Some SPAs may require specific header configurations (e.g., x-timeout
, x-wait-for-selector
) for optimal content capture.
2 months ago
1 week