ollama by ollama

CLI tool for running LLMs locally

Created 2 years ago

159,168 stars

Top 0.0% on SourcePulse

View on GitHub

50 Experts Love This Project

Research Scientist at Apple; Professor at CMU

Eric Zhu

Coauthor of AutoGen; Research Scientist at Microsoft Research

and 46 more!

Project Summary

Ollama provides a streamlined way to download, install, and run large language models (LLMs) locally on macOS, Windows, and Linux. It targets developers and power users seeking to experiment with or integrate various LLMs into their applications without complex setup. The primary benefit is simplified local LLM deployment and management.

How It Works

Ollama acts as a local inference server, downloading quantized LLM weights (typically in GGUF format) and serving them via a REST API. This approach allows users to run powerful models on consumer hardware by leveraging quantization, which reduces model size and computational requirements. It abstracts away the complexities of model loading, GPU acceleration (if available), and API serving, offering a consistent interface across different models.

Quick Start & Requirements

Install: curl -fsSL https://ollama.com/install.sh | sh (Linux/macOS) or download from ollama.com. Docker image ollama/ollama is also available.
Prerequisites: Minimum 8GB RAM for 7B models, 16GB for 13B, 32GB for 33B. GPU acceleration is supported but not strictly required.
Run: ollama run llama3.2
Docs: https://ollama.com/

Highlighted Details

Supports a wide range of popular LLMs including Llama 3, Gemma, Mistral, Phi-4, and DeepSeek-R1.
Enables model customization via Modelfiles for system prompts and parameters.
Offers multimodal capabilities with models like LLaVA.
Provides a comprehensive REST API for programmatic interaction.

Maintenance & Community

Active development with a large community, indicated by numerous integrations and community projects listed.
Community support available via Discord and Reddit.

Licensing & Compatibility

The Ollama project itself is released under the MIT license. Model licenses vary by the specific LLM being used.

Limitations & Caveats

Performance is highly dependent on local hardware, especially for larger models.
While it supports GPU acceleration, specific driver configurations might be necessary for optimal performance.

Health Check

Last Commit

16 hours ago

Responsiveness

1 day

Pull Requests (30d)

128

Issues (30d)

176

Star History

2,030 stars in the last 30 days