ollama  by ollama

CLI tool for running LLMs locally

created 2 years ago
148,101 stars

Top 0.0% on sourcepulse

GitHubView on GitHub
Project Summary

Ollama provides a streamlined way to download, install, and run large language models (LLMs) locally on macOS, Windows, and Linux. It targets developers and power users seeking to experiment with or integrate various LLMs into their applications without complex setup. The primary benefit is simplified local LLM deployment and management.

How It Works

Ollama acts as a local inference server, downloading quantized LLM weights (typically in GGUF format) and serving them via a REST API. This approach allows users to run powerful models on consumer hardware by leveraging quantization, which reduces model size and computational requirements. It abstracts away the complexities of model loading, GPU acceleration (if available), and API serving, offering a consistent interface across different models.

Quick Start & Requirements

  • Install: curl -fsSL https://ollama.com/install.sh | sh (Linux/macOS) or download from ollama.com. Docker image ollama/ollama is also available.
  • Prerequisites: Minimum 8GB RAM for 7B models, 16GB for 13B, 32GB for 33B. GPU acceleration is supported but not strictly required.
  • Run: ollama run llama3.2
  • Docs: https://ollama.com/

Highlighted Details

  • Supports a wide range of popular LLMs including Llama 3, Gemma, Mistral, Phi-4, and DeepSeek-R1.
  • Enables model customization via Modelfiles for system prompts and parameters.
  • Offers multimodal capabilities with models like LLaVA.
  • Provides a comprehensive REST API for programmatic interaction.

Maintenance & Community

  • Active development with a large community, indicated by numerous integrations and community projects listed.
  • Community support available via Discord and Reddit.

Licensing & Compatibility

  • The Ollama project itself is released under the MIT license. Model licenses vary by the specific LLM being used.

Limitations & Caveats

  • Performance is highly dependent on local hardware, especially for larger models.
  • While it supports GPU acceleration, specific driver configurations might be necessary for optimal performance.
Health Check
Last commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
107
Issues (30d)
301
Star History
10,045 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Nat Friedman Nat Friedman(Former CEO of GitHub), and
32 more.

llama.cpp by ggml-org

0.4%
84k
C/C++ library for local LLM inference
created 2 years ago
updated 10 hours ago
Feedback? Help us improve.