mgrep  by mixedbread-ai

Semantic CLI for code and document search

Created 3 weeks ago

New!

1,590 stars

Top 26.2% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

mgrep provides a CLI-native semantic search for code, images, and PDFs, overcoming grep's limitations. It targets developers and AI agents, enabling immediate, natural-language querying that reduces agent token usage and enhances search relevance.

How It Works

Utilizing Mixedbread Search, mgrep combines semantic retrieval models with optimized inference. mgrep watch continuously indexes repositories in the background, respecting .gitignore, and stores data in a cloud-backed corpus. Users query this corpus via natural language, facilitating intent-based discovery beyond traditional pattern matching and allowing models to focus on reasoning.

Quick Start & Requirements

Install via npm install -g @mixedbread/mgrep (requires Node.js/npm/pnpm/bun). Authenticate via mgrep login (browser) or MXBAI_API_KEY env var. Index projects with mgrep watch in the repository directory. Search using mgrep "natural language query" [path]. Claude Code integration is available via mgrep install-claude-code.

Highlighted Details

  • Semantic, multilingual search; multimodal (audio, video) support is upcoming.
  • Achieved ~2x agent token reduction vs. grep-based workflows in benchmarks, with comparable or better quality.
  • Features first-class coding agent integrations, starting with Claude Code.
  • Cloud-backed stores allow shared querying without re-uploading data.

Maintenance & Community

No specific details on maintainers, community channels, or roadmaps were provided in the README.

Licensing & Compatibility

Licensed under Apache-2.0, offering permissive terms for commercial use and closed-source integration.

Limitations & Caveats

Audio/video support is pending. Login issues may require mgrep logout. mgrep watch can be noisy; consider --store for workspace separation or pausing the watcher. Automated tests are not fully integrated; pnpm typecheck is recommended pre-publish. mgrep complements, rather than replaces, grep.

Health Check
Last Commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
40
Issues (30d)
16
Star History
1,616 stars in the last 25 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Simon Willison Simon Willison(Coauthor of Django).

semantra by freedmand

0.2%
3k
CLI tool for semantic document search
Created 2 years ago
Updated 1 year ago
Starred by John Resig John Resig(Author of jQuery; Chief Software Architect at Khan Academy), Chenlin Meng Chenlin Meng(Cofounder of Pika), and
9 more.

clip-retrieval by rom1504

0.1%
3k
CLIP retrieval system for semantic search
Created 4 years ago
Updated 3 months ago
Starred by Chang She Chang She(Cofounder of LanceDB), Carol Willing Carol Willing(Core Contributor to CPython, Jupyter), and
11 more.

lancedb by lancedb

0.6%
8k
Embedded retrieval engine for multimodal AI
Created 2 years ago
Updated 2 days ago
Starred by John Resig John Resig(Author of jQuery; Chief Software Architect at Khan Academy), Simon Horup Eskildsen Simon Horup Eskildsen(Cofounder of Turbopuffer), and
21 more.

meilisearch by meilisearch

0.2%
55k
Search engine API for integrating AI-powered hybrid search
Created 7 years ago
Updated 2 days ago
Feedback? Help us improve.