Skip to main content

What is VecGrep?

VecGrep is a semantic code search tool that works as an MCP (Model Context Protocol) plugin for Claude Code. Instead of grepping 50 files and sending 30,000 tokens to Claude, VecGrep returns the top 8 semantically relevant code chunks (~1,600 tokens). That’s a ~95% token reduction for codebase queries.

How it works

1

Chunk

Parses source files with tree-sitter to extract semantic units — functions, classes, and methods.
2

Embed

Encodes each chunk locally using all-MiniLM-L6-v2-code-search-512 (384-dim, ~80MB one-time download) via the fastembed ONNX backend (~100ms startup) or PyTorch.
3

Store

Saves embeddings and metadata in LanceDB under ~/.vecgrep/<project_hash>/.
4

Search

Uses an ANN index (IVF-PQ) for fast approximate search on large codebases.
Incremental re-indexing via mtime/size checks skips unchanged files.

Key features

95% token reduction

Returns only the top semantically relevant chunks instead of entire files.

Fast startup

ONNX backend starts in ~100ms. No PyTorch required by default.

Incremental indexing

Only re-indexes files that have changed since the last run.

Works automatically

Claude decides when to call VecGrep — no manual invocation needed.