Introduction

What is VecGrep?

VecGrep is a semantic code search tool that works as an MCP (Model Context Protocol) plugin for Claude Code. Instead of grepping 50 files and sending 30,000 tokens to Claude, VecGrep returns the top 8 semantically relevant code chunks (~1,600 tokens). That’s a ~95% token reduction for codebase queries.

How it works

Chunk

Parses source files with tree-sitter to extract semantic units — functions, classes, and methods.

Embed

Encodes each chunk locally using all-MiniLM-L6-v2-code-search-512 (384-dim, ~80MB one-time download) via the fastembed ONNX backend (~100ms startup) or PyTorch.

Store

Saves embeddings and metadata in LanceDB under ~/.vecgrep/<project_hash>/.

Uses an ANN index (IVF-PQ) for fast approximate search on large codebases.

Incremental re-indexing via mtime/size checks skips unchanged files.

Key features

95% token reduction

Returns only the top semantically relevant chunks instead of entire files.

Fast startup

ONNX backend starts in ~100ms. No PyTorch required by default.

Incremental indexing

Only re-indexes files that have changed since the last run.

Works automatically

Claude decides when to call VecGrep — no manual invocation needed.

Getting Started

Tools

Configuration

Reference

What is VecGrep?

How it works

Key features

95% token reduction

Fast startup

Incremental indexing

Works automatically

Getting Started

Tools

Configuration

Reference

​What is VecGrep?

​How it works

​Key features

95% token reduction

Fast startup

Incremental indexing

Works automatically

What is VecGrep?

How it works

Key features