AgentSkillsCN

mgrep-code-search

使用mgrep进行语义化代码搜索,以实现高效的代码库探索。当您需要搜索或探索代码库,且代码库中非Git忽略的文件数量超过30个,或存在嵌套的目录结构时,应使用此技能。它提供自然语言语义搜索,可与传统的grep/ripgrep形成互补,帮助您查找功能、理解意图,以及探索陌生的代码。

SKILL.md
--- frontmatter
name: mgrep-code-search
description: Semantic code search using mgrep for efficient codebase exploration. This skill should be used when searching or exploring codebases with more than 30 non-gitignored files and/or nested directory structures. It provides natural language semantic search that complements traditional grep/ripgrep for finding features, understanding intent, and exploring unfamiliar code.

mgrep Code Search

Overview

mgrep is a semantic search tool that enables natural language queries across code, text, PDFs, and images. It is particularly effective for exploring larger or complex codebases where traditional pattern matching falls short.

When to Use This Skill

Use mgrep when:

  • The codebase contains more than 30 non-gitignored files
  • There are nested directory structures
  • Searching for concepts, features, or intent rather than exact strings
  • Exploring an unfamiliar codebase
  • Need to understand "where" or "how" something is implemented

Use traditional grep/ripgrep when:

  • Searching for exact patterns or symbols
  • Regex-based refactoring
  • Tracing specific function or variable names

Quick Start

Indexing

Before searching, start the watcher to index the repository:

bash
bunx @mixedbread/mgrep watch

The watch command indexes the repository and maintains synchronisation with file changes. It respects .gitignore and .mgrepignore patterns.

Searching

bash
bunx @mixedbread/mgrep "your natural language query" [path]

Search Commands

Basic Search

bash
bunx @mixedbread/mgrep "where is authentication configured?"
bunx @mixedbread/mgrep "how do we handle errors in API calls?" src/
bunx @mixedbread/mgrep "database connection setup" src/lib

Search Options

OptionDescription
-m <count>Maximum results (default: 10)
-c, --contentDisplay full result content
-a, --answerGenerate AI-powered synthesis of results
-s, --syncUpdate index before searching
--no-rerankDisable relevance optimisation

Examples with Options

bash
# Get more results
bunx @mixedbread/mgrep -m 25 "user authentication flow"

# Show full content of matches
bunx @mixedbread/mgrep -c "error handling patterns"

# Get an AI-synthesised answer
bunx @mixedbread/mgrep -a "how does the caching layer work?"

# Sync index before searching
bunx @mixedbread/mgrep -s "payment processing" src/services

Workflow

  1. Start watcher (once per session or when files change significantly):

    bash
    bunx @mixedbread/mgrep watch
    
  2. Search semantically:

    bash
    bunx @mixedbread/mgrep "what you're looking for" [optional/path]
    
  3. Refine as needed using path constraints or options:

    bash
    bunx @mixedbread/mgrep -m 20 -c "refined query" src/specific/directory
    

Environment Variables

Configure defaults via environment variables:

VariablePurpose
MGREP_MAX_COUNTDefault result limit
MGREP_CONTENTEnable content display (1/true)
MGREP_ANSWEREnable AI synthesis (1/true)
MGREP_SYNCPre-search sync (1/true)

Important Notes

  • Always use bunx @mixedbread/mgrep to run commands (not npm/npx or direct installation)
  • Run bunx @mixedbread/mgrep watch before searching to ensure the index is current
  • mgrep respects .gitignore patterns automatically
  • Create .mgrepignore for additional exclusions