AgentSkillsCN

local-markdown-search

当用户希望借助 QMD(Query Markup Documents)在本地 Markdown 收藏中进行跨库检索时使用。当您需要为文字稿、笔记、文档或知识库建立全文索引或语义索引时,此技能同样适用。若需设计维基结构,请参考 wiki-ontology-design。

SKILL.md
--- frontmatter
name: local-markdown-search
version: 1.0.0
description: "When the user wants to search across local markdown collections using QMD (Query Markup Documents). Also use when indexing transcripts, notes, documentation, or knowledge bases for full-text or semantic search. For wiki structure design, see wiki-ontology-design."

Local Markdown Search

Search and query local markdown collections with BM25 full-text search, vector semantic search, and LLM re-ranking. Built on QMD (Query Markup Documents) for on-device, privacy-preserving search across transcripts, notes, wikis, and documentation.

Purpose

  • Index markdown files into searchable collections
  • Perform keyword, semantic, or combined searches
  • Retrieve full documents or snippets with context
  • Support research workflows requiring source-grounded answers

When to Use This Skill

  • Setting up search for a new corpus of markdown files
  • Querying transcripts, notes, or documentation
  • Finding specific content across multiple files
  • Building research briefs with source citations
  • Creating searchable archives of knowledge bases

Not for: Web search, database queries, or non-markdown content.


Setup

Prerequisites

  • Bun runtime installed
  • QMD cloned and linked

Installation

bash
git clone https://github.com/tobi/qmd.git ~/qmd
cd ~/qmd
bun install
bun link  # Makes 'qmd' available globally

Paths:

  • Index: ~/.cache/qmd/index.sqlite
  • Models: ~/.cache/qmd/models/ (~2GB, auto-downloaded on first use)

Workflow

Phase 1: Create Collections

Goal: Index markdown files into named, searchable collections.

Step 1: Add a Collection

bash
qmd collection add /path/to/folder --name collection-name --mask "**/*.md"

Parameters:

  • --name: Identifier for the collection (lowercase, hyphenated)
  • --mask: Glob pattern for files to include

Examples:

bash
qmd collection add ~/notes --name my-notes --mask "**/*.md"
qmd collection add ./transcripts --name podcasts --mask "**/polished/*.md"
qmd collection add . --name project --mask "**/*.{md,txt}"

Step 2: Verify Collections

bash
qmd collection list   # Show all collections with stats
qmd status            # Index health and counts

Phase 2: Search

Goal: Find relevant content across indexed collections.

Full-Text Search (BM25)

bash
qmd search "keyword phrase" -n 10
qmd search "fasting cortisol" -c my-collection

Semantic Search (Vector)

Requires embeddings generated first:

bash
qmd embed             # One-time setup (~2GB model download)
qmd vsearch "conceptual query about metabolism"

Combined Search with Reranking

bash
qmd query "Why does this author oppose fasting?"

Search Options:

FlagEffect
-n <num>Number of results (default: 5)
-c <name>Filter to specific collection
--allReturn all matches
--min-score <num>Minimum similarity threshold
--fullOutput full document content
--filesOutput file paths only
--jsonJSON output

Phase 3: Retrieve Documents

Goal: Get full content from search results.

Single Document

bash
qmd get collection-name/path/to/file.md
qmd get collection-name/path/to/file.md -l 100  # First 100 lines
qmd get collection-name/path/to/file.md --from 50 -l 100  # Lines 50-150

Multiple Documents

bash
qmd multi-get "**/*keyword*.md" -l 50
qmd multi-get "**/*.md" --max-bytes 10240

Collection Management

bash
qmd collection list              # List all with stats
qmd collection remove <name>     # Delete collection
qmd collection rename <old> <new>
qmd update                       # Re-index all collections
qmd update --pull                # Git pull then re-index
qmd cleanup                      # Remove orphaned data

Context Annotations

Add metadata to improve search relevance:

bash
qmd context add /path "Description of this folder's content"
qmd context list
qmd context rm /path

MCP Server Integration

For AI agent integration:

bash
qmd mcp    # Start MCP server

Example: Research Workflow

bash
# 1. Create collections
qmd collection add ~/wikis/ray-peat/transcripts --name rp-transcripts --mask "**/polished/*.md"
qmd collection add ~/wikis/ray-peat/newsletters --name rp-newsletters --mask "*.md"

# 2. Generate embeddings for semantic search
qmd embed

# 3. Search for topic
qmd search "fasting cortisol" -n 10 -c rp-transcripts
qmd vsearch "metabolic stress during starvation" -n 10

# 4. Get full documents from top results
qmd get rp-transcripts/jodellefit/polished/2019-06-01-cortisol.md --full

# 5. Multi-get for comprehensive research
qmd multi-get "**/*fasting*.md" -l 100 --md > research-sources.md

Troubleshooting

IssueSolution
Collection not foundRun qmd collection list to verify names
No resultsLower --min-score or check collection mask
Stale resultsRun qmd update to re-index
Slow first searchModels downloading (~2GB one-time)
Vector search failsRun qmd embed first

Related Skills

  • Enhanced by: semantic-chunking-rules
  • Feeds into: multi-source-research, podcast-wiki-pipeline