RAG Search

This skill helps you search processed document databases using semantic similarity and retrieval-time optimizations.

Quick Search

bash

# Basic vector search
uv run processor search ./lancedb "how does the caching work"

# Hybrid search (vector + keyword)
uv run processor search ./lancedb "ConfigParser yaml loading" --hybrid

# Search code
uv run processor search ./lancedb "authentication middleware" --table code_chunks

Available Tables

Table	Content
`text_chunks`	Documents, papers, markdown (default)
`code_chunks`	Source code
`image_chunks`	Figures from papers
`chunks`	Unified table (if created with --table-mode unified)

MCP Server

Start the RAG MCP server for programmatic access:

bash

uv run rag-mcp

Generate a config template:

bash

uv run rag-mcp --config_generate

Configure in Claude Desktop (claude_desktop_config.json):

json

{
  "mcpServers": {
    "rag": {
      "command": "uv",
      "args": ["run", "rag-mcp"],
      "cwd": "/path/to/processor"
    }
  }
}

Available MCP Tools

•search - Vector/hybrid search with optimizations
•search_images - Search image chunks
•list_tables - List available tables
•generate_config - Create config template

Retrieval Optimizations

Enable these for better results at the cost of latency:

Optimization	Flag	Latency	Best For
Hybrid Search	`hybrid=True`	+10-30ms	Keyword-heavy queries
HyDE	`use_hyde=True`	+200-500ms	Knowledge questions
Reranking	`rerank=True`	+50-200ms	High precision needs
Parent Expansion	`expand_parents=True`	+5-20ms	Broader context

Recommended Combinations

Fast search (default): No optimizations - pure vector similarity

Better recall:

python

search(query="...", hybrid=True)

Knowledge questions:

python

search(query="what is...", use_hyde=True, rerank=True)

Code search:

python

search(query="...", table="code_chunks", hybrid=True)

Maximum precision:

python

search(query="...", hybrid=True, use_hyde=True, rerank=True)

Configuration

Edit rag_config.yaml to set defaults:

yaml

# Embedding profiles (must match processor config)
text_profile: "low"
code_profile: "low"
ollama_host: "http://localhost:11434"

# Default search behavior
default_limit: 5
default_hybrid: false

# HyDE settings (uses Claude SDK by default, falls back to Ollama)
hyde:
  enabled: false
  backend: "claude_sdk"  # claude_sdk (default) or ollama
  claude_model: "haiku"  # haiku, sonnet, opus
  ollama_model: "llama3.2:latest"  # fallback

# Reranking settings
reranker:
  enabled: false
  model: "BAAI/bge-reranker-v2-m3"
  top_k: 20
  top_n: 5

Understanding Results

Results include:

•content: Matched chunk text
•source_file: Original file path
•score: Similarity (0-1, higher is better)
•metadata: Additional fields (section, language, etc.)

Troubleshooting

No results found

•Check database exists: uv run processor stats ./lancedb
•Verify table has data: --table text_chunks
•Try broader query terms

Poor quality results

•Enable hybrid search: --hybrid
•Check embedding profiles match processor config
•Consider reranking: rerank=True

Slow searches

•Disable HyDE if not needed
•Reduce rerank_top_k
•Check Ollama server performance