AgentSkillsCN

semantic-patterns

将用户的提问转化为高效的向量搜索查询,用于RAG系统。在构建语义搜索、设计RAG管道、编写检索代码,或调试召回率不佳的问题时,可使用此功能。涵盖查询扩展、分解、HyDE、元数据过滤、多跳检索以及结果聚合。

SKILL.md
--- frontmatter
name: semantic-patterns
description: Transform user questions into effective vector search queries for RAG systems. Use when building semantic search, designing RAG pipelines, writing retrieval code, or debugging poor recall. Covers query expansion, decomposition, HyDE, metadata filtering, multi-hop retrieval, and result aggregation.

Semantic Search Query Patterns

Query engineering determines retrieval quality. Raw user questions rarely match how information is stored. These patterns transform user intent into effective vector searches.

Core principle: Combine patterns. A single search often benefits from expansion + filtering + temporal awareness.

Query Patterns

PatternWhen to Use
ExpansionUser vocabulary differs from document vocabulary; want recall over precision
DecompositionMulti-faceted questions (procedure + legal + technical); cross-domain queries
Contextual RewritingMulti-turn conversations; pronouns or references to previous context
Diagnostic ExpansionUser describes a problem; troubleshooting scenarios
Self-AskAmbiguous queries; high-stakes queries where precision matters
HyDEVague questions; user doesn't know domain vocabulary; large semantic gap
Metadata Pre-filteringLarge corpus; known user context; document lifecycle states
Multi-hop RetrievalQuestions about document relationships; hierarchical navigation
Temporal RewritingVersioned corpus; time-sensitive documents; document lifecycle

Pattern Combinations

Complex compliance question:

  1. Decomposition - Break into legal + procedural sub-queries
  2. Metadata filtering - Filter by region and doc_type
  3. Temporal - Ensure current/approved versions

Vague troubleshooting query:

  1. Self-Ask - Generate clarifying questions
  2. Diagnostic Expansion - Cover symptoms + causes + solutions
  3. Query Expansion - Add vocabulary variations

Follow-up in conversation:

  1. Contextual Rewriting - Inject prior context
  2. HyDE - Bridge short question to detailed docs
  3. Multi-hop - Find referenced doc, then search within it

Result Aggregation

When running multiple queries:

  • Reciprocal Rank Fusion (RRF): Weight by position across result sets
  • Score normalization: Normalize similarity scores, then combine
  • De-duplication: Remove duplicate chunks by ID before final ranking

Reference

For code examples and implementation details of each pattern, see patterns.md.