AgentSkillsCN

pgvector-search

采用 PGVector HNSW 与 BM25 结合的混合搜索方案,辅以倒数排名融合算法。当您需要实现混合搜索、语义检索与关键词检索相结合、优化向量检索、进行元数据筛选,或在 HNSW 与 IVFFlat 索引之间做出选择时,可使用此技能。

SKILL.md
--- frontmatter
name: pgvector-search
description: Production hybrid search combining PGVector HNSW with BM25 using Reciprocal Rank Fusion. Use when implementing hybrid search, semantic + keyword retrieval, vector search optimization, metadata filtering, or choosing between HNSW and IVFFlat indexes.
context: fork
agent: database-engineer
version: 1.2.0
author: OrchestKit AI Agent Hub
tags: [pgvector-0.8, hybrid-search, bm25, rrf, semantic-search, retrieval, 2026]
user-invocable: false

PGVector Hybrid Search

Production-grade semantic + keyword search using PostgreSQL

Overview

Architecture:

code
Query
  |
[Generate embedding] --> Vector Search (PGVector) --> Top 30 results
  |
[Generate ts_query]  --> Keyword Search (BM25)    --> Top 30 results
  |
[Reciprocal Rank Fusion (RRF)] --> Merge & re-rank --> Top 10 final results

When to use this skill:

  • Building semantic search (RAG, knowledge bases, recommendations)
  • Implementing hybrid retrieval (vector + keyword)
  • Optimizing PGVector performance
  • Working with large document collections (1M+ chunks)

Quick Reference

Search Type Comparison

AspectSemantic (Vector)Keyword (BM25)
QueryEmbedding similarityExact word matches
StrengthsSynonyms, conceptsExact phrases, rare terms
WeaknessesExact matches, technical termsNo semantic understanding
IndexHNSW (pgvector)GIN (tsvector)

Index Comparison

MetricIVFFlatHNSW
Query speed50ms3ms (17x faster)
Index time2 min20 min
Best for< 100k vectors100k+ vectors
Recall@100.85-0.950.95-0.99

Recommendation: Use HNSW for production (scales to millions).

RRF Formula

python
rrf_score = 1/(k + vector_rank) + 1/(k + keyword_rank)  # k=60 (standard)

Database Schema

sql
CREATE TABLE chunks (
    id UUID PRIMARY KEY,
    document_id UUID REFERENCES documents(id),
    content TEXT NOT NULL,
    embedding vector(1024),  -- PGVector
    content_tsvector tsvector GENERATED ALWAYS AS (
        to_tsvector('english', content)
    ) STORED,
    section_title TEXT,
    content_type TEXT,
    created_at TIMESTAMP DEFAULT NOW()
);

-- Indexes
CREATE INDEX idx_chunks_embedding ON chunks
    USING hnsw (embedding vector_cosine_ops)
    WITH (m = 16, ef_construction = 64);

CREATE INDEX idx_chunks_content_tsvector ON chunks
    USING gin (content_tsvector);

Hybrid Search Query (SQLAlchemy)

python
async def hybrid_search(
    query: str,
    query_embedding: list[float],
    top_k: int = 10
) -> list[Chunk]:
    FETCH_MULTIPLIER = 3  # Fetch 30 for better RRF coverage
    K = 60  # RRF smoothing constant

    # Vector search subquery
    vector_subq = (
        select(Chunk.id,
            func.row_number().over(
                order_by=Chunk.embedding.cosine_distance(query_embedding)
            ).label("vector_rank"))
        .limit(top_k * FETCH_MULTIPLIER)
        .subquery()
    )

    # Keyword search subquery
    ts_query = func.plainto_tsquery("english", query)
    keyword_subq = (
        select(Chunk.id,
            func.row_number().over(
                order_by=func.ts_rank_cd(Chunk.content_tsvector, ts_query).desc()
            ).label("keyword_rank"))
        .where(Chunk.content_tsvector.op("@@")(ts_query))
        .limit(top_k * FETCH_MULTIPLIER)
        .subquery()
    )

    # RRF fusion with FULL OUTER JOIN
    rrf_subq = (
        select(
            func.coalesce(vector_subq.c.id, keyword_subq.c.id).label("chunk_id"),
            (func.coalesce(1.0 / (K + vector_subq.c.vector_rank), 0.0) +
             func.coalesce(1.0 / (K + keyword_subq.c.keyword_rank), 0.0)
            ).label("rrf_score"))
        .select_from(vector_subq.outerjoin(keyword_subq, ..., full=True))
        .order_by("rrf_score DESC")
        .limit(top_k)
        .subquery()
    )

    return await session.execute(
        select(Chunk).join(rrf_subq, Chunk.id == rrf_subq.c.chunk_id)
    )

Common Patterns

Filtered Search

python
results = await hybrid_search(
    query="binary search",
    query_embedding=embedding,
    content_type_filter=["code_block"]
)

Similarity Threshold

python
results = await hybrid_search(query, embedding, top_k=50)
filtered = [r for r in results if (1 - r.vector_distance) >= 0.75][:10]

Multi-Query Retrieval

python
queries = ["machine learning", "ML algorithms", "neural networks"]
all_results = [await hybrid_search(q, embed(q)) for q in queries]
final = deduplicate_and_rerank(all_results)

Performance Tips

  1. Pre-compute tsvector - 5-10x faster than to_tsvector() at query time
  2. Use HNSW index - 17x faster queries than IVFFlat
  3. 3x fetch multiplier - Better RRF coverage (30 results per search for top 10)
  4. Iterative scan for filtered queries - Set hnsw.iterative_scan = 'relaxed_order'
  5. Metadata boosting - +6% MRR with title/path matching

Redis 8 FT.HYBRID Alternative

Redis 8.4 introduced FT.HYBRID with native hybrid search support. Consider this alternative when latency is critical.

FT.HYBRID Command Syntax

redis
FT.HYBRID index query
  VECTOR $embedding AS vec_score
  TEXT "search terms" AS text_score
  COMBINE RRF
  LIMIT 0 10
  RETURN 3 content title score

Comparison: pgvector vs Redis 8 FT.HYBRID

AspectpgvectorRedis 8 FT.HYBRID
Setup complexityMedium (extensions, indexes)Low (single FT.CREATE)
RRF implementationManual SQL with FULL OUTER JOINNative COMBINE RRF
Keyword scoringts_rank_cd (TF-IDF variant)BM25STD (configurable)
Typical latency5-20ms2-5ms
PersistenceACID transactionsAOF/RDB (configurable)
Max datasetBillions (with partitioning)Memory-bound (~100M vectors)
Joins/relationsFull SQL supportLimited (hash/JSON docs)
Operational maturityBattle-testedNew (8.4, 2024)

When to Choose pgvector

  • Already using PostgreSQL - No new infrastructure, single source of truth
  • Need ACID transactions - Atomic updates across search index and relational data
  • Complex joins with relational data - Foreign keys, aggregations, reporting
  • Large datasets - Billions of vectors with table partitioning
  • Compliance requirements - Established PostgreSQL security/audit tooling

When to Choose Redis 8

  • Need sub-5ms latency - Real-time autocomplete, typeahead, live recommendations
  • Caching layer with search - Already using Redis for cache, add search capability
  • Simpler deployment - Single binary, no extensions or complex configuration
  • Ephemeral search indexes - Rebuild from source of truth is acceptable
  • Memory budget allows - Dataset fits comfortably in RAM

References

Detailed Implementation Guides

ReferenceDescriptionUse When
index-strategies.mdHNSW vs IVFFlat, tuning, iterative scansChoosing/optimizing indexes
hybrid-search-rrf.mdRRF algorithm, SQL implementation, debuggingImplementing hybrid search
metadata-filtering.mdPre/post filtering, score boostingImproving relevance

External Resources

Related Skills

  • ai-native-development - Embeddings and vector concepts
  • database-schema-designer - Schema design for vector search

Version: 1.2.0 | Status: Production-ready | Updated: pgvector 0.8.1


Capability Details

hybrid-search-rrf

Keywords: hybrid search, rrf, reciprocal rank fusion, vector bm25, semantic keyword search Solves:

  • How do I combine vector and keyword search?
  • Implement hybrid retrieval with RRF
  • Merge semantic and BM25 results

semantic-search

Keywords: semantic search, vector similarity, embedding, nearest neighbor, cosine distance Solves:

  • How does semantic search work?
  • When to use semantic vs keyword search
  • Semantic search strengths and weaknesses

keyword-search-bm25

Keywords: bm25, full-text search, tsvector, tsquery, keyword search Solves:

  • How does BM25 keyword search work?
  • Implement PostgreSQL full-text search
  • BM25 vs semantic search trade-offs

rrf-algorithm

Keywords: rrf, reciprocal rank fusion, rank-based fusion, score normalization Solves:

  • How does Reciprocal Rank Fusion work?
  • Why use rank instead of scores?
  • RRF smoothing constant (k parameter)

database-schema

Keywords: pgvector schema, chunk table, embedding column, tsvector, generated column Solves:

  • How do I design schema for hybrid search?
  • Store embeddings with vector(1024)
  • Pre-compute tsvector for performance

search-query-implementation

Keywords: hybrid search query, sqlalchemy, vector distance, ts_rank_cd, full outer join Solves:

  • How do I write hybrid search SQL?
  • Implement RRF in SQLAlchemy
  • Use fetch multiplier for better coverage

indexing-strategies

Keywords: pgvector index, hnsw, ivfflat, vector index performance, index tuning Solves:

  • HNSW vs IVFFlat comparison
  • Optimize vector search speed
  • Scale to millions of vectors

pre-computed-tsvector

Keywords: tsvector, gin index, full-text index, pre-computed column, generated column Solves:

  • Optimize keyword search performance
  • 5-10x speedup with indexed tsvector

metadata-filtering

Keywords: metadata filter, faceted search, content type filter, score boosting Solves:

  • Filter search by metadata
  • Boost results by section title
  • Pre-filter by content type

common-patterns

Keywords: filtered search, similarity threshold, multi-query retrieval, search patterns Solves:

  • Filter search by content type
  • Set minimum similarity threshold
  • Implement multi-query retrieval

golden-dataset-testing

Keywords: golden dataset, search evaluation, pass rate, mrr, retrieval testing Solves:

  • Test hybrid search quality
  • Evaluate search with golden queries
  • Calculate pass rate and MRR metrics