Analyze RAG (Retrieval-Augmented Generation) implementations for anti-patterns, performance issues, and best practices violations.

When to Use

What This Skill Does

Audit Categories

1. Chunking Strategy

2. Embedding Configuration

3. Vector Store Setup

4. Retrieval Pipeline

5. Generation Configuration

6. Production Readiness

How to Run an Audit

Step 1: Discover RAG Code

code

Patterns to search:
- "embedding" OR "embeddings" OR "embed("
- "vector" OR "vectorstore" OR "vector_store"
- "qdrant" OR "pinecone" OR "chroma" OR "weaviate" OR "milvus"
- "chunk" OR "chunking" OR "split" OR "splitter"
- "retriev" OR "search" OR "query"
- "langchain" OR "llamaindex" OR "haystack"
- "openai.embed" OR "cohere.embed" OR "voyageai"

Step 2: Analyze Each Component

Step 3: Generate Report

markdown

# RAG Audit Report

## Summary
- **Files Analyzed**: X
- **Issues Found**: Y (X critical, Y warnings, Z suggestions)
- **Overall Score**: X/100

## Critical Issues
[Issues that will cause failures or severe degradation]

## Warnings
[Issues that impact quality or performance]

## Suggestions
[Optimizations and best practices]

## Detailed Findings

### [Component Name]
**Location**: `path/to/file.py:line`
**Issue**: [Description]
**Impact**: [What goes wrong]
**Fix**: [How to fix with code example]

Common Anti-Patterns to Flag

1. No Chunk Overlap

python

# BAD: No overlap causes context loss at boundaries
chunks = text_splitter.split(text, chunk_size=1000, overlap=0)

# GOOD: 10-20% overlap preserves context
chunks = text_splitter.split(text, chunk_size=1000, overlap=150)

2. Hardcoded Top-K

python

# BAD: Fixed top-k regardless of query complexity
results = vectorstore.search(query, k=5)

# GOOD: Dynamic or configurable with score threshold
results = vectorstore.search(query, k=10, score_threshold=0.7)

3. No Reranking

python

# BAD: Using raw vector similarity scores only
docs = vectorstore.similarity_search(query, k=5)
context = "\n".join([d.content for d in docs])

# GOOD: Rerank for relevance before using
docs = vectorstore.similarity_search(query, k=20)
reranked = reranker.rerank(query, docs, top_k=5)
context = "\n".join([d.content for d in reranked])

4. Ignoring Metadata

python

# BAD: Storing only text
vectorstore.add(texts=chunks)

# GOOD: Preserve source metadata for citations
vectorstore.add(
    texts=chunks,
    metadatas=[{"source": doc.name, "page": i, "chunk_id": j} for ...]
)

5. No Error Handling

python

# BAD: Unhandled failures
response = llm.generate(prompt)

# GOOD: Graceful degradation
try:
    response = llm.generate(prompt)
except RateLimitError:
    response = fallback_response(query)
except Exception as e:
    logger.error(f"Generation failed: {e}")
    response = "I couldn't process your request. Please try again."

6. Context Window Overflow

python

# BAD: Stuffing all retrieved docs without checking
context = "\n".join([doc.content for doc in all_docs])
prompt = f"Context: {context}\nQuestion: {query}"

# GOOD: Respect token limits
max_context_tokens = 3000
context = truncate_to_tokens(docs, max_context_tokens)

7. Missing Hybrid Search

python

# BAD: Dense-only search misses keyword matches
results = vectorstore.similarity_search(query)

# GOOD: Combine dense + sparse for better recall
dense_results = vectorstore.similarity_search(query, k=10)
sparse_results = bm25.search(query, k=10)
results = reciprocal_rank_fusion(dense_results, sparse_results)

8. No Query Preprocessing

python

# BAD: Raw user query to embedding
embedding = embed(user_query)

# GOOD: Clean and optionally expand query
cleaned_query = preprocess(user_query)
# Optional: query expansion for better recall
expanded_queries = expand_query(cleaned_query)

Rag Audit

RAG Audit Skill

When to Use

What This Skill Does

Audit Categories

1. Chunking Strategy

2. Embedding Configuration

3. Vector Store Setup

4. Retrieval Pipeline

5. Generation Configuration

6. Production Readiness

How to Run an Audit

Step 1: Discover RAG Code

Step 2: Analyze Each Component

Step 3: Generate Report

Common Anti-Patterns to Flag

1. No Chunk Overlap

2. Hardcoded Top-K

3. No Reranking

4. Ignoring Metadata

5. No Error Handling

6. Context Window Overflow

7. Missing Hybrid Search

8. No Query Preprocessing

Reference Resources

Output Format