RAG Pipeline Skill

Name: rag-pipeline
Rating: 87
Author: muhammad-anas35

Quick Start Workflow

When building or maintaining the RAG pipeline:

•
Content Ingestion (One-time setup)
- •Read all Docusaurus markdown files from /docs
- •Chunk text (800 chars, 200 overlap)
- •Generate embeddings with OpenAI ada-002
- •Upsert to Qdrant with metadata
•
Query Flow (Runtime)
- •Receive user question
- •Generate query embedding
- •Search Qdrant (top 5 results, score >= 0.7)
- •Build context from relevant chunks
- •Pass to OpenAI GPT-4 with context
- •Return answer + sources
•
Continuous Improvement
- •Monitor search quality (are results relevant?)
- •Adjust chunk size if needed
- •Update score thresholds
- •Add filters for specific chapters

Standard Architecture

code

User Question
    ↓
[Generate Embedding]
    ↓ 
[Search Qdrant]
    ↓
[Extract Top 5 Chunks]
    ↓
[Build Context String]
    ↓
[GPT-4 with Context]
    ↓
AI Answer + Sources

Key Parameters

•Chunk size: 800 characters
•Overlap: 200 characters
•Embedding model: text-embedding-ada-002
•LLM model: gpt-4 or gpt-3.5-turbo
•Search limit: 5 chunks
•Score threshold: 0.7
•Context window: ~3000 tokens max

Best Practices

For Physical AI textbook RAG:

•Preserve code blocks when chunking
•Include chapter/section in metadata
•Cite sources in responses
•Cache embeddings for popular queries
•Log all queries for analytics
•Handle "no results" gracefully

Knowledge Base

Detailed guides available:

•Chunking Strategies → references/chunking.md
•Ingestion Script → references/ingestion-script.md
•Query Pipeline → references/query-pipeline.md
•Context Building → references/context-building.md
•Error Handling → references/error-handling.md