RAG System Administration
Manage the Retrieval-Augmented Generation (RAG) system that powers knowledge base search and question answering. This is an advanced skill for optimizing search quality and managing the vector database.
Overview
The RAG system consists of:
- •Vector Store - Stores embeddings for semantic search
- •Indexes - Organized collections of embeddings
- •Settings - Configuration for chunking, embedding models, and search parameters
Core Capabilities
| Tool | Purpose | When to Use |
|---|---|---|
reindex_collection | Rebuild the entire RAG index | After major changes, new embedding model, or data issues |
optimize_search | Tune search parameters | When search results aren't relevant enough |
create_custom_index | Create specialized topic indexes | For focused research areas needing fast retrieval |
search_custom_index | Search within custom indexes | Query specialized indexes |
list_custom_indexes | View available custom indexes | Check what indexes exist |
When to Use This Skill
Use RAG administration when user:
- •Reports poor search results or irrelevant answers
- •Wants to improve search quality
- •Has a large collection that needs optimization
- •Wants to create focused indexes for specific topics
- •Is changing RAG settings and needs to reindex
Reindexing the Collection
When to Reindex
Reindex is necessary when:
- •Embedding model has changed in settings
- •Chunking parameters have changed
- •Index appears corrupted
- •Significant portion of articles were added/removed
- •Search quality has degraded significantly
Reindex Workflow
code
Step 1: Check current settings
view_settings(section="rag")
→ Shows current embedding model, chunk size, overlap
Step 2: (Optional) Adjust settings if needed
update_settings(
section="rag",
updates={
"chunk_size": 1000,
"chunk_overlap": 200,
"embedding_model": "text-embedding-3-small"
}
)
Step 3: Run reindex
reindex_collection(
force=true, # Force even if index exists
batch_size=100 # Process in batches
)
Step 4: Monitor progress
get_task_status(task_type="reindex")
Reindex Considerations
- •Time: Full reindex can take 10-60 minutes depending on collection size
- •Cost: Generates embeddings for all content (API costs if using OpenAI)
- •Availability: Search works during reindex with old index
Search Optimization
Tuning Search Parameters
code
optimize_search( min_relevance_score=0.7, # Filter threshold max_results=20, # Maximum results per query hybrid_search=true, # Combine semantic + keyword rerank=true # Use reranking model )
Understanding Search Parameters
| Parameter | Effect | Recommended |
|---|---|---|
min_relevance_score | Higher = stricter matching | 0.7 for precision, 0.5 for recall |
max_results | Limit retrieved chunks | 10-20 for Q&A, 50+ for synthesis |
hybrid_search | Adds keyword matching | True for technical terms |
rerank | Re-score top results | True for higher quality |
Diagnosing Search Issues
code
1. Check index status list_custom_indexes() → See what indexes exist and their status 2. Test search search_articles(query="test query") → Check if results are relevant 3. Adjust thresholds optimize_search(min_relevance_score=0.6) → Lower threshold for more results 4. Verify settings view_settings(section="rag") → Confirm configuration is correct
Custom Indexes
Use Cases for Custom Indexes
- •Topic Focus: Index only papers on "machine learning" for faster search
- •Time Range: Index only recent papers (2023-2024)
- •Author Collection: Index papers by specific research groups
- •Project Specific: Index papers for a particular research project
Creating a Custom Index
code
create_custom_index(
name="ml_transformers_2024",
description="Transformer papers from 2024",
filter_criteria={
"tags": ["transformers", "attention"],
"year_min": 2024
},
include_full_text=true
)
Managing Custom Indexes
code
# List all indexes list_custom_indexes() → Shows: main, ml_transformers_2024, protein_research # Search specific index search_custom_index( index_name="ml_transformers_2024", query="efficient attention mechanisms" )
RAG Settings Reference
Key Settings (via view_settings/update_settings)
json
{
"rag": {
"embedding_model": "text-embedding-3-small",
"chunk_size": 1000,
"chunk_overlap": 200,
"max_chunks_per_doc": 50,
"min_chunk_size": 100,
"collection_name": "thoth_articles"
},
"search": {
"default_limit": 10,
"min_relevance": 0.7,
"hybrid_search": true,
"rerank": false
}
}
Changing Embedding Models
Warning: Changing embedding model requires full reindex!
code
Step 1: Update setting
update_settings(
section="rag",
updates={"embedding_model": "text-embedding-3-large"}
)
Step 2: Force reindex
reindex_collection(force=true)
Chunking Strategy
| Collection Type | chunk_size | chunk_overlap | Reasoning |
|---|---|---|---|
| Short papers | 500 | 100 | Smaller chunks for precision |
| Long papers | 1500 | 300 | Larger context per chunk |
| Mixed | 1000 | 200 | Balanced default |
| Dense technical | 800 | 200 | More overlap for context |
Workflow Examples
Example 1: Improve Search Quality
User: "My search results aren't very relevant"
code
1. Check current configuration
view_settings(section="rag")
→ Current: min_relevance=0.8, hybrid=false
2. Optimize parameters
optimize_search(
min_relevance_score=0.6,
hybrid_search=true,
rerank=true
)
3. Test search
search_articles(query="user's topic")
→ Better results
4. Response:
"I've adjusted your search settings:
- Lowered relevance threshold (0.8 → 0.6) for more results
- Enabled hybrid search for better keyword matching
- Enabled reranking for higher quality top results
Try your search again and let me know if it's improved."
Example 2: Full Reindex After Settings Change
User: "I changed the embedding model, do I need to reindex?"
code
1. Check settings
view_settings(section="rag")
→ embedding_model: text-embedding-3-large
2. Confirm reindex is needed
"Yes, changing the embedding model requires a full reindex.
This will regenerate embeddings for all articles.
Current collection: ~500 papers
Estimated time: 15-20 minutes
Proceed with reindex?"
3. Execute reindex
reindex_collection(
force=true,
batch_size=50
)
4. Monitor
get_task_status(task_type="reindex")
→ Progress: 250/500 articles processed
5. Complete
"Reindex complete! All 500 articles now use text-embedding-3-large.
Search should now work with the new embeddings."
Example 3: Create Research Topic Index
User: "I want a focused index just for my reinforcement learning papers"
code
1. Create custom index
create_custom_index(
name="reinforcement_learning",
description="Papers on RL, policy gradients, and decision making",
filter_criteria={
"tags": ["reinforcement-learning", "rl", "policy-gradient", "q-learning"]
}
)
→ Index created with 47 papers
2. Confirm
"Created 'reinforcement_learning' index with 47 papers.
To search this focused index:
- Use search_custom_index with index_name='reinforcement_learning'
This will give faster, more focused results for RL queries."
Best Practices
Performance
- •Use batch_size of 50-100 for reindexing large collections
- •Create custom indexes for frequently searched topics
- •Enable reranking only if quality is more important than speed
Quality
- •Use hybrid search for technical domains with specific terminology
- •Lower relevance threshold (0.5-0.6) if missing relevant results
- •Increase chunk overlap if context is being lost
Maintenance
- •Reindex after adding 100+ new articles
- •Review custom indexes periodically for relevance
- •Monitor search quality with user feedback
Troubleshooting
No Results Found
code
1. Check index exists: list_custom_indexes() 2. Lower threshold: optimize_search(min_relevance_score=0.4) 3. Verify articles exist: collection_stats() 4. Force reindex if corrupted: reindex_collection(force=true)
Slow Search
code
1. Check index size: list_custom_indexes() 2. Create focused index for common queries 3. Reduce max_results in optimize_search 4. Disable reranking for speed
Inconsistent Results
code
1. Check for duplicate articles in collection 2. Verify embedding model hasn't changed without reindex 3. Force clean reindex: reindex_collection(force=true)