Qdrant File Exporter
Extracts all text chunks with three-tier contextual embeddings from a specific PDF file stored in the Qdrant vector database and saves them to a plain text file.
When to Use This Skill
Activate this skill when the user:
- •Wants to extract all chunks from a specific PDF
- •Asks to export, dump, or save chunks from a document
- •Needs to view all chunk content for a file
- •Wants to analyze chunks outside of the vector database
- •Needs to review or inspect the three-tier contextual embeddings
- •Wants to see how chunks were contextualized during loading
- •Says things like "export chunks from [filename]" or "dump all chunks for [file]"
How to Use
Step 1: Get the PDF Filename
Ask the user which PDF file they want to export chunks from. The filename should match exactly what's stored in the Qdrant doc field (e.g., "bcy-26-income-eligibility-and-maximum-psoc-twc.pdf").
If the user doesn't know the exact filename, suggest using the qdrant-chunk-retriever skill first to search for files.
Step 2: Run the Export Script
Execute the Python helper script:
python .claude/skills/qdrant-file-exporter/scripts/export_chunks.py "filename.pdf"
The script will:
- •Connect to Qdrant (tro-child-3-contextual collection)
- •Retrieve all chunks matching the filename
- •Save them to
UTIL/[filename]_chunks.txt
Step 3: Report Results
After the script completes, inform the user:
- •Total number of chunks extracted
- •Output file location
- •File size (if available)
- •Mention that contexts are included
Example:
✅ Extracted 47 chunks from bcy-26-income-eligibility-and-maximum-psoc-twc.pdf 📄 Saved to: UTIL/bcy-26-income-eligibility-and-maximum-psoc-twc_chunks.txt 📊 Includes master context, document context, and chunk-specific contexts
Step 4: Offer Next Steps
Ask the user if they want to:
- •View the exported file
- •Export another PDF
- •Analyze the chunk content
Examples
Example 1: Basic Export
User: "Export all chunks from the income eligibility PDF"
Assistant: "Which PDF would you like to export? Please provide the exact filename."
User: "bcy-26-income-eligibility-and-maximum-psoc-twc.pdf"
Assistant: *Runs export script*
"✅ Extracted 47 chunks from bcy-26-income-eligibility-and-maximum-psoc-twc.pdf
📄 Saved to: UTIL/bcy-26-income-eligibility-and-maximum-psoc-twc_chunks.txt"
Example 2: Export for Analysis
User: "I need to see all the chunks for the PSOC chart document"
Assistant: *Runs export script with bcy-26-psoc-chart-twc.pdf*
"✅ Extracted 12 chunks from bcy-26-psoc-chart-twc.pdf
📄 Saved to: UTIL/bcy-26-psoc-chart-twc_chunks.txt
Would you like me to open the file or analyze the content?"
Error Handling
File Not Found in Qdrant
If the PDF filename doesn't match any documents:
- •Suggest the user check the filename spelling
- •Recommend using
qdrant-chunk-retrieverto search for available files - •List similar filenames if possible
Connection Errors
If Qdrant connection fails:
- •Check QDRANT_API_URL and QDRANT_API_KEY environment variables
- •Verify the collection name (tro-child-3-contextual) exists
- •Suggest running
python LOAD_DB/verify_qdrant.pyto check connection
No Chunks Found
If the file exists but has 0 chunks:
- •Verify the file was loaded correctly
- •Suggest running the loader script if needed
Dependencies
- •Qdrant client (
qdrant-client) - •Environment variables: QDRANT_API_URL, QDRANT_API_KEY
- •Collection: tro-child-3-contextual (must exist)
Output Format
The exported text file contains:
- •Master Context (once at beginning): Domain-level context for all chunks
- •Document Context (once at beginning): Document-specific summary
- •Chunks in original document order (sorted by chunk_index from loading pipeline)
- •Header:
--- Chunk N (Page X) --- - •Chunk-specific context:
[Chunk Context]: ... - •Plain text content
- •Header:
- •Starts from the beginning of the document