AgentSkillsCN

sparkgen-rag

导入文档,查询知识库,评估RAG质量,并管理知识库。

SKILL.md
--- frontmatter
name: sparkgen-rag
description: Ingest documents, query knowledge bases, evaluate RAG quality, and manage KBs
user_invokable: true
auto_invokable: true
auto_invoke_hint: Invoke when the user discusses RAG, knowledge bases, document ingestion, or retrieval
arguments: "<ingest|query|eval|kb-list|kb-add|config> [args]"

SparkGen RAG

Manage the RAG (Retrieval-Augmented Generation) pipeline.

Dynamic Context

Before any action:

  1. Read config/ai_workflow.yamlknowledge_bases: and rag: sections
  2. List documents: ls documents/
  3. Check vector index: ls local_data/vectors/ 2>/dev/null
  4. If server running: curl -sf http://localhost:8000/v1/rag/knowledge-bases -H "X-API-Key: ${API_KEY:-dev-local-key}"

Actions

Ingest Documents (/sparkgen-rag ingest [--kb name] [--source path])

bash
python -m app.rag.ingest --kb ${KB:-default} --source ${SOURCE:-./documents}

This will:

  • Read documents from the source directory
  • Chunk them according to config/ai_workflow.yaml settings (size, overlap, strategy)
  • Generate embeddings using the configured provider
  • Store vectors in the configured backend (FAISS/Milvus)
  • Report: documents processed, chunks created, time taken

Query (/sparkgen-rag query "<question>" [--mode standard|self_rag|graphrag] [--kb name])

If server is running:

bash
curl -s -X POST http://localhost:8000/v1/rag/query \
  -H "Content-Type: application/json" \
  -H "X-API-Key: ${API_KEY:-dev-local-key}" \
  -d '{"question": "<question>", "mode": "<mode>", "knowledge_base": "<kb>"}'

Display: answer, sources with relevance scores, chunks retrieved.

Evaluate (/sparkgen-rag eval [--kb name])

bash
python -m app.rag.eval --kb ${KB:-default} --config config/rag.yaml

Runs RAG quality evaluation. Reports:

  • Accuracy score
  • Faithfulness score
  • Relevancy score
  • Per-question breakdown

List Knowledge Bases (/sparkgen-rag kb-list)

Parse config/ai_workflow.yaml knowledge_bases: section and display: | Name | Description | Source Paths | File Types | Vector Store | Chunks |

Add Knowledge Base (/sparkgen-rag kb-add <name> --source <path> [--description text])

Add a new knowledge base to config/ai_workflow.yaml:

yaml
- name: <name>
  description: "<description>"
  source_paths:
    - <path>
  file_types: [pdf, docx, txt, md]
  auto_ingest: false
  collection: <name>_vectors
  chunking:
    size: 500
    overlap: 50
    strategy: sliding_window
  vector_store:
    backend: faiss
    index_path: ./local_data/vectors/<name>

Then assign it to relevant agents in their rag.knowledge_bases list.

Config (/sparkgen-rag config)

Show current RAG configuration:

  • Global RAG settings (enabled, default mode, top_k, citations)
  • Reranker settings
  • Self-RAG settings
  • Per-agent RAG overrides
  • Knowledge base details