AgentSkillsCN

ai

在构建 AI 功能、集成 LLM、实施 RAG、处理嵌入式向量、部署 ML 模型,或进行数据科学相关工作时,可使用此技能。当用户提及 OpenAI、Anthropic、Claude、GPT、LLM、RAG、嵌入式向量、向量数据库、Pinecone、Qdrant、LangChain、LlamaIndex、DSPy、MLflow、微调、LoRA、QLoRA、模型部署、ML 流水线、特征工程,或机器学习时,本技能即可触发。

SKILL.md
--- frontmatter
name: ai
description: Use this skill when building AI features, integrating LLMs, implementing RAG, working with embeddings, deploying ML models, or doing data science. Activates on mentions of OpenAI, Anthropic, Claude, GPT, LLM, RAG, embeddings, vector database, Pinecone, Qdrant, LangChain, LlamaIndex, DSPy, MLflow, fine-tuning, LoRA, QLoRA, model deployment, ML pipeline, feature engineering, or machine learning.

AI/ML Engineering

Build production AI systems with modern patterns and tools.

Quick Reference

The 2026 AI Stack

LayerToolPurpose
PromptingDSPyProgrammatic prompt optimization
OrchestrationLangGraphStateful multi-agent workflows
RAGLlamaIndexDocument ingestion and retrieval
VectorsQdrant / PineconeEmbedding storage and search
EvaluationRAGASRAG quality metrics
Experiment TrackingMLflow / W&BLogging, versioning, comparison
ServingBentoML / vLLMModel deployment
ProtocolMCPTool and context integration

DSPy: Programmatic Prompting

Manual prompts are dead. DSPy treats prompts as optimizable code:

python
import dspy

class QA(dspy.Signature):
    """Answer questions with short factoid answers."""
    question = dspy.InputField()
    answer = dspy.OutputField(desc="1-5 words")

# Create module
qa = dspy.Predict(QA)

# Use it
result = qa(question="What is the capital of France?")
print(result.answer)  # "Paris"

Optimize with real data:

python
from dspy.teleprompt import BootstrapFewShot

optimizer = BootstrapFewShot(metric=exact_match)
optimized_qa = optimizer.compile(qa, trainset=train_data)

RAG Architecture (Production)

code
Query → Rewrite → Hybrid Retrieval → Rerank → Generate → Cite
         │              │                │
         v              v                v
    Query expansion  Dense + BM25   Cross-encoder

LlamaIndex + LangGraph Pattern:

python
from llama_index.core import VectorStoreIndex
from langgraph.graph import StateGraph

# Data layer (LlamaIndex)
index = VectorStoreIndex.from_documents(docs)
query_engine = index.as_query_engine()

# Control layer (LangGraph)
def retrieve(state):
    response = query_engine.query(state["question"])
    return {"context": response.response, "sources": response.source_nodes}

graph = StateGraph(State)
graph.add_node("retrieve", retrieve)
graph.add_node("generate", generate_answer)
graph.add_edge("retrieve", "generate")

MCP Integration

Model Context Protocol is the standard for tool integration:

python
from mcp import Server, Tool

server = Server("my-tools")

@server.tool()
async def search_docs(query: str) -> str:
    """Search the knowledge base."""
    results = await vector_store.search(query)
    return format_results(results)

Embeddings (2026)

ModelDimensionsBest For
text-embedding-3-large3072General purpose
BGE-M31024Multilingual RAG
Qwen3-EmbeddingFlexibleCustom domains

Fine-Tuning with LoRA/QLoRA

python
from peft import LoraConfig, get_peft_model

config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.05,
)

model = get_peft_model(base_model, config)
# Train on ~24GB VRAM (QLoRA on RTX 4090)

MLOps Pipeline

yaml
# MLflow tracking
mlflow.set_experiment("rag-v2")

with mlflow.start_run():
    mlflow.log_params({"chunk_size": 512, "model": "gpt-4"})
    mlflow.log_metrics({"faithfulness": 0.92, "relevance": 0.88})
    mlflow.log_artifact("prompts/qa.txt")

Evaluation with RAGAS

python
from ragas import evaluate
from ragas.metrics import faithfulness, answer_relevancy, context_precision

results = evaluate(
    dataset,
    metrics=[faithfulness, answer_relevancy, context_precision],
)
print(results)  # {'faithfulness': 0.92, 'answer_relevancy': 0.88, ...}

Vector Database Selection

DBBest ForPricing
QdrantSelf-hosted, filtering1GB free forever
PineconeManaged, zero-opsFree tier available
WeaviateKnowledge graphs14-day trial
MilvusBillion-scaleSelf-hosted

Agents

  • ai-engineer - LLM integration, RAG, MCP, production AI
  • mlops-engineer - Model deployment, monitoring, pipelines
  • data-scientist - Analysis, modeling, experimentation
  • ml-researcher - Cutting-edge architectures, paper implementation
  • cv-engineer - Computer vision, VLMs, image processing

Deep Dives

Examples