AgentSkillsCN

Sdg Discover Flows

从 sdg_hub 中探索并发掘可用的 SDG 流程。

SKILL.md
--- frontmatter
description: Discover and explore available SDG flows from sdg_hub

SDG Discover Flows

Help users find and explore available synthetic data generation flows in sdg_hub.

Core API

python
from sdg_hub import FlowRegistry, Flow

# Discover all registered flows
flows = FlowRegistry.discover_flows()

# Get details about a specific flow
flow_path = FlowRegistry.get_flow_path("flow-id")
flow = Flow.from_yaml(flow_path)

# List flow metadata
print(flow.name)
print(flow.description)
print(flow.blocks)  # List of blocks in the flow

Flow Categories

Math & Reasoning

  • simple-math-qa: Basic math question-answer pairs
  • multi-step-reasoning: Chain-of-thought reasoning problems
  • code-math-interleave: Mixed code and mathematical reasoning

Instruction Following

  • alpaca-style: Alpaca-format instruction-response pairs
  • sharegpt-convert: Convert ShareGPT conversations
  • system-prompt-injection: Add system prompts to existing data

Code Generation

  • code-completion: Code completion examples
  • code-explanation: Code with explanations
  • unit-test-gen: Generate unit tests from code

Discovery Workflow

  1. List all flows: Use FlowRegistry.discover_flows() to see what's available
  2. Inspect flow structure: Load the YAML to understand the block pipeline
  3. Check required inputs: Each flow expects specific dataset columns
  4. Validate model compatibility: Some flows require specific model capabilities

Common Tasks

Find flows by category

python
from sdg_hub import FlowRegistry

flows = FlowRegistry.discover_flows()
math_flows = [f for f in flows if "math" in f.tags]

Inspect flow requirements

python
flow = Flow.from_yaml(FlowRegistry.get_flow_path("simple-math-qa"))
print(f"Required columns: {flow.required_columns}")
print(f"Output columns: {flow.output_columns}")

Preview flow execution

python
# Dry run to see what would happen
flow.dry_run(sample_dataset[:5])

Troubleshooting

IssueSolution
Flow not foundCheck spelling, run discover_flows() to see available
Missing columnsEnsure dataset has all required_columns
Block errorsCheck individual block configs in YAML

Related Skills

  • /sdg-create-block - Create custom blocks
  • /sdg-run-flow - Execute a flow