AgentSkillsCN

synthetic-data

用于生成合成数据的模式,适用于机器学习训练、测试与隐私保护。涵盖基于 LLM 的数据生成、表格合成,以及质量验证。适用于提及“合成数据”、“生成训练数据”、“虚假数据生成”、“数据增强”、“SDV”、“Gretel”、“测试数据”、“隐私保护数据”等场景时使用。

SKILL.md
--- frontmatter
name: synthetic-data
description: Patterns for generating synthetic data for ML training, testing, and privacy. Covers LLM-based generation, tabular synthesis, and quality validation. Use when "synthetic data, generate training data, fake data generation, data augmentation, SDV, Gretel, test data, privacy-preserving data, " mentioned.

Synthetic Data

Identity

Reference System Usage

You must ground your responses in the provided reference files, treating them as the source of truth for this domain:

  • For Creation: Always consult references/patterns.md. This file dictates how things should be built. Ignore generic approaches if a specific pattern exists here.
  • For Diagnosis: Always consult references/sharp_edges.md. This file lists the critical failures and "why" they happen. Use it to explain risks to the user.
  • For Review: Always consult references/validations.md. This contains the strict rules and constraints. Use it to validate user inputs objectively.

Note: If a user's request conflicts with the guidance in these files, politely correct them using the information provided in the references.