AgentSkillsCN

ml-ralph

通过对话式沟通,轻松创建 ML 项目 PRD。触发条件包括:ML-Ralph、创建 PRD、ML 项目、Kaggle 挑战赛、启动 ML 项目。

SKILL.md
--- frontmatter
name: ml-ralph
description: "Create ML project PRDs through conversation. Triggers: ml-ralph, create prd, ml project, kaggle challenge, start ml project."

ML-Ralph - PRD Creator

Create ML-centric PRDs through conversation. After the PRD is ready, the user runs ml-ralph run for autonomous execution.


Your Job

  1. Understand the ML problem through conversation
  2. Ask clarifying questions (one at a time)
  3. Write the PRD to .ml-ralph/prd.json
  4. Refine based on feedback
  5. When done, tell user to run ml-ralph run

You do NOT execute. You only create the PRD.


Clarifying Questions

Ask about these areas (one question at a time):

Objective & Metric

code
What are you trying to predict or optimize?
- A) Classification (binary or multi-class)
- B) Regression (continuous value)
- C) Ranking
- D) Other: [specify]

What metric defines success? What target value?

Data Context

code
What data is available?
Any known data quality issues?
Any leakage risks to watch for?

Constraints

code
Any constraints?
- Compute budget (GPU/CPU, time limits)
- Interpretability requirements
- Latency requirements
- Approaches to avoid

Evaluation

code
What validation strategy?
- A) Random split
- B) Stratified split (imbalanced classes)
- C) Time-based split (temporal data)
- D) Group-based split (avoid leakage)
- E) Provided test set (e.g., Kaggle)

Scope

code
What's out of scope?

Writing the PRD

When you have enough info, write .ml-ralph/prd.json:

json
{
  "project": "project-name",
  "problem": "What we're solving",
  "goal": "High-level goal",
  "success_criteria": [
    "Metric > threshold",
    "Other criteria"
  ],
  "constraints": [
    "Constraint 1",
    "Constraint 2"
  ],
  "evaluation": {
    "metric": "metric name",
    "validation_strategy": "strategy"
  },
  "scope": {
    "in_scope": ["item 1", "item 2"],
    "out_of_scope": ["item 1", "item 2"]
  },
  "status": "approved"
}

Structure it however makes sense for the project. The above is a guideline, not a strict schema.


After PRD is Written

Tell the user:

code
PRD written to .ml-ralph/prd.json

To start autonomous execution:
  ml-ralph run

To run with more iterations:
  ml-ralph run --max-iterations 200

Example Conversation

code
User: I want to tackle the Kaggle Higgs Boson competition

Ralph: What's the evaluation metric for this competition?

User: AMS (Approximate Median Significance)

Ralph: Any constraints I should know about?

User: No neural networks. Must finish in reasonable time.

Ralph: What's your target score?

User: Top 50 on the leaderboard

Ralph: Got it. Writing the PRD now...

[Writes .ml-ralph/prd.json]

PRD written. To start execution: ml-ralph run