extract-hyperparameters

从论文中提取并记录模型的超参数配置，为训练任务的顺利开展奠定基础。

SKILL.md

--- frontmatter

name: extract-hyperparameters
description: "Identify and document model hyperparameters from papers. Use when setting up training configurations."
mcp_fallback: none
category: analysis
tier: 2

Extract Hyperparameters

Locate and document all hyperparameters mentioned in research papers including learning rates, batch sizes, and model configurations.

When to Use

•Reproducing paper results
•Setting up model training configurations
•Comparing hyperparameter choices across papers
•Planning hyperparameter tuning experiments

Quick Reference

bash

# Extract numeric values and parameters from papers
pdftotext paper.pdf - | grep -i "learning rate\|batch\|epochs\|weight decay\|dropout" | head -20

# Common pattern search
grep -E "\\b(lr|batch_size|epochs|momentum|dropout|layers)\\s*[=:]" config.py

Workflow

•Find hyperparameter table: Look for "Table 1" or "Hyperparameters" section
•Document architecture parameters: Layer sizes, activation functions, normalization
•Extract training parameters: Learning rate, batch size, epochs, optimizers
•Note regularization: Dropout, weight decay, batch normalization
•Create configuration file: Translate to implementation format (YAML/JSON/Mojo)

Output Format

Hyperparameter documentation:

•Model architecture (layers, sizes, activations)
•Training parameters (LR, batch size, epochs)
•Optimizer configuration (type, momentum, decay)
•Regularization settings (dropout, L1/L2)
•Data preprocessing (normalization, augmentation)
•Hardware and precision (float32, float64)

References

•See prepare-dataset skill for data configuration
•See train-model skill for training implementation
•See /notes/review/mojo-ml-patterns.md for Mojo configuration patterns