ML Pipeline Workflow
Complete end-to-end MLOps pipeline orchestration from data preparation through model deployment.
Do not use this skill when
- •The task is unrelated to ml pipeline workflow
- •You need a different domain or tool outside this scope
Instructions
- •Clarify goals, constraints, and required inputs.
- •Apply relevant best practices and validate outcomes.
- •Provide actionable steps and verification.
- •If detailed examples are required, open
resources/implementation-playbook.md.
Overview
This skill provides comprehensive guidance for building production ML pipelines that handle the full lifecycle: data ingestion → preparation → training → validation → deployment → monitoring.
Use this skill when
- •Building new ML pipelines from scratch
- •Designing workflow orchestration for ML systems
- •Implementing data → model → deployment automation
- •Setting up reproducible training workflows
- •Creating DAG-based ML orchestration
- •Integrating ML components into production systems
What This Skill Provides
Core Capabilities
- •
Pipeline Architecture
- •End-to-end workflow design
- •DAG orchestration patterns (Airflow, Dagster, Kubeflow)
- •Component dependencies and data flow
- •Error handling and retry strategies
- •
Data Preparation
- •Data validation and quality checks
- •Feature engineering pipelines
- •Data versioning and lineage
- •Train/validation/test splitting strategies
- •
Model Training
- •Training job orchestration
- •Hyperparameter management
- •Experiment tracking integration
- •Distributed training patterns
- •
Model Validation
- •Validation frameworks and metrics
- •A/B testing infrastructure
- •Performance regression detection
- •Model comparison workflows
- •
Deployment Automation
- •Model serving patterns
- •Canary deployments
- •Blue-green deployment strategies
- •Rollback mechanisms
Reference Documentation
See the references/ directory for detailed guides:
- •data-preparation.md - Data cleaning, validation, and feature engineering
- •model-training.md - Training workflows and best practices
- •model-validation.md - Validation strategies and metrics
- •model-deployment.md - Deployment patterns and serving architectures
Assets and Templates
The assets/ directory contains:
- •pipeline-dag.yaml.template - DAG template for workflow orchestration
- •training-config.yaml - Training configuration template
- •validation-checklist.md - Pre-deployment validation checklist
Usage Patterns
Basic Pipeline Setup
python
# 1. Define pipeline stages
stages = [
"data_ingestion",
"data_validation",
"feature_engineering",
"model_training",
"model_validation",
"model_deployment"
]
# 2. Configure dependencies
# See assets/pipeline-dag.yaml.template for full example
Production Workflow
- •
Data Preparation Phase
- •Ingest raw data from sources
- •Run data quality checks
- •Apply feature transformations
- •Version processed datasets
- •
Training Phase
- •Load versioned training data
- •Execute training jobs
- •Track experiments and metrics
- •Save trained models
- •
Validation Phase
- •Run validation test suite
- •Compare against baseline
- •Generate performance reports
- •Approve for deployment
- •
Deployment Phase
- •Package model artifacts
- •Deploy to serving infrastructure
- •Configure monitoring
- •Validate production traffic
Best Practices
Pipeline Design
- •Modularity: Each stage should be independently testable
- •Idempotency: Re-running stages should be safe
- •Observability: Log metrics at every stage
- •Versioning: Track data, code, and model versions
- •Failure Handling: Implement retry logic and alerting
Data Management
- •Use data validation libraries (Great Expectations, TFX)
- •Version datasets with DVC or similar tools
- •Document feature engineering transformations
- •Maintain data lineage tracking
Model Operations
- •Separate training and serving infrastructure
- •Use model registries (MLflow, Weights & Biases)
- •Implement gradual rollouts for new models
- •Monitor model performance drift
- •Maintain rollback capabilities
Deployment Strategies
- •Start with shadow deployments
- •Use canary releases for validation
- •Implement A/B testing infrastructure
- •Set up automated rollback triggers
- •Monitor latency and throughput
Integration Points
Orchestration Tools
- •Apache Airflow: DAG-based workflow orchestration
- •Dagster: Asset-based pipeline orchestration
- •Kubeflow Pipelines: Kubernetes-native ML workflows
- •Prefect: Modern dataflow automation
Experiment Tracking
- •MLflow for experiment tracking and model registry
- •Weights & Biases for visualization and collaboration
- •TensorBoard for training metrics
Deployment Platforms
- •AWS SageMaker for managed ML infrastructure
- •Google Vertex AI for GCP deployments
- •Azure ML for Azure cloud
- •Kubernetes + KServe for cloud-agnostic serving
Progressive Disclosure
Start with the basics and gradually add complexity:
- •Level 1: Simple linear pipeline (data → train → deploy)
- •Level 2: Add validation and monitoring stages
- •Level 3: Implement hyperparameter tuning
- •Level 4: Add A/B testing and gradual rollouts
- •Level 5: Multi-model pipelines with ensemble strategies
Common Patterns
Batch Training Pipeline
yaml
# See assets/pipeline-dag.yaml.template
stages:
- name: data_preparation
dependencies: []
- name: model_training
dependencies: [data_preparation]
- name: model_evaluation
dependencies: [model_training]
- name: model_deployment
dependencies: [model_evaluation]
Real-time Feature Pipeline
python
# Stream processing for real-time features # Combined with batch training # See references/data-preparation.md
Continuous Training
python
# Automated retraining on schedule # Triggered by data drift detection # See references/model-training.md
Troubleshooting
Common Issues
- •Pipeline failures: Check dependencies and data availability
- •Training instability: Review hyperparameters and data quality
- •Deployment issues: Validate model artifacts and serving config
- •Performance degradation: Monitor data drift and model metrics
Debugging Steps
- •Check pipeline logs for each stage
- •Validate input/output data at boundaries
- •Test components in isolation
- •Review experiment tracking metrics
- •Inspect model artifacts and metadata
Next Steps
After setting up your pipeline:
- •Explore hyperparameter-tuning skill for optimization
- •Learn experiment-tracking-setup for MLflow/W&B
- •Review model-deployment-patterns for serving strategies
- •Implement monitoring with observability tools
Related Skills
- •experiment-tracking-setup: MLflow and Weights & Biases integration
- •hyperparameter-tuning: Automated hyperparameter optimization
- •model-deployment-patterns: Advanced deployment strategies