Training on Google Colab Skill

This skill enables efficient model training using Google Colab's GPU resources while maintaining code and data synchronization with the local project via Google Drive.

When to Use

•Need GPU acceleration for model training
•Training experiments that exceed local hardware capabilities
•Long-running training jobs that benefit from cloud execution
•Testing hyperparameter variations at scale

Prerequisites

•Google account with Colab Pro (recommended) or Colab Free
•Google Drive with project folder structure
•Training data uploaded to Google Drive
•Local project configured for Drive sync (see docs/reference/colab-pro-setup.md)

Workflow

Step 1: Prepare Training Configuration

Create or update a training config in training/configs/:

yaml

# training/configs/experiment_name.yaml
model:
  name: resnet50
  num_classes: 2
  pretrained: true

training:
  epochs: 100
  batch_size: 32
  learning_rate: 0.001
  optimizer: adam
  
data:
  train_path: /content/drive/MyDrive/traina/data/train
  val_path: /content/drive/MyDrive/traina/data/val
  
augmentation:
  enabled: true
  normalize: false  # Set based on production requirements

Step 2: Sync Code to Google Drive

Before syncing, perform a Pre-flight Check:

•Run a local smoke test (e.g., python smoke_test.py or set DRY_RUN=True in your notebook).
•Verify that 1 epoch runs on CPU with a few batches.
•Only sync after local verification passes.

bash

# From project root
./training/scripts/sync_to_drive.sh

Step 3: Launch Colab Notebook

•Open training/notebooks/colab_training.ipynb in Google Colab
•Connect to GPU runtime (Runtime → Change runtime type → GPU)

•Mount Google Drive:

python

from google.colab import drive
drive.mount('/content/drive')

•Execute training cells

Step 4: Download Results

After training completes:

bash

# Sync trained models back from Drive
./training/scripts/sync_from_drive.sh \
  --source "drive/MyDrive/traina/experiments/" \
  --dest "training/experiments/"

Configuration Options

Environment Modes

Mode	Description	Use Case
Development	Quick iterations, small dataset	Testing configs, debugging
Training	Full dataset, GPU acceleration	Production model training
Evaluation	Validation/test metrics only	Assessing trained models

GPU Optimization