SKILL.md

Stata Coding Standards with LOG Documentation

Purpose

This skill defines how to write Stata code with comprehensive logging documentation to make it easy to verify that actual results match expected output.

Core Principles

•Every block that produces output must have a LOG comment
•LOG comments show what will appear in the actual log file
•LOG comments go immediately after the corresponding code block
•Use emojis for visual clarity (✓, ❌)
•Keep error messages short and concise

Rules for LOG Comments

Rule 1: Single Display Statement

Every display statement gets a LOG comment showing the exact output.

stata

display "Creating output directories..."
// LOG: Creating output directories...

display "Pipeline start time: " c(current_time)
// LOG: Pipeline start time: 14:32:15

Rule 2: If-Else Blocks (Multiple Cases)

Use // case 1 LOG: and // case 2 LOG: for different outcomes.

stata

if "`config_file'" == "" {
    display as error "❌ ERROR: Config file required! Usage: stata-mp -b do script.do <config>"
    exit 198
}
else {
    display "✓ Config file: `config_file'"
}
// case 1 LOG: ❌ ERROR: Config file required! (and exits)
// case 2 LOG: ✓ Config file: config_production_2015-2020

Rule 3: Variable Display with Examples

Show resolved variable values in LOG comments.

stata

display "  Creating: ${output_root}"
capture mkdir "${output_root}"
assert _rc == 0 | _rc == 693
// LOG:   Creating: _WorkSpace/1-CMSStore

Rule 4: Loops - Show Iteration Pattern

For loops, show what each iteration produces.

stata

forvalues year = ${year_start}/${year_end} {
    display ""
    display "########## YEAR `year' ##########"
    // LOG:
    // LOG: ########## YEAR 2015 ##########
    // LOG: ########## YEAR 2016 ##########
    // LOG: (continues for each year)
}

Rule 5: Conditional Blocks (Run Flags)

When execution depends on flags, show both cases.

stata

if ${run_extract} {
    display "STAGE 1: Extract"
    capture noisily do "scripts/extract.do" `year'
}
// case 1 LOG: (skipped - ${run_extract} = 0)
// case 2 LOG: STAGE 1: Extract (followed by output from extract.do)

Rule 6: Foreach Loops

Show example output for each iteration.

stata

local log_files: dir "${log_dir_cms}" files "${config_name}*.log"
foreach logfile of local log_files {
    copy "${log_dir_cms}/`logfile'" "${log_dir_archive}/`logfile'", replace
    display "  Copied: `logfile'"
}
// LOG:   Copied: config_production_2015-2020-master.log
// LOG:   Copied: config_production_2015-2020-extract-2015.log
// LOG:   (one line per log file copied)

Rule 7: Multi-Line Display Blocks

Group related displays together with one LOG comment.

stata

display ""
display "=========================================="
display "Archiving logs to 0-Logging-Store..."
display "=========================================="
// LOG: (blank line)
// LOG: ==========================================
// LOG: Archiving logs to 0-Logging-Store...
// LOG: ==========================================

Rule 8: No Output Blocks

Note when blocks don't produce log output.

stata

capture mkdir "${temp_dir}"
// (no output - capture suppresses display)

log using "${log_file}", replace text
// (opens log file - no output to console)

local myvar = 5
// (no output - local assignment)

Code Structure Standards

Error Handling - Keep It Short

❌ BAD - Too verbose:

stata

if "`config_file'" == "" {
    display as error "ERROR: Config file required!"
    display as error ""
    display as error "Usage (from code/ directory):"
    display as error "  cd code"
    display as error "  stata-mp -b do script.do <config_name>"
    display as error ""
    display as error "Example:"
    display as error "  stata-mp -b do script.do config_production"
    exit 198
}

✓ GOOD - Concise with emoji:

stata

if "`config_file'" == "" {
    display as error "❌ ERROR: Config file required! Usage: stata-mp -b do script.do <config>"
    exit 198
}
else {
    display "✓ Config file: `config_file'"
}
// case 1 LOG: ❌ ERROR: Config file required! (and exits)
// case 2 LOG: ✓ Config file: config_production

Directory Creation - Show Paths

Always display the path before creating directories, then assert success.

stata

display "  Creating: ${output_root}"
capture mkdir "${output_root}"
assert _rc == 0 | _rc == 693  // 0=created, 693=already exists
// LOG:   Creating: _WorkSpace/1-CMSStore

Configuration Loading - Show What's Set

Document what global variables the config file sets.

stata

// Load configuration file
// Sets globals: ${output_root}, ${cms_store}, ${year_start}, ${year_end}, etc.
display "=========================================="
display "CMS DATA PREPARATION PIPELINE"
display "Loading config: `config_file'"
display "=========================================="
// LOG: ==========================================
// LOG: CMS DATA PREPARATION PIPELINE
// LOG: Loading config: config_production_2015-2020
// LOG: ==========================================

capture do "config\`config_file'.do"
if _rc != 0 {
    display as error "ERROR: Could not load config\`config_file'.do"
    exit 198
}
// case 1 LOG: (nothing - config loaded successfully)
// case 2 LOG: ERROR: Could not load config\config_production_2015-2020.do (and exits)

Assertions After Critical Operations

Use assertions to verify operations succeeded.

stata

capture mkdir "${log_dir}"
assert _rc == 0 | _rc == 693  // Fails if permission denied or other error

capture use "data.dta", clear
assert _rc == 0  // Fails if file not found

Summary Checklist

When writing Stata code, ensure:

Why This Matters

Comprehensive LOG documentation allows you to:

•Verify correctness: Compare actual log output against expected LOG comments
•Debug faster: Know exactly what should appear at each step
•Understand flow: See all possible execution paths (case 1, case 2, etc.)
•Catch errors: Assertions fail immediately if something goes wrong
•Document behavior: Code is self-documenting with clear output expectations