Stata Coding Standards with LOG Documentation
Purpose
This skill defines how to write Stata code with comprehensive logging documentation to make it easy to verify that actual results match expected output.
Core Principles
- •Every block that produces output must have a LOG comment
- •LOG comments show what will appear in the actual log file
- •LOG comments go immediately after the corresponding code block
- •Use emojis for visual clarity (✓, ❌)
- •Keep error messages short and concise
Rules for LOG Comments
Rule 1: Single Display Statement
Every display statement gets a LOG comment showing the exact output.
display "Creating output directories..." // LOG: Creating output directories... display "Pipeline start time: " c(current_time) // LOG: Pipeline start time: 14:32:15
Rule 2: If-Else Blocks (Multiple Cases)
Use // case 1 LOG: and // case 2 LOG: for different outcomes.
if "`config_file'" == "" {
display as error "❌ ERROR: Config file required! Usage: stata-mp -b do script.do <config>"
exit 198
}
else {
display "✓ Config file: `config_file'"
}
// case 1 LOG: ❌ ERROR: Config file required! (and exits)
// case 2 LOG: ✓ Config file: config_production_2015-2020
Rule 3: Variable Display with Examples
Show resolved variable values in LOG comments.
display " Creating: ${output_root}"
capture mkdir "${output_root}"
assert _rc == 0 | _rc == 693
// LOG: Creating: _WorkSpace/1-CMSStore
Rule 4: Loops - Show Iteration Pattern
For loops, show what each iteration produces.
forvalues year = ${year_start}/${year_end} {
display ""
display "########## YEAR `year' ##########"
// LOG:
// LOG: ########## YEAR 2015 ##########
// LOG: ########## YEAR 2016 ##########
// LOG: (continues for each year)
}
Rule 5: Conditional Blocks (Run Flags)
When execution depends on flags, show both cases.
if ${run_extract} {
display "STAGE 1: Extract"
capture noisily do "scripts/extract.do" `year'
}
// case 1 LOG: (skipped - ${run_extract} = 0)
// case 2 LOG: STAGE 1: Extract (followed by output from extract.do)
Rule 6: Foreach Loops
Show example output for each iteration.
local log_files: dir "${log_dir_cms}" files "${config_name}*.log"
foreach logfile of local log_files {
copy "${log_dir_cms}/`logfile'" "${log_dir_archive}/`logfile'", replace
display " Copied: `logfile'"
}
// LOG: Copied: config_production_2015-2020-master.log
// LOG: Copied: config_production_2015-2020-extract-2015.log
// LOG: (one line per log file copied)
Rule 7: Multi-Line Display Blocks
Group related displays together with one LOG comment.
display "" display "==========================================" display "Archiving logs to 0-Logging-Store..." display "==========================================" // LOG: (blank line) // LOG: ========================================== // LOG: Archiving logs to 0-Logging-Store... // LOG: ==========================================
Rule 8: No Output Blocks
Note when blocks don't produce log output.
capture mkdir "${temp_dir}"
// (no output - capture suppresses display)
log using "${log_file}", replace text
// (opens log file - no output to console)
local myvar = 5
// (no output - local assignment)
Code Structure Standards
Error Handling - Keep It Short
❌ BAD - Too verbose:
if "`config_file'" == "" {
display as error "ERROR: Config file required!"
display as error ""
display as error "Usage (from code/ directory):"
display as error " cd code"
display as error " stata-mp -b do script.do <config_name>"
display as error ""
display as error "Example:"
display as error " stata-mp -b do script.do config_production"
exit 198
}
✓ GOOD - Concise with emoji:
if "`config_file'" == "" {
display as error "❌ ERROR: Config file required! Usage: stata-mp -b do script.do <config>"
exit 198
}
else {
display "✓ Config file: `config_file'"
}
// case 1 LOG: ❌ ERROR: Config file required! (and exits)
// case 2 LOG: ✓ Config file: config_production
Directory Creation - Show Paths
Always display the path before creating directories, then assert success.
display " Creating: ${output_root}"
capture mkdir "${output_root}"
assert _rc == 0 | _rc == 693 // 0=created, 693=already exists
// LOG: Creating: _WorkSpace/1-CMSStore
Configuration Loading - Show What's Set
Document what global variables the config file sets.
// Load configuration file
// Sets globals: ${output_root}, ${cms_store}, ${year_start}, ${year_end}, etc.
display "=========================================="
display "CMS DATA PREPARATION PIPELINE"
display "Loading config: `config_file'"
display "=========================================="
// LOG: ==========================================
// LOG: CMS DATA PREPARATION PIPELINE
// LOG: Loading config: config_production_2015-2020
// LOG: ==========================================
capture do "config\`config_file'.do"
if _rc != 0 {
display as error "ERROR: Could not load config\`config_file'.do"
exit 198
}
// case 1 LOG: (nothing - config loaded successfully)
// case 2 LOG: ERROR: Could not load config\config_production_2015-2020.do (and exits)
Assertions After Critical Operations
Use assertions to verify operations succeeded.
capture mkdir "${log_dir}"
assert _rc == 0 | _rc == 693 // Fails if permission denied or other error
capture use "data.dta", clear
assert _rc == 0 // Fails if file not found
Summary Checklist
When writing Stata code, ensure:
- • Every display statement has a LOG comment
- • If-else blocks use
case 1 LOG:andcase 2 LOG: - • Variables like ${var} show example resolved values in LOG
- • Loops show iteration patterns in LOG
- • Error messages are short with emojis (❌, ✓)
- • Directory creation displays the full path first
- • Assertions verify critical operations
- • LOG comments appear immediately after their corresponding block
- • Multi-line outputs are documented in LOG comments
- • Blocks with no output are noted
Why This Matters
Comprehensive LOG documentation allows you to:
- •Verify correctness: Compare actual log output against expected LOG comments
- •Debug faster: Know exactly what should appear at each step
- •Understand flow: See all possible execution paths (case 1, case 2, etc.)
- •Catch errors: Assertions fail immediately if something goes wrong
- •Document behavior: Code is self-documenting with clear output expectations