--- frontmattername: education-data-source-eada
description: >-
Equity in Athletics Disclosure Act (EADA) data for college athletics gender
equity analysis. Use when analyzing athletic participation, coaching staff,
salaries, expenses, or revenues at colleges/universities, or understanding
Title IX context in athletics. EADA is NOT Title IX compliance data.
metadata:
audience: data-analysts
domain: education-data
EADA Data Source Reference
The EADA provides the only standardized, publicly available dataset on college athletics participation, coaching, finances, and athletic aid by gender across ~2,000+ postsecondary institutions, enabling gender equity analysis in intercollegiate athletics.
CRITICAL: Value Encoding
EADA data from the Education Data Portal uses integer codes for categorical
variables. Original EADA web tools use string labels; the Portal converts these
to integers. Always verify codes against the codebook (see Truth Hierarchy below).
| Context | ath_classification_code | Missing values |
|---|
| Portal (integers) | 1 = NCAA DI FBS | -1, -2, -3 |
| Original EADA | String labels | Blank / N/A |
Note: There is no sector column in EADA Portal data. To filter by sector,
join with IPEDS directory data on unitid.
See ./references/variable-definitions.md for complete encoding tables.
What is EADA?
- •Collector: U.S. Department of Education (Office of Postsecondary Education)
- •Coverage: ~2,000+ coeducational postsecondary institutions with intercollegiate athletics
- •Mandate: Institutions participating in Title IV aid with athletic programs must report
- •Frequency: Annual (data publicly available by October 15 each year)
- •Available years: 2002–2021 (Portal mirror)
- •Primary identifier:
unitid (6-digit IPEDS institution ID)
- •Content: Athletic participation, coaching staff, salaries, expenses, revenues, and athletic aid — all reported by gender
Reference File Structure
| File | Purpose | When to Read |
|---|
title-ix-context.md | Legal framework, gender equity requirements | Understanding policy context |
data-elements.md | Participation, coaches, salaries, expenses, revenues | Identifying available variables |
sport-level-data.md | Data available by individual sport | Sport-specific analysis |
variable-definitions.md | Key variables, codes, special values | Interpreting specific data elements |
limitations.md | Data quality issues, comparability, self-reporting caveats | Assessing data reliability |
fetch-patterns.md | Mirror URLs and fetch code patterns | Fetching data |
Decision Trees
What analysis am I conducting?
Research question?
├─ Gender equity overview → Start with participation + aid ratios
│ └─ See ./references/data-elements.md
├─ Coaching disparities → Coach counts + salaries by gender
│ └─ See ./references/data-elements.md (Coaching section)
├─ Financial investment → Expenses + revenues by team gender
│ └─ See ./references/data-elements.md (Financial section)
├─ Sport-specific analysis → Individual sport data
│ └─ See ./references/sport-level-data.md
├─ Title IX compliance assessment → CAUTION: EADA ≠ compliance data
│ └─ See ./references/limitations.md (Critical)
└─ Trend analysis → Year-over-year comparisons
└─ See ./references/fetch-patterns.md
What variables do I need?
Variable categories?
├─ Participation counts
│ ├─ Unduplicated by gender → `undup_athpartic_men`, `undup_athpartic_women`
│ ├─ Duplicated (sport-level sum) → `athpartic_men`, `athpartic_women`
│ ├─ Coed teams → `athpartic_coed_men`, `athpartic_coed_women`
│ └─ By sport → See ./references/sport-level-data.md
├─ Coaching
│ ├─ Head coaches → `men_fthdcoach_*`, `women_fthdcoach_*` variables
│ ├─ Assistant coaches → `men_ftascoach_*`, `women_ftascoach_*` variables
│ └─ Salaries → `hdcoach_salary_*`, `ascoach_salary_*` variables
├─ Financial
│ ├─ Expenses → `ath_exp_*` variables
│ ├─ Revenues → `ath_rev_*` variables
│ └─ Athletic aid → `ath_stuaid_*` variables
└─ Detailed definitions → See ./references/variable-definitions.md
How do I interpret the data?
Interpretation question?
├─ What counts as "participation"?
│ └─ See ./references/variable-definitions.md
├─ Why don't participation ratios match enrollment?
│ └─ See ./references/limitations.md
├─ Is this institution Title IX compliant?
│ └─ CANNOT determine from EADA data alone
│ └─ See ./references/limitations.md (Critical)
├─ Why are some values missing or zero?
│ └─ See ./references/limitations.md
└─ How do I compare across institutions?
└─ See ./references/limitations.md (Comparability section)
Quick Reference: Key Metrics
Participation Equity Indicators
| Metric | Calculation | Interpretation |
|---|
| Female participation ratio | undup_athpartic_women / (undup_athpartic_men + undup_athpartic_women) | Compare to female enrollment ratio |
| Participation gap | Female enrollment % - Female participation % | Positive = underrepresentation |
| Opportunities per student | undup_athpartic_total / enrollment_total | Athletic opportunity rate |
Financial Equity Indicators
| Metric | Calculation | Notes |
|---|
| Aid ratio | ath_stuaid_women / (ath_stuaid_men + ath_stuaid_women) | Should approximate participation ratio |
| Per-participant expense | ath_opexp_perpart_men, ath_opexp_perpart_women | Pre-calculated per-participant operating expense |
| Recruiting investment | recruitexp_men, recruitexp_women | Indicator of program investment |
Coaching Equity Indicators
| Metric | Focus | Variables |
|---|
| Female coaches of women's teams | % female | women_fthdcoach_fem, women_pthdcoach_fem |
| Salary equity | Avg salary comparison | hdcoach_salary_men, hdcoach_salary_women |
Key Identifiers
| ID | Format | Level | Example | Notes |
|---|
unitid | 6-digit integer | Institution | 110635 | Same as IPEDS; primary join key |
opeid | String | Institution | "00123400" | OPE ID (may be null for early years) |
year | 4-digit integer | Reporting year | 2021 | Fiscal year ending |
fips | Integer | State | 6 (California) | Federal FIPS code |
inst_name | String | Institution | "University of..." | Institution name |
Common Filters
| Filter | Variable | Example Values |
|---|
| Institution | unitid | 6-digit IPEDS ID |
| Year | year | 2002–2021 |
| State | fips | Integer FIPS code (e.g., 6 = California) |
| Athletic Division | ath_classification_code | Integer codes 1–20 (see below) |
Note: There is no sector column in the EADA Portal data. To filter by institutional sector, join with IPEDS directory data on unitid.
Athletic Classification Codes
| Code | Division | Code | Division |
|---|
| 1 | NCAA Division I FBS | 12 | NJCAA Division I |
| 2 | NCAA Division I FCS | 13 | NJCAA Division II |
| 3 | NCAA Division I (no football) | 14 | NJCAA Division III |
| 4 | NCAA Division II (with football) | 15 | NCCAA Division I |
| 5 | NCAA Division II (no football) | 16 | NCCAA Division II |
| 6 | NCAA Division III (with football) | 17 | CCCAA |
| 7 | NCAA Division III (no football) | 18 | Independent |
| 8 | Other (check ath_classification_other) | 19 | NWAC |
| 9 | NAIA Division I | 20 | USCAA |
| 10 | NAIA Division II | | |
| 11 | NAIA Division III | | |
Note: Code 1 was historically labeled "NCAA Division I-A" and code 2 "NCAA Division I-AA" in earlier years. The ath_classification_name string column reflects the label used at the time of reporting.
Missing Data Codes
| Code | Meaning | When Used |
|---|
-1 | Missing/not reported | Data not submitted by institution |
-2 | Not applicable | Item doesn't apply (e.g., no men's team) |
-3 | Suppressed | Data suppressed for privacy |
Data Availability
| Topic | Years Available | Update Frequency |
|---|
| Institution-level | 2002–2021 | Annual |
| Sport-level | 2002–2021 | Annual |
| Coaching details | 2002–2021 | Annual |
| Financial data | 2002–2021 | Annual |
Note: Some columns (e.g., num_sports, aggregated totals with _all suffix) are null for earlier years (2002) and were added in later reporting cycles. The opeid column is null for 2002.
Example Research Questions
| Question | Key Variables | Reference |
|---|
| Are women underrepresented in athletics? | undup_athpartic_*, enrollment_* | data-elements.md |
| How much do institutions invest in women's sports? | ath_exp_*, ath_rev_* | data-elements.md |
| Are coaches of women's teams paid fairly? | hdcoach_salary_* | variable-definitions.md |
| Which sports have most female participants? | Sport-level data | sport-level-data.md |
| Has participation equity improved over time? | Multi-year trend | fetch-patterns.md |
Data Access
Datasets for EADA are available via the mirror system. All data fetching uses fetch_from_mirrors() from fetch-patterns.md, with mirrors defined in mirrors.yaml and canonical paths in datasets-reference.md.
Key datasets:
| Dataset | Path | Type | Codebook |
|---|
| Institutional Characteristics | eada/colleges_eada_inst_characteristics | Single | eada/codebook_colleges_eada_inst-characteristics |
EADA naming note: The data path uses inst_characteristics (underscores) while the codebook path uses inst-characteristics (hyphens). Always use the exact paths from datasets-reference.md.
Truth Hierarchy
When interpreting EADA variable definitions and coded values, apply this priority:
| Priority | Source | Rationale |
|---|
| 1 (highest) | Actual data file (parquet) | What you observe IS the truth |
| 2 | Live codebook (.xls via get_codebook_url()) | Authoritative documentation; may lag |
| 3 (lowest) | This skill's reference docs | Summarized; convenient but may drift |
Use get_codebook_url("eada/codebook_colleges_eada_inst-characteristics") from fetch-patterns.md to construct the codebook download URL.
Filtering
import polars as pl
# Filter by athletic division (NCAA Division I FBS only)
df_d1_fbs = df.filter(pl.col("ath_classification_code") == 1)
# Exclude coded missing values before calculations
df_clean = df.filter(
(pl.col("undup_athpartic_men") >= 0) &
(pl.col("undup_athpartic_women") >= 0)
)
# Note: No `sector` column in EADA data. To filter by sector,
# join with IPEDS directory data on unitid first.
Common Pitfalls
| Pitfall | Issue | Solution |
|---|
| Including coded missing values | -1, -2, -3 treated as real numbers skew totals and ratios | Filter >= 0 on all numeric columns before aggregation |
| Assuming Title IX compliance | EADA data cannot determine Title IX compliance — it is a disclosure tool, not an enforcement mechanism | Read ./references/limitations.md; use EADA for descriptive analysis only |
| Comparing across institutions naively | Different reporting practices, program sizes, and classification levels make raw comparisons misleading | Normalize by enrollment, filter to same classification, and note caveats |
| Using wrong variable names | Portal variable names differ from EADA source documentation (e.g., undup_athpartic_men not partic_men) | Always verify column names against actual data or codebook; see ./references/variable-definitions.md |
| Self-reported data accuracy | Institutions self-report without independent verification; errors and inconsistencies exist | Cross-check outliers against institution websites or IPEDS data |
| Ignoring zero values | Zero may mean "no team" or "not reported" depending on context | Distinguish between true zeros and missing data using -1/-2 codes |
Assuming sector column exists | EADA data has no sector column | Join with IPEDS directory on unitid to get sector |
EADA vs. Title IX Compliance
EADA Data Title IX Compliance
──────────────────────────────────────────────────────────
Self-reported OCR investigation
Snapshot (Oct 15) Continuous obligation
Participation counts only Participation + interest + ability
No "laundry list" items 13+ treatment areas
Public disclosure Enforcement mechanism
Always read: ./references/limitations.md before drawing compliance conclusions.
Key Limitations Summary
- •Self-reported: No independent verification
- •Counting methods: Differ from Title IX counting
- •Not comprehensive: Misses many equity factors
- •Comparability issues: Different reporting practices across institutions
Related Data Sources
| Source | Relationship | When to Use |
|---|
education-data-source-ipeds | Complementary institution data | Joining enrollment, demographics, finances via unitid |
education-data-explorer | Parent discovery skill | Finding available endpoints across all sources |
education-data-query | Data fetching | Downloading parquet/CSV files from mirrors |
Topic Index
| Topic | Reference File |
|---|
| Title IX law | ./references/title-ix-context.md |
| Gender equity requirements | ./references/title-ix-context.md |
| Three-prong test | ./references/title-ix-context.md |
| Participation variables | ./references/data-elements.md |
| Coaching variables | ./references/data-elements.md |
| Salary variables | ./references/data-elements.md |
| Expense variables | ./references/data-elements.md |
| Revenue variables | ./references/data-elements.md |
| Athletic aid | ./references/data-elements.md |
| Sport-specific data | ./references/sport-level-data.md |
| Variable definitions | ./references/variable-definitions.md |
| Integer encoding tables | ./references/variable-definitions.md |
| Data limitations | ./references/limitations.md |
| Self-reporting issues | ./references/limitations.md |
| EADA vs Title IX | ./references/limitations.md |
| Fetch patterns | ./references/fetch-patterns.md |
| Mirror URLs | ./references/fetch-patterns.md |