AgentSkillsCN

education-data-source-eada

用于高校体育性别平等分析的《体育公平披露法案》(EADA)数据。适用于分析高校/大学的体育参与度、教练团队、薪资、开支或收入,或理解体育领域的第九条背景时使用。EADA 并非第九条合规数据。

SKILL.md
--- frontmatter
name: education-data-source-eada
description: >-
  Equity in Athletics Disclosure Act (EADA) data for college athletics gender
  equity analysis. Use when analyzing athletic participation, coaching staff,
  salaries, expenses, or revenues at colleges/universities, or understanding
  Title IX context in athletics. EADA is NOT Title IX compliance data.
metadata:
  audience: data-analysts
  domain: education-data

EADA Data Source Reference

The EADA provides the only standardized, publicly available dataset on college athletics participation, coaching, finances, and athletic aid by gender across ~2,000+ postsecondary institutions, enabling gender equity analysis in intercollegiate athletics.

CRITICAL: Value Encoding

EADA data from the Education Data Portal uses integer codes for categorical variables. Original EADA web tools use string labels; the Portal converts these to integers. Always verify codes against the codebook (see Truth Hierarchy below).

Contextath_classification_codeMissing values
Portal (integers)1 = NCAA DI FBS-1, -2, -3
Original EADAString labelsBlank / N/A

Note: There is no sector column in EADA Portal data. To filter by sector, join with IPEDS directory data on unitid.

See ./references/variable-definitions.md for complete encoding tables.

What is EADA?

  • Collector: U.S. Department of Education (Office of Postsecondary Education)
  • Coverage: ~2,000+ coeducational postsecondary institutions with intercollegiate athletics
  • Mandate: Institutions participating in Title IV aid with athletic programs must report
  • Frequency: Annual (data publicly available by October 15 each year)
  • Available years: 2002–2021 (Portal mirror)
  • Primary identifier: unitid (6-digit IPEDS institution ID)
  • Content: Athletic participation, coaching staff, salaries, expenses, revenues, and athletic aid — all reported by gender

Reference File Structure

FilePurposeWhen to Read
title-ix-context.mdLegal framework, gender equity requirementsUnderstanding policy context
data-elements.mdParticipation, coaches, salaries, expenses, revenuesIdentifying available variables
sport-level-data.mdData available by individual sportSport-specific analysis
variable-definitions.mdKey variables, codes, special valuesInterpreting specific data elements
limitations.mdData quality issues, comparability, self-reporting caveatsAssessing data reliability
fetch-patterns.mdMirror URLs and fetch code patternsFetching data

Decision Trees

What analysis am I conducting?

code
Research question?
├─ Gender equity overview → Start with participation + aid ratios
│   └─ See ./references/data-elements.md
├─ Coaching disparities → Coach counts + salaries by gender
│   └─ See ./references/data-elements.md (Coaching section)
├─ Financial investment → Expenses + revenues by team gender
│   └─ See ./references/data-elements.md (Financial section)
├─ Sport-specific analysis → Individual sport data
│   └─ See ./references/sport-level-data.md
├─ Title IX compliance assessment → CAUTION: EADA ≠ compliance data
│   └─ See ./references/limitations.md (Critical)
└─ Trend analysis → Year-over-year comparisons
    └─ See ./references/fetch-patterns.md

What variables do I need?

code
Variable categories?
├─ Participation counts
│   ├─ Unduplicated by gender → `undup_athpartic_men`, `undup_athpartic_women`
│   ├─ Duplicated (sport-level sum) → `athpartic_men`, `athpartic_women`
│   ├─ Coed teams → `athpartic_coed_men`, `athpartic_coed_women`
│   └─ By sport → See ./references/sport-level-data.md
├─ Coaching
│   ├─ Head coaches → `men_fthdcoach_*`, `women_fthdcoach_*` variables
│   ├─ Assistant coaches → `men_ftascoach_*`, `women_ftascoach_*` variables
│   └─ Salaries → `hdcoach_salary_*`, `ascoach_salary_*` variables
├─ Financial
│   ├─ Expenses → `ath_exp_*` variables
│   ├─ Revenues → `ath_rev_*` variables
│   └─ Athletic aid → `ath_stuaid_*` variables
└─ Detailed definitions → See ./references/variable-definitions.md

How do I interpret the data?

code
Interpretation question?
├─ What counts as "participation"?
│   └─ See ./references/variable-definitions.md
├─ Why don't participation ratios match enrollment?
│   └─ See ./references/limitations.md
├─ Is this institution Title IX compliant?
│   └─ CANNOT determine from EADA data alone
│       └─ See ./references/limitations.md (Critical)
├─ Why are some values missing or zero?
│   └─ See ./references/limitations.md
└─ How do I compare across institutions?
    └─ See ./references/limitations.md (Comparability section)

Quick Reference: Key Metrics

Participation Equity Indicators

MetricCalculationInterpretation
Female participation ratioundup_athpartic_women / (undup_athpartic_men + undup_athpartic_women)Compare to female enrollment ratio
Participation gapFemale enrollment % - Female participation %Positive = underrepresentation
Opportunities per studentundup_athpartic_total / enrollment_totalAthletic opportunity rate

Financial Equity Indicators

MetricCalculationNotes
Aid ratioath_stuaid_women / (ath_stuaid_men + ath_stuaid_women)Should approximate participation ratio
Per-participant expenseath_opexp_perpart_men, ath_opexp_perpart_womenPre-calculated per-participant operating expense
Recruiting investmentrecruitexp_men, recruitexp_womenIndicator of program investment

Coaching Equity Indicators

MetricFocusVariables
Female coaches of women's teams% femalewomen_fthdcoach_fem, women_pthdcoach_fem
Salary equityAvg salary comparisonhdcoach_salary_men, hdcoach_salary_women

Key Identifiers

IDFormatLevelExampleNotes
unitid6-digit integerInstitution110635Same as IPEDS; primary join key
opeidStringInstitution"00123400"OPE ID (may be null for early years)
year4-digit integerReporting year2021Fiscal year ending
fipsIntegerState6 (California)Federal FIPS code
inst_nameStringInstitution"University of..."Institution name

Common Filters

FilterVariableExample Values
Institutionunitid6-digit IPEDS ID
Yearyear2002–2021
StatefipsInteger FIPS code (e.g., 6 = California)
Athletic Divisionath_classification_codeInteger codes 1–20 (see below)

Note: There is no sector column in the EADA Portal data. To filter by institutional sector, join with IPEDS directory data on unitid.

Athletic Classification Codes

CodeDivisionCodeDivision
1NCAA Division I FBS12NJCAA Division I
2NCAA Division I FCS13NJCAA Division II
3NCAA Division I (no football)14NJCAA Division III
4NCAA Division II (with football)15NCCAA Division I
5NCAA Division II (no football)16NCCAA Division II
6NCAA Division III (with football)17CCCAA
7NCAA Division III (no football)18Independent
8Other (check ath_classification_other)19NWAC
9NAIA Division I20USCAA
10NAIA Division II
11NAIA Division III

Note: Code 1 was historically labeled "NCAA Division I-A" and code 2 "NCAA Division I-AA" in earlier years. The ath_classification_name string column reflects the label used at the time of reporting.

Missing Data Codes

CodeMeaningWhen Used
-1Missing/not reportedData not submitted by institution
-2Not applicableItem doesn't apply (e.g., no men's team)
-3SuppressedData suppressed for privacy

Data Availability

TopicYears AvailableUpdate Frequency
Institution-level2002–2021Annual
Sport-level2002–2021Annual
Coaching details2002–2021Annual
Financial data2002–2021Annual

Note: Some columns (e.g., num_sports, aggregated totals with _all suffix) are null for earlier years (2002) and were added in later reporting cycles. The opeid column is null for 2002.

Example Research Questions

QuestionKey VariablesReference
Are women underrepresented in athletics?undup_athpartic_*, enrollment_*data-elements.md
How much do institutions invest in women's sports?ath_exp_*, ath_rev_*data-elements.md
Are coaches of women's teams paid fairly?hdcoach_salary_*variable-definitions.md
Which sports have most female participants?Sport-level datasport-level-data.md
Has participation equity improved over time?Multi-year trendfetch-patterns.md

Data Access

Datasets for EADA are available via the mirror system. All data fetching uses fetch_from_mirrors() from fetch-patterns.md, with mirrors defined in mirrors.yaml and canonical paths in datasets-reference.md.

Key datasets:

DatasetPathTypeCodebook
Institutional Characteristicseada/colleges_eada_inst_characteristicsSingleeada/codebook_colleges_eada_inst-characteristics

EADA naming note: The data path uses inst_characteristics (underscores) while the codebook path uses inst-characteristics (hyphens). Always use the exact paths from datasets-reference.md.

Truth Hierarchy

When interpreting EADA variable definitions and coded values, apply this priority:

PrioritySourceRationale
1 (highest)Actual data file (parquet)What you observe IS the truth
2Live codebook (.xls via get_codebook_url())Authoritative documentation; may lag
3 (lowest)This skill's reference docsSummarized; convenient but may drift

Use get_codebook_url("eada/codebook_colleges_eada_inst-characteristics") from fetch-patterns.md to construct the codebook download URL.

Filtering

python
import polars as pl

# Filter by athletic division (NCAA Division I FBS only)
df_d1_fbs = df.filter(pl.col("ath_classification_code") == 1)

# Exclude coded missing values before calculations
df_clean = df.filter(
    (pl.col("undup_athpartic_men") >= 0) &
    (pl.col("undup_athpartic_women") >= 0)
)

# Note: No `sector` column in EADA data. To filter by sector,
# join with IPEDS directory data on unitid first.

Common Pitfalls

PitfallIssueSolution
Including coded missing values-1, -2, -3 treated as real numbers skew totals and ratiosFilter >= 0 on all numeric columns before aggregation
Assuming Title IX complianceEADA data cannot determine Title IX compliance — it is a disclosure tool, not an enforcement mechanismRead ./references/limitations.md; use EADA for descriptive analysis only
Comparing across institutions naivelyDifferent reporting practices, program sizes, and classification levels make raw comparisons misleadingNormalize by enrollment, filter to same classification, and note caveats
Using wrong variable namesPortal variable names differ from EADA source documentation (e.g., undup_athpartic_men not partic_men)Always verify column names against actual data or codebook; see ./references/variable-definitions.md
Self-reported data accuracyInstitutions self-report without independent verification; errors and inconsistencies existCross-check outliers against institution websites or IPEDS data
Ignoring zero valuesZero may mean "no team" or "not reported" depending on contextDistinguish between true zeros and missing data using -1/-2 codes
Assuming sector column existsEADA data has no sector columnJoin with IPEDS directory on unitid to get sector

EADA vs. Title IX Compliance

code
EADA Data                          Title IX Compliance
──────────────────────────────────────────────────────────
Self-reported                      OCR investigation
Snapshot (Oct 15)                  Continuous obligation
Participation counts only          Participation + interest + ability
No "laundry list" items           13+ treatment areas
Public disclosure                  Enforcement mechanism

Always read: ./references/limitations.md before drawing compliance conclusions.

Key Limitations Summary

  • Self-reported: No independent verification
  • Counting methods: Differ from Title IX counting
  • Not comprehensive: Misses many equity factors
  • Comparability issues: Different reporting practices across institutions

Related Data Sources

SourceRelationshipWhen to Use
education-data-source-ipedsComplementary institution dataJoining enrollment, demographics, finances via unitid
education-data-explorerParent discovery skillFinding available endpoints across all sources
education-data-queryData fetchingDownloading parquet/CSV files from mirrors

Topic Index

TopicReference File
Title IX law./references/title-ix-context.md
Gender equity requirements./references/title-ix-context.md
Three-prong test./references/title-ix-context.md
Participation variables./references/data-elements.md
Coaching variables./references/data-elements.md
Salary variables./references/data-elements.md
Expense variables./references/data-elements.md
Revenue variables./references/data-elements.md
Athletic aid./references/data-elements.md
Sport-specific data./references/sport-level-data.md
Variable definitions./references/variable-definitions.md
Integer encoding tables./references/variable-definitions.md
Data limitations./references/limitations.md
Self-reporting issues./references/limitations.md
EADA vs Title IX./references/limitations.md
Fetch patterns./references/fetch-patterns.md
Mirror URLs./references/fetch-patterns.md