Engage Indicators
Manage indicators and metrics in the Engage Analytics dbt project at /Volumes/Biliba/github/engage-analytics/dbt.
How can I help?
Ask the user which task they need help with:
- •Get Started (New User Setup) - Install prerequisites, configure dbt, set up database connection
- •Add a New Questionnaire - Generate models for a new form added to the app
- •Update Anonymization - Mark fields as PII/non-PII in questionnaire metadata
- •Create a New Metric - Add a new indicator to the metrics catalog
- •Modify an Existing Metric - Change how an existing metric is calculated
- •Export Data to S3 - Export anonymized or PII data to S3 buckets
- •Understand Existing Indicators - Query and explore current metrics in the system
Getting Started (New User Setup)
Prerequisites
- •Install uv (Python package manager):
curl -LsSf https://astral.sh/uv/install.sh | sh
- •Install direnv (optional, for auto-loading env vars):
# macOS brew install direnv # Add to shell (bash) echo 'eval "$(direnv hook bash)"' >> ~/.bashrc # Add to shell (zsh) echo 'eval "$(direnv hook zsh)"' >> ~/.zshrc
dbt Setup
cd /Volumes/Biliba/github/engage-analytics/dbt # Install dbt-postgres via uv uv tool install dbt-postgres # Install dbt package dependencies uv run dbt deps --profiles-dir .
Database Configuration
Create environment variables for database connection. Choose one method:
Option A: Using direnv (recommended)
cd /Volumes/Biliba/github/engage-analytics/dbt cp .envrc.example .envrc # or edit existing .envrc direnv allow
Option B: Manual export
export DBT_HOST=localhost export DBT_PORT=5432 export DBT_USER=postgres export DBT_PASSWORD=your_password export DBT_DATABASE=airbyte export DBT_SCHEMA=engage_analytics
Edit values to match your PostgreSQL instance.
Verify Setup
cd /Volumes/Biliba/github/engage-analytics/dbt # Test database connection uv run dbt debug --profiles-dir . # Run all models uv run dbt run --profiles-dir .
Data Export Setup (Optional)
If you need to export data to S3:
cd /Volumes/Biliba/github/engage-analytics/dataexport # Install Python dependencies uv sync # Configure environment cp .env.example .env # Edit .env with your database and AWS credentials
Quick Reference
- •Indicator specs:
indicators/engage-indicators.csv - •Metrics catalog:
macros/metrics.sql - •Metrics fact table:
engage_analytics.fct_metrics_long - •Project docs:
docs/metrics.md
Scripts (in repo root):
- •
model_generator.py- Generate questionnaire models (named + anon) - •
metadata_manager.py- Manage questionnaire metadata - •
run_dbt.sh- Run dbt commands with env vars
Data export (in dataexport/):
- •
run_export.sh- Refresh dbt + export to S3 - •
export_to_s3.py- Direct S3 export
For full project structure, see references/project-structure.md. For indicator-to-metric mapping, see references/indicator-mapping.md.
Core Capabilities
1. Understand Existing Indicators
Query current metrics:
SELECT metric_id, description, max(value) as latest_value FROM engage_analytics.fct_metrics_long GROUP BY 1, 2 ORDER BY 1;
Check indicator coverage in docs/metrics.md - maps all 32 CSV indicators to 64 dbt metrics.
2. Create New Indicator
Guided workflow - ask user these questions:
- •What is the indicator name and description?
- •What domain? (System Use, Programmatic, Treatment, Follow-up, Adoption)
- •What module? (mwTool, IPC, SBIRT, SPI, FWS, Planning Next Steps, All)
- •What is the data source?
- •Questionnaire ID (e.g.,
Questionnaire/1613532) - •Task code (e.g.,
040for follow-up) - •Existing model (e.g.,
patient,practitioners)
- •Questionnaire ID (e.g.,
- •What linkId or field contains the data? (for questionnaire-based)
- •What is the data type? (Count, Percent)
- •For percent metrics: what is the numerator and denominator?
- •What disaggregation? (by organization, by practitioner, by month)
Implementation steps:
A. If new source model needed, create in models/metrics/:
-- models/metrics/new_metric_source.sql
-- ABOUTME: [What this model does]
-- ABOUTME: [What indicator it supports]
{{ config(materialized='view') }}
select
subject_patient_id,
organization_id,
-- metric-specific fields
from {{ ref('source_model') }}
where conditions
B. Add to metrics catalog in macros/metrics.sql:
- id: new_metric_name unit: count # or percent grain: day entity_keys: [organization_id] source_model: new_metric_source expression: "count(distinct subject_patient_id)" # for count # OR for percent: # numerator: "count(distinct case when condition then subject_patient_id end)" # denominator: "nullif(count(distinct subject_patient_id), 0)" description: "Human-readable description" version: v1
C. Rebuild metrics:
cd /Volumes/Biliba/github/engage-analytics/dbt uv run dbt run --profiles-dir . --select new_metric_source fct_metrics_long
D. Verify:
SELECT * FROM engage_analytics.fct_metrics_long WHERE metric_id = 'new_metric_name';
3. Add New Questionnaire
When a new form is added to the app:
A. Find questionnaire ID in raw data:
SELECT DISTINCT questionnaire_id FROM engage_analytics_engage_analytics_stg.stg_questionnaire_response ORDER BY 1;
B. Extract metadata using metadata_manager.py:
cd /Volumes/Biliba/github/engage-analytics/dbt python3 metadata_manager.py extract --questionnaire-id NEW_ID
This extracts linkIds and adds them to data/questionnaire_metadata.csv.
C. Review and edit metadata in data/questionnaire_metadata.csv:
- •Set
anon=TRUEfor PII fields (names, DOB, phone, address) - •Set
anon=FALSEfor non-PII fields - •Add readable
labelfor each field
D. Generate models using model_generator.py:
# Generate both named and anonymized models python3 model_generator.py all --table qr_new_form --questionnaire-id NEW_ID # Or generate separately: python3 model_generator.py named --table qr_new_form --questionnaire-id NEW_ID python3 model_generator.py anon --table qr_new_form
E. Build:
uv run dbt seed --profiles-dir . # Reload metadata uv run dbt run --profiles-dir . --select qr_new_form qr_new_form_anon
Manual model creation (if scripts unavailable):
Named model (models/marts/qr_named/qr_new_form.sql):
{{ config(materialized='view') }}
{% set identifiers = ["Questionnaire/NEW_ID"] %}
{% if identifiers|length == 0 %}
select null::text as placeholder where false
{% else %}
{{ build_qr_wide_readable(identifiers, this.name) }}
{% endif %}
Anonymized model (models/marts/qr_anon/qr_new_form_anon.sql):
{{ config(materialized='view') }}
{{ create_anonymized_qr_view('qr_new_form', []) }}
4. Update Anonymization
Edit data/questionnaire_metadata.csv:
- •Set
anon=TRUEfor PII fields (names, DOB, phone, address, SSN, Medicaid) - •Set
anon=FALSEfor non-PII fields
Rebuild anonymized view:
uv run dbt seed --profiles-dir . uv run dbt run --profiles-dir . --select qr_*_anon
5. Test Metrics
Verify metric logic matches source data:
-- Get metric value SELECT metric_id, organization_id, value FROM engage_analytics.fct_metrics_long WHERE metric_id = 'metric_name'; -- Verify against source SELECT organization_id, count(distinct subject_patient_id) FROM engage_analytics.source_model GROUP BY 1;
6. Export Data to S3
The dataexport/ directory contains tools for exporting data to S3 buckets.
Full refresh + export (recommended):
cd /Volumes/Biliba/github/engage-analytics/dataexport ./run_export.sh anon # Anonymized data only ./run_export.sh pii # PII data only ./run_export.sh # Both (default)
This runs:
- •
dbt runto refresh all models - •
export_to_s3.pyto export and upload
Direct export (skip dbt refresh):
cd /Volumes/Biliba/github/engage-analytics/dataexport uv run python export_to_s3.py --type anon uv run python export_to_s3.py --type pii uv run python export_to_s3.py --type both uv run python export_to_s3.py --type both --delete-local # Remove local files after upload
Environment setup - create dataexport/.env with:
# Database DBT_HOST=localhost DBT_PORT=5432 DBT_USER=postgres DBT_PASSWORD=your_password DBT_DATABASE=airbyte # AWS AWS_ACCESS_KEY_ID=AKIA... AWS_SECRET_ACCESS_KEY=your_secret_key AWS_REGION=us-east-1 # S3 Buckets S3_BUCKET_ANON=engage-analytics-exports-anon S3_BUCKET_PII=engage-analytics-exports-pii
S3 file structure:
s3://engage-analytics-exports-anon/
└── 2024/01/12/
└── engage_analytics_export_anon_20240112_143022.zip
What gets exported:
- •Anon:
qr_*_anonviews,patient_anon, resource tables - •PII:
qr_*views,patient, resource tables
Common Patterns
Eligibility from mwTool
-- Extract boolean from mwTool (Questionnaire/1613532) (jsonb_path_query_first(items::jsonb, '$.**.item[*] ? (@.linkId == "flag-name").answer[0].valueBoolean'))::boolean
Acceptance from Planning Next Steps
-- Check acceptance field WHERE planning_next_steps_did_the_client_accept_X = 'true'
Session Tracking
-- Count sessions by intervention type SELECT intervention_type, count(distinct qr_id) FROM engage_analytics.intervention_sessions GROUP BY 1;
Assessment Score Severity
- •PHQ-9: Severe 20-27, Moderate 10-19, Mild 5-9, Minimal 0-4
- •GAD-7: Severe 15-21, Moderate 10-14, Mild 5-9, Minimal 0-4