Personal Genomics Skill v4.4.0
Comprehensive local DNA analysis with 1600+ markers across 30 categories. Privacy-first genetic analysis for AI agents.
🆕 v4.4.0: 1000 GENOMES POPULATION COMPARISON & ANCIENT DNA
- •Transparent population comparison (not black-box percentages)
- •Ancient DNA signal detection (WHG, ANF, Yamnaya, Neanderthal)
- •Interactive dashboard with population frequency visualizations
- •Complete methodology documentation
⚠️ v4.3.0 focuses on ACCURACY AND HONESTY - improved uncertainty handling, PMIDs for all claims, and explicit limitations.
Quick Start
python comprehensive_analysis.py /path/to/dna_file.txt
⚠️ Important Limitations
- •
Haplogroups are LOW CONFIDENCE - Consumer arrays cannot reliably call haplogroups. Recommend dedicated Y-DNA/mtDNA testing (FTDNA, YFull) for accuracy.
- •
Ancestry shows ANCIENT SIGNALS, not modern ethnicity - Modern ethnicity percentages are unreliable. Instead we detect signals from well-characterized ancient populations (WHG, Neolithic Farmers, Steppe, Neanderthal, Denisovan).
- •
PRS scores show RANGES, not point estimates - Polygenic risk scores have wide confidence intervals. Most conditions are 50-80% non-genetic.
- •
Every marker has PMIDs - All claims are backed by literature citations linked to PubMed.
Triggers
Activate this skill when user mentions:
- •DNA analysis, genetic analysis, genome analysis
- •23andMe, AncestryDNA, MyHeritage results
- •Pharmacogenomics, drug-gene interactions
- •Medication interactions, drug safety
- •Genetic risk, disease risk, health risk
- •Carrier status, carrier testing
- •VCF file analysis
- •APOE, MTHFR, CYP2D6, BRCA, or other gene names
- •Polygenic risk scores
- •Haplogroups, maternal lineage, paternal lineage
- •Ancestry composition, ethnicity
- •Hereditary cancer, Lynch syndrome
- •Autoimmune genetics, HLA, celiac
- •Pain sensitivity, opioid response
- •Sleep optimization, chronotype, caffeine metabolism
- •Dietary genetics, lactose intolerance, celiac
- •Athletic genetics, sports performance
- •UV sensitivity, skin type, melanoma risk
- •Telomere length, longevity genetics
Supported Files
- •23andMe, AncestryDNA, MyHeritage, FTDNA
- •VCF files (whole genome/exome, .vcf or .vcf.gz)
- •Any tab-delimited rsid format
Output Location
~/dna-analysis/reports/
- •
agent_summary.json- AI-optimized, priority-sorted - •
full_analysis.json- Complete data - •
report.txt- Human-readable - •
genetic_report.pdf- Professional PDF report - •
dashboard.html- Interactive visualization
New v4.3.0 Features (Accuracy Update)
Honest Haplogroup Reporting
- •LOW CONFIDENCE labels on all haplogroup calls
- •Explicit disclaimer that consumer arrays can't reliably call haplogroups
- •Recommendations for dedicated Y-DNA/mtDNA testing services
- •PMIDs for haplogroup marker sources
Ancient Ancestral Signals (Replaces Modern Ethnicity)
- •Western Hunter-Gatherers (WHG) - Mesolithic Europeans (~15,000-8,000 BP)
- •Early European Farmers (EEF) - Neolithic Anatolians (~10,000-5,000 BP)
- •Steppe Pastoralists - Yamnaya/Bronze Age (~5,000-4,000 BP)
- •Neanderthal Introgression - Archaic human (~50,000-40,000 BP)
- •Denisovan Introgression - Archaic human (high-altitude adaptation)
- •Shows "Signals Detected" not percentages
- •Includes time periods and trait contributions
- •Based on ancient DNA studies with PMIDs
PRS with Uncertainty Ranges
- •Percentile RANGES instead of point estimates
- •Confidence intervals based on marker coverage
- •Explicit interpretation guidance ("likely average", "uncertain", etc.)
PMIDs Throughout
- •Every marker has at least one literature citation
- •Clickable PubMed links in dashboard
- •Methodology & Limitations section in dashboard
Legacy v4.0-4.2 Features
Haplogroup Analysis (indicative only)
- •Mitochondrial DNA (mtDNA) - maternal lineage
- •Y-chromosome - paternal lineage (males only)
- •Migration history context
- •PhyloTree/ISOGG standards
Ancient Ancestry (scientifically grounded)
- •Detection of ancient population signals
- •Based on well-characterized ancient DNA
- •Includes archaic introgression (Neanderthal/Denisovan)
Hereditary Cancer Panel
- •BRCA1/BRCA2 comprehensive
- •Lynch syndrome (MLH1, MSH2, MSH6, PMS2)
- •Other genes (APC, TP53, CHEK2, PALB2, ATM)
- •ACMG-style classification
Autoimmune HLA
- •Celiac (DQ2/DQ8) - can rule out if negative
- •Type 1 Diabetes
- •Ankylosing spondylitis (HLA-B27)
- •Rheumatoid arthritis, lupus, MS
Pain Sensitivity
- •COMT Val158Met
- •OPRM1 opioid receptor
- •SCN9A pain signaling
- •TRPV1 capsaicin sensitivity
- •Migraine susceptibility
PDF Reports
- •Professional format
- •Physician-shareable
- •Executive summary
- •Detailed findings
- •Disclaimers included
New v4.1.0 Features
Medication Interaction Checker
from markers.medication_interactions import check_medication_interactions
result = check_medication_interactions(
medications=["warfarin", "clopidogrel", "omeprazole"],
genotypes=user_genotypes
)
# Returns critical/serious/moderate interactions with alternatives
- •Accepts brand or generic names
- •CPIC guidelines integrated
- •PubMed citations included
- •FDA warning flags
Sleep Optimization Profile
from markers.sleep_optimization import generate_sleep_profile profile = generate_sleep_profile(genotypes) # Returns ideal wake/sleep times, coffee cutoff, etc.
- •Chronotype (morning/evening preference)
- •Caffeine metabolism speed
- •Personalized timing recommendations
Dietary Interaction Matrix
from markers.dietary_interactions import analyze_dietary_interactions diet = analyze_dietary_interactions(genotypes) # Returns food-specific guidance
- •Caffeine, alcohol, saturated fat, lactose, gluten
- •APOE-specific diet recommendations
- •Bitter taste perception
Athletic Performance Profile
from markers.athletic_profile import calculate_athletic_profile profile = calculate_athletic_profile(genotypes) # Returns power/endurance type, recovery profile, injury risk
- •Sport suitability scoring
- •Training recommendations
- •Injury prevention guidance
UV Sensitivity Calculator
from markers.uv_sensitivity import generate_uv_sensitivity_report uv = generate_uv_sensitivity_report(genotypes) # Returns skin type, SPF recommendation, melanoma risk
- •Fitzpatrick skin type estimation
- •Vitamin D synthesis capacity
- •Melanoma risk factors
Natural Language Explanations
from markers.explanations import generate_plain_english_explanation
explanation = generate_plain_english_explanation(
rsid="rs3892097", gene="CYP2D6", genotype="GA",
trait="Drug metabolism", finding="Poor metabolizer carrier"
)
- •Plain-English summaries
- •Research variant flagging
- •PubMed links
Telomere & Longevity
from markers.advanced_genetics import estimate_telomere_length telomere = estimate_telomere_length(genotypes) # Returns relative estimate with appropriate caveats
- •TERT, TERC, OBFC1 variants
- •Longevity associations (FOXO3, APOE)
Data Quality
- •Call rate analysis
- •Platform detection
- •Confidence scoring
- •Quality warnings
Export Formats
- •Genetic counselor clinical export
- •Apple Health compatible
- •API-ready JSON
- •Integration hooks
Marker Categories (21 total)
- •Pharmacogenomics (159) - Drug metabolism
- •Polygenic Risk Scores (277) - Disease risk
- •Carrier Status (181) - Recessive carriers
- •Health Risks (233) - Disease susceptibility
- •Traits (163) - Physical/behavioral
- •Haplogroups (44) - Lineage markers
- •Ancestry (124) - Population informative
- •Hereditary Cancer (41) - BRCA, Lynch, etc.
- •Autoimmune HLA (31) - HLA associations
- •Pain Sensitivity (20) - Pain/opioid response
- •Rare Diseases (29) - Rare conditions
- •Mental Health (25) - Psychiatric genetics
- •Dermatology (37) - Skin and hair
- •Vision & Hearing (33) - Sensory genetics
- •Fertility (31) - Reproductive health
- •Nutrition (34) - Nutrigenomics
- •Fitness (30) - Athletic performance
- •Neurogenetics (28) - Cognition/behavior
- •Longevity (30) - Aging markers
- •Immunity (43) - HLA and immune
- •Ancestry AIMs (24) - Admixture markers
Agent Integration
The agent_summary.json provides:
{
"critical_alerts": [],
"high_priority": [],
"medium_priority": [],
"pharmacogenomics_alerts": [],
"apoe_status": {},
"polygenic_risk_scores": {},
"haplogroups": {
"mtDNA": {"haplogroup": "H", "lineage": "maternal"},
"Y_DNA": {"haplogroup": "R1b", "lineage": "paternal"}
},
"ancestry": {
"composition": {},
"admixture": {}
},
"hereditary_cancer": {},
"autoimmune_risk": {},
"pain_sensitivity": {},
"lifestyle_recommendations": {
"diet": [],
"exercise": [],
"supplements": [],
"avoid": []
},
"drug_interaction_matrix": {},
"data_quality": {}
}
Critical Findings (Always Alert User)
Pharmacogenomics
- •DPYD variants - 5-FU/capecitabine FATAL toxicity risk
- •HLA-B*5701 - Abacavir hypersensitivity
- •HLA-B*1502 - Carbamazepine SJS (certain populations)
- •MT-RNR1 - Aminoglycoside-induced deafness
Hereditary Cancer
- •BRCA1/BRCA2 pathogenic - Breast/ovarian cancer syndrome
- •Lynch syndrome genes - Colorectal/endometrial cancer
- •TP53 pathogenic - Li-Fraumeni syndrome (multi-cancer)
Disease Risk
- •APOE ε4/ε4 - ~12x Alzheimer's risk
- •Factor V Leiden - Thrombosis risk, contraceptive implications
- •HLA-B27 - Ankylosing spondylitis susceptibility (OR ~70)
Carrier Status
- •CFTR - Cystic fibrosis (1 in 25 Europeans)
- •HBB - Sickle cell (1 in 12 African Americans)
- •HEXA - Tay-Sachs (1 in 30 Ashkenazi Jews)
Usage Examples
Basic Analysis
from comprehensive_analysis import main main() # Uses command line args
Haplogroup Analysis
from markers.haplogroups import analyze_haplogroups result = analyze_haplogroups(genotypes) print(result["mtDNA"]["haplogroup"]) # e.g., "H"
Ancestry
from markers.ancestry_composition import get_ancestry_summary ancestry = get_ancestry_summary(genotypes)
Cancer Panel
from markers.cancer_panel import analyze_cancer_panel
cancer = analyze_cancer_panel(genotypes)
if cancer["pathogenic_variants"]:
print("ALERT: Pathogenic variants detected")
Generate PDF
from pdf_report import generate_pdf_report pdf_path = generate_pdf_report(analysis_results)
Export for Genetic Counselor
from exports import generate_genetic_counselor_export clinical = generate_genetic_counselor_export(results, "clinical.json")
Privacy
- •All analysis runs locally
- •Zero network requests
- •No data leaves the machine
Limitations
- •Consumer arrays miss rare variants (~0.1% of genome)
- •Results are probabilistic, not deterministic
- •Not a medical diagnosis
- •Most conditions 50-80% non-genetic
- •Consult healthcare providers for medical decisions
- •Negative hereditary cancer result does NOT rule out cancer syndrome
- •Haplogroup resolution limited without WGS
When to Recommend Genetic Counseling
- •Any pathogenic hereditary cancer variant
- •APOE ε4/ε4 genotype
- •Multiple critical pharmacogenomic findings
- •Carrier status with reproduction implications
- •High-risk autoimmune HLA types with symptoms
- •Results causing significant user distress