AgentSkillsCN

tooluniverse-disease-research

利用100多种ToolUniverse工具,生成全面的疾病研究报告。以详细的Markdown格式生成报告文件,并在10个研究维度中逐步更新研究成果。所有信息均附有可靠的来源引用与证据分级。当用户询问某种疾病、综合征,或需要对疾病进行系统性分析时,本技能将为您提供有力支持。

SKILL.md
--- frontmatter
name: tooluniverse-disease-research
description: Generate comprehensive disease research reports using 100+ ToolUniverse tools. Creates a detailed markdown report file and progressively updates it with findings from 10 research dimensions. All information includes source references and evidence grading. Use when users ask about diseases, syndromes, or need systematic disease analysis.

ToolUniverse Disease Research

Generate a comprehensive, detailed disease research report with full source citations and evidence grading. The report is created as a markdown file and progressively updated during research.

KEY PRINCIPLES:

  1. Report-first approach - Create report file FIRST, then populate progressively
  2. Evidence grading - Grade all claims by evidence strength (T1-T4)
  3. Citation requirements - Every fact must have inline source attribution
  4. Mandatory completeness - All sections must exist, even if "limited data"
  5. Disease disambiguation - Resolve EFO/ICD/UMLS IDs before research

Evidence Grading System (MANDATORY)

CRITICAL: Grade every claim by evidence strength for disease research.

Evidence Tiers for Disease Research

TierSymbolCriteriaExamples
T1★★★Causal evidence, clinical trials, FDA approvalMendelian gene mutations, Phase 3 trials
T2★★☆Functional validation, large cohort studiesGWAS + functional follow-up, N>1000 cohorts
T3★☆☆Association only, small studies, computationalGWAS without replication, case reports
T4☆☆☆Review mention, database annotation, predictionReview articles, text-mined associations

Apply in Report

markdown
### 3.1 Causal Genes (Mendelian)
Mutations in *PARK2* cause autosomal recessive juvenile Parkinson's [★★★: OMIM:602544, 
>500 families]. *LRRK2* G2019S is the most common genetic cause [★★★: PMID:15541309].

### 3.2 GWAS Associations
rs356219 at *SNCA* is associated with PD risk (OR=1.3) [★★☆: PMID:19915575, GWAS + replication].
rs6430538 shows association in European populations [★☆☆: PMID:xxxxx, single GWAS].

When to Use

Apply when the user:

  • Asks about any disease, syndrome, or medical condition
  • Needs comprehensive disease intelligence
  • Wants a detailed research report with citations
  • Asks "what do we know about [disease]?"

Core Workflow: Report-First Approach

DO NOT show the search process to the user. Instead:

  1. Create report file first - Initialize {disease_name}_research_report.md
  2. Research each dimension - Use all relevant tools
  3. Update report progressively - Write findings to file after each dimension
  4. Include citations - Every fact must reference its source tool
code
User: "Research Parkinson's disease"

Agent Actions (internal, not shown to user):
1. Create "parkinsons_disease_research_report.md" with template
2. Research DIM 1 → Update Identity section
3. Research DIM 2 → Update Clinical section
4. ... continue for all 10 dimensions
5. Present final report to user

Report Template

Create this file structure at the start:

markdown
# Disease Research Report: {Disease Name}

**Report Generated**: {date}
**Disease Identifiers**: (to be filled)

---

## Executive Summary

(Brief 3-5 sentence overview - fill after all research complete)

---

## 1. Disease Identity & Classification

### Ontology Identifiers
| System | ID | Source |
|--------|-----|--------|
| EFO | | |
| ICD-10 | | |
| UMLS CUI | | |
| SNOMED CT | | |

### Synonyms & Alternative Names
- (list with source)

### Disease Hierarchy
- Parent: 
- Subtypes:

**Sources**: (list tools used)

---

## 2. Clinical Presentation

### Phenotypes (HPO)
| HPO ID | Phenotype | Description | Source |
|--------|-----------|-------------|--------|

### Symptoms & Signs
- (list with source)

### Diagnostic Criteria
- (from literature/MedlinePlus)

**Sources**: (list tools used)

---

## 3. Genetic & Molecular Basis

### Associated Genes
| Gene | Score | Ensembl ID | Evidence | Source |
|------|-------|------------|----------|--------|

### GWAS Associations
| SNP | P-value | Odds Ratio | Study | Source |
|-----|---------|------------|-------|--------|

### Pathogenic Variants (ClinVar)
| Variant | Clinical Significance | Condition | Source |
|---------|----------------------|-----------|--------|

**Sources**: (list tools used)

---

## 4. Treatment Landscape

### Approved Drugs
| Drug | ChEMBL ID | Mechanism | Phase | Target | Source |
|------|-----------|-----------|-------|--------|--------|

### Clinical Trials
| NCT ID | Title | Phase | Status | Intervention | Source |
|--------|-------|-------|--------|--------------|--------|

### Treatment Guidelines
- (from literature)

**Sources**: (list tools used)

---

## 5. Biological Pathways & Mechanisms

### Key Pathways
| Pathway | Reactome ID | Genes Involved | Source |
|---------|-------------|----------------|--------|

### Protein-Protein Interactions
- (tissue-specific networks)

### Expression Patterns
| Tissue | Expression Level | Source |
|--------|------------------|--------|

**Sources**: (list tools used)

---

## 6. Epidemiology & Risk Factors

### Prevalence & Incidence
- (from literature)

### Risk Factors
| Factor | Evidence | Source |
|--------|----------|--------|

### GWAS Studies
| Study | Sample Size | Findings | Source |
|-------|-------------|----------|--------|

**Sources**: (list tools used)

---

## 7. Literature & Research Activity

### Publication Trends
- Total publications (5 years): 
- Current year: 
- Trend: 

### Key Publications
| PMID | Title | Year | Citations | Source |
|------|-------|------|-----------|--------|

### Research Institutions
- (from OpenAlex)

**Sources**: (list tools used)

---

## 8. Similar Diseases & Comorbidities

### Similar Diseases
| Disease | Similarity Score | Shared Genes | Source |
|---------|-----------------|--------------|--------|

### Comorbidities
- (from literature/clinical data)

**Sources**: (list tools used)

---

## 9. Cancer-Specific Information (if applicable)

### CIViC Variants
| Gene | Variant | Evidence Level | Clinical Significance | Source |
|------|---------|----------------|----------------------|--------|

### Molecular Profiles
- (biomarkers)

### Targeted Therapies
| Therapy | Target | Evidence | Source |
|---------|--------|----------|--------|

**Sources**: (list tools used)

---

## 10. Drug Safety & Adverse Events

### Drug Warnings
| Drug | Warning Type | Description | Source |
|------|--------------|-------------|--------|

### Clinical Trial Adverse Events
| Trial | Drug | Adverse Event | Frequency | Source |
|-------|------|---------------|-----------|--------|

### FAERS Reports
- (FDA adverse event data)

**Sources**: (list tools used)

---

## References

### Data Sources Used
| Tool | Query | Section |
|------|-------|---------|

### Database Versions
- OpenTargets: (version/date)
- ClinVar: (version/date)
- GWAS Catalog: (version/date)

Research Protocol

Step 1: Initialize Report

python
from datetime import datetime

def create_report_file(disease_name):
    """Create initial report file with template"""
    filename = f"{disease_name.lower().replace(' ', '_')}_research_report.md"
    
    template = f"""# Disease Research Report: {disease_name}

**Report Generated**: {datetime.now().strftime('%Y-%m-%d %H:%M')}
**Disease Identifiers**: Pending research...

---

## Executive Summary

*Research in progress...*

---

## 1. Disease Identity & Classification
*Researching...*

## 2. Clinical Presentation
*Pending...*

[... rest of template ...]
"""
    
    with open(filename, 'w') as f:
        f.write(template)
    
    return filename

Step 2: Research Each Dimension with Citations

For EACH piece of information, track:

  • Tool name that provided the data
  • Parameters used in the query
  • Timestamp of the query
python
def research_with_citations(tu, disease_name, report_file):
    """Research and update report with full citations"""
    
    references = []  # Track all sources
    
    # === DIMENSION 1: Identity ===
    
    # Get EFO ID
    efo_result = tu.tools.OSL_get_efo_id_by_disease_name(disease=disease_name)
    efo_id = efo_result.get('efo_id')
    references.append({
        'tool': 'OSL_get_efo_id_by_disease_name',
        'params': {'disease': disease_name},
        'section': 'Identity'
    })
    
    # Get ICD codes
    icd_result = tu.tools.icd_search_codes(query=disease_name, version="ICD10CM")
    references.append({
        'tool': 'icd_search_codes',
        'params': {'query': disease_name, 'version': 'ICD10CM'},
        'section': 'Identity'
    })
    
    # Get UMLS
    umls_result = tu.tools.umls_search_concepts(query=disease_name)
    references.append({
        'tool': 'umls_search_concepts',
        'params': {'query': disease_name},
        'section': 'Identity'
    })
    
    # Get synonyms from EFO
    if efo_id:
        efo_term = tu.tools.ols_get_efo_term(obo_id=efo_id.replace('_', ':'))
        references.append({
            'tool': 'ols_get_efo_term',
            'params': {'obo_id': efo_id},
            'section': 'Identity'
        })
        
        # Get subtypes
        children = tu.tools.ols_get_efo_term_children(obo_id=efo_id.replace('_', ':'), size=20)
        references.append({
            'tool': 'ols_get_efo_term_children',
            'params': {'obo_id': efo_id, 'size': 20},
            'section': 'Identity'
        })
    
    # UPDATE REPORT FILE with Identity section
    update_report_section(report_file, 'Identity', {
        'efo_id': efo_id,
        'icd_codes': icd_result,
        'umls': umls_result,
        'synonyms': efo_term.get('synonyms', []) if efo_term else [],
        'subtypes': children
    }, references[-5:])  # Last 5 references for this section
    
    # === DIMENSION 2: Clinical ===
    # ... continue for all dimensions

Step 3: Update Report File After Each Dimension

python
def update_report_section(filename, section_name, data, sources):
    """Update a specific section in the report file"""
    
    # Read current file
    with open(filename, 'r') as f:
        content = f.read()
    
    # Format section content with citations
    if section_name == 'Identity':
        section_content = format_identity_section(data, sources)
    elif section_name == 'Clinical':
        section_content = format_clinical_section(data, sources)
    # ... etc
    
    # Replace placeholder with actual content
    placeholder = f"## {section_number}. {section_name}\n*Researching...*"
    content = content.replace(placeholder, section_content)
    
    # Write back
    with open(filename, 'w') as f:
        f.write(content)


def format_identity_section(data, sources):
    """Format Identity section with proper citations"""
    
    source_list = ', '.join([s['tool'] for s in sources])
    
    return f"""## 1. Disease Identity & Classification

### Ontology Identifiers
| System | ID | Source |
|--------|-----|--------|
| EFO | {data['efo_id']} | OSL_get_efo_id_by_disease_name |
| ICD-10 | {data['icd_codes']} | icd_search_codes |
| UMLS CUI | {data['umls']} | umls_search_concepts |

### Synonyms & Alternative Names
{format_list_with_source(data['synonyms'], 'ols_get_efo_term')}

### Disease Subtypes
{format_list_with_source(data['subtypes'], 'ols_get_efo_term_children')}

**Sources**: {source_list}
"""

Complete Tool Usage by Section

Section 1: Identity (use ALL of these)

python
# Required tools - use all
tu.tools.OSL_get_efo_id_by_disease_name(disease=disease_name)
tu.tools.OpenTargets_get_disease_id_description_by_name(diseaseName=disease_name)
tu.tools.ols_search_efo_terms(query=disease_name)
tu.tools.ols_get_efo_term(obo_id=efo_id)
tu.tools.ols_get_efo_term_children(obo_id=efo_id, size=30)
tu.tools.umls_search_concepts(query=disease_name)
tu.tools.umls_get_concept_details(cui=cui)
tu.tools.icd_search_codes(query=disease_name, version="ICD10CM")
tu.tools.snomed_search_concepts(query=disease_name)

Section 2: Clinical Presentation (use ALL of these)

python
tu.tools.OpenTargets_get_associated_phenotypes_by_disease_efoId(efoId=efo_id)
tu.tools.get_HPO_ID_by_phenotype(query=symptom)  # for each key symptom
tu.tools.get_phenotype_by_HPO_ID(id=hpo_id)  # for top phenotypes
tu.tools.MedlinePlus_search_topics_by_keyword(term=disease_name, db="healthTopics")
tu.tools.MedlinePlus_get_genetics_condition_by_name(condition=disease_slug)
tu.tools.MedlinePlus_connect_lookup_by_code(cs=icd_oid, c=icd_code)

Section 3: Genetics (use ALL of these)

Evidence tier guide: Mendelian genes = ★★★, replicated GWAS = ★★☆, single GWAS = ★☆☆

python
# Disease-gene associations (★★☆ to ★★★)
tu.tools.OpenTargets_get_associated_targets_by_disease_efoId(efoId=efo_id)
tu.tools.OpenTargets_target_disease_evidence(efoId=efo_id, ensemblId=gene_id)  # for top genes

# Clinical variants (★★★)
tu.tools.clinvar_search_variants(condition=disease_name, max_results=50)
tu.tools.clinvar_get_variant_details(variant_id=vid)  # for top variants
tu.tools.clinvar_get_clinical_significance(variant_id=vid)

# GWAS associations (★★☆ if replicated, ★☆☆ if single study)
tu.tools.gwas_search_associations(disease_trait=disease_name, size=50)
tu.tools.gwas_get_variants_for_trait(disease_trait=disease_name, size=50)
tu.tools.gwas_get_associations_for_trait(disease_trait=disease_name, size=50)
tu.tools.gwas_get_studies_for_trait(disease_trait=disease_name, size=30)
tu.tools.GWAS_search_associations_by_gene(gene_name=gene)  # for top genes

# Variant details (★★★ for population data)
tu.tools.gnomad_get_variant_frequency(variant=variant)  # for key variants
tu.tools.gnomad_get_gene_constraints(gene_symbol=gene)  # constraint scores
tu.tools.dbsnp_get_variant_by_rsid(rsid=rs_id)  # dbSNP details

# NEW: Deep GWAS analysis
tu.tools.gwas_get_snp_by_id(snp_id=rs_id)  # individual SNP details
tu.tools.gwas_get_snps_for_gene(gene_symbol=gene)  # GWAS SNPs at gene locus
tu.tools.gwas_search_snps(query=disease_name)  # SNP-level search

Section 4: Treatment (use ALL of these)

python
tu.tools.OpenTargets_get_associated_drugs_by_disease_efoId(efoId=efo_id, size=100)
tu.tools.OpenTargets_get_drug_chembId_by_generic_name(drugName=drug)  # for each drug
tu.tools.OpenTargets_get_drug_mechanisms_of_action_by_chemblId(chemblId=chembl_id)
tu.tools.search_clinical_trials(condition=disease_name, pageSize=50)
tu.tools.get_clinical_trial_descriptions(nct_ids=nct_list)
tu.tools.get_clinical_trial_conditions_and_interventions(nct_ids=nct_list)
tu.tools.get_clinical_trial_eligibility_criteria(nct_ids=nct_list)
tu.tools.get_clinical_trial_outcome_measures(nct_ids=nct_list)
tu.tools.extract_clinical_trial_outcomes(nct_ids=nct_list)
tu.tools.GtoPdb_list_diseases(name=disease_name)
tu.tools.GtoPdb_get_disease(disease_id=gtopdb_id)

Section 5: Pathways (use ALL of these)

python
tu.tools.Reactome_get_diseases()
tu.tools.Reactome_map_uniprot_to_pathways(id=uniprot_id)  # for top genes
tu.tools.Reactome_get_pathway(stId=pathway_id)  # for key pathways
tu.tools.Reactome_get_pathway_reactions(stId=pathway_id)
tu.tools.humanbase_ppi_analysis(gene_list=top_genes, tissue=relevant_tissue)
tu.tools.gtex_get_expression_by_gene(gene=gene)  # for top genes
tu.tools.HPA_get_protein_expression(gene=gene)
tu.tools.geo_search_datasets(query=disease_name)

Section 6: Literature (use ALL of these)

python
tu.tools.PubMed_search_articles(query=f'"{disease_name}"', limit=100)
tu.tools.PubMed_search_articles(query=f'"{disease_name}" AND epidemiology', limit=50)
tu.tools.PubMed_search_articles(query=f'"{disease_name}" AND mechanism', limit=50)
tu.tools.PubMed_search_articles(query=f'"{disease_name}" AND treatment', limit=50)
tu.tools.PubMed_get_article(pmid=pmid)  # for top 10 articles
tu.tools.PubMed_get_related(pmid=key_pmid)
tu.tools.PubMed_get_cited_by(pmid=key_pmid)
tu.tools.OpenTargets_get_publications_by_disease_efoId(efoId=efo_id)
tu.tools.openalex_search_works(query=disease_name, limit=50)
tu.tools.europe_pmc_search_abstracts(query=disease_name, limit=50)
tu.tools.semantic_scholar_search_papers(query=disease_name, limit=50)

Section 7: Similar Diseases

python
tu.tools.OpenTargets_get_similar_entities_by_disease_efoId(efoId=efo_id, threshold=0.3, size=30)

Section 8: Cancer-Specific (if cancer)

python
tu.tools.civic_search_diseases(limit=100)
tu.tools.civic_search_genes(query=gene, limit=20)  # for cancer genes
tu.tools.civic_get_variants_by_gene(gene_id=civic_gene_id, limit=50)
tu.tools.civic_get_variant(variant_id=vid)
tu.tools.civic_get_evidence_item(evidence_id=eid)
tu.tools.civic_search_therapies(limit=100)
tu.tools.civic_search_molecular_profiles(limit=50)

Section 9: Pharmacology

python
tu.tools.GtoPdb_get_targets(target_type=type, limit=50)  # GPCR, ion channel, etc
tu.tools.GtoPdb_get_target(target_id=tid)  # for disease-relevant targets
tu.tools.GtoPdb_get_target_interactions(target_id=tid)
tu.tools.GtoPdb_search_interactions(approved_only=True)
tu.tools.GtoPdb_list_ligands(ligand_type="Approved")

Section 10: Safety (use ALL of these)

python
tu.tools.OpenTargets_get_drug_warnings_by_chemblId(chemblId=cid)  # for each drug
tu.tools.OpenTargets_get_drug_blackbox_status_by_chembl_ID(chemblId=cid)
tu.tools.extract_clinical_trial_adverse_events(nct_ids=nct_list)
tu.tools.FAERS_count_reactions_by_drug_event(drug=drug_name, event=event)
tu.tools.AdverseEventPredictionQuestionGenerator(disease_name=disease, drug_name=drug)

Citation Format with Evidence Grading

Every piece of data MUST include its source AND evidence tier. Use this format:

In Tables

markdown
| Gene | Score | Evidence | Source |
|------|-------|----------|--------|
| APOE | 0.92 | ★★★ (causal) | OpenTargets_get_associated_targets_by_disease_efoId |
| APP | 0.88 | ★★★ (Mendelian) | OpenTargets_get_associated_targets_by_disease_efoId |
| CLU | 0.45 | ★★☆ (GWAS) | GWAS Catalog |

In Lists

markdown
- Memory loss [★★★: OpenTargets_get_associated_phenotypes_by_disease_efoId, core feature]
- Cognitive decline [★★★: MedlinePlus_get_genetics_condition_by_name, diagnostic criterion]
- Sleep disturbance [★☆☆: association studies, not diagnostic]

In Prose with Evidence Grades

markdown
The disease affects approximately 6.5 million Americans [★★★: CDC epidemiology data].
APOE ε4 increases risk 3-15 fold [★★★: PMID:8346443, replicated in 100+ cohorts].
A recent single-center study suggests microbiome involvement [★☆☆: PMID:xxxxx, N=50].

Per-Section Evidence Summary

Include at section end:

markdown
---
**Evidence Quality for Section 3 (Genetics)**:
- Causal/Mendelian (T1): 5 genes
- Replicated GWAS (T2): 23 loci
- Single GWAS (T3): 45 associations
- Mention/Predicted (T4): 12
---

References Section

At the end of the report, include complete tool usage log:

markdown
## References

### Tools Used
| # | Tool | Parameters | Section | Items Retrieved |
|---|------|------------|---------|-----------------|
| 1 | OSL_get_efo_id_by_disease_name | disease="Alzheimer disease" | Identity | 1 |
| 2 | ols_get_efo_term | obo_id="EFO:0000249" | Identity | 1 |
| 3 | OpenTargets_get_associated_targets_by_disease_efoId | efoId="EFO_0000249" | Genetics | 245 |
| ... | ... | ... | ... | ... |

### Data Retrieved Summary
- Total tools used: 45
- Total API calls: 78
- Sections completed: 10/10

Progressive Update Pattern

After researching EACH dimension, immediately update the report file:

python
# After each dimension's research completes:

# 1. Read current report
with open(report_file, 'r') as f:
    report = f.read()

# 2. Replace placeholder with formatted content
report = report.replace(
    "## 3. Genetic & Molecular Basis\n*Pending...*",
    formatted_genetics_section
)

# 3. Write back immediately
with open(report_file, 'w') as f:
    f.write(report)

# 4. Continue to next dimension

Final Report Quality Checklist

Before presenting to user, verify:

  • All 10 sections have content (or marked as "No data available")
  • Every data point has a source citation
  • Executive summary reflects key findings
  • References section lists all tools used
  • Tables are properly formatted
  • No placeholder text remains

Example Output Structure

For "Alzheimer's Disease" research, the final report should be 2000+ lines with:

  • Section 1: 5+ ontology IDs, 10+ synonyms, disease hierarchy
  • Section 2: 20+ phenotypes with HPO IDs, symptoms list
  • Section 3: 50+ genes with scores, 30+ GWAS associations, 100+ ClinVar variants
  • Section 4: 20+ drugs, 50+ clinical trials with details
  • Section 5: 10+ pathways, PPI network, expression data
  • Section 6: 100+ publications, citation analysis, institution list
  • Section 7: 15+ similar diseases with similarity scores
  • Section 8: (if cancer) variants, evidence items
  • Section 9: Pharmacological targets and interactions
  • Section 10: Drug warnings, adverse events

Total: Detailed report with 500+ individual data points, each with source citation.


Tool Reference

See TOOLS_REFERENCE.md for complete tool documentation. See EXAMPLES.md for sample reports.