Deep Research: Multi-Model Comparative Analysis

Query up to 3 deep research models (OpenAI o3-deep-research, Perplexity sonar-deep-research, Gemini deep-research-pro) in parallel, then produce a comparative assessment highlighting agreements, disagreements, and unique insights.

Setup Check

The skill requires at least one API key. Check ~/.claude/.env:

First-Time Setup

If no config exists, create it:

bash

cat > ~/.claude/.env << 'ENVEOF'
# Deep Research API Configuration
# All keys are optional — skill works with any subset

# OpenAI (o3-deep-research via Responses API)
OPENAI_API_KEY=

# Perplexity (sonar-deep-research via Chat Completions API)
PERPLEXITY_API_KEY=

# Gemini (deep-research-pro via Interactions API)
GEMINI_API_KEY=
ENVEOF

chmod 600 ~/.claude/.env
echo "Config created at ~/.claude/.env"
echo "Add at least one API key for deep research."

DO NOT stop if the config doesn't exist. Create it and tell the user to add keys.

Research Execution

Step 1: Run the deep research script

CRITICAL: This script takes 2-10 minutes. It runs blocking — do NOT use run_in_background. This skill runs in a forked context, so blocking is correct.

Create the output directory and run the script:

bash

# timeout must exceed script's internal 600s timeout
# Output stays project-local at .claude/research/ so implementer can reference it
RESEARCH_DIR=".claude/research/DeepResearch_[SafeTopic]_[YYYY-MM-DD]"
mkdir -p "$RESEARCH_DIR" && \
python3 ~/.claude/skills/deep-research/scripts/deep_research.py "$ARGUMENTS" \
  --output-dir "$RESEARCH_DIR" 2>&1

Set timeout: 660000 on the Bash tool call (script's 600s timeout + 60s buffer).

The script will:

•Detect which API keys are configured
•Launch available providers in parallel
•Poll async providers (OpenAI, Gemini) until complete
•Write raw_results.json to the output directory
•Print progress to stderr

IMPORTANT: Deep research models take 2-10 minutes per provider. The script handles all polling internally. Do NOT interrupt it.

Step 2: Read and parse the results

Use the Read tool to read raw_results.json from the output directory. The file is a JSON object:

json

{
  "topic": "the research topic",
  "provider_count": 3,
  "success_count": 3,
  "results": [
    {
      "provider": "openai",
      "success": true,
      "report": "full report text...",
      "citations": [{"url": "...", "title": "..."}],
      "model": "o3-deep-research-2025-06-26",
      "elapsed_seconds": 145.3,
      "error": null
    },
    ...
  ]
}

Step 3: If any providers failed, optionally supplement with WebSearch to fill gaps.

Synthesis: Produce the Comparative Report

Read ALL provider reports carefully. Then produce a report in this structure:

markdown

# Deep Research Report: [Topic]

## Executive Summary
[3-5 sentence overview of the key findings across all models]

## Individual Model Reports

### OpenAI (o3-deep-research) — [elapsed]s
[Condensed key findings from OpenAI's report — preserve the important facts,
remove redundant prose. 200-400 words.]

### Perplexity (sonar-deep-research) — [elapsed]s
[Condensed key findings from Perplexity's report. 200-400 words.]

### Gemini (deep-research-pro) — [elapsed]s
[Condensed key findings from Gemini's report. 200-400 words.]

## Comparative Assessment

### Points of Agreement
[Claims made by 2+ models — these are highest confidence findings]

### Points of Disagreement
[Claims where models contradict each other — note which model says what]

### Unique Insights
[Findings that only one model reported — interesting but lower confidence]

### Confidence Assessment
| Finding | OpenAI | Perplexity | Gemini | Confidence |
|---------|--------|------------|--------|------------|
| [key claim 1] | ✓ | ✓ | ✓ | High |
| [key claim 2] | ✓ | ✓ | — | Medium |
| [key claim 3] | — | — | ✓ | Low |

### Source Quality Comparison
| Provider | Citations | Report Length | Depth |
|----------|-----------|-------------|-------|
| OpenAI | [n] sources | [n] words | [assessment] |
| Perplexity | [n] sources | [n] words | [assessment] |
| Gemini | [n] sources | [n] words | [assessment] |

## References

**CRITICAL: The report must be verifiable.** Include a numbered references section at the end using citations from all providers. Every factual claim in the report should be traceable to a source.

Build the references list by:
1. Collecting all citation URLs from the `citations` arrays in the JSON results
2. Deduplicating by URL (multiple providers may cite the same source)
3. Numbering them sequentially
4. Using inline reference numbers `[1]`, `[2]` etc. throughout the report body to link claims to sources

Format:
[1] Title or description — URL
[2] Title or description — URL
...

If a provider (like Gemini) returns no structured citations, note that its claims are unsourced and lower confidence. Prefer citing claims that have URLs backing them.

Adaptation rules:

•If only 1 provider succeeded: Skip comparative sections, note limited analysis
•If only 2 providers succeeded: Pairwise comparison instead of tri-model
•If 0 providers succeeded: Report the errors and suggest checking API keys

Save All Outputs

The output directory was already created in Step 1. The script already wrote raw_results.json there.

Write these additional files to the same directory:

File	Contents
`report.md`	Your comparative synthesis (the report above)
`openai.md`	OpenAI's full report text (from `results[].report` where provider=openai)
`perplexity.md`	Perplexity's full report text
`gemini.md`	Gemini's full report text

Only write provider files for providers that succeeded. The raw individual reports are often 5-40K chars — preserve them in full, don't truncate.

Tell the user where the reports were saved and list the files.

Graceful Degradation

Keys Available	Behavior
0	Error with setup instructions
1	Single provider report, note limited comparison
2	Pairwise comparison
3	Full tri-model comparison

After the Report

End with:

code

---
Deep Research complete.
- Providers: [n]/3 succeeded
- Total research time: [sum of elapsed]s
- Report saved to: .claude/research/DeepResearch_[Topic]_[Date]/report.md

Want me to dig deeper into any specific finding?