AgentSkillsCN

research-synthesis-workflow

系统化地整合多方研究资源,通过科学的归纳与分析,提炼出连贯的洞见与可付诸行动的知识。

SKILL.md
--- frontmatter
name: research-synthesis-workflow
description: Systematic methodology for gathering, analyzing, and synthesizing research from multiple sources into coherent insights and actionable knowledge.
license: MIT

Research Synthesis Workflow

This skill provides a systematic methodology for conducting research, synthesizing findings from multiple sources, and producing actionable knowledge artifacts.

Core Competencies

  • Source Evaluation: Assessing credibility, relevance, and bias
  • Information Extraction: Systematic note-taking and annotation
  • Synthesis Methods: Thematic analysis, meta-analysis, framework building
  • Knowledge Artifacts: Reports, literature reviews, decision frameworks

Research Workflow Overview

code
┌──────────────────────────────────────────────────────────────┐
│                    Research Synthesis Workflow                │
├──────────────────────────────────────────────────────────────┤
│                                                              │
│  1. SCOPE         2. GATHER         3. EXTRACT              │
│  ┌─────────┐      ┌─────────┐      ┌─────────┐              │
│  │ Define  │─────▶│ Find    │─────▶│ Capture │              │
│  │ Question│      │ Sources │      │ Insights│              │
│  └─────────┘      └─────────┘      └─────────┘              │
│       │                                  │                   │
│       │          5. PRODUCE         4. SYNTHESIZE           │
│       │          ┌─────────┐      ┌─────────┐              │
│       └─────────▶│ Create  │◀─────│ Connect │              │
│                  │ Artifact│      │ Themes  │              │
│                  └─────────┘      └─────────┘              │
│                                                              │
└──────────────────────────────────────────────────────────────┘

Phase 1: Scope Definition

Research Question Framework

Transform vague topics into answerable questions:

TypePatternExample
ExploratoryWhat is X? How does X work?What is vector search?
ComparativeHow does X compare to Y?PostgreSQL vs. Neo4j for graphs?
EvaluativeIs X effective for Y?Is RAG effective for technical docs?
CausalWhat causes X? What are effects of X?What causes LLM hallucinations?
PrescriptiveHow should we implement X?How to design a RAG pipeline?

Scope Boundaries

Define explicitly:

  • In scope: Topics to cover
  • Out of scope: Adjacent topics to exclude
  • Depth: Survey (broad) vs. deep-dive (narrow)
  • Time bounds: Cut-off dates for sources
  • Source types: Academic, industry, primary data

Example Scope Document

markdown
## Research Scope: Vector Database Selection

### Research Question
Which vector database best fits our production RAG system
requiring <50ms latency at 10M+ vectors?

### In Scope
- Pinecone, Weaviate, Milvus, Qdrant, pgvector
- Latency benchmarks at scale
- Cost analysis (cloud vs self-hosted)
- Operational complexity

### Out of Scope
- General-purpose databases with vector extensions
- Sub-million vector use cases
- Academic/research-only systems

### Success Criteria
Recommendation with supporting evidence for 2-3 top candidates

Phase 2: Source Gathering

Source Quality Assessment

Evaluate each source on:

CriterionHigh QualityLow Quality
AuthorityExpert author, peer-reviewedAnonymous, no credentials
CurrencyRecent, updatedOutdated, no dates
AccuracyCitations, verifiableUnsupported claims
PurposeInform, educateSell, persuade
CoverageComprehensiveSuperficial

Source Types and Uses

code
Primary Sources (original)
├── Research papers
├── Official documentation
├── Benchmark data
└── Expert interviews

Secondary Sources (analysis)
├── Review articles
├── Technical blogs
├── Industry reports
└── Book chapters

Tertiary Sources (summaries)
├── Wikipedia
├── Textbooks
└── Encyclopedias

Search Strategies

Keyword expansion:

  • Start: "vector database performance"
  • Expand: "approximate nearest neighbor", "HNSW benchmark", "embedding search latency"

Citation chaining:

  • Forward: Who cites this paper?
  • Backward: What does this paper cite?

Author tracking:

  • Find key researchers, follow their work

Source Documentation

For each source, capture:

markdown
## Source: [Title]
- **URL/DOI**:
- **Author(s)**:
- **Date**:
- **Type**: [paper/blog/docs/report]
- **Quality Score**: [1-5]
- **Relevance**: [high/medium/low]
- **Key Topics**:
- **Notes**:

Phase 3: Information Extraction

Structured Note-Taking

Use consistent templates for extraction:

markdown
## Claim: [Specific assertion]
- **Source**: [reference]
- **Evidence**: [supporting data/reasoning]
- **Strength**: [strong/moderate/weak]
- **My Assessment**: [agree/disagree/uncertain]
- **Related Claims**: [links to other notes]

Evidence Classification

TypeDescriptionWeight
EmpiricalMeasured data, experimentsHigh
AnalyticalLogical derivationMedium-High
AnecdotalCase studies, examplesMedium
Expert OpinionAuthority statementsMedium
TheoreticalModel predictionsMedium-Low

Contradiction Tracking

When sources disagree:

markdown
## Conflict: [Topic]

### Position A: [Claim]
- Sources: [list]
- Evidence: [summary]

### Position B: [Claim]
- Sources: [list]
- Evidence: [summary]

### Analysis
- Methodological differences:
- Context differences:
- Possible resolution:
- My conclusion:

Phase 4: Synthesis

Thematic Analysis

  1. Code individual insights with tags
  2. Cluster related codes into themes
  3. Review themes for coherence
  4. Define each theme clearly
  5. Relate themes to research question
code
Codes                    Themes                 Findings
├─ fast queries     ─┐
├─ low latency      ─┼── Performance      ─┬── Theme 1: Performance
├─ high throughput  ─┘                     │   varies significantly
├─ managed service  ─┐                     │   by workload type
├─ self-hosted      ─┼── Deployment      ─┼── Theme 2: Cloud vs
├─ kubernetes       ─┘                     │   self-hosted tradeoff
├─ pricing tiers    ─┐                     │
├─ compute costs    ─┼── Economics       ─┴── Theme 3: Total cost
├─ hidden costs     ─┘                         drives final choice

Framework Building

Create decision frameworks from synthesis:

markdown
## Vector Database Selection Framework

### Decision Tree
1. Scale requirement?
   - <1M vectors → pgvector (simplicity)
   - 1M-100M vectors → Continue to 2
   - >100M vectors → Milvus/Weaviate (distributed)

2. Operational capacity?
   - Limited DevOps → Pinecone (managed)
   - Strong DevOps → Continue to 3

3. Cost sensitivity?
   - Budget constrained → Qdrant (open source)
   - Budget flexible → Evaluate all options

### Comparison Matrix
| Criterion      | Weight | Pinecone | Milvus | Qdrant |
|----------------|--------|----------|--------|--------|
| Latency        | 30%    | 4        | 5      | 4      |
| Scalability    | 25%    | 5        | 5      | 4      |
| Operations     | 20%    | 5        | 3      | 4      |
| Cost           | 15%    | 2        | 4      | 5      |
| Features       | 10%    | 4        | 5      | 4      |
| **Weighted**   |        | **4.0**  | **4.4**| **4.2**|

Phase 5: Knowledge Artifact Production

Artifact Types

FormatPurposeAudience
Executive SummaryQuick decision supportLeadership
Technical ReportDetailed analysisEngineers
Literature ReviewAcademic synthesisResearchers
Decision FrameworkStructured evaluationDecision makers
Reference GuideQuick lookupPractitioners

Structure Templates

Executive Summary (1-2 pages):

  1. Context and question
  2. Key findings (3-5 bullets)
  3. Recommendation
  4. Risks and considerations

Technical Report (5-20 pages):

  1. Executive summary
  2. Background and scope
  3. Methodology
  4. Findings by theme
  5. Analysis and discussion
  6. Recommendations
  7. Appendices (data, sources)

Quality Checklist

Before finalizing:

  • Research question answered?
  • All claims supported by evidence?
  • Contradictions addressed?
  • Limitations acknowledged?
  • Actionable recommendations?
  • Sources properly cited?
  • Appropriate for audience?

Best Practices

Avoiding Bias

  • Seek disconfirming evidence actively
  • Include multiple perspectives
  • Note your priors and update them
  • Separate observation from interpretation
  • Document methodology for transparency

Managing Scope Creep

  • Return to research question frequently
  • Park interesting tangents in "Future Research"
  • Time-box each phase
  • Define "good enough" criteria upfront

Iteration

Research is rarely linear:

  • New sources may require scope adjustment
  • Synthesis may reveal gaps requiring more gathering
  • Artifacts may need multiple drafts

References

  • references/evaluation-rubrics.md - Source quality scoring guides
  • references/synthesis-methods.md - Detailed synthesis techniques
  • references/artifact-templates.md - Document templates and examples