Research Storage Skill

This skill defines conventions for storing and retrieving research data collected by dokhak agents. Research files are cached to enable reuse and reduce redundant web searches.

Quick Reference for Agents

Agent	Uses This Skill For
researcher	Directory resolution, research.md writing, multi-tier lookup
research-collector	summary.md, sources.md writing to `.research/init/`
writer	Reading research files (read-only)
structure-designer	Reading init research (read-only)

Standard Loading Pattern

All agents should reference this skill for:

•Normalization functions: normalizeChapter, normalizeSection, generateSlug
•Multi-tier directory resolution: Handling legacy naming inconsistencies
•File format templates: research.md, sources.md, summary.md

code

Read("skills/research-storage/SKILL.md")

Directory Structure

code

project-root/
├── .research/                          # Research cache directory
│   ├── init/                           # /init command research
│   │   ├── summary.md                  # Structured research summary
│   │   └── sources.md                  # Source registry with reliability
│   │
│   └── sections/                       # /write command section research
│       ├── 01-1-introduction/
│       │   ├── research.md             # Section research results
│       │   └── sources.md              # Section sources
│       ├── 01-2-core-concepts/
│       │   ├── research.md
│       │   └── sources.md
│       └── {chapter}-{section}-{slug}/
│           ├── research.md
│           └── sources.md

Naming Convention

Section Directory Pattern (CANONICAL)

Format: {chapter}-{section}-{slug}

Component	Format	Canonical Example	Non-canonical (avoid)
chapter	Zero-padded 2 digits	`01`, `02`, `10`	`1`, `2`
section	Single digit (NO padding)	`1`, `2`, `3`	`01`, `02`
slug	Kebab-case lowercase	`core-concepts`	`Core-Concepts`

Canonical Examples:

•Chapter 1, Section 2, "Core Concepts" → 01-2-core-concepts ✓
•Chapter 3, Section 1, "Getting Started" → 03-1-getting-started ✓
•Chapter 10, Section 3, "Advanced Patterns" → 10-3-advanced-patterns ✓

Non-canonical (may exist from legacy/inconsistency):

•1-2-core-concepts (chapter not padded)
•01-02-core-concepts (section padded)
•01-2-Core-Concepts (slug not lowercase)

Normalization Functions

CRITICAL: All agents MUST use these normalization functions to ensure consistency.

normalizeChapter(chapter)

Converts any chapter format to canonical 2-digit zero-padded string.

code

Input: "1" or "01" or 1 or "001"
Output: "01" (always 2-digit zero-padded string)

Process:
1. Convert to integer: parseInt(chapter, 10)
2. Zero-pad to 2 digits: String(n).padStart(2, '0')

Examples:
- "1" → "01"
- "01" → "01"
- "10" → "10"
- 1 → "01"
- "001" → "01"

normalizeSection(section)

Converts any section format to canonical single-digit string (no padding).

code

Input: "1" or "01" or 1
Output: "1" (single digit, no padding)

Process:
1. Convert to integer: parseInt(section, 10)
2. Convert to string: String(n)

Examples:
- "1" → "1"
- "01" → "1"
- "3" → "3"
- "03" → "3"

generateSlug(title)

Converts title to canonical kebab-case slug.

code

Input: Any title string
Output: Lowercase kebab-case slug

Process:
1. Convert to lowercase: title.toLowerCase()
2. Replace spaces with hyphens: replace(/\s+/g, '-')
3. Remove special characters (keep a-z, 0-9, -): replace(/[^a-z0-9-]/g, '')
4. Collapse multiple hyphens: replace(/-+/g, '-')
5. Trim leading/trailing hyphens: replace(/^-|-$/g, '')

Examples:
- "Core Concepts" → "core-concepts"
- "What is React?" → "what-is-react"
- "Setup & Installation" → "setup-installation"
- "  Multiple   Spaces  " → "multiple-spaces"
- "C++ Programming" → "c-programming"

buildCanonicalPath(chapter, section, title)

Builds the canonical directory path.

code

Input: chapter, section, title
Output: ".research/sections/{canonical_chapter}-{canonical_section}-{canonical_slug}/"

Process:
1. canonical_chapter = normalizeChapter(chapter)
2. canonical_section = normalizeSection(section)
3. canonical_slug = generateSlug(title)
4. return ".research/sections/{canonical_chapter}-{canonical_section}-{canonical_slug}/"

Example:
- buildCanonicalPath("1", "02", "Core Concepts")
- → ".research/sections/01-2-core-concepts/"

File Path Generation

For /init Research

code

.research/init/summary.md
.research/init/sources.md

For Section Research

code

.research/sections/{chapter}-{section}-{slug}/research.md
.research/sections/{chapter}-{section}-{slug}/sources.md

Example: Section 1.2 "Core Concepts"

code

.research/sections/01-2-core-concepts/research.md
.research/sections/01-2-core-concepts/sources.md

File Format Templates

summary.md (for /init)

markdown

# Research Summary

> Generated: {YYYY-MM-DD}
> Topic: {topic}
> Domain: {domain}

## Key Concepts

### {Concept 1}

- **Definition**: {clear definition}
- **Importance**: {why it matters}
- **Source**: [{source name}]({url})

### {Concept 2}

- **Definition**: {clear definition}
- **Importance**: {why it matters}
- **Source**: [{source name}]({url})

## Learning Path

1. **Prerequisites**: {comma-separated list}
2. **Fundamentals**: {comma-separated list}
3. **Core Skills**: {comma-separated list}
4. **Advanced Topics**: {comma-separated list}

## Current Trends ({current_year})

- {trend 1 with source link}
- {trend 2 with source link}

## Domain-Specific Information

{domain-specific sections based on domain-profiles skill}

sources.md (for both /init and sections)

markdown

# Source Registry

> Section: {section_id or "init"}
> Generated: {YYYY-MM-DD}

## Primary Sources (High Reliability)

| Source | URL   | Type          | Last Verified |
| ------ | ----- | ------------- | ------------- |
| {name} | {url} | Official Docs | {YYYY-MM-DD}  |
| {name} | {url} | Official Docs | {YYYY-MM-DD}  |

## Secondary Sources (Medium Reliability)

| Source | URL   | Type     | Notes   |
| ------ | ----- | -------- | ------- |
| {name} | {url} | Tutorial | {notes} |
| {name} | {url} | Blog     | {notes} |

## Rejected Sources

| Source | Reason            |
| ------ | ----------------- |
| {name} | Outdated (year)   |
| {name} | Unreliable author |

research.md (for sections)

markdown

# Research: {Section Title}

> Section: {chapter}.{section} {title}
> Target Pages: {N}p
> Generated: {YYYY-MM-DD}

## Scope

{Brief description of what this section covers}

## Key Concepts

### {Concept 1}

- **Definition**: {definition}
- **Source**: [{name}]({url})

### {Concept 2}

- **Definition**: {definition}
- **Source**: [{name}]({url})

## Code Examples

### {Example Title}

```{language}
{code}
```

> Source: [{name}]({url})

## Common Pitfalls

1. **{Pitfall 1}**: {description}
   - **Cause**: {why it happens}
   - **Solution**: {how to avoid}

2. **{Pitfall 2}**: {description}
   - **Cause**: {why it happens}
   - **Solution**: {how to avoid}

## Practical Insights

- {insight 1 with source link}
- {insight 2 with source link}

## Subtopic Coverage

| Subtopic | Status   | Source            |
| -------- | -------- | ----------------- |
| {name}   | Complete | [{source}]({url}) |
| {name}   | Partial  | [{source}]({url}) |
| {name}   | Missing  | -                 |

Directory Resolution Strategy

When locating research directories, use multi-tier search to handle naming inconsistencies from legacy data or different generation sources.

Why Multi-Tier Search?

Research directories may have been created with inconsistent naming:

Inconsistency Type	Example Mismatch
Chapter padding	`1-2-intro` vs `01-2-intro`
Section padding	`01-02-intro` vs `01-2-intro`
Slug case	`01-2-Core-Concepts` vs `01-2-core-concepts`
Slug special chars	`01-2-whats-new?` vs `01-2-whats-new`
Combined inconsistency	`1-02-What's New?` vs `01-2-whats-new`

Multi-Tier Search Algorithm

⚠️ CRITICAL: Glob Returns Files Only

Glob does NOT return directories. All patterns MUST end with a filename (e.g., /research.md).

Pattern Result
.research/sections/*9-1* ❌ Empty (matches directory, not returned)
.research/sections/*9-1*/research.md ✅ Returns file path
.research/sections/*9-1*/* ✅ Returns all files in matching dirs

Execute tiers in order. Stop at first successful match.

Tier 1: Canonical Exact Match (Primary)

Search using fully normalized canonical path.

code

canonical_chapter = normalizeChapter(chapter)  # "1" → "01"
canonical_section = normalizeSection(section)  # "02" → "2"
canonical_slug = generateSlug(title)           # "Core Concepts" → "core-concepts"

Glob(".research/sections/{canonical_chapter}-{canonical_section}-{canonical_slug}/research.md")

Example: Glob(".research/sections/01-2-core-concepts/research.md")

Tier 2: Canonical Chapter-Section, Any Slug

If Tier 1 fails, search with canonical chapter-section but wildcard slug.

code

Glob(".research/sections/{canonical_chapter}-{canonical_section}-*/research.md")

Example: Glob(".research/sections/01-2-*/research.md")

This catches slug variations like Core-Concepts, core_concepts, etc.

Tier 3: Non-Padded Chapter Variation

If Tier 2 fails, try without chapter zero-padding (legacy compatibility).

code

raw_chapter = String(parseInt(chapter, 10))  # "01" → "1"

Glob(".research/sections/{raw_chapter}-{canonical_section}-*/research.md")

Example: Glob(".research/sections/1-2-*/research.md")

Tier 4: Flexible Pattern Match (Last Resort)

If all above fail, use section number and first slug keyword.

code

first_keyword = generateSlug(title).split('-')[0]  # "core-concepts" → "core"

Glob(".research/sections/*-{canonical_section}-*{first_keyword}*/research.md")

Example: Glob(".research/sections/*-2-*core*/research.md")

Resolution Output Format

Return resolution result in XML format:

xml

<directory_resolution>
  <input>
    <chapter>{original_chapter}</chapter>
    <section>{original_section}</section>
    <title>{original_title}</title>
  </input>
  <canonical>
    <chapter>{normalized_chapter}</chapter>
    <section>{normalized_section}</section>
    <slug>{normalized_slug}</slug>
    <path>{canonical_path}</path>
  </canonical>
  <resolution>
    <resolved_path>{matched_path OR canonical_path}</resolved_path>
    <existing>{true|false}</existing>
    <match_tier>{1|2|3|4|new}</match_tier>
  </resolution>
</directory_resolution>

Resolution Logic Summary

code

function resolveResearchDirectory(chapter, section, title):
  # Normalize inputs
  c = normalizeChapter(chapter)
  s = normalizeSection(section)
  slug = generateSlug(title)
  canonical = ".research/sections/{c}-{s}-{slug}/"

  # Tier 1: Exact canonical
  result = Glob("{canonical}research.md")
  if result: return { path: canonical, existing: true, tier: 1 }

  # Tier 2: Canonical chapter-section, any slug
  result = Glob(".research/sections/{c}-{s}-*/research.md")
  if result: return { path: parent(result[0]), existing: true, tier: 2 }

  # Tier 3: Non-padded chapter
  raw_c = String(parseInt(chapter, 10))
  result = Glob(".research/sections/{raw_c}-{s}-*/research.md")
  if result: return { path: parent(result[0]), existing: true, tier: 3 }

  # Tier 4: Flexible pattern
  keyword = slug.split('-')[0]
  result = Glob(".research/sections/*-{s}-*{keyword}*/research.md")
  if result: return { path: parent(result[0]), existing: true, tier: 4 }

  # No match - use canonical for new directory
  return { path: canonical, existing: false, tier: "new" }

Usage Patterns

Checking Existing Research (UPDATED)

IMPORTANT: Do NOT use simple Glob. Use the multi-tier resolution strategy above.

code

# OLD (may miss existing research due to naming inconsistency)
Glob(".research/sections/{chapter}-{section}-{slug}/research.md")

# NEW (handles all variations)
resolution = resolveResearchDirectory(chapter, section, title)
existing_research = resolution.existing
research_dir = resolution.resolved_path

Returns resolved directory path and existence status.

Reading Research Files

When consuming research, read files directly in agent context:

code

Read(".research/init/summary.md")
Read(".research/sections/01-2-core-concepts/research.md")

Saving Research Results

Agents should Write files following the templates above:

code

Write(".research/init/summary.md", content)
Write(".research/sections/01-2-core-concepts/research.md", content)

Agent-Specific Guidelines

research-collector Agent

•Outputs to: .research/init/summary.md, .research/init/sources.md
•Creates directory if not exists
•Returns confirmation only: research_saved:.research/init/

researcher Agent

•Outputs to: .research/sections/{id}/research.md, .research/sections/{id}/sources.md
•Checks existing research via Subtopic Coverage table
•Appends to existing if partial coverage
•Returns confirmation only: research_saved:{output_dir}

Consumer Agents (structure-designer, writer)

•Receive file paths in prompt
•Read files directly in their own context
•Do not modify research files

XML Output Schemas

Standardized XML schemas for agent communication. All agents should use these formats for consistency.

Research Result Schema

Used by research-collector and researcher agents:

xml

<research_result domain="{technology|history|science|arts|general}" status="OK|PARTIAL|ERROR">
  <summary>
    <sources_count>{N}</sources_count>
    <concepts_count>{N}</concepts_count>
    <output_path>{path}</output_path>
    <generated>{YYYY-MM-DD}</generated>
  </summary>

  <authoritative_sources>
    - [Source Name](url) - {reliability: high|medium}
  </authoritative_sources>

  <key_concepts>
    - **{Term}**: {Definition}
  </key_concepts>

  <learning_path>
    1. Prerequisites: {list}
    2. Fundamentals: {list}
    3. Core Skills: {list}
    4. Advanced: {list}
  </learning_path>

  <!-- Domain-specific sections as per domain-profiles -->
</research_result>

Directory Resolution Schema

Used by researcher agent for path resolution:

xml

<directory_resolution>
  <input>
    <chapter>{original_chapter}</chapter>
    <section>{original_section}</section>
    <title>{original_title}</title>
  </input>
  <canonical>
    <chapter>{normalized_chapter}</chapter>
    <section>{normalized_section}</section>
    <slug>{normalized_slug}</slug>
    <path>{canonical_path}</path>
  </canonical>
  <resolution status="FOUND|NEW">
    <resolved_path>{matched_path OR canonical_path}</resolved_path>
    <existing>{true|false}</existing>
    <match_tier>{1|2|3|4|new}</match_tier>
  </resolution>
</directory_resolution>

Subtopic Coverage Schema

Used within research files to track coverage:

xml

<subtopic_coverage>
  <subtopic name="{name}" status="Complete|Partial|Missing">
    <source>{url or "pending"}</source>
  </subtopic>
</subtopic_coverage>

Status Values Reference

Status	Context	Meaning
OK	research_result	All subtopics covered, sufficient sources
PARTIAL	research_result	Some subtopics missing or incomplete
ERROR	research_result	Critical failure (e.g., no sources found)
FOUND	directory_resolution	Existing research directory located
NEW	directory_resolution	No existing research, use canonical path

Error Handling

Scenario	Action
`.research/` directory missing	Auto-create on first write
Research file not found	Conduct fresh research
Read failure	Log warning, conduct fresh research
Write failure	Report error, do not update task.md

.gitignore Recommendation

Research files are regenerable and should typically be ignored:

gitignore

# Research cache (regenerable)
.research/

Pattern	Result
`.research/sections/9-1`	❌ Empty (matches directory, not returned)
`.research/sections/9-1/research.md`	✅ Returns file path
`.research/sections/9-1/*`	✅ Returns all files in matching dirs