Curator Notes Processing

Process the human curator's monthly notes file into pipeline-ready content and editorial signals.

Quick Start

•Check for workspace/curator_notes_YYYY-MM.md (or known variants)
•If no file exists, skip this phase entirely -- pipeline continues normally
•Parse and classify every item using the link type taxonomy
•Filter: internal links become editorial signals; public links get visited
•Output processed items + editorial signals
•Items feed into Phase 3 curation alongside Phase 1C discoveries

Inputs

•
Curator notes file (optional): workspace/curator_notes_YYYY-MM.md
- •May also appear as: workspace/<month>.md, workspace/<Month>.md, workspace/Jan.md, etc.
- •Check for any .md file in workspace/ that looks like a brain dump of links
•Phase 1C Discoveries: For cross-reference (avoid duplicates)

Output

•workspace/curator_notes_processed_YYYY-MM.md -- Items in universal extraction format, ready for Phase 3
•workspace/curator_notes_editorial_signals_YYYY-MM.md -- Priority hints, theme suggestions, internal context

Core Workflow

Step 0: Locate the Notes File

Check these paths in order:

•workspace/curator_notes_YYYY-MM.md (canonical)
•workspace/<Month>.md (e.g., workspace/Jan.md, workspace/February.md)
•Any .md file in workspace/ containing 3+ URLs that isn't a pipeline output

If no file found: SKIP. Output nothing. Pipeline continues.

Step 1: Parse and Classify

For each line/item in the notes file, classify into one of these types:

Type	Detection Pattern	Action
GitHub Changelog	`github.blog/changelog/`	PROCESS: visit URL, extract item
GitHub Blog	`github.blog/news-insights/` or `github.blog/ai-and-ml/`	PROCESS: visit URL, extract item
VS Code Release Notes	`code.visualstudio.com/updates/`	PROCESS: visit URL, extract features
Microsoft DevBlog	`devblogs.microsoft.com/`	PROCESS: visit if GitHub-specific
Community resource	`github.com/<user>/`, personal sites, awesome-lists	PROCESS: visit URL, extract description
VS Code extension	`marketplace.visualstudio.com/`	PROCESS: visit URL, extract description
YouTube video	`youtube.com/watch`, `youtu.be/`	PROCESS: visit, extract title and context
Event registration	`registration.goldcast.io/`, `github.com/resources/events/`	PROCESS: extract event details
Named feature hint	Text line without URL referencing a product name	SIGNAL: flag as editorial priority
Internal link	Internal collaboration URLs (chat, docs, private repos, project boards)	SIGNAL: translate topic to public reference
Industry content	`hbr.org/`, `substack.com/`, `medium.com/`, `martinfowler.com/`, analyst reports	SIGNAL: framing context only
Competitor content	`blog.cloudflare.com/`, `sonarsource.com/`, etc.	SIGNAL: competitive context only
Text note (vague)	Bare text, no URL, no product name	SIGNAL: note for context

Step 2: Filter and Route

PROCESS items (visit, extract, output as discoveries):

•GitHub Changelog, Blog, VS Code, DevBlog, Community, Extensions, Videos, Events
•Visit each URL and extract: title, date, description, enterprise relevance

SIGNAL items (output as editorial signals):

•Internal links: identify the topic and note "watch for public announcement about [topic]"
•Industry content: summarize the thesis for framing context
•Named feature hints: flag as lead section candidates
•Competitor content: note competitive positioning opportunity

Step 3: Cross-Reference with Phase 1C

For each PROCESS item, check if Theme 1C discoveries already cover it:

•Already covered: Mark as "REINFORCEMENT" (human agreed this is important -- boost priority in Phase 3)
•Not covered: Mark as "ADDITION" (unique content from curator -- must include in Phase 3)

Step 4: Visit and Extract

For each PROCESS item not already covered by Phase 1C:

•Fetch the URL
•Extract: title, date, description (2-3 sentences), enterprise relevance (1-10)
•Format in universal extraction format
•Assign category from the newsletter taxonomy

Step 5: Write Outputs

Processed items (workspace/curator_notes_processed_YYYY-MM.md):

markdown

## Curator Notes: Processed Items

### Additions (not in Phase 1C)
- **[Item Title]** -- Description. - [Label](URL)
  - Source: Curator notes
  - Enterprise Relevance: N/10

### Reinforcements (also in Phase 1C -- boost priority)
- **[Item Title]** -- Already in discoveries. Curator explicitly flagged.

Editorial signals (workspace/curator_notes_editorial_signals_YYYY-MM.md):

markdown

## Editorial Signals from Curator Notes

### Lead Section Candidates
- [Feature hint]: Curator named this explicitly. Consider for lead.

### Priority Boosts
- [Topic]: Curator linked N items about this topic. Strong signal.

### Framing Context
- [Industry article summary]: Useful for newsletter introduction framing.

### Internal Signals (not linkable)
- [Internal thread topic]: Something is happening around [X]. Watch for public announcement.

Integration with Pipeline

•Phase 1C -> Phase 1.5 (this skill) -> Phase 3
•Phase 3 (content-curation) reads BOTH Phase 1C discoveries AND curator notes processed items
•Editorial signals inform Phase 3's lead section decision and item prioritization
•If a named feature hint matches a Phase 1C discovery, it gets 2.0x priority boost

Link Type Survival Intelligence

From analysis of 10 benchmark cycles (see reference/curator-notes-intelligence.md):

Type	Survival Rate	Notes
Named feature hints	100%	Always become sections/leads
Community resources	100%	Pipeline can't find these
Team member content	100%	Personal curation uniqueness
GitHub Changelog	67%	May be superseded
Microsoft DevBlog	50%	Only if GitHub-specific
YouTube videos	25%	Only major conference sessions
Internal links	0%	Translate to public references
Industry blogs	0%	Framing context only
Analyst reports	0%	Never linked directly

Reference

•Curator Notes Intelligence -- Full analysis from 10 benchmark cycles
•Content Format Spec -- Output formatting rules

Done When

• Notes file located (or confirmed absent)
• All items classified by type
• Public URLs visited and extracted
• Cross-referenced with Phase 1C discoveries
• Processed items file exists (if notes file existed)
• Editorial signals file exists (if notes file existed)
• No internal links leaked to processed items output