AgentSkillsCN

research-paper-study

全面的机器学习/科研论文研究工作流——支持从 arXiv URL 或本地 PDF 摘要论文,开展互动式问答讨论,并生成 Hugo 博客文章。同时支持英文与韩文两种语言输出。适用于用户希望:(1) 摘要并深入研读来自 arXiv 或 PDF 的科研论文;(2) 以互动方式探讨论文发现;(3) 撰写包含详尽分析与图表的博客文章;或 (4) 请求以韩文形式输出摘要。

SKILL.md
--- frontmatter
name: research-paper-study
description: Comprehensive ML/research paper study workflow - summarize papers (from arXiv URL or local PDF), facilitate interactive discussion with Q&A, and generate Hugo blog posts. Supports both English and Korean output. Use when user wants to (1) summarize/study a research paper from arXiv or PDF, (2) discuss paper findings interactively, (3) create blog post with detailed analysis and figures, or (4) requests Korean language output for summaries.

Research Paper Study

Complete workflow for studying ML and research papers with summarization, interactive discussion, note compilation, and optional Obsidian integration.

Quick Start

Fetch from arXiv

bash
python scripts/fetch_arxiv.py "https://arxiv.org/abs/1706.03762"
# Or just the ID:
python scripts/fetch_arxiv.py "1706.03762"

Extract from Local PDF

bash
python scripts/extract_pdf_text.py paper.pdf extracted-text.txt

Extract Images from PDF

bash
python scripts/extract_pdf_images.py paper.pdf output-dir/ --prefix arxiv-id --min-size-kb 20

Compile Study Notes

bash
python scripts/compile_study_notes.py arxiv-metadata.json --lang ko --output study-notes.md

Complete Workflow

Phase 1: Acquire Paper

From arXiv:

  1. Run scripts/fetch_arxiv.py with URL or ID
  2. Downloads PDF and extracts metadata (title, authors, abstract, year)
  3. Outputs: {arxiv-id}.pdf and {arxiv-id}-metadata.json

From Local PDF:

  1. If user provides PDF path, use scripts/extract_pdf_text.py
  2. Ask user for metadata (title, authors, year, link) or extract from filename
  3. Create metadata JSON manually for consistency

Phase 2: Summarize Paper

Use sub-agent for deep reading:

  1. Spawn isolated session with higher context model (recommended: sonnet or gpt)

  2. Provide extracted text + metadata

  3. Check USER.md for user role/preferences to determine technical depth

  4. Generate technical summary with following structure:

    code
    Summarize this research paper with technical depth. Include:
    - Problem statement with complexity/efficiency analysis
    - Key contributions with architectural details
    - Methodology: equations, hyperparameters, design choices
    - Results: concrete metrics, ablation studies, comparisons
    - Computational complexity analysis where relevant
    - Limitations and future work
    
    [Language: English/Korean based on user preference]
    
    Do NOT include phrases like "ML Engineer perspective" or "for technical audience" 
    in the summary title or section headers. Just write technically by default.
    
  5. Return structured summary to main session

Summary structure:

  • Problem definition (include bottlenecks, complexity issues)
  • Key contributions (specific innovations with details)
  • Methodology (equations, architectures, hyperparameters)
  • Experimental results (tables, ablations, comparisons)
  • Computational analysis (complexity, efficiency)
  • Limitations and future directions

Phase 3: Interactive Discussion

  1. Present FULL summary to user:

    • Show the complete summary generated by sub-agent
    • Do NOT summarize or condense it further
    • Do NOT show only "highlights" or "key points"
    • The user wants to read the entire technical analysis
  2. User reads and asks questions

  3. Answer questions conversationally

  4. Track Q&A pairs for compilation:

    • Store each question and answer
    • Maintain order of discussion

Example questions to anticipate:

  • "What's novel about this approach?"
  • "How does this compare to [other method]?"
  • "What are the practical applications?"
  • "What experiments did they run?"

Phase 4: Extract Important Figures

If figures are needed:

  1. Run scripts/extract_pdf_images.py
  2. Filter by size (default --min-size-kb 20)
  3. Show extracted images with page numbers.
  4. Ask user which figures to keep (e.g., “1, 3, 5”).

Naming convention:

  • {arxiv-id}-fig{N}.{ext}

Phase 5: Compile Study Notes

When user requests compilation (e.g., "save this" or "create study notes"):

  1. Use appropriate template:

    • English: references/hugo-post-template-en.md
    • Korean: references/hugo-post-template-ko.md
  2. Fill template with:

    • Metadata (from Phase 1)
    • Study date: Use date +"%Y-%m-%d" to get current date (NOT hardcoded)
    • Summary (from Phase 2)
    • Q&A discussion (from Phase 3)
    • User insights (ask if not provided)
    • Image embeds (from Phase 4, if applicable)
  3. Image placement:

    • Do NOT place all figures at the end in a separate section
    • Embed figures inline at contextually appropriate locations
    • Examples:
      • Architecture diagrams → within "Methodology" section when describing the architecture
      • Attention mechanism diagrams → when explaining attention details
      • Result graphs → within "Results" section
    • Use Obsidian wiki-link format: ![[attachments/papers/image.png]]
    • Add descriptive captions under each figure
  4. Reference links:

    • For cited papers in "References" section, add hyperlinks to arXiv/DOI
    • Format: [Paper Title](https://arxiv.org/abs/XXXX.XXXXX)
  5. Section naming:

    • Use "인사이트" NOT "개인적 인사이트" for Korean
    • Use "Insights" NOT "Personal Insights" for English
  6. Generate output filename: YYYY-MM-DD-paper-title-slug.md

Phase 6: Generate Hugo Blog Post

After compiling the study notes, generate a Hugo blog post.

Consult references/hugo-integration.md for detailed guidelines.

  1. Determine blog root:

    • Default: /Users/gil-yoonhee/VSCodeProject/moripiri.github.io
    • Or check current working directory
  2. Create slug from paper title:

    • Lowercase, hyphens for spaces, ASCII only
    • Example: "Attention Is All You Need" → attention-is-all-you-need
    • Max ~60 chars
  3. Create page bundle directory:

    bash
    mkdir -p content/posts/<slug>/images
    
  4. Copy extracted images:

    • Source: paper-images-<arxiv-id>/ or workspace
    • Destination: content/posts/<slug>/images/
    • Rename descriptively if needed:
      • 2512.24601-fig1.pngfigure1-architecture.png
      • 2512.24601-fig3.pngfigure2-main-results.png
  5. Generate frontmatter:

    yaml
    ---
    title: "<Paper Title> 요약"  # or "Summary" for English
    date: <YYYY-MM-DD>
    draft: true
    tags: ["Paper Review", "<topic-tag>"]
    ShowToc: true
    description: "<one-line summary>"
    ---
    
  6. Add AI notice (MUST be at top of content):

    markdown
    > **🤖 AI Summary Notice**
    > 이 글은 AI(Claude)가 논문을 읽고 작성한 요약입니다. 부정확한 내용이 있을 수 있으니, 정확한 정보는 원문을 참고해주세요.
    
    <!--more-->
    
    **저자:** [Author names]  
    **발행년도:** [Publication year]  
    **링크:** [arXiv/DOI URL]
    
  7. Convert markdown format and paste content:

    • Convert Obsidian wiki-links → Hugo image syntax
    • ![[attachments/papers/image.png]]<p align="center"><img src="images/image.png" alt="description"></p>
    • IMPORTANT: Keep ALL other content AS-IS from research-paper-study summary
    • DO NOT simplify or restructure the technical content
    • Preserve: complexity analysis, equations, tables, ablations, detailed insights
  8. Write to file:

    • Save as content/posts/<slug>/index.md
  9. Start Hugo server (optional):

    bash
    cd /Users/gil-yoonhee/VSCodeProject/moripiri.github.io && hugo server -D
    
  10. Confirm to user:

    • Post location: content/posts/<slug>/index.md
    • Preview URL: http://localhost:1313/posts/<slug>/
    • Status: draft (set draft: false when ready)
    • Images copied: list filenames

Language Support

Default: English

Korean mode: Triggered when:

  • User explicitly requests Korean (e.g., "한국어로", "write in Korean")
  • USER.md indicates Korean preference
  • Use --lang ko flag for scripts

What gets translated:

  • Summary content
  • Q&A responses
  • Template headers and labels
  • Output markdown file

What stays in English:

  • Paper metadata (title, authors, as published)
  • Technical terms (when appropriate)
  • arXiv links and identifiers

Dependencies

Python packages required:

  • requests (arXiv API)
  • pdfplumber (PDF text extraction)
  • PyMuPDF (PDF image extraction)

Install if missing:

bash
pip install requests pdfplumber PyMuPDF

Output Example

See references/hugo-post-template-en.md and references/hugo-post-template-ko.md for full templates.

Hugo blog posts are generated as page bundles:

code
content/posts/<slug>/
├── index.md (frontmatter + content)
└── images/
    ├── figure1-architecture.png
    └── figure2-main-results.png

English output:

markdown
# Attention Is All You Need

**Authors:** Vaswani et al.
**Year:** 2017
**Link:** https://arxiv.org/abs/1706.03762
**Studied:** 2024-02-02

## Summary
[Summary content]

## Discussion
### Q: What's the key innovation?
**A:** The Transformer architecture eliminates recurrence...

Korean output:

markdown
# Attention Is All You Need

**저자:** Vaswani et al.
**발행년도:** 2017
**링크:** https://arxiv.org/abs/1706.03762
**학습일:** 2024-02-02

## 요약
[요약 내용]

## 토론 및 질의응답
### Q: 핵심 혁신은 무엇인가요?
**A:** Transformer 아키텍처는 순환 구조를 제거하고...

Advanced Usage

Batch processing:

  • Process multiple papers in sequence
  • Generate comparison notes across papers

Custom templates:

  • User can provide custom template in workspace
  • Override default templates if found

Integration with other tools:

  • Use hugo server -D for live preview while editing
  • Use git to commit and push blog posts
  • Generate multiple posts in batch for paper comparison series

Troubleshooting

PDF extraction issues:

  • Some PDFs are image-based (scanned) - requires OCR
  • Try alternative: Use arXiv source LaTeX if available

arXiv rate limits:

  • Be respectful with API calls
  • Add delays if fetching multiple papers

Large papers:

  • May hit context limits during summarization
  • Use chunking strategy or higher context model
  • Focus on abstract + key sections only