GitHub Code Search
Fetch real-world code examples from GitHub through intelligent multi-search orchestration.
Prerequisites
Required:
- •gh CLI installed and authenticated (
gh auth login --web) - •Node.js + pnpm
Validation:
gh auth status # Should show authenticated
Usage
cd plugins/knowledge-work/skills/gh-code-search pnpm install # First time only pnpm search "your query here"
When to Use
Invoke this skill when user requests:
- •Real-world examples ("how do people implement X?")
- •Library usage patterns ("show me examples of workflow.dev usage")
- •Implementation approaches ("claude code skills doing github search")
- •Architectural patterns ("repository pattern in TypeScript")
- •Best practices for specific frameworks/libraries
Examples requiring nuanced multi-search:
- •"Find React hooks" → Need queries for
useState/useEffect(imports),function use(definitions),const use =(arrow functions) - •"Express error handling middleware" → Queries for
(req, res, next) =>(signatures),app.use(registration),function(err,(error handlers) - •"How do people use XState?" → Queries for
useMachine,createMachine,interpret(actual function names, not "state machine") - •"GitHub Actions for TypeScript projects" → Queries for
.ymlworkflow files +actions/setup-node+tscorpnpm build(actual commands)
How It Works
Division of responsibilities:
Script Responsibilities (Tool Implementation)
- •Authenticates with GitHub (gh CLI token)
- •Executes single search query (Octokit API, pagination, text-match)
- •Ranks by quality (stars, recency, code structure)
- •Fetches top 10 files in parallel
- •Extracts factual data (imports, syntax patterns, metrics)
- •Returns clean markdown with code + metadata + GitHub URLs for all files
The script is a single-query tool. Claude orchestrates multiple invocations.
Claude's Orchestration Responsibilities
Claude executes a multi-phase workflow:
- •
Query Strategy Generation: Craft 3-5 targeted search queries considering:
- •Language context (TypeScript vs JavaScript)
- •File type nuances (tsx vs ts for React, config files vs implementations)
- •Specificity variants (broad → narrow)
- •Related terminology expansion
- •
Sequential Search Execution: Run tool multiple times, adapting queries based on intermediate results
- •
Result Aggregation: Combine all code files, deduplicate by repo+path, preserve GitHub URLs
- •
Pattern Analysis: Extract common imports, architectural styles, implementation patterns
- •
Comprehensive Summary: Synthesize findings with trade-offs, recommendations, key code highlights, and GitHub URLs for ALL meaningful files
Claude's Orchestration Workflow
Step 1: Analyze Request & Generate Queries (BLOCKING)
Analyze user request to determine:
- •Primary language/framework (TypeScript, Python, Go, etc.)
- •Specific patterns sought (hooks, middleware, configs, actions)
- •File type variations needed (tsx vs ts, yaml vs json)
Generate 3-5 targeted queries considering nuances:
Example 1: "Find React hooks"
- •Query 1:
useState useEffect language:typescript extension:tsx(matches import statements, hook usage in components) - •Query 2:
function use language:typescript extension:ts(matches hook definitions likefunction useFetch()) - •Query 3:
const use = language:typescript(matches arrow function hooks likeconst useAuth = () =>)
Example 2: "Express error handling"
- •Query 1:
(req, res, next) => language:javascript(matches middleware function signatures) - •Query 2:
app.use express(matches Express middleware registration) - •Query 3:
function(err, req, res, next) language:javascript(matches error handler signatures with 4 params)
Example 3: "XState state machines in React"
- •Query 1:
useMachine language:typescript extension:tsx(matches React hook usage likeconst [state, send] = useMachine(machine)) - •Query 2:
createMachine language:typescript(matches machine definitions and imports) - •Query 3:
interpret xstate language:typescript(matches service creation likeconst service = interpret(machine))
Rationale for each query:
- •Match actual code patterns (function names, import statements, signatures)
- •NOT human-friendly search terms or documentation phrases
- •Different file extensions capture different use cases (tsx for components, ts for utilities)
- •Specific function/variable names find concrete implementations
- •Signature patterns match common code structures (arrow functions, function declarations)
- •Language/extension filters prevent noise from unrelated languages
CHECKPOINT: Verify 3-5 queries generated before proceeding
Step 2: Execute Searches Sequentially
For each query (in order):
cd plugins/knowledge-work/skills/gh-code-search pnpm search "query text here"
After each search:
- •Note result count - Too many? Too few?
- •Check quality indicators - Stars, recency, code structure scores
- •Identify patterns - Are results converging on specific approaches?
- •Adapt next query if needed:
- •Too many results → Narrow scope (add more filters)
- •Too few results → Broaden (remove restrictive filters)
- •Found strong pattern → Search for similar variations
Track which queries succeed - Note for summary which strategies were effective
Early stopping: If first 2-3 queries yield 30+ high-quality results, remaining queries optional
Step 3: Aggregate Results
Combine all code files from all searches:
- •Collect files from each search output with their GitHub URLs
- •Deduplicate by
repository.full_name + file_path- •Keep highest-scored version if duplicates exist
- •Note which queries found the same file (indicates strong relevance)
- •Maintain diversity - Ensure multiple repositories represented (avoid 10 files from one repo)
- •Track provenance - Remember which query found each file (useful for analysis)
- •Preserve GitHub URLs - Maintain full GitHub URLs for every file to include in summary
Result: Unified list of 15-30 unique, high-quality code files with GitHub URLs
Step 4: Analyze Patterns
Extract factual patterns across all code files:
Import Analysis:
- •List most common imports (group by library)
- •Identify standard library vs third-party dependencies
- •Note version-specific patterns (e.g., React 18 hooks)
Architectural Patterns:
- •Functional vs class-based implementations
- •Custom hooks vs inline logic
- •Middleware chains vs single-handler
- •Configuration patterns (env vars, config files, inline)
Code Structure:
- •Function signatures (parameters, return types)
- •Error handling approaches (try/catch, error middleware, Result types)
- •Testing patterns (unit vs integration, mocking strategies)
- •Documentation styles (JSDoc, inline comments, README)
Language Distribution:
- •TypeScript vs JavaScript split
- •Strict mode usage
- •Type definition patterns
DO NOT editorialize - Extract facts, not opinions. Analysis comes in Step 5.
Step 5: Generate Comprehensive Summary
Output format (exact structure):
# GitHub Code Search: [User Query Topic] ## Search Strategy Executed Ran [N] targeted queries: 1. `query text` - [brief rationale] → [X results] 2. `query text` - [brief rationale] → [Y results] 3. ... **Total unique files analyzed:** [N] --- ## Pattern Analysis ### Common Imports - `library-name` - Used in [N/total] files - `another-lib` - Used in [M/total] files ### Architectural Styles - **Functional Programming** - [N] files use pure functions, immutable patterns - **Object-Oriented** - [M] files use classes, inheritance - **Hybrid** - [K] files mix both approaches ### Implementation Patterns - **Pattern 1 Name**: [Description with prevalence] - **Pattern 2 Name**: [Description with prevalence] --- ## Approaches Found ### Approach 1: [Name] **Repos:** [repo1], [repo2] **Characteristics:** - [Key trait 1] - [Key trait 2] **Example:** [repo/file_path:line_number](github_url) ```language [relevant code snippet - NOT just first 40 lines]
Approach 2: [Name]
[Similar structure]
Trade-offs
| Approach | Pros | Cons | Best For |
|---|---|---|---|
| Approach 1 | [pros] | [cons] | [use case] |
| Approach 2 | [pros] | [cons] | [use case] |
Recommendations
For your use case ([infer from user's context]):
- •
Primary recommendation: [Approach name]
- •Why: [Rationale based on analysis]
- •Implementation: [High-level steps]
- •References: repo/file_path:line_number
- •
Alternative: [Another approach]
- •When to use: [Scenarios where this is better]
- •References: repo/file_path:line_number
Key Code Sections
[Concept 1]
Source: repo/file_path:line_range
[Specific relevant code - imports, key function, not arbitrary truncation]
Why this matters: [Brief explanation]
[Concept 2]
Source: repo/file_path:line_range
[Specific relevant code]
Why this matters: [Brief explanation]
All GitHub Files Analyzed
IMPORTANT: Include GitHub URLs for ALL meaningful files found across all searches.
List every unique file analyzed (15-30 files), grouped by repository:
[repo-owner/repo-name] ([stars]⭐)
- •file_path - [language] - [brief description of what this file demonstrates]
- •file_path - [language] - [brief description]
[repo-owner/repo-name2] ([stars]⭐)
- •file_path - [language] - [brief description]
Format: Direct GitHub blob URLs with line numbers where relevant (e.g., https://github.com/owner/repo/blob/main/path/file.ts#L10-L50)
**Summary characteristics:** - **Factual** - Based on extracted data, not assumptions - **Actionable** - Clear recommendations with reasoning - **Contextualized** - References specific code locations with line numbers - **Balanced** - Shows trade-offs, not just "best practice" - **Comprehensive** - Covers patterns, approaches, trade-offs, recommendations - **Accessible** - GitHub URLs for ALL meaningful files so users can explore full source code --- ### Step 6: Save Results to File **After generating comprehensive summary, persist for future reference:** 1. **Generate timestamp:** - Invoke `timestamp` skill to get deterministic YYYYMMDDHHMMSS format - Example: `20250110143052` 2. **Sanitize query for filename:** - Convert user's original query to kebab-case slug - Rules: lowercase, spaces → hyphens, remove special chars, max 50 chars - Example: "React hooks useState" → "react-hooks-usestate" 3. **Construct file path:** - Directory: `docs/research/github/` - Format: `<timestamp>-<sanitized-query>.md` - Full path: `docs/research/github/20250110143052-react-hooks-usestate.md` 4. **Save using Write tool:** - Content: Full comprehensive summary from Step 5 - Ensures persistence across sessions - User can reference past research 5. **Log saved location:** - Inform user where file was saved - Example: "Research saved to docs/research/github/20250110143052-react-hooks-usestate.md" **Why save:** - Comprehensive summaries represent significant analysis work (30-150s of API calls) - Users may want to reference patterns/trade-offs later - Builds searchable knowledge base of GitHub research - Avoids re-running expensive queries for same topics --- ## Error Handling **Common issues:** - **Auth errors:** Prompt user to run `gh auth login --web` - **Rate limits:** Show remaining quota, reset time. If hit during multi-search, stop gracefully with partial results - **Network failures:** Continue with partial results, note which queries failed - **No results for query:** Note in summary, adjust subsequent queries to be broader - **All queries return same files:** Note low diversity, recommend broader initial queries **Graceful degradation:** Partial results are acceptable. Complete summary based on available data. --- ## Limitations - GitHub API rate limit: 5,000 req/hr authenticated (multi-search uses more quota) - Each tool invocation fetches top 10 results only - Skips files >100KB - Sequential execution takes longer than single query (10-30s per query) - Provides factual data, not conclusions (Claude interprets patterns) - Deduplication assumes exact path matches (renamed files treated as unique) --- ## Typical Execution Time **Per query:** 10-30 seconds depending on: - Number of results (100 max) - File sizes - Network latency - API rate limits **Full workflow (3-5 queries):** 30-150 seconds **Optimization:** If first queries yield sufficient results, skip remaining queries --- ## Integration Notes **Example: User asks "find Claude Code skills doing github search"** ```javascript // 1. Claude analyzes: Skills use SKILL.md with frontmatter, likely in .claude/skills/ // 2. Claude generates queries: // - "filename:SKILL.md github search" (matches SKILL.md files with "github search" text) // - "octokit.rest.search language:typescript" (matches actual Octokit API usage) // - "gh api language:typescript path:skills" (matches gh CLI usage in skills) // 3. Claude executes sequentially: cd plugins/knowledge-work/skills/gh-code-search pnpm search "filename:SKILL.md github search" // [analyze results] pnpm search "octokit.rest.search language:typescript" // [analyze results] pnpm search "gh api language:typescript path:skills" // 4. Claude aggregates, deduplicates, analyzes patterns // 5. Claude generates comprehensive summary with trade-offs and recommendations
Testing
Validate workflow with diverse queries:
- •Simple: "React hooks" (should generate 3+ queries, combine results)
- •Complex: "GitHub Actions TypeScript project setup" (requires config + implementation queries)
- •Ambiguous: "state management" (should generate queries for Redux, Zustand, XState, Context)
Check quality:
- •Deduplication works (no repeated files in summary)
- •Diverse repositories (not all from one repo)
- •Summary includes trade-offs and recommendations
- •Code snippets are relevant (not arbitrary truncations)