Codebase exploration
Performs thorough, structured codebase exploration to build deep understanding before coding work begins.
Scope
Covers:
- •Exploring project structure, architecture, and directory layout
- •Tracing execution paths from entry points through core logic
- •Mapping import dependencies and identifying central components
- •Enumerating public API surfaces and CLI commands
- •Discovering configuration mechanisms (pyproject.toml, env vars, config files)
- •Mapping test coverage to source modules
- •Producing a structured summary of findings
Does not cover:
- •Modifying code or configuration files
- •Applying coding conventions (see
/python-style) - •Writing commit messages (see
/commit) - •Creating or modifying skill files (see
/skill-design)
Workflow
You MUST follow these steps when this skill is invoked.
Step 1: Determine project size
Quickly assess the project to select the appropriate exploration tier. Run a file count and directory listing to make this determination before proceeding.
| Tier | Indicators | Approach |
|---|---|---|
| Small | Single package, < 10 source files, no subpackages | Single-pass exploration |
| Medium | Multiple packages or 10-50 source files | Structured four-phase |
| Large | Monorepo, 50+ source files, multiple entry points, MCP server | Parallel subagent exploration |
Step 2: Execute exploration
Follow the approach for the determined tier.
Small projects: Execute all four exploration phases yourself in a single pass. Combine phases where appropriate to keep exploration concise.
Medium projects: Execute all four exploration phases sequentially, giving each phase focused
attention. Use the Task tool with subagent_type: Explore for any phase that requires reading
many files.
Large projects: Launch 2-3 Explore subagents in parallel using the Task tool with
subagent_type: Explore. Assign each subagent a different focus area:
- •Subagent 1: Structure and entry points — Phase 1 (feature discovery) and Phase 4 (configuration and test coverage)
- •Subagent 2: Architecture and dependencies — Phase 2 (code flow tracing) and Phase 3 (architecture analysis including import mapping and central component identification)
- •Subagent 3: API surface and quality — Public API enumeration, error handling patterns, technical debt indicators
Synthesize the subagent findings into a unified summary.
Step 3: Present findings
Present the structured summary following the output format below. Do NOT make code changes during exploration. Wait for user direction before proceeding.
Exploration phases
Every exploration follows four phases regardless of project size. For small projects, phases may be combined. For large projects, phases may be distributed across parallel subagents.
Phase 1: Feature discovery
Identify the project's entry points, boundaries, and configuration.
- •Entry points — CLI commands, API endpoints, main functions, GUI entry points. Check
pyproject.tomlfor[project.scripts]and[project.entry-points]sections. - •Configuration mechanisms —
pyproject.tomlsettings, environment variables, CLI argument defaults, YAML/JSON/TOML config files,.envfiles, and dataclass-based configuration objects. - •Public API surface — Classes, functions, and constants exported from
__init__.pyfiles. Note which modules use__all__to restrict exports. - •CLI command signatures — For CLI-based projects, document each command's name, arguments, options, and purpose.
Phase 2: Code flow tracing
Trace execution paths from entry points through the codebase layers.
- •Call chain tracing — Starting from each entry point identified in Phase 1, follow the call
chain through to core logic. Document the path as a sequence of
file:functionreferences. - •Data transformations — At each step in the call chain, note what data is passed, transformed, or returned.
- •Layer identification — Identify the abstraction layers the call chain passes through (e.g., CLI → validation → business logic → data access → output).
- •Side effects — Note where the code interacts with external systems (file I/O, network calls, subprocess invocations, environment variable reads).
Phase 3: Architecture analysis
Analyze the structural relationships between components.
- •Import/dependency mapping — For each source module, identify what it imports from other project modules. Note the direction of dependencies (which modules depend on which).
- •Central component identification — Identify which files or classes are imported by the most other modules. These are the architecturally central components that changes will most likely ripple through.
- •Design patterns — Identify patterns in use (dataclasses, protocols, factories, decorators, context managers, abstract base classes, etc.).
- •Cross-cutting concerns — Note how logging, error handling, configuration, and validation are handled across the codebase.
Phase 4: Implementation details
Examine specifics that inform future modifications.
- •Test coverage mapping — For each source module, identify the corresponding test file(s). Note any source modules that lack test coverage.
- •Key algorithms and data structures — Document non-trivial algorithms, important data structures (dataclasses, TypedDicts, NamedTuples), and their locations.
- •Error handling patterns — How errors are raised, caught, and reported to the user. Note whether the project uses custom exception classes, logging, or a console utility.
- •Technical debt and complexity — Areas with high complexity, TODO/FIXME comments, known limitations, type: ignore suppressions, or noqa markers.
Output format
Present findings using the following structure. Include all sections for medium and large projects. For small projects, omit sections that do not apply.
Required sections
- •Project purpose — 1-2 sentence summary
- •Entry points and CLI commands — Table of entry points with locations and descriptions
- •Key components — Table of components with locations and purposes
- •Call chain summary — Entry point → layer → core logic flow for primary paths
- •Import dependency map — Which modules depend on which, with central components highlighted
- •Public API surface — Exported classes, functions, and constants
- •Configuration — All configuration mechanisms discovered
- •Test coverage — Source module → test file mapping, noting gaps
- •Notable patterns — Design patterns, conventions, and cross-cutting concerns
- •Areas of concern — Technical debt, complexity hotspots, missing coverage
Example output
## Project purpose Provides a reimplemented suite2p library for neural imaging analysis with multi-day cell tracking capabilities for the Sun Lab at Cornell University. ## Entry points and CLI commands | Entry Point | Location | Description | |-----------------|---------------------------|------------------------------------------| | `sl-suite2p` | pyproject.toml scripts | Main CLI entry point | | `run` | cli.py:run | Executes single-day processing pipeline | | `run-multiday` | cli.py:run_multiday | Executes multi-day tracking pipeline | | `mcp` | cli.py:mcp | Starts the MCP server | ## Key components | Component | Location | Purpose | |--------------------|--------------------------------|---------------------------------------------------| | Pipeline | src/sl_suite2p/pipeline.py | Main single-day processing pipeline orchestration | | Multi-day Tracking | src/sl_suite2p/multiday/ | Cross-session cell tracking and alignment | | Registration | src/sl_suite2p/registration/ | Image registration and motion correction | | Detection | src/sl_suite2p/detection/ | Cell detection algorithms | | Configuration | src/sl_suite2p/configuration/ | Pipeline configuration management | | MCP Server | src/sl_suite2p/mcp/ | AI agent integration via MCP | ## Call chain summary `sl-suite2p run` → `cli.py:run` → `pipeline.py:run_pipeline` → `registration/register.py:register` → `detection/detect.py:detect_cells` → `extraction/extract.py:extract_signals` → `classification/classify.py:classify_cells` → writes output to `suite2p/` directory. ## Import dependency map Central components (imported by 5+ modules): - `configuration/pipeline_config.py` — imported by pipeline, registration, detection, extraction - `utils/io.py` — imported by all processing modules - `types.py` — imported by all modules for shared type definitions Dependency direction: cli → pipeline → processing modules → utils/io + configuration. ## Public API surface Exported from `__init__.py` via `__all__`: - `run_pipeline(config: PipelineConfig) -> PipelineResult` - `PipelineConfig` (dataclass) - `PipelineResult` (dataclass) ## Configuration | Mechanism | Location | Purpose | |-----------------|---------------------------------|--------------------------------------| | `PipelineConfig`| configuration/pipeline_config.py| Dataclass with all pipeline settings | | CLI arguments | cli.py | Override config values at runtime | | YAML config | User-provided path | Full pipeline configuration file | ## Test coverage | Source Module | Test File | Coverage | |------------------------------|------------------------------------|----------| | pipeline.py | tests/test_pipeline.py | Yes | | registration/register.py | tests/test_registration.py | Yes | | detection/detect.py | tests/test_detection.py | Yes | | mcp/server.py | (none) | Gap | ## Notable patterns - Numba-accelerated computation for image processing - Configuration dataclasses with validation - MyPy strict mode with full type annotations - Processing modules follow a consistent register → detect → extract → classify pipeline ## Areas of concern - `mcp/server.py` lacks test coverage - Large refactoring effort from original suite2p codebase still in progress - Multi-day tracking module has high cyclomatic complexity - Several `# type: ignore` suppressions in registration module
Related skills
| Skill | Relationship |
|---|---|
/python-style | Provides Python coding conventions discovered during exploration |
/cpp-style | Provides C++ coding conventions discovered during exploration |
/csharp-style | Provides C# coding conventions discovered during exploration |
/readme-style | Provides README conventions when exploration reveals README issues |
/commit | Should be invoked after completing code changes informed by context |
/skill-design | Provides skill conventions when exploration reveals skill files |
Proactive behavior
Invoke at session start to ensure full context before making changes. Prevents blind modifications and ensures understanding of existing patterns.
Do NOT make code changes during exploration. Present findings and wait for user direction.
Verification checklist
You MUST verify the exploration output against this checklist before presenting it to the user.
Exploration Output Compliance: - [ ] Project purpose summarized (1-2 sentences) - [ ] Entry points identified with locations (pyproject.toml scripts, CLI commands) - [ ] Key components identified with locations and purposes - [ ] Call chains traced from entry points through core logic (file:function references) - [ ] Import dependencies mapped with central components highlighted - [ ] Public API surface enumerated (exported classes, functions, constants) - [ ] Configuration mechanisms documented (pyproject.toml, env vars, config files, dataclasses) - [ ] Test files mapped to source modules with coverage gaps noted - [ ] Design patterns and cross-cutting concerns documented - [ ] Areas of concern noted (technical debt, complexity, missing coverage) - [ ] Output uses structured format (headings, tables, lists) - [ ] No code modifications were made during exploration - [ ] Exploration depth matches project size tier (small/medium/large)