Codebase Search
When to use this skill
- •Finding specific functions or classes
- •Tracing function calls and dependencies
- •Understanding code structure and architecture
- •Finding usage examples
- •Identifying code patterns
- •Locating bugs or issues
- •Code archaeology (understanding legacy code)
- •Impact analysis before changes
Instructions
Step 1: Understand what you're looking for
Feature implementation:
- •Where is feature X implemented?
- •How does feature Y work?
- •What files are involved in feature Z?
Bug location:
- •Where is this error coming from?
- •What code handles this case?
- •Where is this data being modified?
API usage:
- •How is this API used?
- •Where is this function called?
- •What are examples of using this?
Configuration:
- •Where are settings defined?
- •How is this configured?
- •What are the config options?
Step 2: Choose search strategy
Semantic search (for conceptual questions):
code
Use when: You understand what you're looking for conceptually Examples: - "How do we handle user authentication?" - "Where is email validation implemented?" - "How do we connect to the database?" Benefits: - Finds relevant code by meaning - Works with unfamiliar codebases - Good for exploratory searches
Grep (for exact text/patterns):
code
Use when: You know exact text or patterns Examples: - Function names: "def authenticate" - Class names: "class UserManager" - Error messages: "Invalid credentials" - Specific strings: "API_KEY" Benefits: - Fast and precise - Works with regex patterns - Good for known terms
Glob (for file discovery):
code
Use when: You need to find files by pattern Examples: - "**/*.test.js" (all test files) - "**/config*.yaml" (config files) - "src/**/*Controller.py" (controllers) Benefits: - Quickly find files by type - Discover file structure - Locate related files
Step 3: Search workflow
1. Start broad, then narrow:
code
Step 1: Semantic search "How does authentication work?" Result: Points to auth/ directory Step 2: Grep in auth/ for specific function Pattern: "def verify_token" Result: Found in auth/jwt.py Step 3: Read the file File: auth/jwt.py Result: Understand implementation
2. Use directory targeting:
code
# Start without target (search everywhere) Query: "Where is user login implemented?" Target: [] # Refine with specific directory Query: "Where is login validated?" Target: ["backend/auth/"]
3. Combine searches:
code
# Find where feature is implemented Semantic: "user registration flow" # Find all files involved Grep: "def register_user" # Find test files Glob: "**/*register*test*.py" # Understand the implementation Read: registration.py, test_registration.py
Step 4: Common search patterns
Find function definition:
bash
# Python grep -n "def function_name" --type py # JavaScript grep -n "function functionName" --type js grep -n "const functionName = " --type js # TypeScript grep -n "function functionName" --type ts grep -n "export const functionName" --type ts # Go grep -n "func functionName" --type go # Java grep -n "public.*functionName" --type java
Find class definition:
bash
# Python grep -n "class ClassName" --type py # JavaScript/TypeScript grep -n "class ClassName" --type js,ts # Java grep -n "public class ClassName" --type java # C++ grep -n "class ClassName" --type cpp
Find class/function usage:
bash
# Python
grep -n "ClassName(" --type py
grep -n "function_name(" --type py
# JavaScript
grep -n "new ClassName" --type js
grep -n "functionName(" --type js
Find imports/requires:
bash
# Python grep -n "from.*import.*ModuleName" --type py grep -n "import.*ModuleName" --type py # JavaScript grep -n "import.*from.*module-name" --type js grep -n "require.*module-name" --type js # Go grep -n "import.*package-name" --type go
Find configuration:
bash
# Config files
glob "**/*config*.{json,yaml,yml,toml,ini}"
# Environment variables
grep -n "process\\.env\\." --type js
grep -n "os\\.environ" --type py
# Constants
grep -n "^[A-Z_]+\\s*=" --type py
grep -n "const [A-Z_]+" --type js
Find TODO/FIXME:
bash
grep -n "TODO|FIXME|HACK|XXX" -i
Find error handling:
bash
# Python grep -n "try:|except|raise" --type py # JavaScript grep -n "try|catch|throw" --type js # Go grep -n "if err != nil" --type go
Step 5: Advanced techniques
Trace data flow:
code
1. Find where data is created Semantic: "Where is user object created?" 2. Search for variable usage Grep: "user\\." with context lines 3. Follow transformations Read: Files that modify user 4. Find where it's consumed Grep: "user\\." in relevant files
Find all callsites of a function:
code
1. Find function definition
Grep: "def process_payment"
Result: payments/processor.py:45
2. Find all imports of that module
Grep: "from payments.processor import"
Result: Multiple files
3. Find all calls to the function
Grep: "process_payment\\("
Result: All callsites
4. Read each callsite for context
Read: Each file with context
Understand a feature end-to-end:
code
1. Find API endpoint Semantic: "Where is user registration endpoint?" Result: routes/auth.py 2. Trace to controller Read: routes/auth.py Find: Calls to AuthController.register 3. Trace to service Read: controllers/auth.py Find: Calls to UserService.create_user 4. Trace to database Read: services/user.py Find: Database operations 5. Find tests Glob: "**/*auth*test*.py" Read: Test files for examples
Find related files:
code
1. Start with known file Example: models/user.py 2. Find imports of this file Grep: "from models.user import" 3. Find files this imports Read: models/user.py Note: Import statements 4. Build dependency graph Map: All related files
Impact analysis:
code
Before changing function X:
1. Find all callsites
Grep: "function_name\\("
2. Find all tests
Grep: "test.*function_name" -i
3. Check related functionality
Semantic: "What depends on X?"
4. Review each usage
Read: Each file using function
5. Plan changes
Document: Impact and required updates
Step 6: Search optimization
Use appropriate context:
bash
# See surrounding context grep -n "pattern" -C 5 # 5 lines before and after grep -n "pattern" -B 3 # 3 lines before grep -n "pattern" -A 3 # 3 lines after
Case sensitivity:
bash
# Case insensitive grep -n "pattern" -i # Case sensitive (default) grep -n "Pattern"
File type filtering:
bash
# Specific type grep -n "pattern" --type py # Multiple types grep -n "pattern" --type py,js,ts # Exclude types grep -n "pattern" --glob "!*.test.js"
Regex patterns:
bash
# Any character: . grep -n "function.*Name" # Start of line: ^ grep -n "^class" # End of line: $ grep -n "TODO$" # Optional: ? grep -n "function_name_?()" # One or more: + grep -n "[A-Z_]+" # Zero or more: * grep -n "import.*" # Alternatives: | grep -n "TODO|FIXME" # Groups: () grep -n "(get|set)_user" # Escape special chars: \ grep -n "function\(\)"
Best practices
- •Start with semantic search: For unfamiliar code or conceptual questions
- •Use grep for precision: When you know exact terms
- •Combine multiple searches: Build understanding incrementally
- •Read surrounding context: Don't just look at matching lines
- •Check file history: Use
git blamefor context - •Document findings: Note important discoveries
- •Verify assumptions: Read actual code, don't assume
- •Use directory targeting: Narrow scope when possible
- •Follow the data: Trace data flow through the system
- •Check tests: Tests often show usage examples
Common search scenarios
Scenario 1: Understanding a bug
code
1. Find error message Grep: "exact error message" 2. Find where it's thrown Read: File with error 3. Find what triggers it Semantic: "What causes X error?" 4. Find related code Grep: Related function names 5. Check tests Glob: "**/*test*.py" Look: For related test cases
Scenario 2: Learning a new codebase
code
1. Find entry point Semantic: "Where does the application start?" Common files: main.py, index.js, app.py 2. Find main routes/endpoints Grep: "route|endpoint|@app\\." 3. Find data models Semantic: "Where are data models defined?" Common: models/, entities/ 4. Find configuration Glob: "**/*config*" 5. Read README and docs Read: README.md, docs/
Scenario 3: Refactoring preparation
code
1. Find all usages Grep: "function_to_change" 2. Find tests Grep: "test.*function_to_change" 3. Find dependencies Semantic: "What does X depend on?" 4. Check imports Grep: "from.*import.*X" 5. Document scope List: All affected files
Scenario 4: Adding a feature
code
1. Find similar features Semantic: "How is similar feature implemented?" 2. Find where to add code Semantic: "Where should new feature go?" 3. Check patterns Read: Similar implementations 4. Find tests to emulate Glob: Test files for similar features 5. Check documentation Grep: "TODO.*new feature" -i
Tools integration
Git integration:
bash
# Who changed this line? git blame filename # History of a file git log -p filename # Find when function was added git log -S "function_name" --source --all # Find commits mentioning X git log --grep="feature name"
IDE integration:
- •Use "Go to Definition" for quick navigation
- •Use "Find References" for usage
- •Use "Find in Files" for broad search
- •Use symbol search for classes/functions
Documentation:
- •Check inline comments
- •Look for docstrings
- •Read README files
- •Check architecture docs
Troubleshooting
No results found:
- •Check spelling and case sensitivity
- •Try semantic search instead of grep
- •Broaden search scope (remove directory target)
- •Try different search terms
- •Check if files are in .gitignore
Too many results:
- •Add directory targeting
- •Use more specific patterns
- •Filter by file type
- •Use exact phrases (quotes)
Wrong results:
- •Be more specific in query
- •Use grep instead of semantic for exact terms
- •Add context to semantic queries
- •Check file types