Sensitive Content Scanner
Core Purpose
Examine files for sensitive content that should be sanitized before sharing publicly. This is a safety check to run before publishing repos, sharing code snippets, exporting configurations, or any public sharing of files.
When to Use
- •Before pushing a repo public
- •Before sharing code/config snippets
- •Before exporting skills, dotfiles, or configs for others
- •When reviewing what might leak from a file set
- •As part of an automated sanitization workflow
Invocation
/sensitive-content-scanner [path] - Scan a file or directory
/sensitive-content-scanner - Prompted for path
Examples:
- •
/sensitive-content-scanner ~/.claude/skills/ - •
/sensitive-content-scanner ./my-project/ - •
/sensitive-content-scanner ./config.md
Personal Context File (Optional)
For better detection of YOUR specific sensitive content, create ~/.claude/sensitive-content-context.md:
# Sensitive Content Context ## Personal Identifiers - Full name: [Your Name] - Usernames: [github-handle, twitter-handle, etc.] - Email patterns: [yourname@, your.name@] - Company/team name: [Company Name] ## Private Paths (patterns to flag) - ~/Library/Mobile Documents/com~apple~CloudDocs/ - /Users/[username]/ - Any path containing: [folder names that are private] ## Private URLs (domains/patterns to flag) - notion.site (personal workspaces) - [internal-tool].company.com - Private GitHub org: github.com/[private-org]/ ## Known Secrets Patterns - Project-specific API key prefixes - Internal service names ## Business/Proprietary Terms - Client names: [list] - Internal project codenames: [list] - Confidential terms: [list]
If this file exists, the scanner will use it for personalized detection. If not, it uses generic patterns.
Detection Categories
1. Credentials & Secrets (CRITICAL)
Always flag - these should never be shared:
| Pattern | Examples |
|---|---|
| API keys | sk-..., AKIA..., ghp_..., xoxb-... |
| Private keys | -----BEGIN RSA PRIVATE KEY----- |
| Passwords | password=, passwd:, pwd= |
| Tokens | token=, bearer ..., auth_token |
| Connection strings | postgres://user:pass@, mongodb+srv:// |
| AWS credentials | aws_access_key_id, aws_secret_access_key |
| Environment secrets | .env files with real values |
Severity: CRITICAL - Block sharing until resolved
2. Personal Identifiers (HIGH)
Flag for review - may need anonymization:
| Pattern | Examples |
|---|---|
| Email addresses | user@domain.com |
| Phone numbers | +1-555-..., (555) 123-4567 |
| Names (from context file) | Your name, family names |
| Usernames (from context file) | Social handles, login names |
| Physical addresses | Street addresses, ZIP codes |
Severity: HIGH - Review and anonymize or confirm OK to share
3. Private URLs (HIGH)
Flag - often reveal private resources:
| Pattern | Examples |
|---|---|
| Notion URLs | *.notion.site/*, notion.so/*/... |
| Google Docs (private) | docs.google.com/document/d/... |
| Internal tools | *.internal.*, *.corp.*, localhost:* |
| Private repos | github.com/[private-org]/... |
| Figma/design files | Private design URLs |
| Calendar/meeting links | Zoom personal rooms, Calendly |
Severity: HIGH - Replace with placeholder or remove
4. Local Paths (MEDIUM)
Flag - reveal system structure and username:
| Pattern | Examples |
|---|---|
| Home directory | /Users/username/, /home/username/ |
| iCloud paths | ~/Library/Mobile Documents/com~apple~CloudDocs/ |
| Dropbox/OneDrive | ~/Dropbox/, ~/OneDrive/ |
| App-specific paths | ~/Library/Application Support/[App]/ |
| Windows user paths | C:\Users\username\ |
Severity: MEDIUM - Replace with generic placeholder like ~/path/to/... or [YOUR_PATH]
5. Infrastructure & Security (MEDIUM-HIGH)
Flag - could enable attacks or reveal architecture:
| Pattern | Examples |
|---|---|
| Internal IPs | 10.x.x.x, 192.168.x.x, 172.16-31.x.x |
| Internal hostnames | *.internal, *.local, *.corp |
| Database hosts | Specific DB server addresses |
| Cloud resource IDs | AWS account IDs, GCP project IDs |
| CI/CD specifics | Internal Jenkins/GitHub Actions URLs |
Severity: MEDIUM-HIGH depending on context
6. Business & Proprietary (MEDIUM)
Flag if context file specifies - varies by situation:
| Pattern | Examples |
|---|---|
| Client names | (from context file) |
| Internal project names | Codenames, internal product names |
| Pricing/financial | Revenue figures, pricing tiers |
| Strategy content | Competitive analysis, roadmaps |
| Internal comms | Slack channel names, team names |
Severity: MEDIUM - Requires judgment
Scan Protocol
Phase 1: Load Context
- •Check for
~/.claude/sensitive-content-context.md - •If exists: load personal patterns and terms
- •If not: use generic detection only, note limitations
Phase 2: File Discovery
- •If path is a file: scan that file
- •If path is a directory:
- •Find all text-based files (md, txt, json, yaml, yml, toml, js, ts, py, sh, etc.)
- •Respect
.gitignoreif present - •Skip binary files, node_modules, .git, etc.
- •Report file count and types found
Phase 3: Pattern Scanning
For each file, scan for all detection categories. Use:
- •Regex patterns for structured data (emails, IPs, API keys)
- •Context file terms for personal/business content
- •Heuristics for paths and URLs
Phase 4: Findings Report
Generate structured report:
# Sensitive Content Scan Report **Path scanned:** [path] **Files scanned:** [count] **Scan date:** [timestamp] ## Summary | Severity | Count | Action Required | | -------- | ----- | ----------------------- | | CRITICAL | X | Must fix before sharing | | HIGH | X | Review and sanitize | | MEDIUM | X | Consider sanitizing | | LOW | X | Informational | ## CRITICAL Findings ### [Category]: [Brief description] **File:** `path/to/file.md` **Line:** [line number] **Content:** `[snippet with sensitive part highlighted]` **Risk:** [Why this is sensitive] **Suggested fix:** [How to sanitize] --- [Repeat for each finding, grouped by severity] ## Recommendations 1. [Prioritized action items] 2. [...] ## Files Cleared These files contained no detected sensitive content: - [list of clean files]
Output Modes
Default: Full Report
Complete findings with context and suggestions.
Summary Mode
/sensitive-content-scanner [path] --summary
Just the summary table and critical findings.
JSON Mode (for automation)
/sensitive-content-scanner [path] --json
Machine-readable output for piping to other tools.
Integration with Other Skills
This skill is designed to be called by other skills/workflows:
Example - called by sync-skills-public:
1. sync-skills-public invokes sensitive-content-scanner on ~/.claude/skills/ 2. Scanner returns findings 3. sync-skills-public uses findings to guide sanitization 4. Scanner re-runs on output to verify clean
When called programmatically, return structured findings that the calling skill can act on.
Limitations
- •Cannot detect everything: Novel patterns or context-dependent sensitivity may be missed
- •False positives: Some patterns (like example UUIDs) may flag incorrectly
- •Requires context for personal content: Without the context file, personal names/terms won't be detected
- •No semantic understanding: Can't detect if content is "confidential" without explicit markers
Always do a final human review before sharing sensitive materials.
Quick Reference
| I want to... | Command |
|---|---|
| Scan a directory | /sensitive-content-scanner ./path/ |
| Scan a single file | /sensitive-content-scanner ./file.md |
| Quick summary only | /sensitive-content-scanner ./path/ --summary |
| Check my skills before export | /sensitive-content-scanner ~/.claude/skills/ |
| Set up personal detection | Create ~/.claude/sensitive-content-context.md |