Security Scrubbing Knowledge

LLM Provider API Keys

Key Patterns by Provider

Provider	Pattern	Environment Variable
OpenAI	`sk-...`, `sk-proj-...`, `sk-svcacct-...`	`OPENAI_API_KEY`
Anthropic	`sk-ant-...`	`ANTHROPIC_API_KEY`
Google AI	`AIza[0-9A-Za-z-_]{35}`	`GOOGLE_API_KEY`, `GEMINI_API_KEY`
Cohere	40-char alphanumeric	`COHERE_API_KEY`
Mistral	32-char alphanumeric	`MISTRAL_API_KEY`
Groq	`gsk_...`	`GROQ_API_KEY`
Perplexity	`pplx-...`	`PERPLEXITY_API_KEY`
Hugging Face	`hf_...`	`HF_TOKEN`, `HUGGINGFACE_API_KEY`
Replicate	`r8_...`	`REPLICATE_API_TOKEN`
Together AI	64-char hex	`TOGETHER_API_KEY`
xAI (Grok)	varies	`XAI_API_KEY`

Secure API Key Usage

Never do this:

python

# BAD - hardcoded key
client = OpenAI(api_key="sk-abc123...")

Do this instead:

python

# GOOD - environment variable
import os
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

Or use dotenv:

python

from dotenv import load_dotenv
load_dotenv()  # Loads from .env file
client = OpenAI()  # Auto-reads OPENAI_API_KEY

Other Common Secret Patterns

Service	Pattern	Example
AWS Access Key	`AKIA[0-9A-Z]{16}`	`AKIAIOSFODNN7EXAMPLE`
AWS Secret Key	40 char base64	`wJalrXUtnFEMI/K7MDENG...`
GitHub Token	`ghp_[a-zA-Z0-9]{36}`	`ghp_xxxx...`
GitHub PAT	`github_pat_[0-9a-zA-Z_]{82}`	`github_pat_...`
GitLab Token	`glpat-[0-9a-zA-Z-]{20}`	`glpat-xxxx...`
Stripe Live	`sk_live_[a-zA-Z0-9]{24}`	`sk_live_...`
Stripe Test	`sk_test_[a-zA-Z0-9]{24}`	`sk_test_...`
Slack Token	`xox[baprs]-...`	`xoxb-...`
Slack Webhook	`https://hooks.slack.com/...`	webhook URL
Twilio	`SK[0-9a-fA-F]{32}`	`SK...`
SendGrid	`SG.[a-zA-Z0-9]{22}.[a-zA-Z0-9]{43}`	`SG....`
Discord Bot	`[MN][A-Za-z0-9]{23,}...`	Bot token
Firebase	`AAAA[A-Za-z0-9_-]{7}:...`	FCM key
Supabase	`sbp_[a-f0-9]{40}`	`sbp_...`
JWT	`eyJ...` (3 base64 segments)	`eyJhbGc...`
Private Key	`-----BEGIN...PRIVATE KEY-----`	PEM format

Credential Guard

The scrub plugin includes a PreToolUse hook that blocks Claude from reading files containing credentials.

How It Works

•Hook intercepts Read operations
•Scans target file for 40+ credential patterns
•Blocks read if secrets detected
•Allows if file is in allowlist

Allowlist

Add trusted files to ~/.claude/plugins/scrub/.guard-allowlist:

code

# Allow test fixtures
tests/fixtures/
*.test.js
.env.example

Commands

bash

/scrub:guard status      # Check if enabled
/scrub:guard on          # Enable guard
/scrub:guard off         # Disable guard
/scrub:guard allowlist   # Manage allowlist

Where Secrets Hide

•Configuration files: .env, config.js, settings.py
•Test files: Hardcoded test credentials
•Git history: Removed from HEAD but still in commits
•Logs: Accidentally logged tokens
•Comments: "Temporary" credentials
•CI/CD files: .github/workflows/, Jenkinsfile
•Notebooks: Jupyter notebooks with inline keys
•Docker files: Dockerfile, docker-compose.yml

PII Categories

Identifiable Data

•Direct identifiers: Name, SSN, passport number
•Contact info: Email, phone, address
•Network info: IP address, MAC address
•Financial: Credit card, bank account
•Biometric: Fingerprints, face data

Log Sanitization

Before sharing logs:

•Remove IP addresses
•Redact email addresses
•Mask user IDs
•Strip session tokens
•Remove query parameters with PII

Git History Security

When to Scrub History

•Accidentally committed credentials
•Pushed .env file
•Included private keys
•Added database dumps with PII

Tools

•git-filter-repo (recommended): Fast, safe, feature-rich
•git filter-branch: Built-in but slower
•BFG Repo Cleaner: Java-based, good for large repos

Post-Scrub Checklist

•Force push all branches
•Delete all tags and recreate
•Notify all collaborators
•Rotate ALL exposed credentials
•Clear CI/CD caches
•GitHub: Contact support to clear caches

File Permissions

Secure Defaults

File Type	Recommended	Why
Private keys	600	Owner only
.env files	600	Contains secrets
Scripts	755	Execute for all, write for owner
Config files	644	Read for all, write for owner
Directories	755	Standard access

Dangerous Permissions

•World-writable (o+w): Anyone can modify
•World-readable keys: Credentials exposed
•SetUID on scripts: Privilege escalation risk

.gitignore Security

Must-Have Entries

gitignore

# Environment
.env
.env.*
.env.local

# Keys and certificates
*.pem
*.key
*.p12
*.pfx
*.keystore
id_rsa
id_ed25519

# Credentials
credentials.json
secrets.json
*secret*
*credential*

# Logs
*.log
logs/

Best Practices

•Never commit secrets - Use environment variables
•Use credential guard - Block secrets from reaching AI
•Pre-commit hooks - Scan before commit
•Least privilege - Minimal file permissions
•Rotate regularly - Even if not leaked
•Audit trails - Know who accessed what
•Defense in depth - Multiple protection layers
•.env.example - Document required vars without values