LLM Security Attacks
Scope
Use this skill when working on:
- •Prompt injection attacks and defenses
- •LLM jailbreaking techniques
- •Training data extraction
- •Model output manipulation
- •AI safety bypasses
Common LLM Vulnerabilities (Cheat Sheet)
Prompt Injection
- •Direct injection (user prompt manipulation)
- •Indirect injection (via external data sources)
- •System prompt extraction
- •Role-play attacks
- •Encoding/obfuscation bypasses
Jailbreaking
- •DAN (Do Anything Now) prompts
- •Character roleplay escapes
- •Multi-turn manipulation
- •Token smuggling
- •Crescendo attacks
Data Extraction
- •Training data memorization extraction
- •PII leakage from context
- •System prompt disclosure
- •API key/secret extraction
- •Model architecture probing
Model Manipulation
- •Output steering
- •Hallucination exploitation
- •Bias amplification
- •Harmful content generation
OWASP LLM Top 10 Reference
- •LLM01 - Prompt Injection
- •LLM02 - Insecure Output Handling
- •LLM03 - Training Data Poisoning
- •LLM04 - Model Denial of Service
- •LLM05 - Supply Chain Vulnerabilities
- •LLM06 - Sensitive Information Disclosure
- •LLM07 - Insecure Plugin Design
- •LLM08 - Excessive Agency
- •LLM09 - Overreliance
- •LLM10 - Model Theft
Where to Add Links in README
- •Prompt injection tools/research:
AI Security & Attacks → Prompt Injection - •Jailbreak techniques:
AI Security & Attacks → Model Security - •Data extraction research:
AI Security & Attacks → Privacy & Extraction - •Defense tools:
AI Security & Attacks → Model Security - •CTFs/challenges:
AI Security Starter Pack → CTFs / Practice
Notes
Keep additions:
- •AI/LLM security focused
- •Non-duplicated URLs
- •Minimal structural changes
Data Source
For detailed and up-to-date resources, fetch the complete list from:
code
https://raw.githubusercontent.com/gmh5225/awesome-ai-security/refs/heads/main/README.md
Use this URL to get the latest curated links when you need specific tools, papers, or resources not covered in this skill.