AgentSkillsCN

red-teaming

适用于规划或开展对抗性红队演练,以检验组织在传统渗透测试之外的检测、响应与韧性能力。涵盖红队与渗透测试的区别、对手模拟框架(MITRE ATT&CK、Cyber Kill Chain)、紫队协作、C2 框架,以及 AI/LLM 红队测试。 适用场景:红队测试、对手模拟、MITRE ATT&CK、Cyber Kill Chain、紫队协作、C2 框架、假设已遭入侵、攻击路径映射、检测测试、AI 红队测试、LLM 红队测试、桌面演练。 不适用场景:自动化漏洞扫描(应使用安全测试相关技能)、初始漏洞发现(应使用渗透测试相关技能)、设计阶段的威胁建模(应使用威胁建模相关技能)。

SKILL.md
--- frontmatter
name: red-teaming
description: |
    Use when planning or conducting adversarial red team engagements that test an organization's detection, response, and resilience capabilities beyond traditional penetration testing. Covers red team vs pen test distinctions, adversary simulation frameworks (MITRE ATT&CK, Cyber Kill Chain), purple teaming, C2 frameworks, and AI/LLM red teaming.
    USE FOR: red teaming, adversary simulation, MITRE ATT&CK, Cyber Kill Chain, purple teaming, C2 frameworks, assumed breach, attack path mapping, detection testing, AI red teaming, LLM red teaming, tabletop exercises
    DO NOT USE FOR: automated vulnerability scanning (use security-testing), initial vulnerability discovery (use penetration-testing), threat modeling during design (use threat-modeling)
license: MIT
metadata:
  displayName: "Red Teaming"
  author: "Tyler-R-Kendrick"
compatibility: claude, copilot, cursor

Red Teaming

Overview

Red teaming is an adversarial assessment that goes beyond penetration testing to evaluate an organization's detection, response, and resilience capabilities against realistic attack scenarios. While pen testing asks "can we get in?", red teaming asks "can you detect us, stop us, and recover?"

Red teams operate with stealth, simulate real-world threat actors (APTs, insider threats, ransomware groups), and test the full kill chain — from initial access through lateral movement to objective completion. The goal is not just to find vulnerabilities, but to expose gaps in people, processes, and technology.

Important: Red team engagements require explicit written authorization from organizational leadership. Only a small group (the "white team" or "trusted agents") should know the engagement is occurring. Unauthorized adversarial activity is illegal.

Key References

TitleAuthor(s)Focus
Red Team Development and OperationsJoe Vest & James TubbervillePlanning and executing red team engagements
The Hacker Playbook 3Peter KimRed team field manual with practical TTPs
Adversarial Tradecraft in CybersecurityDan BorgesOffensive and defensive techniques for red/blue teams
MITRE ATT&CK FrameworkMITRE CorporationComprehensive knowledge base of adversary tactics and techniques
Cyber Kill ChainLockheed MartinSeven-stage model of cyberattack progression

Red Team vs Pen Test vs Vulnerability Assessment

DimensionVulnerability AssessmentPenetration TestRed Team
GoalFind known vulnerabilitiesExploit vulnerabilities to prove impactTest detection and response against realistic attacks
ScopeBroad, automated scanningDefined targets and systemsObjective-based (e.g., "exfiltrate customer database")
StealthNoneMinimalHigh — evade detection, mimic real threat actors
DurationHours to daysDays to weeksWeeks to months
Blue team awarenessYesUsually yesNo (except white team)
Tests people/processNoLimitedYes — primary focus
MethodologyScan and reportOWASP/PTES/OSSTMMMITRE ATT&CK, Cyber Kill Chain
OutputVulnerability list with severitiesExploit proof with business impactDetection gaps, response failures, attack narratives

Adversary Simulation Frameworks

MITRE ATT&CK

The industry-standard knowledge base of adversary tactics, techniques, and procedures (TTPs) based on real-world observations.

TacticDescriptionExample Techniques
ReconnaissanceGathering information about the targetActive scanning, search open websites, gather victim identity info
Resource DevelopmentEstablishing infrastructure for the attackAcquire infrastructure, develop capabilities, establish accounts
Initial AccessGetting into the target environmentPhishing, exploit public-facing application, supply chain compromise
ExecutionRunning adversary-controlled codeCommand/scripting interpreter, exploitation for client execution
PersistenceMaintaining access across restartsBoot/logon autostart, scheduled tasks, account manipulation
Privilege EscalationGaining higher-level permissionsExploitation for privilege escalation, access token manipulation
Defense EvasionAvoiding detectionObfuscated files, indicator removal, masquerading
Credential AccessStealing credentialsOS credential dumping, brute force, credentials from password stores
DiscoveryUnderstanding the environmentNetwork service scanning, system information discovery
Lateral MovementMoving through the environmentRemote services, exploitation of remote services, lateral tool transfer
CollectionGathering data of interestData from local system, email collection, screen capture
Command and ControlCommunicating with compromised systemsApplication layer protocol, encrypted channel, proxy
ExfiltrationStealing dataExfiltration over C2 channel, exfiltration over web service
ImpactDisrupting, destroying, or manipulating systemsData encrypted for impact (ransomware), defacement, data destruction

Cyber Kill Chain (Lockheed Martin)

A linear model of cyberattack progression:

code
1. Reconnaissance ──► 2. Weaponization ──► 3. Delivery ──► 4. Exploitation
         │                                                         │
         │              7. Actions on ◄── 6. Command ◄── 5. Installation
         │                 Objectives       & Control
         └──────────────────────────────────────────────────────────┘
                          Defender can break the chain at any stage
StageAttacker ActivityDefender Response
ReconnaissanceResearch targets, scan for vulnerabilitiesMonitor for scanning, minimize public exposure
WeaponizationCreate exploit payload, craft phishing lureThreat intelligence, signature updates
DeliverySend phishing email, exploit web appEmail filtering, web application firewall, user training
ExploitationExecute exploit, trigger vulnerabilityPatching, endpoint detection, application hardening
InstallationInstall backdoor, establish persistenceEDR, application allowlisting, integrity monitoring
Command & ControlEstablish C2 channel, beaconNetwork monitoring, DNS filtering, egress controls
Actions on ObjectivesExfiltrate data, deploy ransomware, pivotDLP, segmentation, incident response, backups

Purple Teaming

Purple teaming is the collaborative integration of red team (attack) and blue team (defense) to maximize learning and improve detection coverage.

AspectTraditional Red TeamPurple Team
CollaborationAdversarial — red hides from blueCooperative — red and blue work together
GoalFind gaps in detection and responseImprove detection and response in real-time
ProcessRed attacks, writes report, blue reads laterRed executes technique, blue tunes detection, iterate
Speed of improvementSlow (report → fix cycle)Fast (immediate feedback loop)
Best forMature security programsBuilding and validating detection capabilities

Purple Team Exercise Flow

  1. Select ATT&CK technique to test (e.g., T1059 — Command and Scripting Interpreter)
  2. Red team executes the technique in a controlled environment
  3. Blue team observes — did alerts fire? Were logs generated? Was the activity detected?
  4. Gap analysis — identify what was missed and why
  5. Detection engineering — blue team writes or tunes detection rules
  6. Re-execute — red team runs the technique again to verify detection
  7. Document — record the detection coverage and any remaining gaps
  8. Repeat for the next technique

C2 Frameworks

Command and Control (C2) frameworks provide the infrastructure for maintaining access to compromised systems during engagements. These tools must only be used in authorized engagements.

FrameworkTypeKey FeaturesBest For
Cobalt StrikeCommercialBeacon payload, Malleable C2 profiles, team server, industry standardProfessional red teams, APT simulation
SliverOpen source (BishopFox)Cross-platform implants, mutual TLS, DNS/HTTP/WireGuard C2Modern open-source alternative to Cobalt Strike
HavocOpen sourceDemon agent, sleep obfuscation, token manipulationAdvanced evasion techniques
MythicOpen sourceMulti-agent support, containerized, extensibleFlexible multi-platform C2
Brute RatelCommercialBadger payload, EDR evasion focused, syscall-level evasionEDR bypass testing
CalderaOpen source (MITRE)ATT&CK-aligned automated adversary emulationAutomated purple team exercises

Attack Path Mapping

ToolPurposeType
BloodHoundActive Directory attack path visualizationOpen source
PlumHoundBloodHound report automationOpen source
PingCastleAD security assessment and risk scoringFree (community)
ADExplorerAD snapshot and analysisFree (Sysinternals)

AI / LLM Red Teaming

AI systems require specialized red teaming to test for vulnerabilities unique to language models and autonomous agents.

ConcernTest ApproachTools
Prompt injectionCraft adversarial inputs to override system instructionsGarak, Promptfoo
Data extractionAttempt to extract training data or PII from model responsesManual probing, ARTKIT
System prompt leakageTry to reveal hidden system prompts or instructionsManual, Promptfoo
JailbreakingBypass safety guardrails to produce harmful contentGarak, DeepTeam
Excessive agencyTest whether agent takes unauthorized actionsScenario-based testing, sandboxed execution
Hallucination exploitationTrick model into generating convincing false informationDomain-specific fact-checking probes
Multi-turn manipulationGradually shift model behavior across conversation turnsARTKIT (multi-turn framework)

AI Red Teaming Tools

ToolMaintainerFocus
GarakOWASP100+ attack modules for LLM vulnerability scanning
ARTKITMulti-turn adversarial interaction framework
PromptfooPrompt testing, comparison, and red teaming
DeepTeamComprehensive LLM red team framework
CounterfitMicrosoftAdversarial ML attack framework (classical + generative AI)
Caldera + MITRE ATLASMITREATT&CK-aligned adversarial ML tactics

Tabletop Exercises

Tabletop exercises simulate cyber incidents in a discussion-based format without touching live systems. They test incident response plans, communication, and decision-making.

Common Scenarios

ScenarioTestsParticipants
Ransomware attackContainment, communication, ransom decision, recoveryExecutive, IT, Legal, Comms
Data breachBreach notification, forensics, regulatory responseSecurity, Legal, Compliance, Comms
Insider threatDetection, investigation, HR/legal coordinationSecurity, HR, Legal, IT
Supply chain compromiseVendor assessment, containment, customer notificationSecurity, Engineering, Legal, Comms
Business email compromiseFraud detection, financial controls, employee trainingFinance, Security, Executive

Running a Tabletop

  1. Define scenario and objectives — what are you testing?
  2. Identify participants — include leadership, not just technical staff
  3. Prepare injects — time-sequenced events that escalate the scenario
  4. Facilitate discussion — walk through decisions at each stage
  5. Document gaps — record where plans broke down, who was unclear on their role
  6. Produce action items — concrete improvements with owners and deadlines
  7. Schedule follow-up — run the exercise again after improvements are implemented

Best Practices

  • Distinguish red teaming from pen testing — pen tests find vulnerabilities; red teams test your ability to detect and respond to real attacks.
  • Start with purple teaming if your security program is still maturing — get collaborative value before going adversarial.
  • Map engagements to MITRE ATT&CK to measure detection coverage systematically against known adversary techniques.
  • Use assumed breach as a starting point for red team engagements — skip initial access and focus on lateral movement and detection gaps.
  • Conduct tabletop exercises quarterly to test incident response without the cost and risk of live red team engagements.
  • Rotate between internal and external red teams — external teams bring fresh TTPs; internal teams bring organizational knowledge.
  • Debrief collaboratively — red team findings should lead to blue team improvements, not blame.
  • Invest in detection engineering based on red team results — every missed detection is an opportunity to build a new rule.
  • Test AI systems with specialized AI red teaming — traditional pen testing tools do not cover prompt injection, jailbreaking, or excessive agency.
  • Treat red team reports as highly confidential — they contain detailed attack playbooks specific to your organization.