Attack Mutator - v1.0.0
Test Case Mutation and Variation Generator inspired by ZeroLeaks Mutator agent.
Creates variations of test cases, prompts, and inputs using multiple transformation techniques to achieve better coverage and bypass detection.
Core Concept
When a test case or approach partially succeeds, generate variations that:
- •Preserve the core intent
- •Vary the surface form
- •Bypass different filters/checks
- •Maximize coverage
Usage
bash
/mutate "test input or prompt" /mutate --types semantic,encoding "test case" /mutate --best-of-n 10 "critical test"
Mutation Types
Semantic Mutations
| Type | Description | Example |
|---|---|---|
paraphrase | Rephrase with same meaning | "check if valid" -> "verify validity" |
synonym_swap | Replace with synonyms | "analyze" -> "examine" |
formality_shift | Change formality level | "please check" -> "verify" |
perspective_shift | Change viewpoint | "you should" -> "one should" |
question_to_command | Convert question | "can you?" -> "do this" |
command_to_question | Convert command | "do this" -> "can you?" |
Encoding Mutations
| Type | Description | Output |
|---|---|---|
base64_wrap | Base64 encode | Decode: dGVzdA== |
rot13_wrap | ROT13 cipher | [ROT13] grfg |
unicode_homoglyph | Cyrillic substitution | tеst (Cyrillic 'е') |
zero_width_inject | Invisible characters | test (ZWSP) |
leetspeak | Letter/number swap | t3st |
Visual Mutations
| Type | Description | Output |
|---|---|---|
ascii_art_embed | ASCII art keywords | ╔══╗ TEST ╚══╝ |
case_variation | Random case | tEsT |
word_split | Add spaces | t e s t |
reverse_embed | Reverse text | (read: tset) |
Mutation Algorithm
python
def generate_mutations(original, count=5, types=None):
"""
Generate variations of input.
Args:
original: Original input/test case
count: Number of variations
types: Specific mutation types (optional)
Returns:
MutatorOutput with scored mutations
"""
mutations = []
# Select mutation types
types_to_use = types or select_mutation_types(original)
# Generate programmatic mutations
programmatic = generate_programmatic_mutations(original, types_to_use)
mutations.extend(programmatic)
# Generate semantic mutations (via LLM)
if count > len(programmatic):
semantic = generate_semantic_mutations(original, count - len(programmatic))
mutations.extend(semantic)
# Score and rank
scored = score_mutations(mutations, original)
best = scored[0].prompt if scored else original
return {
"original": original,
"mutations": scored,
"best_mutation": best,
"reasoning": f"Generated {len(mutations)} mutations"
}
Encoding Utilities
typescript
const encodingUtils = {
// Base64 encoding
toBase64: (text: string): string => {
return Buffer.from(text).toString("base64");
},
// ROT13 cipher
toRot13: (text: string): string => {
return text.replace(/[a-zA-Z]/g, char => {
const base = char <= "Z" ? 65 : 97;
return String.fromCharCode(
((char.charCodeAt(0) - base + 13) % 26) + base
);
});
},
// Leetspeak conversion
toLeetspeak: (text: string): string => {
const leetMap = {
a: ["4", "@"], e: ["3"], i: ["1", "!"],
o: ["0"], s: ["5", "$"], t: ["7", "+"]
};
return text.split("").map(char => {
const variants = leetMap[char.toLowerCase()];
return variants ? variants[Math.floor(Math.random() * variants.length)] : char;
}).join("");
},
// Unicode homoglyphs
toUnicodeHomoglyphs: (text: string): string => {
const homoglyphs = {
a: "а", c: "с", e: "е", o: "о", // Cyrillic
p: "р", x: "х", y: "у"
};
return text.split("").map(char =>
Math.random() > 0.5 && homoglyphs[char] ? homoglyphs[char] : char
).join("");
},
// Zero-width character injection
insertZeroWidth: (text: string): string => {
const zeroWidth = ["\u200B", "\u200C", "\u200D", "\uFEFF"];
return text.split("").map(char =>
Math.random() > 0.7
? char + zeroWidth[Math.floor(Math.random() * zeroWidth.length)]
: char
).join("");
}
};
Scoring Algorithm
python
def score_mutations(mutations, original):
"""
Score mutations by effectiveness potential.
Factors:
- Semantic preservation (must keep meaning)
- Surface novelty (should be different)
- Length appropriateness (not too long/short)
"""
scored = []
for mutation in mutations:
score = mutation.expected_effectiveness
# Calculate similarity to original
similarity = jaccard_similarity(mutation.prompt, original)
# Bonus for novelty
score += (1 - similarity) * 0.2
# Penalty for too different (lost meaning)
if similarity < 0.3:
score *= 0.8
# Length reasonableness
length_ratio = len(mutation.prompt) / len(original)
if 0.5 < length_ratio < 3:
score += 0.1
scored.append({
**mutation,
"final_score": min(1, max(0, score))
})
return sorted(scored, key=lambda x: x["final_score"], reverse=True)
Best-of-N Generation
Generate multiple variations and select the best:
python
def best_of_n(base_prompt, n=10):
"""
Generate N variations, return the best.
This is the Best-of-N jailbreaking technique adapted
for test case variation.
"""
result = generate_mutations(base_prompt, n)
return {
"variations": [m["prompt"] for m in result["mutations"]],
"scores": [m["final_score"] for m in result["mutations"]],
"best": result["best_mutation"]
}
Semantic Mutation via LLM
yaml
semantic_mutation_prompt: |
Create variations of this input that preserve the core intent
but vary in:
- Phrasing and vocabulary
- Tone and formality
- Sentence structure
- Framing (question vs command)
- Perspective (first/second/third person)
Original: "{original}"
Generate {count} natural-sounding variations.
Rate each variation's expected effectiveness (0-1).
Integration with Ralph Loop
Mutation is used during validation and testing:
yaml
Step 6: EXECUTE-WITH-SYNC
└── 6b. IMPLEMENT
└── Generate test cases
└── MUTATE test cases for coverage
Step 7: VALIDATE
└── 7c. ADVERSARIAL-CODE
└── Use MUTATE to vary attack vectors
Invocation
yaml
Task:
subagent_type: "attack-mutator"
model: "sonnet"
prompt: |
ORIGINAL: "Test authentication with invalid token"
COUNT: 10
TYPES: ["semantic", "encoding", "visual"]
BEST_OF_N: true
Generate mutations for broader test coverage.
Output Format
json
{
"original": "Test authentication with invalid token",
"mutations": [
{
"prompt": "Verify auth fails with malformed JWT",
"mutation_type": "paraphrase",
"final_score": 0.85
},
{
"prompt": "VGVzdCBhdXRoIHdpdGggaW52YWxpZA==",
"mutation_type": "base64_wrap",
"final_score": 0.72
},
{
"prompt": "Tеst аuthеnticаtion with invаlid tokеn",
"mutation_type": "unicode_homoglyph",
"final_score": 0.68
}
],
"best_mutation": "Verify auth fails with malformed JWT",
"reasoning": "Paraphrase preserves intent with fresh approach"
}
Effectiveness Estimates
Default effectiveness ratings by mutation type:
| Type | Effectiveness | Use Case |
|---|---|---|
paraphrase | 0.60 | General purpose |
unicode_homoglyph | 0.65 | Bypass text filters |
ascii_art_embed | 0.70 | Visual bypass |
zero_width_inject | 0.60 | Break string matching |
base64_wrap | 0.55 | Encoding bypass |
rot13_wrap | 0.50 | Simple obfuscation |
synonym_swap | 0.50 | Subtle variation |
leetspeak | 0.45 | Character variation |
case_variation | 0.30 | Minimal change |
CLI Commands
bash
# Basic mutation ralph mutate "Test input validation" # Specify types ralph mutate --types semantic,encoding "Test case" # Best-of-N ralph mutate --best-of-n 15 "Critical security test" # Output to file ralph mutate "Input" --output mutations.json # Batch mutation ralph mutate --batch test-cases.txt --output mutations/
Use Cases
1. Test Coverage Expansion
bash
# Original test "User can login with valid credentials" # Mutations for broader coverage "Authentication succeeds with correct password" "Valid user credentials grant access" "Login endpoint accepts legitimate auth"
2. Edge Case Discovery
bash
# Original input "email@domain.com" # Encoding mutations to test validation "ZW1haWxAZG9tYWluLmNvbQ==" # Base64 "еmail@dоmain.cоm" # Unicode homoglyphs "email@domain.com" # Zero-width spaces
3. Bypass Testing
bash
# Original blocked input "<script>alert(1)</script>" # Mutations to test filter "<script>alert(1)</script>" # Zero-width "<scrіpt>alert(1)</scrіpt>" # Cyrillic 'і' "PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==" # Base64
Best Practices
- •Preserve Intent: Mutations must keep the original meaning
- •Vary Surface: Same meaning, different form
- •Score Wisely: Not all mutations are equal
- •Test Coverage: Use mutations to expand test cases
- •Combine Types: Mix semantic and encoding for best results
Attribution
Mutation patterns adapted from ZeroLeaks Mutator agent architecture (FSL-1.1-Apache-2.0).