AgentSkillsCN

aidefence

AI 操控防御系统,具备提示注入检测、对抗性输入扫描、PII 检测,以及威胁分析功能。适用于保护 AI 应用免受提示注入攻击、扫描输入以检测操控行为、在提示中识别 PII、为 LLM 流水线实施安全中间件,或强化代理系统以抵御对抗性输入时使用。

SKILL.md
--- frontmatter
name: "aidefence"
description: "AI Manipulation Defense System with prompt injection detection, adversarial input scanning, PII detection, and threat analysis. Use when protecting AI applications from prompt injection attacks, scanning inputs for manipulation, detecting PII in prompts, implementing security middleware for LLM pipelines, or hardening agent systems against adversarial inputs."

AIDefence

Production-ready AI security middleware for protecting applications from prompt injection, manipulation attacks, adversarial inputs, PII exposure, and other threats targeting LLM-based systems.

Quick Reference

TaskCode
Installnpx aidefence@latest
Importimport { AIDefence, Scanner } from 'aidefence';
Createconst defence = new AIDefence({ level: 'strict' });
Scanconst result = await defence.scan(input);
Check PIIconst hasPII = await defence.hasPII(text);
Middlewareapp.use(defence.middleware());

Installation

Hub install (recommended): npx neural-trader@latest includes this package. Standalone: npx aidefence@latest See Installation Guide for the full ecosystem.

Key API

AIDefence

The main defense system class.

typescript
import { AIDefence } from 'aidefence';

const defence = new AIDefence({
  level: 'strict',
  enablePII: true,
  enableInjection: true,
});

Constructor Options:

OptionTypeDefaultDescription
levelstring'standard'Security level: 'relaxed', 'standard', 'strict', 'paranoid'
enablePIIbooleantrueEnable PII detection
enableInjectionbooleantrueEnable injection detection
enableAdversarialbooleantrueEnable adversarial input detection
enableEncodingbooleantrueDetect encoding-based attacks
customRulesRule[][]Custom detection rules
allowListstring[][]Allowed patterns (bypass checks)
blockListstring[][]Always-blocked patterns
maxInputLengthnumber100000Maximum input length
logThreatsbooleantrueLog detected threats

Methods:

MethodReturnsDescription
scan(input)Promise<ScanResult>Comprehensive input scan
isSafe(input)Promise<boolean>Quick safety check
hasPII(text)Promise<PIIResult>Detect PII in text
sanitize(input)Promise<string>Remove threats from input
analyze(input)Promise<ThreatAnalysis>Detailed threat analysis
middleware()RequestHandlerExpress/Connect middleware
addRule(rule)voidAdd custom detection rule
getStats()DefenceStatsGet scan statistics

Scanner

Lower-level scanner for specific threat types.

typescript
import { Scanner } from 'aidefence';

const scanner = new Scanner({
  detectors: ['injection', 'jailbreak', 'encoding'],
});

const threats = await scanner.detect(input);

Constructor Options:

OptionTypeDefaultDescription
detectorsstring[]allActive detectors
thresholdnumber0.7Detection confidence threshold
maxLengthnumber100000Max input length

Available Detectors:

DetectorDescription
'injection'Prompt injection attempts
'jailbreak'Jailbreak/system prompt extraction
'encoding'Base64/hex/unicode evasion
'pii'Personal Identifiable Information
'toxicity'Toxic/harmful content
'adversarial'Adversarial perturbations
'exfiltration'Data exfiltration attempts
'delimiter'Delimiter injection attacks

PIIDetector

Specialized PII detection.

typescript
import { PIIDetector } from 'aidefence';

const detector = new PIIDetector({
  types: ['email', 'phone', 'ssn', 'credit-card'],
  action: 'redact',
});

const result = await detector.scan(text);

PII Types:

TypeDescription
'email'Email addresses
'phone'Phone numbers
'ssn'Social Security Numbers
'credit-card'Credit card numbers
'address'Physical addresses
'name'Person names
'ip'IP addresses
'api-key'API keys and tokens

Common Patterns

Express Middleware

typescript
import { AIDefence } from 'aidefence';
import express from 'express';

const app = express();
const defence = new AIDefence({ level: 'strict' });

app.use(defence.middleware());

app.post('/api/chat', async (req, res) => {
  // Input already scanned by middleware
  const response = await llm.generate(req.body.message);
  res.json({ response });
});

Pre-LLM Input Validation

typescript
import { AIDefence } from 'aidefence';

const defence = new AIDefence({ level: 'standard' });

async function safeLLMCall(userInput: string) {
  const result = await defence.scan(userInput);
  if (!result.safe) {
    console.log(`Blocked: ${result.threats.map(t => t.type).join(', ')}`);
    return 'Your input was blocked for security reasons.';
  }
  return await llm.generate(userInput);
}

PII Redaction Pipeline

typescript
import { AIDefence } from 'aidefence';

const defence = new AIDefence({ enablePII: true });

const piiResult = await defence.hasPII(userMessage);
if (piiResult.detected) {
  const redacted = await defence.sanitize(userMessage);
  console.log(`Redacted ${piiResult.items.length} PII items`);
  // Use redacted text for LLM call
}

RAN DDD Context

Bounded Context: Security

References