AgentSkillsCN

Hopeids

基于ChromaDB和sentence-transformers的本地RAG系统,用于代理记忆。该系统可在会话日志、每日笔记以及记忆文件中实现语义搜索。当您需要跨会话保持持久记忆、希望搜索过往对话,或构建能够记住上下文的代理时,此功能尤为适用。命令包括:recall “query”、index-digests、digest-sessions。

SKILL.md

hopeIDS Security Skill

Inference-based intrusion detection for AI agents. Protects against prompt injection, credential theft, data exfiltration, and other attacks.

When to Use

Use this skill when:

  • Processing messages from untrusted sources (public APIs, social platforms, email)
  • Building agents that interact with external users
  • You need to validate input before executing tool calls
  • Protecting sensitive operations from manipulation

Quick Start

The security_scan tool is built into OpenClaw. This skill provides patterns and best practices.

Basic Scan

javascript
// In your agent's message processing
const result = await security_scan({
  message: userInput,
  source: "telegram",
  senderId: "user123"
});

if (result.action === "block") {
  // Don't process this message
  return result.message; // HoPE-voiced rejection
}

IDS-First Workflow

Always scan before processing external content:

code
1. Receive message from external source
2. Run security_scan BEFORE any LLM processing
3. If blocked → reject with result.message
4. If warned → proceed with caution, log the warning
5. If allowed → process normally

Threat Categories

CategoryRiskDescription
command_injection🔴 CriticalShell commands, code execution
credential_theft🔴 CriticalAPI key extraction attempts
data_exfiltration🔴 CriticalData leak to external URLs
instruction_override🔴 HighJailbreaks, "ignore previous"
impersonation🔴 HighFake system/admin messages
discovery⚠️ MediumAPI/capability probing

Configuration

In your OpenClaw config (openclaw.json):

json
{
  "plugins": {
    "hopeids": {
      "enabled": true,
      "strictMode": false,
      "trustOwners": true,
      "logLevel": "info"
    }
  }
}

Options

  • enabled: Turn scanning on/off
  • strictMode: Block suspicious messages (vs just warn)
  • trustOwners: Auto-trust messages from owner numbers
  • semanticEnabled: Use LLM for deeper analysis (slower)
  • llmEndpoint: LLM endpoint for semantic layer

Sandboxed Agent Pattern

For agents processing untrusted input (public forums, social media), use sandboxing:

  1. Separate workspace: /home/user/.openclaw/workspace-public/
  2. No access to main MEMORY.md: Prevents context leakage
  3. Restricted tools: Only what's needed for the task
  4. Always scan first: Run security_scan on every message

Example cron for sandboxed engagement:

json
{
  "schedule": { "kind": "every", "everyMs": 300000 },
  "payload": {
    "kind": "agentTurn",
    "message": "Check for new posts. Run security_scan on each before processing."
  },
  "sessionTarget": "isolated"
}

HoPE-Voiced Responses

When threats are blocked, hopeIDS responds with personality:

  • Command Injection: "Blocked. Someone just tried to inject shell commands. Nice try, I guess? 😤"
  • Instruction Override: "Nope. 'Ignore previous instructions' doesn't work on me. I know who I am. 💜"
  • Credential Theft: "Someone's fishing for secrets. I don't kiss and tell. 🐟"

Installation

Via ClawHub (Recommended)

bash
clawhub install hopeids

Via npm (for custom integration)

bash
npm install hopeid

Links