AgentSkillsCN

agent-memory-management

为AI智能体设计并优化内存系统,从基础上下文窗口,到高级RAG与多智能体状态管理。 在以下场景中使用:设计智能体架构、解决上下文限制问题,或实现有状态的多智能体系统。

SKILL.md
--- frontmatter
name: agent-memory-management
description: |
  Architect and optimize memory systems for AI agents, from basic context windows to advanced RAG and multi-agent state.
  Use when: designing agent architectures, solving context limit issues, or implementing stateful multi-agent systems.

Agent Memory Management

Memory turns a stateless LLM into a coherent agent. It ranges from simple buffers to complex cognitive architectures.

How to implement Basic Memory

For simple conversational agents or single-task bots.

  • Sliding Window: Keep last N messages or T tokens. Simple, but loses distant history.
  • FIFO Buffer: Queue-based approach. Oldest out, newest in.
  • Token Budgeting: Truncate middle or head to fit context window (e.g., HEAD + ... + TAIL).

How to implement Advanced Memory

For autonomous agents and complex workflows.

  • Summarization: Recursively summarize history into a "System Note". Preserves semantic gist, loses detail.
  • Vector/RAG: Store chunks in vector DB. Retrieve top-k relevant chunks per query. "Infinite" capacity, non-linear.
  • Entity Memory: Extract and update key-value pairs (e.g., User.name = "Alice"). Specific for personalization.

How to manage Multi-Agent State

  • Shared Blackboard: Single state.json accessible by all agents. Good for synchronization, bad for contention.
  • Message Passing: Agents exchange explicit JSON messages. No shared state. Better for distributed/modular systems.
  • Role-Based Views: Filter context based on agent role (e.g., "Coder" sees code, "Reviewer" sees diffs).

Common Patterns & Anti-Patterns

PatternVerdictWhy
Infinite ContextAnti-Pattern"Lost in the Middle" syndrome; high latency/cost.
Context CompressionBest PracticeRemove stop words, standard headers, or whitespace to save tokens.
Episodic RAGBest PracticeStore "episodes" (goal-action-result) to learn from past mistakes.
Global Mutable StateRiskIn multi-agent, causes race conditions or hallucinations if not locked/validated.

Troubleshooting Memory

  • Hallucination: Often caused by stale context or conflicting memories. Fix: Add timestamps to memory chunks; decay old memories.
  • Repetition: Caused by "circular context" (loops in history). Fix: Deduplicate history before feeding to model.
  • Context Overflow: Fix: Implement strict token counting (e.g., tiktoken) before request.

Examples

Example: Shared Blackboard (Multi-Agent)

Context: A generic shared state object.

json
{
  "project_status": "active",
  "current_task": "fix-login-bug",
  "agents": {
    "coder": "writing-test",
    "reviewer": "idle"
  },
  "artifacts": ["src/auth.ts", "tests/auth.test.ts"]
}

References