AgentSkillsCN

vllm-async-loading-debug

调试 vLLM 的异步 KV 加载与 PegaFlow 连接器行为。当您需要探究 vLLM 调度器/工作线程代码中的异步 KV 加载、WAITING_FOR_REMOTE_KVS 状态、加载/保存意图、连接器元数据流动、预取行为,或抢占式交互时,可使用此技能。

SKILL.md
--- frontmatter
name: vllm-async-loading-debug
description: Debug vLLM async KV loading and PegaFlow connector behavior. Use when investigating async KV loads, WAITING_FOR_REMOTE_KVS states, load/save intents, connector metadata flow, prefetch behavior, or preemption interactions in vLLM scheduler/worker code.

Vllm Async Loading Debug

Overview

Trace the async KV loading path between vLLM scheduler and worker connectors, including prefetch and preemption edges.

Workflow

1) Confirm async-load symptom

  • Record request IDs and statuses: WAITING, RUNNING, WAITING_FOR_REMOTE_KVS, PREEMPTED.
  • Check async-load logs and whether prefetch is in progress.

2) Follow scheduler-side decisions

  • python/pegaflow/connector/scheduler.py: prefetch query + LoadIntent creation.
  • .project-plans/scheduler.py (private): load_kv_async gating and WAITING_FOR_REMOTE_KVS transitions.

3) Follow worker-side lifecycle

  • python/pegaflow/connector/worker.py: start_load_kv() calls engine_client.load() and tracks PyLoadState.
  • get_finished() polls is_ready() and emits finished_recving.

4) Check preemption and retries

  • .project-plans/scheduler.py (private): _preempt_request() and reset_prefix_cache() behavior.
  • invalid_block_ids handling can trigger recompute or failure based on kv_load_failure_policy.

Reference Files

  • Read references/async-loading.md for full call flow and log markers.