Vllm Async Loading Debug
Overview
Trace the async KV loading path between vLLM scheduler and worker connectors, including prefetch and preemption edges.
Workflow
1) Confirm async-load symptom
- •Record request IDs and statuses:
WAITING,RUNNING,WAITING_FOR_REMOTE_KVS,PREEMPTED. - •Check async-load logs and whether prefetch is in progress.
2) Follow scheduler-side decisions
- •
python/pegaflow/connector/scheduler.py: prefetch query +LoadIntentcreation. - •
.project-plans/scheduler.py(private):load_kv_asyncgating andWAITING_FOR_REMOTE_KVStransitions.
3) Follow worker-side lifecycle
- •
python/pegaflow/connector/worker.py:start_load_kv()callsengine_client.load()and tracksPyLoadState. - •
get_finished()pollsis_ready()and emitsfinished_recving.
4) Check preemption and retries
- •
.project-plans/scheduler.py(private):_preempt_request()andreset_prefix_cache()behavior. - •
invalid_block_idshandling can trigger recompute or failure based onkv_load_failure_policy.
Reference Files
- •Read
references/async-loading.mdfor full call flow and log markers.