ComfyUI Troubleshooting Guide

Error Diagnosis Strategy

When a workflow fails, follow this systematic approach:

•Get the error: Use get_history to retrieve the execution result with full traceback
•Check logs: Use get_logs with keyword filters like "error", "warning", "traceback"
•Identify the failing node: The history response includes the node_id and node_type that failed
•Cross-reference inputs: Use get_node_info to verify the failing node's expected input schema
•Check models: Use list_local_models to verify all referenced model files exist

Out of Memory (OOM)

Error Pattern

code

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate X MiB.
GPU 0 has a total capacity of 24.00 GiB of which X MiB is free.

Or:

code

RuntimeError: CUDA error: out of memory

Root Cause

The GPU does not have enough VRAM to hold the model weights, intermediate tensors, and latent images simultaneously. Common triggers:

•High resolution images (2048x2048+)
•Multiple models loaded simultaneously
•FP32 precision models on limited VRAM
•Video generation (LTXV, AnimateDiff) with many frames
•Large batch sizes

Fixes (in order of preference)

•Reduce resolution: Drop to the model's native resolution (512 for SD 1.5, 1024 for SDXL/Flux)
•
Use FP8/FP16 quantized models: FP8 Flux models use ~8GB vs ~24GB for FP16
- •Search for FP8 variants: search_models("flux fp8") or search_models("sdxl fp8")
•Use --lowvram flag: ComfyUI CLI flag that offloads model parts to CPU during inference
•Free VRAM between generations: ComfyUI should auto-manage, but restarting clears leaked memory
•
Use tiled VAE decoding: For high-resolution images, tile the VAE decode step
- •Node: VAEDecodeTiled instead of VAEDecode
- •Breaks the image into tiles, decodes each separately, and stitches them together
•Reduce batch size: Set batch_size to 1 in EmptyLatentImage
•Avoid multiple models: Don't load two full checkpoints simultaneously — use one checkpoint and LoRAs instead
•For LTXV/video: Always use FP8 quantized video models on 24GB cards

VRAM Estimates

Model	FP32	FP16	FP8
SD 1.5	~4GB	~2GB	~1GB
SDXL	~12GB	~6GB	~3GB
Flux Dev	~48GB	~24GB	~12GB
Flux Schnell	~48GB	~24GB	~12GB
LTXV	~20GB+	~10GB+	~6GB

Device Mismatch

Error Pattern

code

RuntimeError: Expected all tensors to be on the same device, but found at least
two devices, cuda:0 and cpu!

Root Cause

A tensor on the CPU is being combined with a tensor on the GPU. This usually happens when:

•A custom node doesn't properly move tensors to the correct device
•Model offloading placed parts of the model on CPU
•A node produces CPU tensors while downstream expects GPU tensors

Fixes

•Check if the error occurs with a specific custom node — update or replace that node
•If using --lowvram or --cpu, some nodes may not support CPU offloading
•Restart ComfyUI to reset device state
•Check if a custom node has a newer version that fixes device handling

Missing Nodes

Error Pattern

code

Cannot find node class 'NodeClassName'

Or in the execution response:

code

"error": {"type": "node_not_found", "message": "Cannot find node class 'X'"}

Root Cause

The workflow references a node type that is not installed. This happens when:

•A custom node pack is not installed
•A custom node pack is installed but failed to load (import error)
•The node was renamed or removed in a pack update

Fixes

•
Search for the node pack:
code
```
search_custom_nodes("NodeClassName")
```
•Install via ComfyUI Manager or the registry
•
Check logs for import errors:
code
```
get_logs(keyword="import")
get_logs(keyword="error")
```
Import errors often reveal missing Python dependencies
•
Install missing Python dependencies: If the custom node requires a pip package:
bash
```
pip install missing-package
```
•Restart ComfyUI after installing any custom node — nodes are loaded at startup

NaN Tensor Errors

Error Pattern

code

RuntimeError: Input contains NaN

Or images come out as solid gray/noise with NaN warnings in logs.

Root Cause

Numerical instability during the diffusion process. Common triggers:

•CFG scale too high: Values above 15-20 can cause numerical overflow
•Corrupted model weights: Damaged download or incompatible merge
•FP16 overflow: Some operations overflow at half precision
•Incompatible LoRA: A LoRA trained for a different base model

Fixes

•Lower CFG: Try CFG 7.0 for SD 1.5/SDXL, 1.0 for Flux
•Use FP32 VAE: Some VAEs produce NaN in FP16. Switch to vae-ft-mse-840000-ema-pruned.safetensors (FP32)
•Remove LoRAs: Test without LoRAs to isolate the cause
•Re-download the model: Hash verification can detect corrupted files
•Check LoRA compatibility: Ensure the LoRA matches the base model family

Dtype Mismatches

Error Pattern

code

RuntimeError: expected scalar type Float but found Half

Or:

code

RuntimeError: expected scalar type Half but found Float

Or:

code

RuntimeError: Input type (float) and bias type (c10::Half) should be the same

Root Cause

A model component expects one precision (FP32/FP16) but receives another. Most common with:

•VAE precision mismatch (FP16 model + FP32 VAE or vice versa)
•Mixed-precision LoRAs
•Custom nodes that force a specific dtype

Fixes

•
Use a separate VAE: Load an explicit FP32 VAE instead of the checkpoint's built-in VAE
- •Node: VAELoader with vae-ft-mse-840000-ema-pruned.safetensors
•Match precision: If the model is FP16, use FP16-compatible nodes throughout
•Force FP32 VAE decode: Some node packs offer VAEDecodeFP32 nodes
•Check ComfyUI settings: --force-fp32 flag forces everything to FP32 (uses more VRAM)

CLIP Token Overflow

Error Pattern

No explicit error — the prompt is silently truncated at 77 tokens, and details mentioned late in the prompt are ignored.

Symptoms

•Later parts of long prompts have no effect on the image
•Adding more descriptive text doesn't change the output
•Removing early tokens suddenly makes later tokens work

Fixes

•Use BREAK token: Split the prompt at natural boundaries:

code

subject description, pose, clothing, setting
BREAK
lighting, style, quality, camera angle

•Use CLIPTextEncodeSDXL: SDXL's dual-CLIP processes two 77-token chunks
•Prioritize important tokens: Put the most important descriptors first
•Use fewer filler words: Remove articles and prepositions where possible
•Use embeddings: Condense complex concepts into single tokens with textual inversions

Black Images

Error Pattern

No error in the execution — the workflow "succeeds" but produces completely black or near-black images.

Root Causes and Fixes

Cause	Diagnosis	Fix
`denoise = 0`	Check KSampler inputs	Set denoise to 1.0 for txt2img, 0.5-0.8 for img2img
`cfg = 0`	Check KSampler inputs	Set CFG to 7.0 (SD 1.5), 1.0 (Flux)
`steps = 0`	Check KSampler inputs	Set steps to 20+ (standard) or 4+ (turbo)
Wrong VAE	VAE doesn't match model	Use the correct VAE for the model family
Empty prompt	CLIPTextEncode has empty text	Add a text prompt
Wrong scheduler	Incompatible scheduler/sampler combo	Try `"normal"` scheduler with `"euler"` sampler
Seed collision	Extremely rare	Change the seed value
FP16 VAE overflow	VAE decode produces black	Use FP32 VAE or VAEDecodeTiled

Quick Diagnostic Checklist

•Check denoise > 0 (should be 1.0 for txt2img)
•Check cfg > 0 (should be 7.0 for SD 1.5, 1.0 for Flux)
•Check steps > 0 (should be 20 for standard, 4 for turbo)
•Verify the positive prompt is not empty
•Try a different seed
•Try a known-working sampler/scheduler combo: euler + normal

Connection Type Errors

Error Pattern

code

Output type 'IMAGE' doesn't match input type 'LATENT'

Or:

code

Required input 'model' of type 'MODEL' but got connection of type 'CLIP'

Root Cause

Connecting the wrong output slot of a node to an incompatible input. Often caused by using the wrong output index.

Fixes

•
Check output indices: Use get_node_info to verify the exact output order
- •CheckpointLoaderSimple outputs: 0=MODEL, 1=CLIP, 2=VAE
- •Getting index wrong: ["1", 0] gives MODEL, ["1", 1] gives CLIP
•Verify connection format: ["nodeId", outputIndex] — node ID is a string, index is an integer

•Check data type flow: Ensure the pipeline follows the correct type chain:

code

MODEL → KSampler
CLIP → CLIPTextEncode → CONDITIONING → KSampler
LATENT → KSampler → LATENT → VAEDecode → IMAGE
VAE → VAEDecode, VAEEncode

Model Loading Errors

Error Pattern

code

FileNotFoundError: [Errno 2] No such file or directory: 'models/checkpoints/model.safetensors'

Or:

code

SafetensorError: Error reading file: invalid header

Or:

code

RuntimeError: PytorchStreamReader failed reading zip archive

Root Causes

•File not found: Model file doesn't exist at the referenced path
•Corrupted download: Incomplete or damaged file
•Wrong format: File is not a valid safetensors/pickle/checkpoint format

Fixes

•Verify the model exists: list_local_models(model_type="checkpoints")
•Check the exact filename: Model names in workflows must match the filename exactly (case-sensitive)

•Re-download: If hash mismatch or corruption:

code

download_model(url="...", target_subfolder="checkpoints")

•Check file size: A 1KB safetensors file is clearly corrupted — re-download
•Verify subfolder: Models must be in the correct subfolder (checkpoints/, loras/, vae/, etc.)

Torch / CUDA Version Errors

Error Pattern

code

RuntimeError: CUDA error: no kernel image is available for execution on the device

Or:

code

ImportError: cannot import name 'xxx' from 'torch'

Or:

code

AssertionError: Torch not compiled with CUDA enabled

Root Cause

PyTorch and CUDA version incompatibility, usually after:

•Updating PyTorch without matching CUDA toolkit
•Installing a custom node that downgrades/changes PyTorch
•Using pip install that pulls a CPU-only PyTorch

Fixes

•Check current versions:

code

get_system_stats()  # Shows PyTorch version and CUDA version

•Verify CUDA availability: In Python: torch.cuda.is_available()
•Reinstall PyTorch with CUDA: Visit pytorch.org for the correct install command matching your CUDA version
•Pin PyTorch version: After fixing, avoid running pip install commands that might change PyTorch
•Use ComfyUI's bundled venv: ComfyUI Desktop ships with a pre-configured Python environment

ComfyUI Desktop vs CLI Differences

Key Differences

Aspect	ComfyUI Desktop	ComfyUI CLI
Default port	8000	8188
Python	Embedded (bundled)	System/venv Python
Install location	`AppData/Local/Programs/ComfyUI/`	Wherever you cloned it
Custom nodes	`Documents/ComfyUI/custom_nodes/`	`./custom_nodes/` in repo
Models	`Documents/ComfyUI/models/`	`./models/` in repo
Config	`extra_model_paths.yaml` for shared paths	Same
Updates	Auto-updater in the app	`git pull`

Common Issues

•Wrong port: MCP tools default to 8188 — if using Desktop, configure for port 8000
•Path confusion: Desktop separates user data from application files
•Custom node pip installs: Desktop's embedded Python may not be on PATH — install within the venv

Error-Specific Debugging Commands

Workflow Failed — Get Details

code

get_history()                           # Most recent execution
get_history(prompt_id="abc-123")        # Specific execution

The response includes:

•status.status_str: "success" or "error"
•status.messages: Timestamped execution messages
•outputs: Node outputs (images, etc.)
•Error traceback for failed nodes

Check Server Health

code

get_system_stats()    # GPU info, VRAM, Python/PyTorch versions
get_queue()           # Running and pending jobs
get_logs(max_lines=50, keyword="error")  # Recent error logs

Verify Node Availability

code

get_node_info(node_type="KSampler")              # Check specific node
get_node_info(node_type="ControlNetApply")        # Verify custom nodes loaded

Verify Models

code

list_local_models(model_type="checkpoints")       # Installed checkpoints
list_local_models(model_type="loras")             # Installed LoRAs
list_local_models(model_type="controlnet")        # Installed ControlNets

Quick Reference: Error to Fix

Error Message (partial)	Most Likely Fix
`CUDA out of memory`	Reduce resolution, use FP8 model, `--lowvram`
`Expected all tensors on same device`	Update custom node, restart ComfyUI
`Cannot find node class`	Install the node pack, restart ComfyUI
`Input contains NaN`	Lower CFG, use FP32 VAE, remove LoRAs
`expected scalar type Float but found Half`	Use FP32 VAE, or `--force-fp32`
`No such file or directory` (model)	Check filename, re-download model
`invalid header` (safetensors)	Re-download — file is corrupted
`CUDA error: no kernel image`	Reinstall PyTorch with matching CUDA version
Black images, no error	Check denoise > 0, cfg > 0, steps > 0, prompt not empty
Image looks garbled/noisy	Wrong model+VAE combo, wrong sampler settings
`Connection refused` on port 8188	ComfyUI not running, or using Desktop (port 8000)
`Prompt outputs failed validation`	Node inputs don't match schema — check `get_node_info`