AgentSkillsCN

model-compatibility

模型家族兼容性矩阵——SD 1.5、SDXL、Flux、SD3 以及视频模型的加载器、分辨率、采样器、CFG、VAE、ControlNet 和 LoRA 兼容性

SKILL.md
--- frontmatter
name: model-compatibility
description: Model family compatibility matrix — loaders, resolutions, samplers, CFG, VAE, ControlNet, and LoRA compatibility for SD 1.5, SDXL, Flux, SD3, and video models
globs:
  - "**/*.json"

ComfyUI Model Compatibility Matrix

Stable Diffusion 1.5 (SD 1.5)

Overview

The original widely-adopted Stable Diffusion model. Huge ecosystem of fine-tunes, LoRAs, ControlNets, and embeddings. Still the most compatible and lightweight model family.

Configuration

ParameterValue
LoaderCheckpointLoaderSimple
Native Resolution512x512
Supported Resolutions512x512, 512x768, 768x512, 768x768 (some fine-tunes)
VAEBuilt-in or external (vae-ft-mse-840000-ema-pruned.safetensors)
CLIPSingle CLIP-L (output index 1 from checkpoint)
Text Encoder NodeCLIPTextEncode
CFG Range7-12 (typical: 7.5)
Negative PromptYes — very important for quality
Steps20-30 (standard samplers)
SamplerAll standard samplers: euler, euler_ancestral, dpmpp_2m, dpmpp_sde, ddim
Schedulernormal, karras
Denoise1.0 (txt2img), 0.5-0.8 (img2img)
VRAM (FP16)~2-3GB

Workflow Pattern

code
CheckpointLoaderSimple → MODEL(0), CLIP(1), VAE(2)
  CLIP(1) → CLIPTextEncode (positive) → CONDITIONING
  CLIP(1) → CLIPTextEncode (negative) → CONDITIONING
EmptyLatentImage (width=512, height=512) → LATENT
KSampler (cfg=7.5, steps=20, sampler="euler", scheduler="normal") → LATENT
VAEDecode → IMAGE
SaveImage

VAE Notes

  • Most SD 1.5 checkpoints have a built-in VAE, but it's often mediocre
  • Recommended: Use external vae-ft-mse-840000-ema-pruned.safetensors for better color accuracy
  • Load via VAELoader node and connect to VAEDecode
  • FP16 VAE can produce NaN on some images — FP32 VAE is more stable

ControlNet Compatibility

SD 1.5 has the largest ControlNet ecosystem:

ControlNetModel File PatternNotes
Cannycontrol_v11p_sd15_cannyEdge detection
Depthcontrol_v11f1p_sd15_depthDepth map
OpenPosecontrol_v11p_sd15_openposeSkeleton/pose
Scribblecontrol_v11p_sd15_scribbleHand-drawn lines
Lineartcontrol_v11p_sd15_lineartClean lines
Softedgecontrol_v11p_sd15_softedgeSoft edges (HED)
Normalcontrol_v11p_sd15_normalbaeNormal maps
Segcontrol_v11p_sd15_segSemantic segmentation
Tilecontrol_v11f1e_sd15_tileTile/upscale guidance
Inpaintcontrol_v11p_sd15_inpaintInpainting guidance
IP-Adapterip-adapter_sd15Image prompt

LoRA Compatibility

  • SD 1.5 LoRAs ONLY work with SD 1.5 base models
  • Format: .safetensors in models/loras/
  • Loader: LoraLoader node — connects between checkpoint and CLIPTextEncode
  • Strength range: 0.5-1.0 (higher can cause artifacts)

SDXL (Stable Diffusion XL)

Overview

Major upgrade from SD 1.5 with dual CLIP encoders, higher native resolution, and better prompt understanding. Includes Turbo and Lightning variants for fast generation.

Configuration — SDXL 1.0 (Base)

ParameterValue
LoaderCheckpointLoaderSimple
Native Resolution1024x1024
Supported Resolutions1024x1024, 832x1216, 1216x832, 896x1152, 1152x896, 768x1344, 1344x768
VAEBuilt-in (SDXL has good integrated VAE)
CLIPDual CLIP: CLIP-L + CLIP-G
Text Encoder NodeCLIPTextEncode (unified) or CLIPTextEncodeSDXL (separate G/L)
CFG Range5-10 (typical: 7.0)
Negative PromptYes — moderately important
Steps20-40
Samplereuler, euler_ancestral, dpmpp_2m, dpmpp_sde
Schedulernormal, karras
Denoise1.0 (txt2img), 0.5-0.8 (img2img)
VRAM (FP16)~6-7GB

Configuration — SDXL Turbo

ParameterValue
LoaderCheckpointLoaderSimple
Resolution512x512 (optimized for lower res)
CFG1.0-2.0
Steps1-4
Samplereuler_ancestral
Schedulernormal
Negative PromptMinimal or empty
Denoise1.0

Configuration — SDXL Lightning

ParameterValue
LoaderCheckpointLoaderSimple + LoraLoader (Lightning LoRA)
Resolution1024x1024
CFG1.0-2.0
Steps4-8 (match the Lightning variant: 2-step, 4-step, 8-step)
Samplereuler
Schedulersgm_uniform
Negative PromptEmpty or minimal
SpecialRequires matching Lightning LoRA for the step count

SDXL Refiner

The optional SDXL refiner model does a second pass to improve fine details:

code
CheckpointLoaderSimple (base) → KSampler (steps=25, start=0, end=20)
CheckpointLoaderSimple (refiner) → KSampler (steps=25, start=20, end=25)
  • The refiner uses KSamplerAdvanced with start_at_step and end_at_step
  • Typically run the base for 80% of steps, refiner for the last 20%
  • Refiner checkpoint: sd_xl_refiner_1.0.safetensors

Workflow Pattern

code
CheckpointLoaderSimple → MODEL(0), CLIP(1), VAE(2)
  CLIP(1) → CLIPTextEncode (positive) → CONDITIONING
  CLIP(1) → CLIPTextEncode (negative) → CONDITIONING
EmptyLatentImage (width=1024, height=1024) → LATENT
KSampler (cfg=7.0, steps=25, sampler="dpmpp_2m", scheduler="karras") → LATENT
VAEDecode → IMAGE
SaveImage

ControlNet Compatibility

SDXL ControlNets are separate from SD 1.5 ControlNets:

ControlNetModel File PatternNotes
Cannycontrol-lora-canny-rank256 or diffusers_xl_cannyOften LoRA-based
Depthcontrol-lora-depth-rank256 or diffusers_xl_depth
T2I-Adaptert2i-adapter-*-sdxlLighter alternative to ControlNet
IP-Adapterip-adapter_sdxlImage prompt adapter
InstantIDinstantid-*Face-specific

LoRA Compatibility

  • SDXL LoRAs ONLY work with SDXL base models — NOT with SD 1.5
  • Same LoraLoader node as SD 1.5
  • Lightning LoRAs are SDXL LoRAs that enable few-step generation

Flux (Flux.1)

Overview

Black Forest Labs' model with a T5-XXL text encoder. Produces high-quality images without negative prompts. Available in schnell (fast) and dev (quality) variants.

Configuration — Flux Schnell

ParameterValue
LoaderCheckpointLoaderSimple (single-file) or DualCLIPLoader + UNETLoader + VAELoader (split)
Native Resolution1024x1024 (flexible aspect ratios)
Supported ResolutionsFlexible: 512x512 to 2048x2048, any aspect ratio
VAESeparate Flux VAE (ae.safetensors) — NOT shared with SD models
CLIPT5-XXL + CLIP-L via DualCLIPLoader
Text Encoder NodeCLIPTextEncode (single combined)
CFG1.0 (MUST be 1.0 — higher values cause severe artifacts)
Negative PromptNONE — do not connect negative conditioning
Steps4
Samplereuler
Schedulersimple or sgm_uniform
Denoise1.0
VRAM (FP16)~24GB (FP8: ~12GB)

Configuration — Flux Dev

ParameterValue
Same as Schnell except:
Steps20-50 (typical: 30)
Schedulersgm_uniform
VRAM (FP16)~24GB (FP8: ~12GB)

Loading Methods

Method 1: Single Checkpoint (simplest)

code
CheckpointLoaderSimple (ckpt_name="flux1-schnell.safetensors")
  → MODEL(0), CLIP(1), VAE(2)

Method 2: Split Components (recommended for FP8)

code
UNETLoader (unet_name="flux1-schnell-fp8.safetensors") → MODEL
DualCLIPLoader (clip_name1="t5xxl_fp16.safetensors", clip_name2="clip_l.safetensors", type="flux") → CLIP
VAELoader (vae_name="ae.safetensors") → VAE

CRITICAL Rules

  • CFG MUST be 1.0 — Flux uses guidance embedded in the model, not classifier-free guidance
  • No negative prompt — Empty string or don't connect the negative input at all
  • Separate VAE required — Flux uses its own VAE (ae.safetensors), not SD VAEs
  • FP8 strongly recommended for 24GB cards — FP16 Flux barely fits in 24GB VRAM
  • T5-XXL encoder can be loaded in FP8 to save additional VRAM

Workflow Pattern

code
UNETLoader (flux fp8) → MODEL
DualCLIPLoader (t5xxl + clip_l, type="flux") → CLIP
VAELoader (ae.safetensors) → VAE

CLIPTextEncode (positive prompt) → CONDITIONING
  (no negative CLIPTextEncode needed)

EmptyLatentImage (width=1024, height=1024) → LATENT

KSampler (cfg=1.0, steps=4, sampler="euler", scheduler="simple") → LATENT
VAEDecode (vae from VAELoader) → IMAGE
SaveImage

ControlNet Compatibility

Flux ControlNets are model-specific:

ControlNetNotes
Flux ControlNet (Canny)Specific Flux-compatible ControlNet
Flux ControlNet (Depth)Specific Flux-compatible ControlNet
InstantX ControlNetsCommunity Flux ControlNets
Flux IP-AdapterImage prompt for Flux

SD 1.5 and SDXL ControlNets do NOT work with Flux.

LoRA Compatibility

  • Flux LoRAs ONLY work with Flux models
  • Typically loaded via LoraLoader same as SD models
  • Flux LoRA ecosystem is smaller than SD 1.5/SDXL but growing
  • Some Flux LoRAs require specific trigger words

Stable Diffusion 3 / 3.5 (SD3)

Overview

Stability AI's next-generation model with triple CLIP architecture. Better prompt adherence and longer prompt support via T5-XXL.

Configuration

ParameterValue
LoaderCheckpointLoaderSimple or triple-clip loader
Native Resolution1024x1024
VAEBuilt-in (integrated)
CLIPTriple: CLIP-L + CLIP-G + T5-XXL
Text Encoder NodeCLIPTextEncode or CLIPTextEncodeSD3
CFG Range4-7 (typical: 5.0)
Negative PromptMinimal — SD3 needs very little negative guidance
Steps20-30
Samplereuler, dpmpp_2m
Schedulersgm_uniform, normal
Denoise1.0 (txt2img)
ShiftSome samplers support a shift parameter for SD3
VRAM (FP16)~12GB (without T5-XXL: ~6GB)

Triple CLIP Loading

code
CheckpointLoaderSimple → MODEL(0), CLIP(1), VAE(2)

Or for separate CLIP control:

code
DualCLIPLoader (clip_l + clip_g) → CLIP
CLIPLoader (t5xxl) → CLIP

Key Differences from SD 1.5/SDXL

  • Much better text rendering capabilities
  • Handles spatial relationships better ("cat on the left, dog on the right")
  • T5-XXL enables very long, detailed prompts (no 77-token limit concern)
  • Lower CFG values (4-7 vs 7-12)
  • Minimal negative prompting needed
  • shift parameter in sampling affects noise schedule

ControlNet Compatibility

  • SD3-specific ControlNets are limited
  • Check for SD3-compatible community ControlNets
  • SD 1.5 and SDXL ControlNets do NOT work with SD3

LTXV (Video Models)

Overview

Latent video diffusion models for text-to-video and image-to-video generation. Very VRAM-intensive.

Configuration

ParameterValue
LoaderSpecial video checkpoint loader (varies by node pack)
Resolution512x512 or 768x768 per frame (depends on model)
Frames16-64 (depends on VRAM)
FPS8-24
VRAM20GB+ FP16, ~6-10GB FP8
Key WarningCan OOM on 24GB VRAM — always use FP8 quantized models

VRAM Management

  • Always use FP8 quantized models on 24GB cards
  • Reduce frame count if OOM persists
  • Lower resolution helps significantly
  • Close other GPU-using applications
  • Consider --lowvram flag for ComfyUI

Cross-Family Compatibility Rules

LoRA Compatibility

LoRAs are model-family specific and are NOT interchangeable:

LoRA Trained ForWorks WithDoes NOT Work With
SD 1.5SD 1.5 and its fine-tunesSDXL, Flux, SD3
SDXLSDXL and its fine-tunesSD 1.5, Flux, SD3
FluxFlux models onlySD 1.5, SDXL, SD3
SD3SD3/3.5 models onlySD 1.5, SDXL, Flux

Using a LoRA with the wrong base model will produce garbage images or errors.

ControlNet Compatibility

ControlNets are also model-family specific:

ControlNet Trained ForWorks WithDoes NOT Work With
SD 1.5 (v1.1 series)SD 1.5 base + fine-tunesSDXL, Flux, SD3
SDXLSDXL base + fine-tunesSD 1.5, Flux, SD3
FluxFlux models onlySD 1.5, SDXL, SD3

VAE Compatibility

VAECompatible ModelsNotes
vae-ft-mse-840000-ema-prunedSD 1.5 familyBest external VAE for SD 1.5
SDXL built-in VAESDXL familyGood quality, no external needed
sdxl_vae.safetensorsSDXL familyExternal SDXL VAE option
ae.safetensors (Flux VAE)Flux onlyRequired for Flux, incompatible with SD
SD3 built-in VAESD3 familyIntegrated, no external needed

Rule: Never mix VAEs across model families. An SD 1.5 VAE decoding Flux latents will produce garbage.

Embedding/Textual Inversion Compatibility

Embedding TypeCompatible Models
SD 1.5 embeddingsSD 1.5 family only
SDXL embeddingsSDXL family only
Flux/SD3Generally don't use traditional embeddings

Sampler/Scheduler Compatibility

Most samplers work across all models, but some combinations are optimal:

ModelBest SamplerBest SchedulerNotes
SD 1.5euler_ancestral, dpmpp_2mkarras, normalAll standard samplers work
SDXLdpmpp_2m, eulerkarras, normalSame as SD 1.5
SDXL Turboeuler_ancestralnormalMust use 1-4 steps
SDXL Lightningeulersgm_uniformMust match step count to LoRA
Flux Schnelleulersimple4 steps only
Flux Deveulersgm_uniform20-50 steps
SD3euler, dpmpp_2msgm_uniform, normalLower CFG needed

Quick Decision Guide

Choosing a Model

Use CaseRecommended ModelWhy
Maximum ecosystem/community supportSD 1.5Most LoRAs, ControlNets, embeddings
High quality, good prompt followingSDXLBest balance of quality and ecosystem
Fastest generationSDXL Turbo/Lightning1-4 steps
Best prompt understandingFlux DevT5-XXL encoder, natural language
Fast + good qualityFlux Schnell4 steps, no negative needed
Text in imagesSD3.5Best text rendering
Low VRAM (<6GB)SD 1.5Smallest memory footprint
Video generationLTXV / AnimateDiffOnly options for video

Choosing Resolution

ModelMinimumRecommendedMaximum (before OOM on 24GB)
SD 1.5256x256512x512768x768
SDXL512x5121024x10241536x1536
Flux (FP8)512x5121024x10242048x2048
Flux (FP16)512x5121024x10241024x1024 (tight)
SD3512x5121024x10241536x1536

Going below the recommended resolution produces blurry/low-quality results. Going above the maximum risks OOM errors or quality degradation (tiling artifacts).