Why ComfyUI?

ComfyUI is the power user's choice for AI image generation. Unlike simple interfaces, ComfyUI's node-based workflow gives you:

๐Ÿ’ก What you'll learn

How to set up ComfyUI on a cloud GPU, create advanced workflows with ControlNet and IP-Adapter, and automate generation via API.

Setup on GPUBrazil

Spin up an RTX 4090 instance on GPUBrazil, then:

# Clone ComfyUI
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI

# Create virtual environment
python -m venv venv
source venv/bin/activate

# Install PyTorch with CUDA
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121

# Install ComfyUI dependencies
pip install -r requirements.txt

# Run ComfyUI
python main.py --listen 0.0.0.0 --port 8188

Access via http://your-server-ip:8188

Installing Models

# SDXL base model
cd models/checkpoints
wget https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors

# SDXL VAE (better colors)
cd ../vae
wget https://huggingface.co/stabilityai/sdxl-vae/resolve/main/sdxl_vae.safetensors

# ControlNet models
cd ../controlnet
wget https://huggingface.co/lllyasviel/sd_control_collection/resolve/main/diffusers_xl_canny_full.safetensors
wget https://huggingface.co/lllyasviel/sd_control_collection/resolve/main/diffusers_xl_depth_full.safetensors

Essential Custom Nodes

# Install ComfyUI Manager (essential!)
cd custom_nodes
git clone https://github.com/ltdrdata/ComfyUI-Manager.git

# IP-Adapter
git clone https://github.com/cubiq/ComfyUI_IPAdapter_plus.git

# ControlNet Preprocessors
git clone https://github.com/Fannovel16/comfyui_controlnet_aux.git

# AnimateDiff
git clone https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved.git

# Restart ComfyUI to load nodes
# Then use ComfyUI Manager to install dependencies

Basic SDXL Workflow

The fundamental workflow connects these nodes:

  1. Load Checkpoint: Load SDXL model
  2. CLIP Text Encode: Convert prompt to embeddings
  3. Empty Latent Image: Create blank canvas
  4. KSampler: Run diffusion process
  5. VAE Decode: Convert latent to image
  6. Save Image: Output result
// Workflow JSON (paste in ComfyUI)
{
  "nodes": [
    {"type": "CheckpointLoaderSimple", "id": 1},
    {"type": "CLIPTextEncode", "id": 2, "inputs": {"text": "your prompt here"}},
    {"type": "CLIPTextEncode", "id": 3, "inputs": {"text": "ugly, blurry"}},
    {"type": "EmptyLatentImage", "id": 4, "inputs": {"width": 1024, "height": 1024}},
    {"type": "KSampler", "id": 5, "inputs": {"steps": 25, "cfg": 7.5}},
    {"type": "VAEDecode", "id": 6},
    {"type": "SaveImage", "id": 7}
  ]
}

ControlNet for Precise Control

ControlNet lets you guide generation with reference images:

Canny Edge Detection

Preserves edges/outlines from a reference image:

# Add these nodes:
1. Load Image (reference)
2. Canny Edge Detector (preprocessor)
3. Load ControlNet Model
4. Apply ControlNet
5. Connect to KSampler

Depth Control

Maintains 3D structure and perspective:

# Nodes needed:
1. Load Image
2. Depth Estimator (MiDaS or Zoe)
3. Load ControlNet (depth model)
4. Apply ControlNet

OpenPose for Characters

Control character poses precisely:

# Pose workflow:
1. Load reference image with pose
2. OpenPose Detector
3. Load ControlNet (openpose model)
4. Apply with strength 0.8-1.0

๐Ÿ’ก ControlNet Tips

Start with strength 0.7-0.8. Higher values follow the control more strictly but may reduce creativity. Combine multiple ControlNets for complex control.

IP-Adapter for Style Transfer

IP-Adapter transfers style from reference images without training:

# IP-Adapter workflow:
1. Load SDXL checkpoint
2. Load IP-Adapter model
3. Load CLIP Vision model
4. Load reference style image
5. IPAdapter Apply node
   - weight: 0.7 (style influence)
   - noise: 0.3 (variation)
6. Connect to KSampler

# Great for:
- Consistent character generation
- Style transfer
- Brand consistency

Face-Specific IP-Adapter

# For consistent faces:
1. Use IP-Adapter FaceID model
2. Load face reference
3. InstantID nodes for best results
4. Combine with ControlNet pose

AnimateDiff for Video

Generate AI animations from text prompts:

# AnimateDiff workflow:
1. Load Checkpoint
2. Load AnimateDiff Model (motion module)
3. AnimateDiff Loader
4. Standard prompt encoding
5. AnimateDiff Sampler
6. AnimateDiff Combine (to video)

# Settings:
- Frames: 16-32 for short clips
- Motion scale: 1.0 default
- Context length: 16

โš ๏ธ VRAM Requirements

AnimateDiff needs significant VRAM. For 16 frames at 512x512, expect ~12GB usage. Use an RTX 4090 or better for comfortable generation.

Batch Processing

Generate multiple images efficiently:

# Method 1: Batch in EmptyLatentImage
- Set batch_size to 4
- Generates 4 images per run

# Method 2: Queue multiple prompts
- Use ComfyUI API
- Queue workflow with different seeds

# Method 3: Prompt scheduling
- Use different prompts per frame
- Great for animations

API Automation

import json
import requests
import websocket
import uuid

SERVER = "http://your-server:8188"

def queue_prompt(workflow):
    """Queue a workflow for generation"""
    prompt_id = str(uuid.uuid4())
    
    response = requests.post(
        f"{SERVER}/prompt",
        json={
            "prompt": workflow,
            "client_id": prompt_id
        }
    )
    return response.json()

def get_images(prompt_id):
    """Get generated images"""
    response = requests.get(f"{SERVER}/history/{prompt_id}")
    history = response.json()
    
    images = []
    for node_id, output in history[prompt_id]["outputs"].items():
        if "images" in output:
            for img in output["images"]:
                img_url = f"{SERVER}/view?filename={img['filename']}"
                images.append(img_url)
    
    return images

# Load workflow from file
with open("workflow.json") as f:
    workflow = json.load(f)

# Modify prompt
workflow["3"]["inputs"]["text"] = "A cyberpunk city at night"

# Queue and wait
result = queue_prompt(workflow)
prompt_id = result["prompt_id"]

# Poll for completion
import time
while True:
    history = requests.get(f"{SERVER}/history/{prompt_id}").json()
    if prompt_id in history:
        break
    time.sleep(1)

# Get images
images = get_images(prompt_id)
print(f"Generated images: {images}")

Performance Optimization

Memory Optimization

# Launch flags for low VRAM:
python main.py --lowvram           # Aggressive memory saving
python main.py --medvram           # Moderate memory saving
python main.py --gpu-only          # Keep everything on GPU (fast, needs VRAM)

# For best performance on 24GB:
python main.py --highvram

Speed Optimization

# Enable xformers (faster attention)
pip install xformers

# Use fp16 precision
# Set in node settings or via --force-fp16

# Batch similar operations
# Queue multiple jobs instead of one at a time

Useful Workflows

1. Product Photography

2. Consistent Characters

3. Architecture Visualization

Run ComfyUI in the Cloud

No local GPU? Run ComfyUI on GPUBrazil's RTX 4090s from $0.40/hr.

Get $5 Free Credit โ†’

Troubleshooting

CUDA Out of Memory

Missing Custom Nodes

Slow Generation

Conclusion

ComfyUI is the most powerful tool for AI image generation. While the learning curve is steeper than simple interfaces, the control and capabilities are unmatched.

Start with basic workflows, gradually add ControlNets and IP-Adapter as you get comfortable. Deploy on GPUBrazil for cloud-based generation without hardware investment.