Key strengths
- Interaction-aware removal — removes not just the object, but all physical interactions it caused on the scene (shadows, reflections, falling objects)
- Object removal, not single-frame patching — produces coherent motion and lighting across the entire clip
- Two-pass refinement — Pass 2 provides superior temporal stability (fewer jitters and flashes) compared to Pass 1 alone, especially on longer cuts or textured backgrounds
Limitations: Unclear masks, chaotic motion, or targets that dominate the frame may still produce suboptimal results — prompting cannot fix fundamentally wrong segmentation.
VOID Video Inpainting Workflow
1. Download Workflow
Update your ComfyUI to the latest version, then go toWorkflow -> Browse Templates and find “VOID: Video Inpainting” under the Utility category.
Download JSON Workflow File
Download workflow
Run on Comfy Cloud
Open in cloud
2. Download Models
All models are hosted on the Comfy-Org VOID model repository. Diffusion Models — the core two-pass inpainting model:- void_pass2.safetensors — Refinement pass, better temporal stability
- void_pass1.safetensors — Primary pass
3. Using the Workflow
Inputs:- Source video — Load a video via the
Load Videonode (place it in the ComfyUIinput/folder) - Positive prompt (inpaint fill) — Describe the scene after removal. Focus on what remains and how it looks, not on what was removed
- Example:
empty kitchen counter, daylight, tiles visible
- Example:
- Negative prompt — Optional anti-artifact list; can be left empty
- SAM3 object prompt — A short label for what to mask out. SAM3 uses semantic understanding to create a segmentation mask for the target object.
- Example:
person in blue jacket,red cup on table - Max tokens for SAM3 prompts is 32. To prompt multiple subjects separately, separate with commas and use
:Nto specify the max objects detected per prompt:eye:2, window panels:4
- Example:
| Prompt | Role |
|---|---|
| SAM3 object | What is removed (SAM3 creates the mask via semantic segmentation) |
| Positive (inpaint) | How the hole is filled across time |
Learn about Subgraph
This workflow uses Subgraph nodes for modular video processing. Check out the Subgraph documentation to learn how to customize and extend the workflow.
Additional Notes
- Mask quality matters — a clean, tight mask around the target object produces the best results
- Prompt writing tip — describe the scene as it should appear naturally after removal, not the removal itself
- Use negative prompt only when you see repeating defects (watermarks, blur, extra limbs)
- Two-pass workflow — the template runs Pass 1 then Pass 2 automatically; you can also run just Pass 1 for faster iterations during testing