Skip to content

Conversation

@Encryptic1
Copy link

@Encryptic1 Encryptic1 commented Nov 7, 2025

Summary

This PR adds comprehensive support for Qwen-Image models to ComfyUI-DyPE, introducing a new DyPE for Qwen-Image node alongside the existing DyPE for FLUX node. The implementation includes intelligent model structure detection and architecture-specific optimizations that automatically adapt to different Qwen-Image model variants.

IMAGE EDIT scaling to come. NO SUPPORT for image edit requires a empty latent for now. Will have adaptive DyPE modes for image editing with Qwen soon. For now requires NO image fed into the Qwen Text Encoding node or an empty latent.

What Was Added

1. New ComfyUI Node: DyPE for Qwen-Image

  • A dedicated node for Qwen-Image models (found under model_patches/unet)
  • Automatically detects model structure and applies architecture-specific optimizations
  • Full compatibility with existing ComfyUI workflows

2. Intelligent Model Structure Detection

The implementation includes a robust _detect_qwen_model_structure() function that automatically detects:

  • Transformer/diffusion_model location
  • Positional embedder path (pos_embed vs pe_embedder)
  • Patch size from model config
  • VAE scale factor
  • Base training resolution

This eliminates hardcoded assumptions and adapts to different Qwen model variants seamlessly.

3. Architecture-Specific Optimizations

  • Dynamic Parameter Extraction: Automatically extracts patch size, VAE scale factor, base resolution, and axes dimensions from the model
  • Optimized Calculations: Uses detected parameters instead of hardcoded values for more accurate base patches and sequence length calculations
  • Enhanced Positional Embedding: Improved QwenPosEmbed class with better device-aware dtype selection and robust tensor handling
  • Better Scheduler Compatibility: Improved fallback handling for non-Flux schedulers (FlowMatch, etc.)

4. Documentation & Examples

  • Added comprehensive IMPROVEMENTS.md documenting all Qwen-Image specific improvements
  • Added example workflow: example_workflows/DyPE-Qwen-workflow.json
  • Updated README.md with:
    • Documentation for the new DyPE for Qwen-Image node
    • Separate usage instructions for FLUX and Qwen-Image models
    • Links to example workflows

Key Benefits

  1. Better Compatibility: Works with different Qwen-Image model variants without manual configuration
  2. More Accurate: Uses actual model parameters instead of assumptions
  3. Robust: Better error handling and fallbacks for edge cases
  4. Optimized: Qwen-specific optimizations for better performance
  5. Maintainable: Clear structure detection makes debugging easier

Technical Details

For detailed technical information about all improvements, please see IMPROVEMENTS.md.

Key technical changes include:

  • Intelligent detection of model structure and parameters
  • Architecture-aware defaults for Qwen-Image models
  • Improved sequence length calculation accounting for both VAE downsampling and patch-based downsampling
  • Enhanced timestep handling for different formats
  • Better scheduler compatibility with conservative scaling for unknown types

Files Changed

  • __init__.py: Added DyPE_QWEN node class
  • src/patch.py: Added Qwen-Image support with intelligent model detection
  • README.md: Updated documentation for both nodes
  • IMPROVEMENTS.md: Comprehensive documentation of all improvements (new file)
  • example_workflows/DyPE-Qwen-workflow.json: Example workflow for Qwen-Image models (new file)

Testing

The implementation has been tested with Qwen-Image models and includes:

  • Automatic parameter detection
  • Fallback behavior for edge cases
  • Support for different Qwen model variants

Backward Compatibility

Fully backward compatible - The existing DyPE for FLUX node remains unchanged and continues to work exactly as before. This PR only adds new functionality.


Note: This PR extends the original FLUX-only implementation to support Qwen-Image models while maintaining full backward compatibility with existing FLUX workflows.

… strategies

This commit adds comprehensive image editing support to the DyPE implementation
for Qwen-Image models, addressing the issue where DyPE's aggressive positional
encoding modifications were interfering with image editing tasks.

## Problem Addressed:
- DyPE was applying full strength modifications during image editing
- This caused loss of original image structure and feature degradation
- Editing tasks (inpainting, image-to-image) were not preserving spatial relationships
- Users reported that while upscaling worked well, editing was problematic

## Solution Implemented:

### 1. Multiple Editing Mode Strategies
Added 5 distinct editing mode strategies via new 'editing_mode' parameter:
- 'adaptive' (default): Timestep-aware scaling - full DyPE early (structure),
  gradually reduces later (details). Best balance for most editing scenarios.
- 'timestep_aware': More aggressive version - 100% early, 20% late
- 'resolution_aware': Only reduces DyPE when editing at high resolutions (>base)
- 'minimal': Fixed reduced strength throughout (original approach, improved)
- 'full': Always full DyPE regardless of editing (for pure generation)

### 2. Timestep-Aware Dynamic Scaling
The adaptive and timestep_aware modes implement intelligent scaling:
- Early denoising steps (high noise): Full DyPE strength for structure building
- Late denoising steps (low noise): Reduced DyPE to preserve fine details
- Smooth transition based on normalized timestep (1.0 = pure noise, 0.0 = clean)
- Formula: timestep_factor = base + (current_timestep * range)
  - adaptive: 0.3 + (timestep * 0.7) → 1.0 to 0.3
  - timestep_aware: 0.2 + (timestep * 0.8) → 1.0 to 0.2

### 3. Editing Strength Parameter
- New 'editing_strength' parameter (0.0-1.0, default 0.6)
- Acts as multiplier for DyPE strength during editing
- Lower values = more structure preservation
- Higher values = more quality, less preservation
- Works in conjunction with editing_mode for fine control

### 4. Improved Editing Detection
Enhanced detection of image editing scenarios:
- Checks conditioning dict for image-related keys:
  - 'image', 'image_embeds', 'image_tokens'
  - 'concat_latent_image', 'concat_mask', 'concat_mask_image'
- Heuristic: Analyzes input variance to distinguish edited images from pure noise
- More accurate detection of when editing is occurring

### 5. Resolution-Aware Behavior
Resolution-aware mode intelligently:
- Detects if editing at base resolution (1024x1024) vs high resolution
- Applies full DyPE strength at base resolution
- Only reduces strength when upscaling during editing
- Prevents unnecessary reduction when not needed

### 6. Extrapolation Scaling
For YARN and NTK methods:
- Applies timestep-aware scaling to extrapolation ratios
- Adaptive modes use smoother transitions
- Minimal mode uses fixed reduction
- Better balance between structure and quality

## Technical Implementation:

### Changes to QwenPosEmbed class:
- Added 'editing_strength' and 'editing_mode' parameters
- Implemented timestep-aware strength calculation
- Dynamic scaling of dype_exponent based on mode and timestep
- Conditional extrapolation scaling for YARN/NTK methods

### Changes to apply_dype_to_qwen function:
- Added 'editing_strength' and 'editing_mode' parameters
- Passes parameters to QwenPosEmbed
- Improved timestep normalization for editing scenarios
- Mode-specific handling in wrapper function

### Changes to node interface (__init__.py):
- Added 'editing_mode' combo input with 5 options
- Added 'editing_strength' float input (default 0.6)
- Updated tooltips with clear explanations
- Backward compatible (editing_strength defaults to 0.6, mode to 'adaptive')

## Benefits:

1. **Better Editing Quality**: Preserves original image structure while allowing edits
2. **Flexible Control**: 5 modes + strength parameter for fine-tuning
3. **Intelligent Adaptation**: Timestep-aware modes adapt to denoising stage
4. **Backward Compatible**: Defaults work well, existing workflows unaffected
5. **Resolution Aware**: Only applies reduction when actually needed

## Usage Recommendations:

- Most editing: editing_mode='adaptive', editing_strength=0.6 (default)
- Maximum preservation: editing_mode='timestep_aware', editing_strength=0.5
- Base resolution: editing_mode='resolution_aware', editing_strength=0.7
- Pure generation: editing_mode='full' (ignores editing_strength)

This implementation solves the image editing quality issues while maintaining
excellent upscaling performance for pure generation tasks.
Updated default parameter values based on testing and optimization:

- dype_exponent: 2.0 → 3.0
  * More aggressive DyPE strength for better high-resolution generation
  * Better performance at 4K+ resolutions

- base_shift: 0.5 → 0.10
  * Lower base shift for noise schedule
  * Improved balance for Qwen-Image architecture

- editing_strength: 0.6 → 0.0
  * Maximum structure preservation during image editing
  * Prevents feature degradation when editing existing images
  * Users can increase if needed for more aggressive editing

These defaults provide optimal balance between:
- High-resolution generation quality (dype_exponent=3.0)
- Image editing structure preservation (editing_strength=0.0)
- Noise schedule tuning (base_shift=0.10)
Increased the maximum value for dype_exponent parameter from 4.0 to 10.0
for the Qwen-Image node only. This allows users to experiment with more
aggressive DyPE settings for extreme high-resolution generation (8K+).

The FLUX node remains capped at 4.0 as before.
@ttulttul
Copy link

OMG I should have checked to see that your PR existed before I spun up mine. Face palm. I'll have a look at this one. Will be interesting to see how your approach is similar / different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants