wildminder · Encryptic1 · Nov 7, 2025 · Nov 7, 2025 · Nov 7, 2025 · Nov 7, 2025
diff --git a/IMPROVEMENTS.md b/IMPROVEMENTS.md
@@ -0,0 +1,94 @@
+# Qwen-Image Specific Improvements
+
+This document outlines the architecture-specific improvements made to optimize DyPE for Qwen-Image models.
+
+## Key Improvements
+
+### 1. **Intelligent Model Structure Detection**
+- Added `_detect_qwen_model_structure()` function that automatically detects:
+  - Transformer/diffusion_model location
+  - Positional embedder path (`pos_embed` vs `pe_embedder`)
+  - Patch size from model config
+  - VAE scale factor
+  - Base training resolution
+- Eliminates hardcoded assumptions and adapts to different Qwen model variants
+
+### 2. **Qwen-Specific Parameter Extraction**
+- **Patch Size Detection**: Automatically extracts `patch_size` from model config (defaults to 2 for MMDiT)
+- **VAE Scale Factor**: Detects actual VAE downsampling factor (typically 8x)
+- **Base Resolution**: Attempts to detect from model config, falls back to 1024
+- **Axes Dimensions**: Extracts from model or uses Qwen-Image defaults `[16, 56, 56]`
+
+### 3. **Optimized Base Patches Calculation**
+```python
+# Old: Hardcoded calculation
+self.base_patches = (self.base_resolution // 8) // 2
+
+# New: Uses detected patch_size and base_resolution
+self.base_patches = (self.base_resolution // vae_scale_factor) // patch_size
+```
+- More accurate for different Qwen model variants
+- Adapts to actual model architecture
+
+### 4. **Enhanced Positional Embedding Class**
+- Added `base_resolution` and `patch_size` parameters to `QwenPosEmbed`
+- Better device-aware dtype selection (handles MPS, NPU, CUDA)
+- Improved comments explaining Qwen-specific behavior
+- More robust handling of different tensor formats
+
+### 5. **Improved Scheduler Compatibility**
+- Better fallback for non-Flux schedulers (FlowMatch, etc.)
+- Conservative scaling approach for unknown scheduler types
+- More robust error handling with `AttributeError` instead of bare `except`
+
+### 6. **Better Sequence Length Calculation**
+```python
+# Now uses detected vae_scale_factor and patch_size
+latent_h, latent_w = height // vae_scale_factor, width // vae_scale_factor
+padded_h = math.ceil(latent_h / patch_size) * patch_size
+padded_w = math.ceil(latent_w / patch_size) * patch_size
+image_seq_len = (padded_h // patch_size) * (padded_w // patch_size)
+```
+- More accurate for Qwen's specific architecture
+- Accounts for both VAE downsampling and patch-based downsampling
+
+### 7. **Enhanced Timestep Handling**
+- Better handling of different timestep formats (tensor, scalar, etc.)
+- More robust normalization logic
+- Improved error handling for edge cases
+
+### 8. **Architecture-Aware Defaults**
+- Qwen-Image specific defaults:
+  - `axes_dim = [16, 56, 56]` (MMDiT standard)
+  - `theta = 10000` (RoPE base frequency)
+  - `patch_size = 2` (MMDiT patch size)
+  - `vae_scale_factor = 8` (standard VAE downsampling)
+
+## Benefits
+
+1. **Better Compatibility**: Works with different Qwen-Image model variants
+2. **More Accurate**: Uses actual model parameters instead of assumptions
+3. **Robust**: Better error handling and fallbacks
+4. **Optimized**: Qwen-specific optimizations for better performance
+5. **Maintainable**: Clear structure detection makes debugging easier
+
+## Testing Recommendations
+
+When testing with your Qwen-Image model:
+
+1. Check console output for detected parameters (add logging if needed)
+2. Verify patch_size matches your model (typically 2 for MMDiT)
+3. Verify base_resolution matches training resolution
+4. Test with different resolutions to ensure proper extrapolation
+5. Monitor for any warnings about fallback behavior
+
+## Future Enhancements
+
+Potential further improvements:
+
+1. **MSRoPE Integration**: Qwen uses Multimodal Scalable RoPE - could add specific support
+2. **Aspect Ratio Presets**: Qwen supports specific aspect ratios - could add presets
+3. **Text Rendering Optimization**: Qwen excels at text - could add text-specific optimizations
+4. **Multi-Image Support**: Qwen-Image-Edit supports multi-image - could extend for that
+5. **Config File Support**: Allow users to override detected parameters via config
+
diff --git a/README.md b/README.md
@@ -39,11 +39,12 @@ It works by taking advantage of the spectral progression inherent to the diffusi
       <p><sub><i>A simple, single-node integration to patch your FLUX model for high-resolution generation.</i></sub></p>
   </div>
 
-This node provides a seamless, "plug-and-play" integration of DyPE into any FLUX-based workflow.
+This node provides a seamless, "plug-and-play" integration of DyPE into FLUX-based and Qwen-Image workflows. Two specialized nodes are available: `DyPE for FLUX` for FLUX models and `DyPE for Qwen-Image` for Qwen-Image models, each optimized for their respective architectures.
 
 **✨ Key Features:**
-*   **True High-Resolution Generation:** Push FLUX models to 4096x4096 and beyond while maintaining global coherence and fine detail.
-*   **Single-Node Integration:** Simply place the `DyPE for FLUX` node after your model loader to patch the model. No complex workflow changes required.
+*   **True High-Resolution Generation:** Push FLUX and Qwen-Image models to 4096x4096 and beyond while maintaining global coherence and fine detail.
+*   **Dual Node Support:** Two specialized nodes available - `DyPE for FLUX` and `DyPE for Qwen-Image` - each optimized for their respective architectures.
+*   **Single-Node Integration:** Simply place the appropriate DyPE node after your model loader to patch the model. No complex workflow changes required.
 *   **Full Compatibility:** Works seamlessly with your existing ComfyUI workflows, samplers, schedulers, and other optimization nodes like Self-Attention or quantization.
 *   **Fine-Grained Control:** Exposes key DyPE hyperparameters, allowing you to tune the algorithm's strength and behavior for optimal results at different target resolutions.
 *   **Zero Inference Overhead:** DyPE's adjustments happen on-the-fly with negligible performance impact.
@@ -75,15 +76,32 @@ Alternatively, to install manually:
 
 Using the node is straightforward and designed for minimal workflow disruption.
 
+### For FLUX Models
+
 1.  **Load Your FLUX Model:** Use a standard `Load Checkpoint` node to load your FLUX model (e.g., `FLUX.1-Krea-dev`).
 2.  **Add the DyPE Node:** Add the `DyPE for FLUX` node to your graph (found under `model_patches/unet`).
 3.  **Connect the Model:** Connect the `MODEL` output from your loader to the `model` input of the DyPE node.
 4.  **Set Resolution:** Set the `width` and `height` on the DyPE node to match the resolution of your `Empty Latent Image`.
 5.  **Connect to KSampler:** Use the `MODEL` output from the DyPE node as the input for your `KSampler`.
 6.  **Generate!** That's it. Your workflow is now DyPE-enabled.
 
+### For Qwen-Image Models
+
+1.  **Load Your Qwen-Image Model:** Use a standard `Load Checkpoint` node to load your Qwen-Image model.
+2.  **Add the DyPE Node:** Add the `DyPE for Qwen-Image` node to your graph (found under `model_patches/unet`).
+3.  **Connect the Model:** Connect the `MODEL` output from your loader to the `model` input of the DyPE node.
+4.  **Set Resolution:** Set the `width` and `height` on the DyPE node to match the resolution of your `Empty Latent Image`.
+5.  **Connect to KSampler:** Use the `MODEL` output from the DyPE node as the input for your `KSampler`.
+6.  **Generate!** The node will automatically detect your Qwen-Image model structure and apply architecture-specific optimizations.
+
+### Example Workflows
+
+Ready-to-use example workflows are available in the [`example_workflows`](example_workflows) folder:
+*   **[DyPE-Flux-workflow.json](example_workflows/DyPE-Flux-workflow.json)** - Example workflow for FLUX models
+*   **[DyPE-Qwen-workflow.json](example_workflows/DyPE-Qwen-workflow.json)** - Example workflow for Qwen-Image models
+
 > [!NOTE]
-> This node specifically patches the **diffusion model (UNet)**. It does not modify the CLIP or VAE models. It is designed exclusively for **FLUX-based** architectures.
+> This node specifically patches the **diffusion model (UNet)**. It does not modify the CLIP or VAE models. It is designed for **FLUX-based** architectures, with enhanced support for **Qwen-Image** models through intelligent model structure detection and architecture-specific optimizations.
 
 ### Node Inputs
 
@@ -130,7 +148,7 @@ Beyond the code, I believe in the power of community and continuous learning. I
 <p align="center">══════════════════════════════════</p>
 
 ## ⚠️ Known Issues and Limitations
-*   **FLUX Only:** This implementation is highly specific to the architecture of the FLUX model and will not work on standard U-Net models (like SD 1.5/SDXL) or other Diffusion Transformers.
+*   **Supported Models:** This implementation is optimized for **FLUX-based** architectures and **Qwen-Image** models. It will not work on standard U-Net models (like SD 1.5/SDXL) or other Diffusion Transformers. For Qwen-Image models, the node automatically detects model structure and applies architecture-specific optimizations (see `IMPROVEMENTS.md` for details).
 *   **Parameter Tuning:** The optimal `dype_exponent` can vary based on your target resolution. Experimentation is key to finding the best setting for your use case. The default of `2.0` is optimized for 4K.
 
 <p align="right">(<a href="#readme-top">back to top</a>)</p>

diff --git a/__init__.py b/__init__.py
@@ -1,6 +1,6 @@
 import torch
 from comfy_api.latest import ComfyExtension, io
-from .src.patch import apply_dype_to_flux
+from .src.patch import apply_dype_to_flux, apply_dype_to_qwen
 
 class DyPE_FLUX(io.ComfyNode):
     """
@@ -82,11 +82,105 @@ def execute(cls, model, width: int, height: int, method: str, enable_dype: bool,
         patched_model = apply_dype_to_flux(model, width, height, method, enable_dype, dype_exponent, base_shift, max_shift)
         return io.NodeOutput(patched_model)
 
+class DyPE_QWEN(io.ComfyNode):
+    """
+    Applies DyPE (Dynamic Position Extrapolation) to a Qwen-Image model.
+    This allows generating images at resolutions far beyond the model's training scale
+    by dynamically adjusting positional encodings and the noise schedule.
+    """
+
+    @classmethod
+    def define_schema(cls) -> io.Schema:
+        return io.Schema(
+            node_id="DyPE_QWEN",
+            display_name="DyPE for Qwen-Image",
+            category="model_patches/unet",
+            description="Applies DyPE (Dynamic Position Extrapolation) to a Qwen-Image model for ultra-high-resolution generation.",
+            inputs=[
+                io.Model.Input(
+                    "model",
+                    tooltip="The Qwen-Image model to patch with DyPE.",
+                ),
+                io.Int.Input(
+                    "width",
+                    default=1024, min=16, max=8192, step=8,
+                    tooltip="Target image width. Must match the width of your empty latent."
+                ),
+                io.Int.Input(
+                    "height",
+                    default=1024, min=16, max=8192, step=8,
+                    tooltip="Target image height. Must match the height of your empty latent."
+                ),
+                io.Combo.Input(
+                    "method",
+                    options=["yarn", "ntk", "base"],
+                    default="yarn",
+                    tooltip="Position encoding extrapolation method (YARN recommended).",
+                ),
+                io.Boolean.Input(
+                    "enable_dype",
+                    default=True,
+                    label_on="Enabled",
+                    label_off="Disabled",
+                    tooltip="Enable or disable Dynamic Position Extrapolation for RoPE.",
+                ),
+                io.Float.Input(
+                    "dype_exponent",
+                    default=3.0, min=0.0, max=10.0, step=0.1,
+                    optional=True,
+                    tooltip="Controls DyPE strength over time (λt). 3.0=Very aggressive (best for 4K+), 2.0=Exponential, 1.0=Linear, 0.5=Sub-linear (better for ~2K). Higher values (up to 10.0) for extreme high-resolution generation."
+                ),
+                io.Float.Input(
+                    "base_shift",
+                    default=0.10, min=0.0, max=10.0, step=0.01,
+                    optional=True,
+                    tooltip="Advanced: Base shift for the noise schedule (mu). Default is 0.10."
+                ),
+                io.Float.Input(
+                    "max_shift",
+                    default=1.15, min=0.0, max=10.0, step=0.01,
+                    optional=True,
+                    tooltip="Advanced: Max shift for the noise schedule (mu) at high resolutions. Default is 1.15."
+                ),
+                io.Float.Input(
+                    "editing_strength",
+                    default=0.0, min=0.0, max=1.0, step=0.1,
+                    optional=True,
+                    tooltip="DyPE strength multiplier for image editing (0.0-1.0). Lower values preserve more original structure. Default 0.0 for maximum preservation. Set to 1.0 for pure generation."
+                ),
+                io.Combo.Input(
+                    "editing_mode",
+                    options=["adaptive", "timestep_aware", "resolution_aware", "minimal", "full"],
+                    default="adaptive",
+                    tooltip="Editing mode strategy: 'adaptive' (recommended) - timestep-aware scaling, 'timestep_aware' - more DyPE early/less late, 'resolution_aware' - only reduce at high res, 'minimal' - minimal DyPE for editing, 'full' - always full DyPE."
+                ),
+            ],
+            outputs=[
+                io.Model.Output(
+                    display_name="Patched Model",
+                    tooltip="The Qwen-Image model patched with DyPE.",
+                ),
+            ],
+        )
+
+    @classmethod
+    def execute(cls, model, width: int, height: int, method: str, enable_dype: bool, dype_exponent: float = 3.0, base_shift: float = 0.10, max_shift: float = 1.15, editing_strength: float = 0.0, editing_mode: str = "adaptive") -> io.NodeOutput:
+        """
+        Clones the model and applies the DyPE patch for both the noise schedule and positional embeddings.
+        """
+        # Check if this is a Qwen model
+        has_transformer = hasattr(model.model, "transformer") or hasattr(model.model, "diffusion_model")
+        if not has_transformer:
+            raise ValueError("This node is only compatible with Qwen-Image models.")
+
+        patched_model = apply_dype_to_qwen(model, width, height, method, enable_dype, dype_exponent, base_shift, max_shift, editing_strength, editing_mode)
+        return io.NodeOutput(patched_model)
+
 class DyPEExtension(ComfyExtension):
-    """Registers the DyPE node."""
+    """Registers the DyPE nodes for both FLUX and Qwen-Image."""
 
     async def get_node_list(self) -> list[type[io.ComfyNode]]:
-        return [DyPE_FLUX]
+        return [DyPE_FLUX, DyPE_QWEN]
 
 async def comfy_entrypoint() -> DyPEExtension:
     return DyPEExtension()