feat(z-image): add sigma schedules, aspect ratios, saveinfo, shift, MCF sampler;#353
Conversation
|
@terribilissimo Thanks for the PR! I'll have a look at this tomorrow |
|
Thanks!
Best regards,
A.
…On Sun, Feb 15, 2026 at 10:34 PM Filip Strand ***@***.***> wrote:
*filipstrand* left a comment (filipstrand/mflux#353)
<#353 (comment)>
@terribilissimo <https://github.com/terribilissimo> Thanks for the PR!
I'll have a look at this tomorrow
—
Reply to this email directly, view it on GitHub
<#353 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AINI3KWHB477Q5FASP5X4ID4MDQ6XAVCNFSM6AAAAACVGL6SMWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTSMBVGIZDIOJTGI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
A while back there was talk about integrating alternative schedulers/samplers via an external package — that's more or less what I tried doing with mflux-schedulers, a fork of @anthonywu's work. My goal was mainly to explore different aesthetics rather than performance gains, bringing over some samplers/schedulers from ComfyUI to try them on mflux. |
Yes, I can confirm that. |
You are making a solid point. This is my overall feeling as well (I personally never switch the scheduler), but I also see how bringing in more optionality is nice for more advanced users. At the same time, it is also now much more trivial to just have an agent build this in 5 minutes on a local install of the project and have whatever tweaks one wants (but, of course, tedious to maintain over time). Another thing which I found tricky when doing (semi)automatic model porting via coding agents is that they often want to go in and tweak the "magic numbers" in the schedulers in order to match more closely to the reference implementations. With more options, I could see how agents could get confused more easily. It is easy to get lost in these smaller details and how numbers should be set across models etc (since doing more "agentic ports", I have started to loose track of what we actually have now and if these magic numbers are general or model-specific - e.g I couldn't really defend our current implementation here, except that it seemed to "look good" when I check the last time) I'm favouring simplicity over complexity - both in terms of the public CLI/API and for internal stuff. This project is still maintained as a side/hobby project with limited resources. Probably a separate package like what @azrahello and @anthonywu did is still the best approach. Regarding the other cli flags, I think the aspect ratio flag seems very interesting and that it would auto-calculate dimensions based on partial information like one would intuitively expect. On this topic more generally, I think the current CLI is due for a refresh pretty soon and have some ideas for how it can look but have not gotten around to a writeup. What would be super helpful to me is if we could start some kind of new discussion (e.g in a new issue - edit: I added this now #357) of how a modern CLI should work and look like. Then we could discuss ideas there, pros and cons of each design choice etc - because I think once we nail that, then implementation is trivial - just give it to Claude code, Cursor or Codex and it will build that for us). I'm thinking of things like naming and behaviour and what really feels intuitive form a user POV: I'll outline a few things on top of my mind, very unstructured, sorry :) :
There are many combinations here, and I feel like this job can be better done if we take a holistic approach to the CLI and redesign it from scratch now that we know so much more about what the project is compared to how the CLI just evolved naturally when we only had one model. Earlier, this kind of "design work" would have felt very daunting to even start tackling, (mostly because it was so hard to even know what we supported in the first place), but now with agents, I think this is a lot easer to get started with, but in the end we humans have to agree on what makes sense to have, and therein lies the real work. |
|
I added this issue now #357 |
That's the point of my other PR (I swear, concocted before reading your comment! :] ). Such capability is already there, it seemed wasteful not to leverage it. And it's fun to play with!
That's a very good idea indeed.
I understand, and maybe those additional schedulers are an unecessary redundance (I seldom try different ones too, tbh). The ability to meddle with the shift, however, is quite useful (at least to me), e.g. many times I succeeded in making a scene less coarse or less "plasticky" by just varying the shift. But it's up to you to decide. In any case, thanks for reviewing my PR! :) |
…CF; remove --metadata - --shift: override automatic sigma shift (mu) for both schedulers - --mcf-max-change: MCF sampler that clamps per-step latent changes - --cosine / --karras / --exponential: alternative sigma schedules - --aspect: aspect ratio presets with auto-dimension computation - --saveinfo: descriptive filenames encoding generation parameters - Remove --metadata JSON sidecar flag (EXIF metadata always embedded) - Remove export_json_metadata from all CLI entry points - All features apply to mflux-generate-z-image-turbo and mflux-generate-z-image - When no fork-specific flags are used, behavior is identical to upstream
e630a6d to
17b08f8
Compare
Ah, totally forgot that I had overlooked this for Flux2 🤦♂️, of course we should have them there too :) I'll merge your other PR right away! Thanks very much for noticing and fixing this. |










Note for Filip: Feel free to pick what you like from this PR and discard the rest. It's perfectly good to discard this PR and make another with just the things you want to integrate (if any).
Update: This PR now also includes aspect-ratio-preserving dimension scaling for img2img workflows.
When a reference image is provided via
--image-path, you can specify output dimensions (or just one of them) as scale factors:Changes (in
dimension_resolver.py):When only one dimension has a non-unity ScaleFactor (e.g. --height 1.2x) and the other is at auto/default, the scale is propagated to both dimensions — preserving the reference image's aspect ratio.
When one dimension is absolute pixels and the other is unspecified, the missing dimension is computed from the reference image's aspect ratio.
Also enabled
supports_dimension_scale_factor=Trueformflux-generate-z-image(was already enabled for turbo).All values snap to multiples of 16. Behavior is unchanged when no reference image is provided.
--shift: override automatic sigma shift (mu) for both schedulers
--mcf-max-change: MCF sampler that clamps per-step latent changes
--cosine / --karras / --exponential: alternative sigma schedules
--aspect: aspect ratio presets with auto-dimension computation
--saveinfo: descriptive filenames encoding generation parameters
All features apply to mflux-generate-z-image-turbo and mflux-generate-z-image
NOTE: when no fork-specific flags are used, behavior is identical to upstream
Summary (about the rest)
This PR adds several quality-of-life enhancements to the Z-Image and Z-Image Turbo
pipelines, ported from our zima.py CUDA/PyTorch scripts. All features are
opt-in — when none of the new flags are used, behavior is identical to upstream.
New CLI Flags
Noise Schedule Control
--shift <float>— Override the automatic sigma shift (mu) value.By default mu is computed from image dimensions. Higher values push the
noise schedule towards higher noise levels.
--cosine— Smooth S-curve sigma schedule (more steps at high/low noise)--karras— Karras schedule with rho=7 (concentrates steps on fine details)--exponential— Log-spaced sigmas between sigma_max and sigma_min(Only one of the three can be used at a time; mutually exclusive group)
Sampling
--mcf-max-change <float>— MCF (Mean Change Factor) sampler that clampsper-step latent changes to prevent sudden jumps. Typical: 0.05–0.50, min/max 0.01/1.00.
Convenience
--aspect <preset>— Aspect ratio presets (1:1, 4:3, 3:4, 3:2, 2:3, 16:9,9:16, 18:9, 9:18, 21:9, 9:21). If combined with only
--widthor--height,the missing dimension is auto-computed and rounded to multiples of 16.
--saveinfo— Save images with descriptive filenames encodingtimestamp, seed, steps, LoRA, scheduler, and sigma schedule info
(convenient, you won't have to look at the EXIF or json for quick reproducibility).
Modified Files
src/mflux/cli/parser/parsers.py— New args, aspect ratio dict, sigma_schedule resolution; removed--metadatasrc/mflux/models/common/config/config.py—sigma_schedulepropertysrc/mflux/models/common/schedulers/linear_scheduler.py—_generate_base_sigmas()src/mflux/models/common/schedulers/flow_match_euler_discrete_scheduler.py— samesrc/mflux/models/z_image/variants/z_image.py— New params ingenerate_image()src/mflux/models/z_image/cli/z_image_turbo_generate.py— Passthrough + saveinfosrc/mflux/models/z_image/cli/z_image_generate.py— Sameexport_json_metadata=args.metadataTesting
All features tested on Apple Silicon (M-series) with both
mflux-generate-z-image-turboand
mflux-generate-z-image. Baseline output is pixel-identical to upstream whenno new flags are used. See TESTING-SCRIPT.txt for the full test matrix (18 tests).
Checklist
mx.array,mx.cos,mx.linspace)