Skip to content

Conversation

@loci-dev
Copy link

@loci-dev loci-dev commented Jan 7, 2026

Mirrored from leejet/stable-diffusion.cpp#1180

Once I wrote "three TinySD models should be enough" ( #939 ), but I changed my mind on user's request in December 2025 ( #603 ) and because SDXS-512 is so incredible fast. (That makes SDXS very handy for my sdcpp-on-android project.)

The main challenge for me was that SDXS does not use AutoEncoderKL as primary VAE (as most else) but AutoEncoderTiny, see also https://huggingface.co/IDKiro/sdxs-512-dreamshaper . I also hope that including SDXS into sd.cpp will convince @IDKiro to release into the public his SDXS-1024 one day.

@loci-dev loci-dev temporarily deployed to stable-diffusion-cpp-prod January 7, 2026 08:44 — with GitHub Actions Inactive
@loci-agentic-ai
Copy link

Explore the complete analysis inside the Version Insights

I've successfully generated a summary report for your project. The report shows a comprehensive performance analysis comparing the base version (aaf16831-eb1a-11f0-81f2-dbb430499cb5) to the target version (4e61d151-eba5-11f0-81f2-dbb430499cb5) for the stable-diffusion.cpp repository (Pull Request #12).

Key Highlights:

  • Top Performance Change: The std::_Hashtable::begin function shows a dramatic 310% throughput improvement
  • Mixed Results: Some functions gained throughput at the cost of response time, while others improved response time with reduced throughput
  • Primary Impact Areas: Most changes affect C++ STL container operations (vectors, hashtables, red-black trees)
  • Affected Binaries: Both sd-server and sd-cli show performance changes

The report includes detailed metrics for the top 10 functions by performance change, along with key observations and recommendations for further investigation.

@loci-dev loci-dev temporarily deployed to stable-diffusion-cpp-prod January 8, 2026 12:47 — with GitHub Actions Inactive
@loci-agentic-ai
Copy link

Explore the complete analysis inside the Version Insights

I've generated a comprehensive summary report for your project. The analysis shows significant performance regressions in Pull Request #12 for the stable-diffusion.cpp repository.

Key Highlights:

⚠️ Critical Issues Identified:

  • Response times increased by up to 226.5% in some functions
  • STL vector and iterator operations are severely impacted
  • Both sd-server and sd-cli binaries are affected

🎯 Most Affected Areas:

  • Vector operations (especially vector::end())
  • Iterator functions
  • Memory allocation routines

💡 Main Recommendation:
Consider investigating compiler optimization changes or reverting PR #12 until the root cause of these performance regressions is identified and resolved.

The report includes detailed metrics, affected components, and actionable recommendations for addressing these performance issues.

@loci-dev loci-dev force-pushed the master branch 5 times, most recently from 3d97fa6 to fd3def8 Compare January 13, 2026 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants