-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] OOM when clickhouse is slow and a lot of insert queries are sent #428
Comments
hi there, this is a tricky issue. I don't really see a positive outcome here rather than using a rate limiter for your inserter and handle the back pressure at the data producer level. Another option would be to add way more memory to your chproxy, or to bypass chproxy for data insertion or to make clickhosue faster 😅 No miracle would happen here. |
Yes, indeed a tricky issue. Rate limiting in front of chproxy is (much stricter) in place now. But still, imho is a program running into OOM a bug :) Just adding more resources will just move the point where the oom will happen. To solve this bug I think a completely different memory management would be needed, but yes, its not trivial as not all connection need the same amount of memory. |
Unfortunately, we (contentsquare) don't use chproxy to insert data. This feature has been done by the previous maintainers (Vertamedia) and we don't maintain it anymore. |
@mga-chka - I've experienced the same issues as author described: Chproxy catches OOM under heavy INSERT load with large batches. So I've made some tests and can shed some light on nature of this bug - it seems that this issue was introduced by changes in 1.22.0 release because 1.21.0 works stable in our environment but 1.22 OOM killed after ~10-20 seconds after starting workload. At least two changes probably introduced this bug: #299 and #296. To test it I've built custom binary from 1.22 sources with that changes reverted and it works stable under our load. But original 1.22 binary and the latest version binary are OOM killed. One of possible root causes - maybe it's not efficient to load every incoming request body for possible retry because it can be very huge for INSERT like workload. |
Describe the bug
We regularly see the following issue:
To Reproduce
Expected behavior
No OOM. Better memory handling. Cancel connections or let them wait before running OOM.
Environment information
chproxy v1.26.2
clickhosue 24.3.2.23
The text was updated successfully, but these errors were encountered: