-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High Memory Usage #40
Comments
Sadly I don't think this is actionable for our little project. Honestly plowing through 3.7TB of data was never my intended use case for this. The problem is likely not the size of data but the sheer amount of files. There's been performance updates over the years that brought some data into memory to not have to reach for them in our SQLite database all the time. That makes stuff much faster but requires more memory. I think that if we removed this optimization, you'd get lower memory usage but you would wait for the checksums to calculate for a veeery long time, making the tool useless regardless. If a process using too much memory is causing your entire box to crash, you should check what's up with that. It's a userspace application, this shouldn't happen no matter what it does. |
Thanks for the reply, really appreciate it.
Totally understand, no problems at all.
Unfortunately it is a feature of Linux based systems. When the system runs out of RAM the kernel will start killing things, hopefully the right things. On headless boxes where your webgui's and other system access is through Docker based containers they can end up be killed, and hence the box crashes (at least as far as the user is concerned) and has to be power cycled to be recovered. Thanks again for the software. I think I have a solution (as of this afternoon), which is to give the box some swap space. Which is essentially writing the SQLite database back to disk, but in a much less efficient manner. Anyway I think it will solve the problem and speed is not a concern to me (if it takes a week to run that is fine) so it should be ok. Keep up the great work. |
I'm aware of OOM killer. Big companies like Facebook and Google disable it in their fleets because it is unpredictable. You can do it, too:
|
Hi there,
Love this library, just found it and it seems to work exactly as I want, except for one issue. It just ran my box out of memory and caused it to crash.
I'm trying to check a fairly large batch of files (about 3.7TB worth or 1,206,600 files) and bitrot really chews through the RAM (causing my box to crash). All up it seems to need 4.3GB of RAM to run, which does seem like a lot.
I'd prefer not to split my checks into multiple smaller sets if at all possible, but obviously I can't have my system crashing.
Any ideas on what I can do to fix this issue?
My system is:
AMD64
Debian Stretch
Python3.8
The text was updated successfully, but these errors were encountered: