-
Notifications
You must be signed in to change notification settings - Fork 71.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mLab failures are non-graceful, need better error handling #4188
Comments
(I personally don't have any familiarity at all with the db interface code, or even where all the interconnected parts are located. Would be great to have some expert eyes on this to get the ball rolling at least.) |
@unsoluble thanks for creating an issue. I don't have a full mlab db so can't test this. |
Oh hey, I didn't notice #4004 there — maybe this is moot? I guess we'll see once 0.11 rolls out and gets used in the wild, but if I'm reading those changes correctly we might not have this problem any more. |
A key need to help updating the code to be more resilient against database full errors is to obtain the Nightscout console output of events when it crashes due to a full database. I know when it happens, I'm focused on getting Nightscout operational again rather than troubleshooting. It's amazing how much we depend on something we didn't know anything about a year ago! If anybody is willing to help by providing log contents, it's helpful to have already gone through the steps to get to the console output before the event happens. On Heroku, you can view the console output by installing the free PaperTrail add-on in the Heroku dashboard. You can also get to the output on Heroku using the Heroku command line interface (cli). The Heroku cli is more difficult to get setup on a computer, though. Since #4004, Nightscout hasn't crash on us or asked for an API_SECRET when the database fills up, but that doesn't mean there aren't other paths that could cause a crash we just haven't hit, yet. 😄 I have on my list of things to do at some point to create a test application to fill up each collection individually to test how Nightscout responds to the resulting database errors. My goal is to implement the items below over time.
|
After having debugged a few crashed instances, looks like part of the problem is that most Nightscout users have an old release that also doesn’t handle Mongo connection breaking. So figuring out how to get users to update would also be good. |
I successfully tested filling the From what I can tell, |
@danamlewis and me pointed @Dave9111 to some documentation on these problems. Gitter discussion starts here: https://gitter.im/nightscout/cgm-remote-monitor?at=5c324b145ec8fe5a85100157 |
I tested NS with a full |
Regardless of the status of this issue it might be user friendly to add a task that detects how close to full the database is and issues a warning when the avaiable space falls below a threshold so the user can try to fix it before it runs out. A graceful fail is always valuable, especially to developers, but preventing the fail in the first place would be even better. Maybe even two thresholds, and two warnings, one a week away and the other a 24 hour warning. Or if it is too hard to predict then maybe a 90% full warning followed by a 99% full warning. |
When a user's free-tier mLab storage becomes full (and possibly on other mLab-related failures), the Nightscout frontend does not currently handle this gracefully — typical symptom is a repeated request for the user's API_SECRET.
We should investigate the error handling here, and ideally present an actionable suggestion to the user (along the lines of "it looks like your mLab is full, here's how to empty it"). Or at the very least fail more silently.
The text was updated successfully, but these errors were encountered: