-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple metagrid node outage #729
Comments
This was a tricky one and the issue of resiliency has become intertwined with Nimbus services, specifically now for the node status. Will keep this open until status is restored as a reminder that we have a workaround in place that disables that functionality. |
So there's a nimbus dependency (not even the SOLR index) with https://esgf-node.cels.anl.gov/ and https://esgf-node.ornl.gov, ouch. Would be great to remove that issue asap, these DDoS/heavy requests bringing down the LLNL index are becoming repetitive |
I was curious where we are today, so checked. https://esgf-node.ornl.gov/search https://esgf-node.cels.anl.gov/search |
|
@znichollscr I believe this is the current estimate for |
With Katharina out sick this week, no movement on the DKRZ upgrade. |
It seems the LLNL ESGF outage this am was concurrent with ANL and ORNL, though DKRZ was functioning.
Just thinking aloud if there's a metagrid config that needs changing to provide more cross-node resilience to outages?
The text was updated successfully, but these errors were encountered: