-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Fix signal handler race condition on metrics.write() #2893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix signal handler race condition on metrics.write() #2893
Conversation
e2dcfee
to
b4b5513
Compare
For the second commit I doubt this is the best approach. I am hardly familiar with RUST, but adding acquire/release a lock, on operations done on the hotpath is not OK. Maybe you can take the second commit out, and we fix just the issue with intermittent test. |
There are no locks added. These are just atomic instruction flags that prevent the CPU from reordering loads of a value before stores on the same atomic. More details about this: https://doc.rust-lang.org/nomicon/atomics.html#hardware-reordering |
src/logger/src/metrics.rs
Outdated
let snapshot = self.count(); | ||
let res = serializer.serialize_u64(snapshot as u64 - self.1.load(Ordering::Acquire) as u64); | ||
|
||
if res.is_ok() { | ||
self.1.store(snapshot, Ordering::Relaxed); | ||
self.1.store(snapshot, Ordering::Release); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is correct. If we can serialize from multiple threads, I think snapshot
can be different on each one and the value that ends up being assigned to self.1
in the end (self.1.store(snapshot, Ordering::Release);
) won't necessarily be the most recent one. So we can end up in the situation where self.0 < self.1.
A solution would be to have another atomic bool that is marking whether there is a serialization in progress, and if there is, don't do anything. Although this wouldn't guarantee the serialization of the latest values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.0
is incremented atomically from any thread. If we have Relaxed semantics, it may have different values on multiple threads when serialising.
If using Acquire semantics, like we do in self.count()
, we can be sure that any writes to self.0
with Release
semantics (like we do in add()) are going to be executed and visible before the load.
I believe that the scenario you are describing is possible prior to this PR, due to the Relaxed ordering.
A solution would be to have another atomic bool that is marking whether there is a serialization in progress, and if there is, don't do anything. Although this wouldn't guarantee the serialization of the latest values.
This is essentially simulating a mutex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
anyway, I believe that the usage of the metrics system in signal handlers needs a redesign with thread-safety in mind.
Since this second commit is pretty controversial and may be introducing some performance degradation on ARM, I will remove it from this PR to unblock it and I will open a new issue for it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would something like the following be possible ?
+-----------------------------------+-----------------------------------+
| Thread 1 | Thread 2 |
+-----------------------------------+-----------------------------------+
| let snapshot = self.count(); -> 5 | ... |
+-----------------------------------+-----------------------------------+
| ... | inc metric -> 6 |
+-----------------------------------+-----------------------------------+
| ... | let snapshot = self.count(); -> 6 |
+-----------------------------------+-----------------------------------+
| ... | store snapshot in self.1 -> 6 |
+-----------------------------------+-----------------------------------+
| store snapshot in self.1 -> 5 | |
+-----------------------------------+-----------------------------------+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this is not possible since we only write the metrics from the VMM thread or the signal handler.
In the signal handler we now only use StoreMetrics so this would not be possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the type is not thread safe than Sync
should not be implemented for it. Otherwise we're breaking the Rust safety principles and Rust cannot protect and benefit of the "fearless concurrency" anymore. The problem seems to be that we're doing more things than we should be doing in a context of the signal handler, so maybe a solution would be to simplify the signal handler. This will come at the expense of maybe metrics being missed, but it is arguably better than opening potentially other safety problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have discussed removing logging and metrics systems from the signal handlers.
One idea is to use specialised files for dumping abrupt exit information from signal and panic handlers (much like a core dump).
This is a separate feature and will be worked on in the future. It requires careful thought and more investment.
This PR addresses the issue surgically, since it is creating much trouble for us with intermittent test failures.
We are not introducing problems with this PR, we are applying a patch over the known occurrence of this problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a patch I think it's ok. Do we have an issue for doing a proper fix ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created an issue for it: #2899
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem sufficient to me because if I had to add a new metric I can imagine that it would be very easy to miss these comments. I wouldn't check the comments for the Metrics.write() or serialize() methods. It would be the same even if the serialize() method would be unsafe.
I realised that the signal handler generation macro is hardcoded to use the store
function, which is only defined for StoreMetrics
. It would require a separate function in order to add an IncMetric to a signal handler, like it is for SIGPIPE. This should be even more evident in code reviews
b4b5513
to
ee8c9d6
Compare
The METRICS.write() method operates on the false assumption that it is only being called from one thread (VMM). This is not true since it is called from the signal handlers as well (which may be executed on any thread). The race condition appears due to the fact that metrics state is mutated during serialization for SharedIncMetrics (old value gets assigned the new value after the difference is serialized). The METRICS.write() method only holds the lock when writing to the file, but there is no lock held while serialising the metrics to JSON. VMM THREAD (1/MINUTE) VCPU THREAD (SIGNAL HANDLER) new_val=1 dif=new_val-old_val(=1) old_val=new_val (=1) dif=new_val-old_val (=0) write(dif) (=0) exit() Because of the exit(), the VMM thread no longer gets a chance to write the metric. We now use SharedStoreMetric for deadly signal metrics, which does not have this type of race condition because it is not mutated on serialization. It also does not make sense to have a SharedIncMetric for deadly signal metrics, which can only ever be 0 or 1. This was the root-cause of the intermittent failure in test_custom_seccomp.py::test_failing_fiter Signed-off-by: alindima <[email protected]>
ee8c9d6
to
f7b7813
Compare
Reason for This PR
FIxes the root cause of intermittent failure of
test_custom_seccomp.py::test_failing_fiter
View commit messages.
Description of Changes
View commit messages.
rust-vmm
.License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license.
PR Checklist
[Author TODO: Meet these criteria.]
[Reviewer TODO: Verify that these criteria are met. Request changes if not]
git commit -s
).unsafe
code is properly documented.CHANGELOG.md
.