Skip to content

Deadlock in the timeline handling of the memory store. #508

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
poljar opened this issue Feb 28, 2022 · 3 comments
Closed

Deadlock in the timeline handling of the memory store. #508

poljar opened this issue Feb 28, 2022 · 3 comments
Assignees

Comments

@poljar
Copy link
Contributor

poljar commented Feb 28, 2022

#486 introduces timeline handling/storing for our state stores. Sadly the MemoryStore implementation has a deadlock. You might have noticed CI running for ages and timing out.

The exact line that deadlocks seems to be here:

data.event_id_to_position.insert(event_id, data.start_position);

Commenting it out seems to get things going, though the client::test::room_timeline now fails.

There seems to be some trickery going on which confuses the borrow checker. Namely data from the previous line will be either coming from self.room_timeline from here:

} else if let Some(mut data) = self.room_timeline.get_mut(room) {

Or from, again from self.room_timeline, over here:

self.room_timeline.get_mut(room).unwrap()

The unwrap and the whole if/let block can be replaced with a self.timeline.entry() call without any change to the semantics:

diff --git a/crates/matrix-sdk-base/src/store/memory_store.rs b/crates/matrix-sdk-base/src/store/memory_store.rs
index ef173df72..e9775f768 100644
--- a/crates/matrix-sdk-base/src/store/memory_store.rs
+++ b/crates/matrix-sdk-base/src/store/memory_store.rs
@@ -284,10 +284,9 @@ impl MemoryStore {
                 info!("Save new timeline batch from messages response for {}", room);
             }
 
-            let data = if timeline.limited {
+            if timeline.limited {
                 info!("Delete stored timeline for {} because the sync response was limited", room);
                 self.room_timeline.remove(room);
-                None
             } else if let Some(mut data) = self.room_timeline.get_mut(room) {
                 if !timeline.sync && Some(&timeline.start) != data.end.as_ref() {
                     // This should only happen when a developer adds a wrong timeline
@@ -311,29 +310,19 @@ impl MemoryStore {
                 if delete_timeline {
                     info!("Delete stored timeline for {} because of duplicated events", room);
                     self.room_timeline.remove(room);
-                    None
                 } else if timeline.sync {
                     data.start = timeline.start.clone();
-                    Some(data)
                 } else {
                     data.end = timeline.end.clone();
-                    Some(data)
                 }
-            } else {
-                None
-            };
+            }
 
-            let mut data = &mut *if let Some(data) = data {
-                data
-            } else {
-                let data = TimelineData {
+            let mut data =
+                self.room_timeline.entry(room.to_owned()).or_insert_with(|| TimelineData {
                     start: timeline.start.clone(),
                     end: timeline.end.clone(),
                     ..Default::default()
-                };
-                self.room_timeline.insert(room.to_owned(), data);
-                self.room_timeline.get_mut(room).unwrap()
-            };
+                });
 
             // Create a copy of the events if the stream created via `room_timeline()` isn't
             // fully consumed

Now the borrow checker seems to kick in and we get multiple warnings of multiple mutable borrows of data. This is because, right after we get the data object out of the room_timeline map we mutably borrow it over here:

let data_events = Arc::make_mut(&mut data.events);

I'm not sure why we need to create a copy, or if this Arc::make_mut was meant to replace proper locking, but this code doesn't seem to work the way it was intended.

@poljar
Copy link
Contributor Author

poljar commented Feb 28, 2022

cc @jsparber

@jsparber
Copy link
Contributor

jsparber commented Mar 1, 2022

I'm not sure why we need to create a copy, or if this Arc::make_mut was meant to replace proper locking, but this code doesn't seem to work the way it was intended.

The copy is needed so that the stream created with room_timline() can be sure that the items in data.events aren't changed till the stream is fully consumed.

@gnunicorn gnunicorn self-assigned this Mar 1, 2022
@poljar
Copy link
Contributor Author

poljar commented Mar 3, 2022

The deadlock was fixed by #509. Closing.

@poljar poljar closed this as completed Mar 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants