Skip to content

Deadlock in the timeline handling of the memory store. #508

Closed
@poljar

Description

@poljar

#486 introduces timeline handling/storing for our state stores. Sadly the MemoryStore implementation has a deadlock. You might have noticed CI running for ages and timing out.

The exact line that deadlocks seems to be here:

data.event_id_to_position.insert(event_id, data.start_position);

Commenting it out seems to get things going, though the client::test::room_timeline now fails.

There seems to be some trickery going on which confuses the borrow checker. Namely data from the previous line will be either coming from self.room_timeline from here:

} else if let Some(mut data) = self.room_timeline.get_mut(room) {

Or from, again from self.room_timeline, over here:

self.room_timeline.get_mut(room).unwrap()

The unwrap and the whole if/let block can be replaced with a self.timeline.entry() call without any change to the semantics:

diff --git a/crates/matrix-sdk-base/src/store/memory_store.rs b/crates/matrix-sdk-base/src/store/memory_store.rs
index ef173df72..e9775f768 100644
--- a/crates/matrix-sdk-base/src/store/memory_store.rs
+++ b/crates/matrix-sdk-base/src/store/memory_store.rs
@@ -284,10 +284,9 @@ impl MemoryStore {
                 info!("Save new timeline batch from messages response for {}", room);
             }
 
-            let data = if timeline.limited {
+            if timeline.limited {
                 info!("Delete stored timeline for {} because the sync response was limited", room);
                 self.room_timeline.remove(room);
-                None
             } else if let Some(mut data) = self.room_timeline.get_mut(room) {
                 if !timeline.sync && Some(&timeline.start) != data.end.as_ref() {
                     // This should only happen when a developer adds a wrong timeline
@@ -311,29 +310,19 @@ impl MemoryStore {
                 if delete_timeline {
                     info!("Delete stored timeline for {} because of duplicated events", room);
                     self.room_timeline.remove(room);
-                    None
                 } else if timeline.sync {
                     data.start = timeline.start.clone();
-                    Some(data)
                 } else {
                     data.end = timeline.end.clone();
-                    Some(data)
                 }
-            } else {
-                None
-            };
+            }
 
-            let mut data = &mut *if let Some(data) = data {
-                data
-            } else {
-                let data = TimelineData {
+            let mut data =
+                self.room_timeline.entry(room.to_owned()).or_insert_with(|| TimelineData {
                     start: timeline.start.clone(),
                     end: timeline.end.clone(),
                     ..Default::default()
-                };
-                self.room_timeline.insert(room.to_owned(), data);
-                self.room_timeline.get_mut(room).unwrap()
-            };
+                });
 
             // Create a copy of the events if the stream created via `room_timeline()` isn't
             // fully consumed

Now the borrow checker seems to kick in and we get multiple warnings of multiple mutable borrows of data. This is because, right after we get the data object out of the room_timeline map we mutably borrow it over here:

let data_events = Arc::make_mut(&mut data.events);

I'm not sure why we need to create a copy, or if this Arc::make_mut was meant to replace proper locking, but this code doesn't seem to work the way it was intended.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions