Skip to content

WIP/RFC: shift_remove and friends #558

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/).
## [Unreleased]

### Added

- Added `shift_remove`, `shift_remove_entry`, `shift_remove_full`, `shift_remove_index` to `IndexMap`
- Added `bytes::Buf` and `bytes::BufMut` implementations for `Vec`.
- Added `format` macro.
- Added `String::from_utf16`.
Expand Down
228 changes: 198 additions & 30 deletions src/index_map.rs
Original file line number Diff line number Diff line change
Expand Up @@ -302,43 +302,65 @@ where
where
F: FnMut(&mut K, &mut V) -> bool,
{
const INIT: Option<Pos> = None;

self.entries
.retain_mut(|entry| keep(&mut entry.key, &mut entry.value));

if self.entries.len() < self.indices.len() {
for index in self.indices.iter_mut() {
*index = INIT;
}
self.after_removal();
}
}

fn shift_remove_index(&mut self, index: usize) -> Option<(K, V)> {
if index >= self.entries.len() {
return None;
}

let bucket = self.entries.remove(index);

self.after_removal();

Some((bucket.key, bucket.value))
}

fn shift_remove_found(&mut self, _probe: usize, found: usize) -> (K, V) {
let entry = self.entries.remove(found);

self.after_removal(); /* Todo: pass probe if this starts taking an index parameter */

(entry.key, entry.value)
}

for (index, entry) in self.entries.iter().enumerate() {
let mut probe = entry.hash.desired_pos(Self::mask());
let mut dist = 0;

probe_loop!(probe < self.indices.len(), {
let pos = &mut self.indices[probe];

if let Some(pos) = *pos {
let entry_hash = pos.hash();

// robin hood: steal the spot if it's better for us
let their_dist = entry_hash.probe_distance(Self::mask(), probe);
if their_dist < dist {
Self::insert_phase_2(
&mut self.indices,
probe,
Pos::new(index, entry.hash),
);
break;
}
} else {
*pos = Some(Pos::new(index, entry.hash));
// Todo: Should this take in a parameter to allow it to only process the moved
// elements?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Technically it's O(n) in both cases, but one is O(len()) while the other is O(len() / 2).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (see subsequent push). I don't fully understand the hashing logic, so I may have missed a subtlety.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong, fix incoming.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add another test for the wrong logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a test for the wrong logic. However, I'm having a hard time figuring out what the correct logic should be.

I can replace only the affected entries' slots in indices with None, and and restrict the recalculation to only the affected entries. But, that fails too, because we would also need to rewrite "unaffected" entries that would have "preferred" to have a slot value that has since been vacated. (There's no longer a mismatched entry there, so when probed, it shows no value present, and the lookup fails.)

I can revert the "optimization" of after_removal to again re-process everything, and the test then passes. But we lose the putative optimization opportunity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a test, I'm going to try the re-hashing logic from remove_found. It doesn't have the "robin hood" logic in it, but when I tried just leaving that logic intact I got an infinite loop. (Or I can abandon the efficiency quest for now.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put the logic from retain back inline; I don't need to solve the problem of arbitrary removals, just a single removal. I switched to having just CoreMap::shift_remove_found, which now mirrors remove_found, plus an extra step to "shift" the appropriate index values after fixing up the removed slot. All IndexMap::shift_* operations are implemented in terms of shift_remove_found now. There's now moderate code duplication between remove_found and shift_remove_found, however.

fn after_removal(&mut self) {
const INIT: Option<Pos> = None;

for index in self.indices.iter_mut() {
*index = INIT;
}

for (index, entry) in self.entries.iter().enumerate() {
let mut probe = entry.hash.desired_pos(Self::mask());
let mut dist = 0;

probe_loop!(probe < self.indices.len(), {
let pos = &mut self.indices[probe];

if let Some(pos) = *pos {
let entry_hash = pos.hash();

// robin hood: steal the spot if it's better for us
let their_dist = entry_hash.probe_distance(Self::mask(), probe);
if their_dist < dist {
Self::insert_phase_2(&mut self.indices, probe, Pos::new(index, entry.hash));
break;
}
dist += 1;
});
}
} else {
*pos = Some(Pos::new(index, entry.hash));
break;
}
dist += 1;
});
}
}

Expand Down Expand Up @@ -1231,6 +1253,152 @@ where
.map(|(probe, found)| self.core.remove_found(probe, found).1)
}

/// Remove the key-value pair at position `index` and return them.
///
/// Like [`Vec::remove`], the pair is removed by shifting all
/// remaining items. This maintains the remaining elements' relative
/// insertion order, but is a more expensive operation
///
/// Return `None` if `index` is not in `0..len()`.
///
/// Computes in *O*(n) time (average).
///
/// # Examples
///
/// ```
/// use heapless::index_map::FnvIndexMap;
///
/// let mut map = FnvIndexMap::<_, _, 8>::new();
/// map.insert(3, "a").unwrap();
/// map.insert(2, "b").unwrap();
/// map.insert(1, "c").unwrap();
/// let removed = map.shift_remove_index(1);
/// assert_eq!(removed, Some((2, "b")));
/// assert_eq!(map.len(), 2);
///
/// let mut iter = map.iter();
/// assert_eq!(iter.next(), Some((&3, &"a")));
/// assert_eq!(iter.next(), Some((&1, &"c")));
/// assert_eq!(iter.next(), None);
/// ```
pub fn shift_remove_index(&mut self, index: usize) -> Option<(K, V)> {
self.core.shift_remove_index(index)
}

/// Remove the key-value pair equivalent to `key` and return it and
/// the index it had.
///
/// Like [`Vec::remove`], the pair is removed by shifting all
/// remaining items. This maintains the remaining elements' relative
/// insertion order, but is a more expensive operation
///
/// Return `None` if `key` is not in map.
///
/// Computes in **O(n)** time (average).
/// # Examples
///
/// ```
/// use heapless::index_map::FnvIndexMap;
///
/// let mut map = FnvIndexMap::<_, _, 8>::new();
/// map.insert(3, "a").unwrap();
/// map.insert(2, "b").unwrap();
/// map.insert(1, "c").unwrap();
/// let removed = map.shift_remove_full(&2);
/// assert_eq!(removed, Some((1, 2, "b")));
/// assert_eq!(map.len(), 2);
/// assert_eq!(map.shift_remove_full(&2), None);
///
/// let mut iter = map.iter();
/// assert_eq!(iter.next(), Some((&3, &"a")));
/// assert_eq!(iter.next(), Some((&1, &"c")));
/// assert_eq!(iter.next(), None);
/// ```
pub fn shift_remove_full<Q>(&mut self, key: &Q) -> Option<(usize, K, V)>
where
K: Borrow<Q>,
Q: ?Sized + Hash + Eq,
{
self.find(key).map(|(probe, found)| {
let (k, v) = self.core.shift_remove_found(probe, found);
(found, k, v)
})
}

/// Remove and return the key-value pair equivalent to `key`.
///
/// Like [`Vec::remove`], the pair is removed by shifting all
/// remaining items. This maintains the remaining elements' relative
/// insertion order, but is a more expensive operation
///
/// Return `None` if `key` is not in map.
///
/// Computes in **O(n)** time (average).
/// # Examples
///
/// ```
/// use heapless::index_map::FnvIndexMap;
///
/// let mut map = FnvIndexMap::<_, _, 8>::new();
/// map.insert(3, "a").unwrap();
/// map.insert(2, "b").unwrap();
/// map.insert(1, "c").unwrap();
/// let removed = map.shift_remove_entry(&2);
/// assert_eq!(removed, Some((2, "b")));
/// assert_eq!(map.len(), 2);
/// assert_eq!(map.shift_remove_entry(&2), None);
///
/// let mut iter = map.iter();
/// assert_eq!(iter.next(), Some((&3, &"a")));
/// assert_eq!(iter.next(), Some((&1, &"c")));
/// assert_eq!(iter.next(), None);
/// ```
pub fn shift_remove_entry<Q>(&mut self, key: &Q) -> Option<(K, V)>
where
K: Borrow<Q>,
Q: ?Sized + Hash + Eq,
{
self.shift_remove_full(key).map(|(_idx, k, v)| (k, v))
}

/// Remove the key-value pair equivalent to `key` and return
/// its value.
///
/// Like [`Vec::remove`], the pair is removed by shifting all of the
/// elements that follow it, preserving their relative order.
/// **This perturbs the index of all of those elements!**
///
/// Return `None` if `key` is not in map.
///
/// Computes in **O(n)** time (average).
///
/// # Examples
///
/// ```
/// use heapless::index_map::FnvIndexMap;
///
/// let mut map = FnvIndexMap::<_, _, 8>::new();
/// map.insert(3, "a").unwrap();
/// map.insert(2, "b").unwrap();
/// map.insert(1, "c").unwrap();
/// let removed = map.shift_remove(&2);
/// assert_eq!(removed, Some(("b")));
/// assert_eq!(map.len(), 2);
/// assert_eq!(map.shift_remove(&2), None);
///
/// let mut iter = map.iter();
/// assert_eq!(iter.next(), Some((&3, &"a")));
/// assert_eq!(iter.next(), Some((&1, &"c")));
/// assert_eq!(iter.next(), None);
/// ```
pub fn shift_remove<Q>(&mut self, key: &Q) -> Option<V>
where
K: Borrow<Q>,
Q: ?Sized + Hash + Eq,
{
self.shift_remove_full(key).map(|(_idx, _k, v)| v)
}

/// Retains only the elements specified by the predicate.
///
/// In other words, remove all pairs `(k, v)` for which `f(&k, &mut v)` returns `false`.
Expand Down