Skip to content

Commit 28cca28

Browse files
committed
Improve make_hash function
The `make_hash` function is used to prevent hashes of non-empty buckets to collide with `EMPTY_HASH = 0u64`. Ideally this function also preserve the uniform distribution of hashes and is cheap to compute. The new implementation reduces the input hash size by one bit, simply by setting the most significant bit. This obviously prevent output hashes to collide with `EMPTY_HASH` and guarantees that the uniform distribution is preserved. Moreover, the new function is simpler (no comparisons, just an OR) and (under the same assumptions as the old function, i.e. only the least significant bit will contribute to the bucket index) no additional collisions are caused.
1 parent fc2ba13 commit 28cca28

File tree

1 file changed

+6
-8
lines changed

1 file changed

+6
-8
lines changed

src/libstd/collections/hash/table.rs

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
// Copyright 2014 The Rust Project Developers. See the COPYRIGHT
1+
// Copyright 2014-2015 The Rust Project Developers. See the COPYRIGHT
22
// file at the top-level directory of this distribution and at
33
// http://rust-lang.org/COPYRIGHT.
44
//
@@ -139,13 +139,11 @@ impl SafeHash {
139139
/// This function wraps up `hash_keyed` to be the only way outside this
140140
/// module to generate a SafeHash.
141141
pub fn make_hash<Sized? T: Hash<S>, S, H: Hasher<S>>(hasher: &H, t: &T) -> SafeHash {
142-
match hasher.hash(t) {
143-
// This constant is exceedingly likely to hash to the same
144-
// bucket, but it won't be counted as empty! Just so we can maintain
145-
// our precious uniform distribution of initial indexes.
146-
EMPTY_BUCKET => SafeHash { hash: 0x8000_0000_0000_0000 },
147-
h => SafeHash { hash: h },
148-
}
142+
// We need to avoid 0u64 in order to prevent collisions with
143+
// EMPTY_HASH. We can maintain our precious uniform distribution
144+
// of initial indexes by unconditionally setting the MSB,
145+
// effectively reducing 64-bits hashes to 63 bits.
146+
SafeHash { hash: 0x8000_0000_0000_0000 | hasher.hash(t) }
149147
}
150148

151149
// `replace` casts a `*u64` to a `*SafeHash`. Since we statically

0 commit comments

Comments
 (0)