-
Notifications
You must be signed in to change notification settings - Fork 656
no-std select! #1912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
no-std select! #1912
Conversation
@@ -18,18 +18,15 @@ pub fn shuffle<T>(slice: &mut [T]) { | |||
|
|||
/// Return a value from `0..n`. | |||
fn gen_index(n: usize) -> usize { | |||
(random() % n as u64) as usize | |||
random() % n | |||
} | |||
|
|||
/// Pseudorandom number generator based on [xorshift*]. | |||
/// | |||
/// [xorshift*]: https://en.wikipedia.org/wiki/Xorshift#xorshift* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since different pointer widths use different algorithms it would be good to move this comment to them. It looks like it's currently xorshift64*
, xorshift32
and non-standard xorshift16
(is there a reference for the constants chosen for this? the xorshift paper only appears to have tables for 32 and 64).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this should definitely be documented and a conscious decision made for each. I picked something random just to get the concept up to show, and to discuss whether plain xorshift32 is "good enough" (it even feels like xorshift64*
is overkill but I left it as-is considering it's the existing implementation and unlikely to matter on 64bit targets anyway). The most relevant discussion of 16bit variants I've come across is here, my target platform in mind here is msp430 (are there any other 16bit rust targets?).
fn rng() -> usize { | ||
static RNG: AtomicUsize = AtomicUsize::new(prng_seed()); | ||
|
||
// Preemption here can cause multiple threads to observe the same state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is actually fine-- multiple threads can't be polling the same future at once, so all this means is that the ordering of two unrelated select!
s running at the same time will be the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, though I think I framed the issue poorly in the comment. While I don't think it's a big deal because you will always eventually make progress, and the problem seems somewhat convoluted and only really scales if you have tons of threads, what I believe can happen is:
rng state = 1
thread1: load state = 1
thread2: load state = 1
thread2: shuffle + store state = 2
thread2: load state = 2
thread2: shuffle + store state = 3
thread1: shuffle + store state = 2
thread2: load state = 2 // oh no state was clobbered and the sequence jumped backward from thread2's PoV
thread2: shuffle + store state = 3
Different idea, which is less prescriptive: Offer users a mechanism to pass a random function to select as an additional parameter, and we just call that one. That would work on all platforms. Only on Benefit here is that users on no-std don't have to pay the overhead for atomics and a static variable if not required. We can then also add The downside of this is that |
This would be really nice, yeah.
I believe that would be
While I personally don't care much for my own purposes/interests and would be fine with that api alone, it does seem beneficial to provide a default/fallback implementation if possible. |
While this is less of a blocker now that |
@@ -39,16 +36,75 @@ fn random() -> u64 { | |||
hasher.write_usize(COUNTER.fetch_add(1, Ordering::Relaxed)); | |||
seed = hasher.finish(); | |||
} | |||
seed | |||
seed as usize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, it turns out that DefaultHasher
doesn't actually involve any randomness, so you could replace this with 13646096770106105413
(for the current algorithm backing it).
Alternatively, this could be changed to use RandomState
instead of DefaultHasher
so that there is actually some randomness injected. That should probably be a separate PR though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, yeah without randomness there's presumably little point in using a hasher at all, the seed generator may as well just be reduced to COUNTER.fetch_add(1)
and give out sequential seeds at that point.
(note the linked fn returns a new seed per thread to initialize TLS with so replacing that literal line with a constant isn't equivalent, unless you meant I should change the const fn underneath to match)
`select!()` is the only macro that requires std due to implementation details, so allow everything else through.
Targets without libstd (and thus thread-local storage) now use global xorshift state for the select order, which has been weakened for <64bit targets.
DefaultHasher is deterministic, so there's little point in using it.
While #1891 adds minimal support for the
async-await
feature flag onno_std
platforms by enabling the trivial macros and pieces that can work without modification,select!
and the rng implementation detail behind it remain the odd one out. This PR aims to address that.To reiterate a few points on why changes are necessary and what design decisions need to be made here, armv6m and some other archs only have
AtomicUsize::{load, store}
rather than full CAS or double width atomics. This requires reducing the xorshift state width depending on target pointer size and also being okay with the state potentially repeating in the cases where calls overlap during the load+store and clobber each other. Given that the randomness requirements here are rather weak (and just in place to prevent executor time/scheduling exhaustion), I believe this shouldn't be much of a problem in practice even under pathological workloads (alsono_std
systems will likely only have one global single-threaded executor anyway?).This is more of an RFC than a draft to pitch whether something along these lines could be accepted. Some potential points I can come up with against it:
no_std
futures are currently unsupported by stable rust and thus effort shouldn't be made to support themno_std
specific logicselect_biased!()
macro won't need any special handling and will work as-is on std and nostd alike while meeting many of the same needs