Add pick generator specialized for indexed sequences#874
Add pick generator specialized for indexed sequences#874lambdani wants to merge 2 commits intotypelevel:mainfrom
Conversation
|
@lambdani just to clarify: is the suggested behavior different from this one? def shuffledPick[T](n: Int, seq: IndexedSeq[T]): Gen[collection.Seq[T]] = {
Arbitrary.arbitrary[Long].flatMap { seed =>
val shuffledSeq = new scala.util.Random(seed).shuffle(seq)
Gen.pick(n, shuffledSeq)
}
}I mean, I realize it works in a different way, but not sure if there's a difference in the results they both will be providing. |
|
No, it should have the same behavior. The only difference should be asymptotic efficiency (O(k log k) vs O(n)). But I don't know if it's faster in practice for enough use cases, and even then if it's worth the added complexity. I could try to write some benchmarks, but it's OK if you think it's not worth it :-). |
Added tests to check red-black tree invariants.
rossabaker
left a comment
There was a problem hiding this comment.
Thanks! A quick benchmark would be interesting to see approximately what size collection is the break-even point.
| * | ||
| * The elements are guaranteed to be permuted in random order. | ||
| */ | ||
| def indexedPick[T](n: Int, l: IndexedSeq[T]): Gen[collection.Seq[T]] = { |
There was a problem hiding this comment.
Names that sort close to their relatives improve discoverability: how about pickIndexed?
| /** A generator that randomly picks a given number of elements from an IndexedSeq | ||
| * | ||
| * The elements are guaranteed to be permuted in random order. | ||
| */ |
There was a problem hiding this comment.
A quick comment on the runtime improvement over pick would be helpful. Perhaps also that it doesn't repeat elements.
Hi! Would there be interest in a pick generator specialized for IndexedSeqs? When choosing k elements from a sequence with n elements, the idea is to choose an element in the inclusive range [0,n-1], then another one in [0,n-2]... up to [0,n-k]. Then these indices must be translated to the whole range [0,n-1] while avoiding repetitions. For this, one can use a modified version of an order statistic tree that selects the i-th non negative integer not present in the tree.
This should pick k elements in O(k log k) time, using O(k) extra space for the tree. Additionally, the elements should be permuted in random order.
The names are horrible but I couldn't come up with better ones. Any help with that would be appreciated if you think it's worth to add this generator to Scalacheck. What do you think?