From 5f69acd3b7cbb72a5f0f42878d9cbc17796dbe84 Mon Sep 17 00:00:00 2001 From: Giorgos Stamatelatos Date: Sat, 4 Jul 2020 14:52:49 +0300 Subject: [PATCH] Note about determinism ChaoSampling is not deterministic because it uses TreeSet.descendingIterator, which arbitrarily breaks ties. --- src/main/java/gr/james/sampling/package-info.java | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/src/main/java/gr/james/sampling/package-info.java b/src/main/java/gr/james/sampling/package-info.java index 4244d11..f398f92 100644 --- a/src/main/java/gr/james/sampling/package-info.java +++ b/src/main/java/gr/james/sampling/package-info.java @@ -38,6 +38,12 @@ * the implementation table below. Implementations may also define certain restrictions on the values of {@code weight} * and violations will result in {@link gr.james.sampling.IllegalWeightException}. The weight ranges are also available * in the table. + *

Determinism

+ * Certain implementations rely on elements of the JRE that are not deterministic, for example + * {@link java.util.PriorityQueue} and {@link java.util.TreeSet}. The side effect of this is that weighted algorithms + * are not deterministic either because they typically rely on these data structures. This phenomenon is more prevalent + * in {@link gr.james.sampling.ChaoSampling}, where, in the presence of ties, there could be instances of different + * samples, even with the same seed and the same weighted elements. *

Precision

* Many implementations have an accumulating state which causes the precision of the algorithms to degrade as the stream * becomes bigger. An example might be a variable state which strictly increases or decreases as elements are read from