-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KPI calculated even if too little data supplied #11
Comments
Sorry for the delay! I have been quite busy recently. TL;DR: Everything works as expected AFAIU The number of data points you need depends on the bound (upper/lower) that you pick, for a given percentile (except for the median, of course). To understand why let's keep your example. P=99, C=95.
|
Thanks a lot for a very detailed and enlightening explanation. It makes very much sense. I tunnel-visioned, assuming they had the same requirements. Regarding the side-note: You mean in
No need to apologize, I am grateful for you taking the time! |
Ah yes, you're right. You'll need to go back to the The two-sided option is reliable. JSYK, I've opened a PR ages ago to include this |
I looked and played a bit with the two-sided option. I modified
The lower- and upper-bounds I get is as follows:
I am struggling to combine my understanding of CIs and "bounds", the terms in Triscale, and the data I am seeing. I was conflicted, so I added some statements behinds the bounds - I was hoping I could ask you to comment, clarify, confirm? |
This might be an issue in TriScale, or me misunderstanding a use-case.
TL;DR:
analysis_kpi()
returns a valid value when too few data-points supplied - if the "unintuitive" bound is selected (upper for percentile < 50, and vice versa).Background: The intuitive way to calculate a KPI is to specify a bound which gives us the "worst case" (upper when percentile > 50, and vice versa). This allows us to make the "performance is at least X"-statements. However, I was thinking there was information in the other bound as well. This would show the width of the CI, and we could learn if the given metric varies a lot between runs. The first example coming to mind is industrial scenarios, where not only the maximum latency is interesting, but also its variability.
With this background I was routinely calling
analysis_kpi()
twice, once with bound set to upper and another with lower. Doing this I noticed I would be getting a valid value when the "unintuitive" bound was selected (upper for percentile < 50, and vice versa), even if I had too little data.Example with too little data:
With bound set to "upper", the KPI correctly returns NaN. With bound set to "lower", a number is returned.
The text was updated successfully, but these errors were encountered: