Does K6 deal with the coordinated omission problem?
Gil Tene (Azul Systems) has argued convincingly [1] (slides [2]) that monitoring tools get latency measurement wrong because sudden spikes aren't represented correctly in timings because of averaging and the wrong use of percentiles.
He argues that percentiles simply aren't useful, because, statistically, most requests will experience >= 99.99-percentile response times. All percentiles lie; the only truly useful and realistic number is, in fact, the 100th percentile.
He also argues that the most revealing way to show latency is with a complete histogram.
I haven't used K6 yet, but I've been looking for a good load-testing tool, and intend to try it out.
This used to be an issue with older k6 versions, but since v0.27.0 we have arrival-rate executors [1] that should address the biggest issue of coordinated omission. And we've always measured the max value of all metrics, even when they are not exported to some external output and we just show the end-of-test summary.
It has been a while since I watched these Gil Tene talks, so I might be missing something, but I think the only remaining task we have is to adopt something like his HDR histogram library [2]. And that's mostly for performance reasons, though we'll probably play around with the correction logic as well.
k6 allows the user to choose[1] which metrics are relevant for the particular test. By default, it displays max or p(100), p(95), p(90), min, med, and avg. User can specify other values such as p(99.995)
It's also possible to create completely custom metrics[2] to track whatever is relevant to the user.
k6 allows the user to change almost all aspects of execution, tracking, and reporting.
Gil Tene (Azul Systems) has argued convincingly [1] (slides [2]) that monitoring tools get latency measurement wrong because sudden spikes aren't represented correctly in timings because of averaging and the wrong use of percentiles.
He argues that percentiles simply aren't useful, because, statistically, most requests will experience >= 99.99-percentile response times. All percentiles lie; the only truly useful and realistic number is, in fact, the 100th percentile.
He also argues that the most revealing way to show latency is with a complete histogram.
I haven't used K6 yet, but I've been looking for a good load-testing tool, and intend to try it out.
[1] https://www.youtube.com/watch?v=lJ8ydIuPFeU
[2] https://www.azul.com/files/HowNotToMeasureLatency_LLSummit_N...