Replying to myself for an update. The hypothesis of a a few users here is correct, while hard to believe. The latency spike is due to the fact that even with the 50 clients of the benchmark, it is possible to touch all the huge pages composing the process in the space of a single event loop. This is why I was observing this initial spike and nothing more. This seemed unrealistic to me with 50 clients, but I remembered that one of the Redis optimizations is to try to serve multiple requests for the same client in the same event loop cycle if there are many in queue, so this is what happens: 50 clients x N-queued-requests = enough requests to touch all the memory pages, or at least a very significant percentage to block for a long time and never again.