Is there a document that lists out Redis best practices like "Redis is very fast as long as you use O(1) and O(log_N) commands"?
Sure it's probably all obvious things, but it would be nice to have a checklist to skim over, to be sure I haven't forgotten any major consideration when designing a new system.
Hello nxb (but this comment also addresses jnewland post). In the documentation we have, for each command, the time complexity, and in certain commands like KEYS an explicit mention about not using those in production.
However the best tool for debugging if there is something wrong with a given deployment is to use the LATENCY command. Once the latency monitoring is enabled, by calling LATENCY DOCTOR Redis creates an automatic report about all the possible sources of latency with a given application/setup by collecting and analyzing live data.
The first section is a TL;DR synopsis listing everything very important.
In general the "slow log" feature, the latency doctor and reporting abilities, the software watchdog, and the documentation which explicitly provides time complexity for each command, should be quite a good amount of tools and informations in order to avoid problems. Obviously like with any popular tool like Redis, there is a percentage of users that will not read the doc and just deploy it. This is unavoidable but at least when they run into issues we try to provide them with the tools needed.
Those tools are certainly useful! In the recent production incident I mentioned, we we able to quickly use SLOWLOG to determine which calls were causing the latency spikes. Thanks deeply for providing all of these useful tools and docs.
> there is a percentage of users that will not read the doc and just deploy it
I wonder what this percentage is? I'd wager that it is higher than you might have anticipated, especially given the other comments on this thread.
As an engineer on a team ultimately responsible for the availability of a production service, it's my responsibility to ensure that the percentage of engineers that know the latency side effects of any Redis calls they make is near 100%. In the presence of such variable latency, any means of making that variability more obvious to all users of Redis would be a positive step towards happy users and operators.
I understand your point of view. Unfortunately I think Redis in this regard is a tool where to help is hard: I try to provide documentation but is the kind of tool that looks superficially so easy to use, yet you need some understanding in order to really use it effectively and deploy it safely. It's part of the fact that uses 1) uncommon tradeoffs and 2) is a set of commands without a predefined use-case, so there is tons to invent in the good and bad side :-)
This line jumped out to me as well. After a recent production incident determined to have been heavily influenced by several thousand O(N) commands being called against a list several orders of magnitude larger than normal, I'm thinking about this a lot.
I'm confident many other users of Redis are using O(N) operations in production against small datasets without knowledge of how much latency will be introduced by those operations when that dataset grows. This is exactly the kind of situation that makes me immediately skeptical when I find Redis in a emergent system design.
I'm considering what initially felt like a draconian means of remediation: using rename-command to rename all O(N) commands to BIG_O_N_$COMMAND to ensure everyone using them knows the possible impact and to allow for easy detection during code review and/or Redis latency spikes.
The more I think about it this approach, the more I feel that this should be the default mode of operation for Redis in production. SREs around the world would collectively save decades of time if the every engineer writing Redis queries to knew this fact by heart:
> Redis is very fast as long as you use O(1) and O(log_N) commands. You are free to use O(N) commands but be aware that it’s not the case we optimized for, be prepared for latency spikes.
> "Redis is very fast as long as you use O(1) and O(log_N) commands"
Well, like you say, to me that particular practice would be pretty obvious. More helpful would be a page that shows the time and space complexity of the various commands, akin to the following page for Python:
Edit: Yeah, like the Redis documentation, lol. I think it says a lot for Redis that I almost never need to visit the docs, other than the single page that lists all the commands (back when it was red was the last time I actually needed to look at it). Once you know the commands, quickly reading the changelogs provides me with all I need. Anyway, all that to say I had no idea Redis already provided this complexity information, don't remember them having this back when I first used learned Redis (although maybe they did). Good to know.
"Time complexity: O(N+M*log(M)) where N is the number of elements in the list or set to sort, and M the number of returned elements. When the elements are not sorted, complexity is currently O(N) as there is a copy step that will be avoided in next releases."
As I was reading through, the back of my mind was saying “Nice heuristics but now you’re adding uncertainty. The behavior of DEL could vary dramatically and users would not know why.”
I was relieved that antirez recognized this as a semantic change and gave it a new name. Very thoughtful.
Perhaps there is something to be learned from GC algorithms in this case?
The whole point of Redis is to run it on powerful single treaded machines. I can count number of times how 'experts' threw it on multi-core beasts of systems and were surprised a single core machine that cost /2 of that system destroyed their provisions.
Agreed. Worth to note it's always better to have at least two cores per Redis process if persistence is enabled since there is the saving process that will burn a single core from time to time.
I am probably missing something here, but if a delete happens asynchronously wouldn't that make the key still be available?
What happens if you check if that key still exists in a different operation as you're deleting it?
Also, how slow is an operation to rename it before you delete it?
The key is removed from the main hashtable and thus not accessible by the normal code path. It's appended to the internal list for deletion then. That's basically just moving around a pointer.
He kind of buried the lede: he intends to implement thread-per-connection :
"... it is finally possible to implement threaded I/O in Redis, so that different clients are served by different threads. This means that we’ll have a global lock only when accessing the database, but the clients read/write syscalls and even the parsing of the command the client is sending, can happen in different threads. This is a design similar to memcached, and one I look forward to implement and test."
Absolutely not, it's the memcached model. There are N threads (like 4 or 8 or one per core) serving all the clients in a multiplexed way like is happening now with a single client.
If we'll go the extra mile implementing also background slow operations, we'll send them in another "ops" thread, will block the client, and will resurrect the client and send the result when available. Like we do currently for blocking operations like BLPOP.
It's a nightmare to do perf/scale testing, capacity planning with a backend who's performance characteristics change with the dataset & operations being performed - that too in a fuzzy way due to 'lazy'. I like memcached way better than this simply because api operations are all deterministic.
As specified in the article DEL remains a blocking DEL. It's application code that calls UNLINK if a background release of memory is needed because of low latency needs. However 50% of the blog post explains how the implementation avoids to create problems by releasing objects faster than you can allocate new ones in order to avoid the undetermined memory behavior[1]. Anyway memcached is a lovely product, I hope it will be developed more in the future so that users have a choice. But in the context of our discussion, there is no equivalent in memcached for this feature since objects are composed of a single allocation. You get the same if you just use Redis SET / GET / DEL commands.
[1] This is also handled in the LRU code. If there is nothing to free but there are objects in the free list, the server waits for the background thread to make progresses to continue again. For expire and eviction of keys there will be explicit options where the user specifies if server-deleted objects should be deleted in a lazy or blocking way, with the old plain blocking as default.
memcache operations are mostly constant time, but their behavior is not what I would call deterministic due to the slab allocator. A long running memcached will not behave the same as a fresh memcached because pages are not reclaimed from a particular slab class.
Say you have a memcached that has been running and only storing 80 byte objects. The 80 byte slab class has all of the memory pages, and the other slab classes each have one page. Now your key size increases (for instance, because it includes the uid) and your objects fit into the 100 byte slab class. You will only be able to ever store 10,000 of these objects while the rest of your memory allocation is effectively unused.
Not an obvious problem to diagnose. You can only fix it by restarting memcached (and losing everything) or (very slowly) migrating pages one at a time to the larger slab class. This anecdote is based on a true story.
So your SET operation is still constant time, but the behavior is not deterministic. redis doesn't have this type of issue. I wish that I hadn't wasted so many years struggling with memcache.
I have a question. UNLINK will be more responsive than DELETE as it will not block and run it in background. But will subsequent requests be faster than DELETE. How will it behave if immediately followed by GET.
In background processing I think it is keeping the list of elements to delete. So when a GET is received it will check against that list too. The overhead will remain when you have to process incoming queries and deletion is not finished yet.
DELETE followed by GET takes x sec (mostly due to DELETE).
//In GET after DELETE time is not affected
UNLINK followed by GET1,GET2,... takes y1,y2... sec
//Will y1>y2>... until deletion finishes ?
Is this correct ? It looks like a trade-off, improving current latency at cost of later ones. (I feel it is worth it).
Sure it's probably all obvious things, but it would be nice to have a checklist to skim over, to be sure I haven't forgotten any major consideration when designing a new system.