> In a world where things like Dynamo/Cassandra or Spanner/Cockroach exist, manually-sharded DB solutions are pretty much entirely obsolete.
Not really. Cassandra is a write optimised, slow, network and IO heavy beast thats a pain in the arse to administer. We replaced a 6 node m4.4xlarge with a single db.m4.2xlarge. on postgres.
You need to pick your DB to match your data, not the otherway around.
Spanner and Cassandra really shouldn't be in the same sentence. They are optimized for very different use cases. The "obsolete" part of that quote does apply to Spanner, TiDB, and CockroachDB in my experience. I haven't used Yugabyte, but the other sharded databases, including Vitess (TiDB is what Vitess is trying to be), actually make life harder.
I'm not sure what "more proven" exactly means. If you mean, it is not as well known and used in US? Yes that's true because the core developers are Chinese. If you mean, it hasn't been used at the same scale in production? False, just completely false.
At my current job, we have reached a scale and use case that requires either manually sharding the database or using a distributed one. 20 TB including indexes, 10-20 tables with >1B rows, and spikes of up to 10K/s transactions. We put Vitess, TiDB, CockroachDB, and Spanner head-to-head, including running production scale load tests. Spanner won out because it is far superior to them all for scaling geo-replication and sysadmin. But TiDB was a close second because it just works, scales, and fast. Vitess on the other hand was extremely buggy and has a very long, undocumented list of unsupported SQL features. Yes Vitess has better press and more well known, but it is an inferior technology imo. TiDB is already what Vitess claims to be.
20TB of data and 10k/s of transactions is not a large cluster. There are Vitess cutters with petabytes of data and 100s of millions of users doing millions of QPS.
The largest Vitess cluster we know of runs on 10s of thousands of servers. So "False, just completely false" is dramatic.
I didn't claim that I am dealing with a gigantic workload (total QPS is a lot higher btw). Rather I claimed we are at the point where a distributed database is a good idea if the business wants to continue to scale without major database trouble. I gave the specs so others can understand the parameters of the evaluation. If Vitess didn't perform well at that scale, I can't imagine the pain at Slack and Twitter scale. I don't envy them at all.
Most orgs, including mine, also have the experience of a cassandra cluster simply melting down. That was more common ten years ago, but it is pretty much chaos when it happens.
If the non-SQL distributed databases, I found FoundationDB to be by far the most robust and zero-overhead.
However, at least through version 6, it had absolutely terrible, service-destroying behavior if any of the nodes became low on storage, dropping transaction levels through the floor and since transactions are used for recovery and cluster expansion... Since it does a very nice job auto-sharding, if you're not paying attention this tends to happen to multiple nodes around the same time.
I wish it would get more attention though. It is a really amazing tool.
Not really. Cassandra is a write optimised, slow, network and IO heavy beast thats a pain in the arse to administer. We replaced a 6 node m4.4xlarge with a single db.m4.2xlarge. on postgres.
You need to pick your DB to match your data, not the otherway around.