> In a world where things like Dynamo/Cassandra or Spanner/Cockroach exist, manu...

billythemaniam · on April 7, 2023

Spanner and Cassandra really shouldn't be in the same sentence. They are optimized for very different use cases. The "obsolete" part of that quote does apply to Spanner, TiDB, and CockroachDB in my experience. I haven't used Yugabyte, but the other sharded databases, including Vitess (TiDB is what Vitess is trying to be), actually make life harder.

KaiserPro · on April 7, 2023

you're missing the point. spanner and cassandra as for completely different ways of storing/accessing data.

For most people(ie 95% of people) After putting a caching layer, paying for a bigger RDS(or equivalent) DB service is the cheapest and fastest option.

samlambert · on April 7, 2023

> (TiDB is what Vitess is trying to be)

Not true. Vitess is far more proven.

billythemaniam · on April 7, 2023

I'm not sure what "more proven" exactly means. If you mean, it is not as well known and used in US? Yes that's true because the core developers are Chinese. If you mean, it hasn't been used at the same scale in production? False, just completely false.

At my current job, we have reached a scale and use case that requires either manually sharding the database or using a distributed one. 20 TB including indexes, 10-20 tables with >1B rows, and spikes of up to 10K/s transactions. We put Vitess, TiDB, CockroachDB, and Spanner head-to-head, including running production scale load tests. Spanner won out because it is far superior to them all for scaling geo-replication and sysadmin. But TiDB was a close second because it just works, scales, and fast. Vitess on the other hand was extremely buggy and has a very long, undocumented list of unsupported SQL features. Yes Vitess has better press and more well known, but it is an inferior technology imo. TiDB is already what Vitess claims to be.

samlambert · on April 7, 2023

20TB of data and 10k/s of transactions is not a large cluster. There are Vitess cutters with petabytes of data and 100s of millions of users doing millions of QPS.

The largest Vitess cluster we know of runs on 10s of thousands of servers. So "False, just completely false" is dramatic.

Slack is powered by Vitess https://slack.engineering/scaling-datastores-at-slack-with-v... Twitter runs Vitess https://blog.twitter.com/engineering/en_us/topics/infrastruc...

I am sorry you had a buggy experience running Vitess yourself but you just wrong about the size of these production workloads.

billythemaniam · on April 7, 2023

I didn't claim that I am dealing with a gigantic workload (total QPS is a lot higher btw). Rather I claimed we are at the point where a distributed database is a good idea if the business wants to continue to scale without major database trouble. I gave the specs so others can understand the parameters of the evaluation. If Vitess didn't perform well at that scale, I can't imagine the pain at Slack and Twitter scale. I don't envy them at all.

samlambert · on April 7, 2023

https://twitter.com/stewart/status/1252385972014088194

They are pretty happy!

Different people find different things difficult. Don't feel bad.

billythemaniam · on April 7, 2023

I don't. I am very happy to not use Vitess and more than willing to recommend your competitors. Don't feel bad.

pradeepchhetri · on April 7, 2023

> There are Vitess cutters with petabytes of data.

Is it GitHub? Where can I find more details about them? It will be interesting to read more about it.

foobiekr · on April 8, 2023

Most orgs, including mine, also have the experience of a cassandra cluster simply melting down. That was more common ten years ago, but it is pretty much chaos when it happens.

If the non-SQL distributed databases, I found FoundationDB to be by far the most robust and zero-overhead.

However, at least through version 6, it had absolutely terrible, service-destroying behavior if any of the nodes became low on storage, dropping transaction levels through the floor and since transactions are used for recovery and cluster expansion... Since it does a very nice job auto-sharding, if you're not paying attention this tends to happen to multiple nodes around the same time.

I wish it would get more attention though. It is a really amazing tool.