My experience with Datomic was that it is not appropriate for a system that has ...

valw · on Jan 18, 2018

> ...it is not appropriate for a system that has performance requirements

This is an unreasonably simplistic approach to assessing performance. A useful approach would be to describe what workloads Datomic may or may not be suited for. Having used Datomic (On-Prem) 2 years in production, here's my take on it:

1. For a system that writes a lot (eventually leading to 10 billion datoms or having high peaks of write throughput), Datomic is not a great fit, because it has a single writer thread and does quite a bit of indexing. 2. Reads are horizontally scalable, and because Datomic's semantic allow for pervasive and reliable cacheing, reads can handle a huge load, and have low latency for OLTP-style queries. 3. Datalog is relatively slow for aggregations, and offers no facilities for trading accuracy for speed: if you need big, low-latency aggregations, you should offload it to a specialized store like ElasticSearch. This is especially easy to do with Datomic, because Datomic makes it trivial to implement change detection, in-real time if needed (see Log API and txReportQueue) 4. When writes are overwhelmed and become unavailable, reads stay available, which is an awesome situation to be in.

lgrapenthin · on Jan 17, 2018

Having built a variety of products with Datomic and performance requirements I can only disagree. Especially queries are insanely fast compared to other databases.

I don't know what your data team was doing, but they must have done it wrong.

My advice is to develop without a data team from the start ;)

nas · on Jan 17, 2018

Datomic is interesting. It is somewhat similar to object databases like ZODB. I use an object database called Durus (spiritual derivative of ZODB). Both ZODB and Datomic make use of aggressive caching in the application. That is a huge performance boost for applications that do a lot of reading and little writing. ZODB essentially puts the indexes in the application, rather than having the DB layer take care of indexes. Again, I understand this is similar to how Datomic works.

Developing an application on top of ZODB vs on top of an SQL database leads to a very different data model design. You can't just swap one database layer out for the other and expect things to work well. I'm not surprised if a team who is used to developing on top of SQL-like DBs did not have good luck when moving to Datomic. It is just a very different model.

carterehsmith · on Jan 18, 2018

>> Especially queries are insanely fast compared to other databases.

Nice! Insanely fast? Awesome. What kind of queries compared to what databases?

qaq · on Jan 17, 2018

would you give an example?

lgrapenthin · on Jan 17, 2018

The main advantage is that you can do time-consistent queries against any past database state across the entire database without any read locks. This frees resources and eliminates the necessity of timestamp tables.

Then there is Datomics caching model in combination with its local data based query engine. The vast majority of critical queries only read from memory. Data that is fetched doesn't block other consumers.

qaq · on Jan 18, 2018

What RDBMS are you comparing to pretty much everything is MVCC so writers don't block readers and readers do not block writers mature RDBMS also are fairly highly optimized written in C with critical paths hand written in assembly so again would be interesting to learn for what workload Datomic is significantly faster. The biggest performance penalty in a modern RDBMS is GC of snapshots so if you need a design that is append only it will make modern RDBMS more performant.

lgrapenthin · on Jan 20, 2018

I don't care so much about optimized tight loops written in assembly, rather about the ability to scale nowadays. Datomic uses databases like you describe e.g. Postgres as its storage. So it would be foolish to say that it beats their performance on a bare bones level. Instead, Datomics architecture and information model make it much easier and significantly reduce the overhead to design and implement applications that provide insane performance. I won't argue that you can hand rewrite every Datomic application in its underlying storage database and get more performance out of it if you do your caching and coordination right. It will take you much longer though (I'd guess a tenfold at least), likely have some very difficult to find bugs, and the result won't be as easy to extend. With Datomic, I get memory speed performance out of the box for the heavy hitters and so much more that it take would take some very uncommon requirements for me to choose something else nowadays.

qaq · on Jan 21, 2018

Well I would buy the developer productivity argument but most applications have a mix of reporting requirements that are generally extremely hard to implement on anything "distributed". Also in a distributed system you are either running some consensus algorithm (paxos, RAFT etc) that will def. not be "insane performance" or you it will have issues with consistency.

lgrapenthin · on Jan 27, 2018

Datomic uses distributed storage, writes are coordinated by a single instance (transactor). Reads don't block writers and immutability allows to query consistent snapshots. Does that address your concern?

didibus · on Jan 18, 2018

I think a big difference is that Datomic can scale out reads in a distributed way. While also keeping writes consistent and transactional.

So its specifically designed for such case where you need consistent transactional writes, while needing high scale reads.

qaq · on Jan 18, 2018

PostgreSQL can do few million reads per second on a high end x86 box. You can scale our reads easily by running multiple slaves.

sheepmullet · on Jan 17, 2018

Interesting!

> not appropriate for a system that has performance requirements

What kind of performance requirements?

Write performance, Query performance, latency? Other?

Where was the bottleneck?

I've found it's very hard for other databases to beat Datomic for query performance.

> Another piece of advice would be to seriously consider if you need immutability in your database.

Remember to check with your audit team, your reporting and analytics teams, your business users, and your data warehousing team.