Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In the last 2 decades in the industry as well I've never lost data with MongoDB, Riak or Cassandra but have with Oracle, DB2 and PostgreSQL. After all databases are just software and there will always be bugs. Some people just get tripped up by different ones.

And you are woefully ignorant to think the RDBMS is the right choice for 99% of projects. Especially since you think that the 1% of remaining users are purely worried about scalability. Hint: think about the schema problems associated with storing auto generated features from deep learning models.



>In the last 2 decades in the industry as well I've never lost data with MongoDB, Riak or Cassandra but have with Oracle, DB2 and PostgreSQL

Yet every test proves otherwise. Also, use Google to see how people have lost data with MongoDB. Mongo is not considered a serious piece of technology by any scientist or engineer I know. Postgres though is universally considered an engineering marvel.

>Hint: think about the schema problems associated with storing auto generated features from deep learning models.

Hint: The problem you mentioned? Even less than 1%

Calling me ignorant doesn't change reality you know.


Is data-loss something inherent to nosql tech or just poor implementations?

If its the latter why haven't there been any reliable nosql implementations.

Perhaps its well suited to non transactional, low fi data?


NoSQL DBs usually target distributed environments.

So... enter CAP theorem. There's no free lunch. People think we can simply throw away half a century's worth of science because JSON and schemaless are teh awesome derp derp.

Implementation is surely an issue, if you take into account that the mongodb guys had to acquire another company [1] in order to overcome their abysmal write performance. And yet there were people, and benchmarks that were trying to tell us that mongo was faster than RDBMS alternatives. All this circa 2009-2012.

You know what's faster than everything? Writing to /dev/null ;)

Anyways, depending on your use case there might be a NoSQL out there that might fill your needs and it might actually deliver what it claims it can deliver. But it's hard to sift through all this ad-driven, buzzword-ridden informacials that gets thrown around by start-up companies in the DB domain.

Also, DBs are like filesystems; even if the match/science is correct, it needs at least a decade of proven track record before you can say that it works as advertised.

[1] http://www.informationweek.com/software/information-manageme...


> NoSQL DBs usually target distributed environments. So... enter CAP theorem.

Surely FB is not running MYSQL on a single machine. Perhaps i am misunderstanding what you are saying but saying SQL db's dont face the issues of distribution seems a little strange.

Distribution comes into picture from shape and size of the data not data saving/retrieval techniques. yea?


FB and all big companies are a very bad example. They have ton of resources and usually they don't use vanilla products, since they have the engineering capacity to support their own forked versions. e.g. see their own version of PHP.

Also distributing reads is easy, writes... not so much. NoSQL systems usually offer distributed writes with the caveat of eventual consistency. RDBMS have referential integrity and other constraints which by definition cannot migrate into a distributed environment. Or at least there's not a one size fits all solution.

> Distribution comes into picture from shape and size of the data not data saving/retrieval techniques. yea?

Most definitely not. It has nothing to do with the shape and size of data. Also.. there's not such thing as "distribution" in our context. Only "distributed", from "distributed computing"[1] and it's everything to do about data saving and retrieval :)

[1] https://en.wikipedia.org/wiki/Distributed_computing


>RDBMS have referential integrity and other constraints which by definition cannot migrate into a distributed environment.

so,

Use RDBMS if your data can be handled by a single machine( or have the resources of FB) ? '99% ppl need RDBMS' argument boils down to 99% of ppl have data that can be handled by a single machine RDBMS.

Is that a good conclusion?


The single machine shouldn't be the deciding factor.

If your application is like most apps(far more reads than writes) then you can easily distribute the load across multiple machines. If you have more writes than reads(quite rare but still) then scaling an RDBMS will be challenging.

In this case, if eventual consistency is something you can live with, a NoSQL store might be best for you.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: