BeastieBoy

FreeBSD, Lisp, Emacs, PostgreSQL & co.

When shall I use MongoDB?

When shall I use MongoDB?

When to pick MongoDB over a relational database can be a tricky question. Luckily, I'm here to help with a quick FAQ. Guaranteed 100% bias free.

Does MongoDB guarantee data persistence?

MongoDB is not ACID-compliant, which means that data can be lost. You should only pick MongoDB when you intend to store data you don't mind losing.

Does MongoDB scale well?

With MongoDB, data is stored duplicated. This means that indexes, when they are used, cannot be as effective as that of a relational database. Also, since data is duplicated, this means that databases tend to occupy more space on the disk than with a relational database. However, MongoDB scales through sharding, which means that data is spread across multiple servers to both balance the load and provide some redundancy. So if scaling is limited on one server, one can always add more servers to cope with the load. As for writing, MongoDB offer two approaches: with the MMAPv1 engine, it locks collections when a document is being saved, disallowing concurrent writing to different documents. This in part comes from the fact that MongoDB lacks transaction isolation (ACIDity). The second option involves the new default engine, WiredTiger, which relies on the optimistic concurrency model, assuming very low data contention. When a conflict occurs, transactions are rolled back and need to be run again. This is fine if transactions deal with different pieces of data; and very costly when multiple writes happen at the same time on the same document.

Is MongoDB more flexible than old cranky relational databases?

MongoDB gives developers the possibility to store data as documents, that is without any defined schema. In a collection, documents can actually look very different. This means the shape of the data doesn't have to be anticipated from the beginning, and fields can be added later. However, as a result, there is no way to alter a collection as a whole in one go: each record has to be fetched, edited, and saved back in the database. On large datasets, this can take hours. One also should think if their scenario really implies going on production without a clear idea of what the data looks like.

Does MongoDB makes the life easier for application developers?

Since MongoDB allows application developers to start working without thinking about the schema or the data itself, things are indeed easy. However, since the approach revolves only around data storage and not data itself (shape, type, constraints, etc.), developers have to write the validation part themselves, on the application side. This means more code to write, maintain, debug.

Is MongoDB the quickest engine with flexible, modern JSON data?

MongoDB makes JSON and binary JSON data its sole data format and tries to be effective at storing and reading it back. However, relational databases can also manage JSON and binary JSON data, and can be pretty efficient with it. Actually, PostgreSQL outperformed MongoDB is a series of benchmarks. Benchmarks being benchmarks, one can consider that at the very least, it's not clear which of MongoDB and PostgreSQL is the most efficient. However, it is clear that with PostgreSQL, one also gets to benefit from all the rest it has to offer: ACIDity, data validation, indexes, etc.

To sum up

One should use MongoDB when they want to store a moderate quantity of data they don't mind losing, don't know much about and don't mind locking for basic maintenance tasks. It's stunning how much cutting edge in 2017 resembles cutting edge in 1997, when startups were trying to circumvent the shortcomings of MyISAM by cobbling together piles of Perl code, ending up with half-working data processing pipelines.