In a recent talk featured at MongoDB Live 2021, Apostrophe CTO, Tom Boutell, shared the journey of coming to the best fit between a database and a CMS. The important words there are "best fit." So this is not a post about what the best database is, or the best CMS for that matter (though we do have some thoughts on that). This is a story about what happens when you make a decision based on the right criteria.
Sometimes you make the wrong choice
Plain and simple, we chose the wrong database back in 2009. The problem was that we did not know how to evaluate our choices at the time. If you don't plan well, it can take a long time to realize that you've made the wrong choice, and unraveling it can be painful. During this time, we were rolling out Apostrophe as our own open source CMS. Choosing a database was one of the trickier calls we had to make at the start. Respecting and listening to the voice of users and the open source community, we knew that some set of people would not be happy no matter which database we picked. In the end exhaustion, familiarity and popularity became the hidden criteria behind our choice. Certainly not how you want to make a decision.
The right selection criteria
Sure, we made progress. We made the initial selection work for our needs, but we knew that we needed a better match. So when Node.js appeared on the scene, we seized the opportunity to make bold changes. To do that, we needed to choose a database based on the right criteria.
Use case: does this database fit our needs specifically?
Open source: this was our initial requirement, but what was the underlying need? (See sustainability section below)
Low tooling: selecting a database with the right features out of the box reduces the need for tools that dress it up as something else.
Sustainability: will the database stick around, be priced fairly and be familiar enough to developers for them to adopt? MongoDB's server-side public license helps ensure that, along with their Atlas offering.
Speed: is it fast enough to serve pages in a hurry?
Security: will it be simple to keep secure?
Defining our use case
Nested widgets: pages are composed of a tree of user-editable widgets and columns of widgets ("areas"). The database should directly understand these data structures.
Trees of documents: allow us to represent relationships between pages.
Localization: allows translation of websites.
A big factor in the success of applying MongoDB early on for us was that the page and its nested widgets make up a single document in MongoDB. Compounding on that benefit is that MongoDB actually understands that these are sub-documents.
main contains an
items array? "Areas" like
main allow the user to add as many "widgets" to this part of the page as they see fit. And since MongoDB allows nested data structures, we can even create "layout widgets" that contain additional, nested areas and widgets, like this:
Rounding out our use case criteria was a simple way to display page trees. It's true that we could have chosen a graph database to do that natively. But in the end we found it was simple to represent the tree just by including a materialized path property, as well as level and rank properties. With this simple representation, even if a race condition were somehow to occur, the failure modes are much safer than the nested set algorithm we were previously applying. And it's still easy for developers to make sense of what they're seeing:
As long-time open source developers, we carefully evaluated MongoDB's server side public license upon release. While we are committed to open source for Apostrophe, we did not want to base our decision making process strictly on whether the underlying database met a strict standard of open source. What we really needed was a database that will be around for a long time. After deliberation, we found that the server side public license is the right balance between freedom, driving value for the community and making sure the company behind the database remains healthy for the long run.
A small digression into Atlas, MongoDB’s fully-managed cloud database, here. We're big fans and we use it for most projects at a significant scale. While the free, downloadable community edition is useful for a single server project, we find that the pricing for larger projects on Atlas is quite reasonable and well worth the investment to achieve better scalability, durability and availability while reducing operational overhead.
MongoDB delivers very high performance to begin with, due to the lack of SQL parsing and the design of the WiredTiger storage engine. But being able to represent an entire page as a single document while retaining the meaning of the data structures inside pushes performance much higher in practice because we're not relying on many layers of SQL joins just to obtain the current version of the current page.
When it comes to security, we have found MongoDB to be an immediate relief in this area. Because queries are actual data structures and are not parsed as strings, attacks similar to SQL injection attacks are just not possible. This is not to say that the system is completely infallible. Denial of service attacks are still possible if features like regular expressions are misused. But the main thing to remember here is to apply MongoDB's features as intended. MongoDB is safe when its features are used correctly. And even when they aren't, it's much safer than SQL.
Other deciding factors
Integrated Search - MongoDB's text queries can be mixed and matched with other criteria, right in the same query object. That means they are composable and you're never trying to create a painful join between two different kinds of databases.
Composability - MongoDB's query objects can be composed directly into larger queries using
$or operators without string concatenation. This allows separate functions to contribute permissions checks, type checks and range checks to the same query without overhead.
Aggregation - While we are not a big data company, we still do use MongoDB's aggregation features for jobs like creating a more powerful replacement for the
distinct method that has the ability to get back counts for all the distinct values of a property.
Good choices, great technology
So that's the story of how MongoDB allowed us to push the performance of ApostropheCMS up and push the complexity of our code down and welcome our developers to the promised land of writing one language all day. If you have CMS tasks, you can avoid some code switching of your own by using a CMS that's native to your preferred database.