Mongo Data Modeling
Designing mongo database is really fun in a way, it let’s us understand the data relationships more intuitively. This is my opinionated stab at how to approach mongo datbase design for any application.
Documents are the basic unit of record in mongo. They are schema less and can have any type of fields and structure. Listed below are the most common terms.
- Documents - schemaless documents
- Collections - List of documents
- Databases - List of collections
These are the things to consider while designing a database for an application.
1. Flexible Schema
- Documents in a single collection do not need to have same set of fields.
- Data type for a field can differ across documents within a collection
However in real world applications, we might need some sort of structure to the documents, and this could be achieved by having validation rules for
2. Document Structure
Designing data models for Mongo depends on the structure of the document and how the application represents relationships between data. Relations or related data can be handled in couple ways.
- Embedded Data
2.1 Embedded Data
In this type of design, related data is stored within a single document structure.
- Whenever there is a contains relationship between entities
- One to many relationships - where child documents are viewed in the context of parent documents
- Better read performance and ability to retrieve related data in a single database operation.
- Embedded data models make it possible to update related data in a single atomic write operation.
2.2 References (Referential data)
In this type of design, the documents store the relationships between data by including links or references. There are two ways to store references.
- Manual references
2.2.1 Manual References
manual references where you save the _id field of one document in another document as a reference. Then your application can run a second query to return the related data. These references are simple and sufficient for most use cases.
Consider the following operation to insert two documents, using the _id field of the first document as a reference in the second document:
Then, when a query returns the document from the people collection you can, if needed, make a second query for the document referenced by the places_id field in the places collection.
dbrefs are references from one document to another using the value of the first document’s _id field, collection name, and, optionally, its database name. By including these names, DBRefs allow documents located in multiple collections to be more easily linked with documents from a single collection.
The DBRef in this example points to a document in the creators collection of the users database that has ObjectId(“5126bc054aed4daf9e2ab772”) in its _id field.
Lets review the concepts we’ve covered so far, we know how to organise data in mongo as documents. This is a good starting point for arriving at a database design for any product idea.
- The document are schemaless and we can enforce some structure by having schema validation for all insert and update operations.
- The documents can embed related data or have a reference to another document (multiple documents) depending on the size of the related data and frequency of operations.
I’m sure there are many important topics like Indexes, Transactions, Sharding, Capped Collections, Time to Live etc. In my next post, I would like to cover about the concepts like CRUD operations and Transactions.