Last Updated: February 25, 2016

MongoDB Schema Design: Embedded vs References

When designing schemas in Mongo DB, expecially the first times, a quick useful checklist to choose whether or not to use embedded documents in one-to-many relation.

So here is the checklist:

Type of query: ask yourself what could be the best model representation for the most frequent queries you'll need to do. Everytime I'll get the parent document I'll always (or really often) need all the child documents. Answer: Nested.
Data model lifecycle: think about the life cycle of the container document and its content: make sense that child documents will still have to exist when the parent document is deleted? If the answer is "no" nested is the way.
Snapshots: another reason that should affect your choice is the data you're representing and if the nested item is a snapshot of something happened at a given time. Suppose you're working with a "Receipt" object that contains a list of buyed products these will be copied as a nested documents in the receipt. You're storing an information related to a specific time like the product cost. So if your child documents are snapshot, pick a nested representation.
Direct access: how many times you need to easily access the nested document (without caring of its container). If you need this, go for a flat design with references.
Number of nested object: a MongoDB document has a size limit of 16MB (quite big amount of data). If your subcollection can growth without limits go flat.

Rule of thumb: if the amount of data to transfer doesn't affect your client experience and the number of subdocuments has a numerical limit, go nested, otherwise flat.

Thanks to Gabriele Lana for sharing this with me :)