Flexing NoSQL: MongoDB in review
MongoDB shines with broad programming language support, SQL-like queries, and out-of-the-box scaling
MongoDB: JSON document store
JSON is an extremely understandable format. Humans can easily read it (as opposed to XML, for example) and machines can efficiently parse it. A document in Mongo representing a business card, for example, would look something like this:
"_id" : ObjectId("4efb731168ee6a18692d86cd"),
"name" : "Andrew Glover",
"cell" : "703-555-5555",
"fax" : "703-555-0555",
"address" : "29210 Corporate Dr, Suite 100, Anywhere USA"
In this case, the
_id attribute in the document above represents a primary key in Mongo. Like a relational database, Mongo can index data and force uniqueness on data attributes. By default, the
_id attribute is indexed; moreover, this document can further index individual fields or even a combination of them (for example, the name and address). Additionally, when defining an index, you can specify that its value be unique.
Mongo, however, doesn't provide for constraints or triggers. Documents in Mongo are free to refer to each other. For example, a document in a
contact_log collection could refer back to a business card's
_id above, thus providing a foreign keylike link. But there is no way, currently, to specify corresponding actions to be taken should a related document be removed, such as remove all referencing documents as well, which you can do in a typical RDBMS. You can, of course, add this sort of logic in application code, and triggers are planned for a future release.
JSON documents in Mongo do not force particular data types on attribute values. That is, there is no need to define upfront the format of a particular attribute. The data can be a string, an integer, or even an object type, provided it makes sense. By default, data types in Mongo documents include string, integer, boolean, double, array, date, object ID (which you can see in action in the business card example above), binary data (similar to a blob), and regular expression, although support for these latter types varies by driver.
The freedom from rigid data definition is where a document store shines. Taking the business card example a bit further, the same collection could include this document:
"_id" : ObjectId("4efb73a868ee6a18692d86ce"),
"name" : "Mark Smith",
"cell" : "301-555-5555",
"address" : "23 Corporation Way, Anywhere USA",
"twitter" : "msmith"
Note the subtle but important differences between the two. While both documents clearly represent business cards, the first includes a fax number and the second includes a Twitter handle instead.
Take a moment to think through how this same entity would be represented in an RDBMS. In this case, a business card table would need columns for both fax and Twitter, even though many rows would not have any data in those fields. Furthermore, altering a table's definition after the fact can be problematic, especially when that table contains a large amount of data. Thus, in some cases, a document store's freedom of data definition permits a high degree of variance in rapidly evolving data collections. In essence, a document store can permit data agility.