NoSQL showdown: MongoDB vs. Couchbase
MongoDB edges Couchbase Server with richer querying and indexing options, as well as superior ease of use
You can define a text index on multiple string fields, but there can be only a single text index per collection, and indexes do not store word proximity information (that is, how close words are to one another, which can affect how matches are weighted). In addition, the text index is fully consistent: when you update data, the index is also updated.
Ease-of-use features have been added to version 2.4 as well. For example, you can now define a "capped array" as a data element, which works sort of like an ordered circular buffer. If, for example, you're keeping track of the top 10 entries in a blog, using a capped array will allow you to add new entries, and (based on the specified ordering) previous entries will be removed to cap the array at 10 or whatever number you specify.
MongoDB 2.4 also has improved geospatial capabilities. For example, you can now perform polygon operations, which would allow you to determine if two regions overlap. The spherical model used in 2.4 is improved too; it now takes into account the fact that the earth is not perfectly spherical, so distance calculations are more accurate.
In Couchbase Server, the
mapreduce operation's primary job is to provide a structured query and information aggregation capability on the documents in the database. In MongoDB,
mapreduce can be used not only for querying and aggregating results, but as a general-purpose data processing tool. Just as a
mapreduce operation executes within a given bucket in Couchbase Server,
mapreduce executes within a given collection in a MongoDB database. As in Couchbase Server,
You can filter the documents passed into the map function via comparison operators, or you can limit the number of documents to a specific number. This allows you to create what amounts to an incremental
mapreduce operation. Initially, you run
mapreduce over the entire collection. For subsequent executions, you add a query function that includes only newly added documents. From there, set the output of
mapreduce to be a separate collection, and configure your code so that the new results are merged into the existing results.
Further speed/size trade-offs are possible by choosing whether the intermediate results (the output of the
map function, sent to the
map function. But as there is no writing to disk, the processing is faster.
You have to be careful with long-running mapreduce operations, because their execution involves lengthy locks. As mentioned earlier, the system has built-in facilities to mitigate this. For example, the read lock on the input collection is yielded every 100 documents. The MongoDB documentation describes the various locks that must be considered -- as well as mechanisms to relieve the possible problems.
Management access with the MongoDB database goes through the interactive
use <databasename>. But that command doesn't check for the presence of the specific database; if you mistype it and proceed to enter documents into that database, you might not know what's going on until you've put a whole lot of documents into the wrong place. The same goes for collections within databases.
Other useful command-line utilities are
mongostat, which returns information concerning the number of operations -- inserts, updates, deletes, and so on -- within a specific time period. The
mongotop utility likewise returns statistical information on a MongoDB instance, this time focusing on a specific collection. You can see the amount of time spent reading or writing in the collection, for instance.
In addition, 10gen provides the free cloud-based MongoDB Monitoring Service (MMS) which provides a monitoring dashboard for MongoDB installations. Built on the SaaS model, MMS requires you to run a small agent on your MongoDB cluster that communicates with the management system.
10gen's MongoDB Monitoring Service shows statistics -- in this case, for a replica set -- but management of the database is done from the command line.