Tech - NOSQL Tutorial - Part 2 (Overview)


I started looking at various NOSQL databases like Hbase, CouchDB and MongoDB. I finally settled on MongoDB, mainly because it was easy to setup in Windows. The name Mongo is derived from 'humongous', interesting name! For more details, you can refer - http://www.mongodb.org. As it says on the website, it is very easy to setup and highly scalable. Here are my first set of statistics -
  • Time to download the software (64 bit edition) - 2 minutes
  • Time to setup the database - 2 minutes
  • Time to get the database up and running - 1 minute
And, this is where I found the database to be refreshing. It is quick, even to setup, leave alone processing of data! Overall, I found these to be key features of this database -
  • Light foorprint. Easy to setup either single or replicated instances and to administer.
  • Schema-free data storage. In other words, there is no need to define any data structure.
  • Incredibly quick for reads and writes.
  • Map Reduce (with Javascript).
  • Very easy to scale horizontally.
  • RESTful API's available to access the database.
I will cover some of these features in separate posts.


Here is the next set of statistics around some complimentary software.
  • RockMongo - Web (PHP) based administration console. Time to download - 1 minute. Time to configure - 1 minute.
  • MongoLive - Web (Chrome App) based monitoring tool. Time to download - 1 minute. Time to configure - 1 minute.
Of course there are many other tools available. The default mongo (default console client) is the best tool to access the database. And with these web based tools, it's such a change from using big and slow clients which are required for some of the RDBMS databases. You can refer the mongodb website for help and tutorials.


One of the key features I wanted to try out was the performance of the database. So, I had to get a 'meaty' set of data. I generated my own dummy data with about 20 million rows and 10 columns.


I used the default mongoimport tool to import the data. It took about 3 hours to import. Next, I created the indexes. They took another 5 hours, a bit more time than I expected. All set now to test it. I also have a web page (using PHP) that queries the data based on the key. The time it took to access the data was 0.0083 seconds! Of course this is with 1 query running at a time. Wait for the next post on performance testing, it's MongoDB vs SQL Server 2008!

Click here for - Part 3.

Comments

Popular posts from this blog

Cloudera Quick Start VM in Hyper-V

Book Review - The Price of Being Fair

Azure Chronicles - VM Security