muSOAing for 8/30/10 – NoSQL

As data and information proliferates, so does the NoSQL movement. NoSQL is basically dealing with data that is not necessarily stored in a traditional relational data store like a relational DB. With the proliferation of grid based and MPP architectures, a lot of new techniques such as parallel processing of queries to run on each node of the distributed data stores are now the norm.

Of these Map/Reduce seems to have the largest mind share. Platforms like Hadoop and Hadoop based implementations make extensive use of this. Also, similar grid based architectures from commercial vendors like AsterData and GreenPlum have enhanced implementations that marry the best features of SQL with Map/Reduce to come up with SQL-MR.

If you thought SQL was cool, then wait till you see what Map Reduce can do. It is really SQL on steroids. It is text processing elevated to the nth degree, supported by a truly multiplexed, parallel processing engine that can execute at lightening speeds and collate the results for you with great speed. In put it in plain terms, kinda like the google search results you get based on the search keywords you submit. In fact BigTable is the first Map/Reduce implementation and it was Google who invented and pioneered this and later on published this algorithm.

I feel that this field is going to or has already created a whole new career group and you in future you are increasingly going to see requirements for positions like “Big Data Architect”, “Hadoop Programmer” etc. Data and Information is only going to increase and expected to reach the Zetabytes per day stage very soon, if it has not already. So there are going to be challenges in all the areas of information management ranging from Transferring, Storing, Mining and Analyzing. So are you ready for Big Data?


