Archive for June, 2011

muSOAing for 6/17/11 – Hadoop Metrics

June 17, 2011

One big area of interest for me is the metrics for Hadoop. This is the stuff you can see in the default web page thru port 50030 and also available for you thru the JobClient API. As of this writing, this whole infrastructure is undergoing a sea change as is the case with the APIs in general. All the APIs are changing for both Hadoop and HBase.

This is a good sign. It says that the adoption of these tools is increasing and as people are using these products, they are demanding more features. It seems at this point that the primary users of these platforms are from the DW camp. Their interests lie in using alternate platforms to perform the same DW and BI tasks that they have been doing with traditional relational DB oriented infrastructures. Moving to a new platform like Hadoop also requires a different mindset which means unlearning a few of the concepts associated with traditional platforms. It will also mean coming out of the comfort zone and expecting Hadoop+HBase to behave exactly like your Oracle datawarehouse. This is a bit like switching from Windows to Mac. Things will only get better, it is just that you have to get used to it.

I think I have digressed a bit, I started off with Hadoop metrics but then wandered off and done some pontification.