muSOAing for 11/24/11 – LINUXization of Hadoop

November 24, 2011

I was tempted to call it the Balkanization of BigData with Hortonworks being the Slovenia,  Cloudera being the Serbia and Apache’s distribution Croatia probably?  But that would be totally inappropriate.   There will be no Balkanization but there certainly will be LINUXization.    At least these three version will be pre-eminent and which versions will people opt for is anybody’s guess.    There are really two camps in the Hadoop world,  the ones that favor RPM distributions and ones that are die hard tarball fans.     I personally belong to the latter camp.

Personally, I think tarballs are the way to go and anyone in the Hadoop release business should definitely support both.   Once you have the process nailed,  tarballs are the easiest to install.   They lend themselves to super fast distribution and deployment with tools like puppet and even your custom home grown tools built with shell and tcl/expect scripts.  Best of all, you can do a hadoop install without root access and that is the best perk or feature that tarballs offer.

RPMs certainly have a mind of their own.   Moreover you need root access to install RPMs.   The files are scattered in myriad places and it is very hard to track and maintain an installation if installed this way.

All said and done,  the Hadoop universe is truly unique as it brings a hitherto non-existent synergy between hardware, software and distributed computing.  For me personally it is very good to be part of this universe and I would even go to the extent of calling it the Vedanta of high end computing.    Enough of pontification on thanksgiving day and with this,  I am taking my hands off the keyboard and closing the screen on my mac book pro (another great perk of working with Hadoop).