Archive for August, 2011

muSOAing for 8/7/11 – The World of HBase

August 8, 2011

It seems that with each passing day there is something new happening in the HBase ecosystem. I have already mentioned about the availability of co-processors with v0.91 onwards but let us look at the plethora of options available today.

For someone from the relational world, HBase may come across as a very loosely typed database system. Even I have been led to believe the same but beware this simplicity is very deceptive akin to an iron hand in a velvet glove. Who would not be if all it takes to create a table is to specify it’s name a a column family. You need not even specify the table key or it’s columns. It can be done on the fly. But that does not mean that you can let your guard down. HBase was designed to handle mountains of information and the way it does that is through it’s very rich Java API that gives you myriad option to both ETL and Query information in and out of it’s highly distributed and scalable architecture.

Mull this (over a glass of your favorite libation), just for ETL you have such a variety of options starting with the most atomic PUT to batch PUTs, HFileOutputFormat, bulk uploads and an endless combination of these including map/reduce. There is an equally diverse buffet of options for querying such as atomic GETs, batch GETs, range scans, co-processors etc.

Watch this space for more detailed information including examples of these features.