HadoopDB

Ok, so now we're talking. Thanks to slashdot I found this article


Great thing about this is that I can easily use JDBC in my front end code and leverage the benefits of Hadoop and everything else on the backend without major code changes. At least, this is my initial thought.

hadoop thoughts

After two months of playing with hadoop-core, hbase and the rest of the hadoop related projects, I have sat, pondered and wondered.

First, what is hadoop?

The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing.

Ok, if you want to know more about hadoop and hbase, etc., go read http://hadoop.apache.org/ and also http://www.cloudera.com/

My initial thoughts about hadoop were in no particular order:

awesome. cool. sweet. huh? hmmmm. what?

awesome:
HDFS. Hadoop file system. Finally, I had at my fingers the ability to have a network storage system that didn't cost a lot and was fairly easy to set up. Started playing around and toying with lucerne and other aspects and that led me to:

cool:
yes, it is cool. if you are a geek, it's great. i see many applications for this ... without getting into the map/reduce that is one of the fundamental benefits of hadoop

sweet:
web interfaces out of the box for all of the different services. hadoop, hbase. sweet! as much as i love to get down and dirty with a cli (command line interface) ... loading up multiple webpages in chrome was and still is appealing.

huh?
SPOF (single point of failure) ... by design, or not, the namenode in hadoop is a single point of failure. from the website:

The NameNode machine is a single point of failure for an HDFS cluster. If the NameNode machine fails, manual intervention is necessary. Currently, automatic restart and failover of the NameNode software to another machine is not supported.


hmmmmm:
Ok, so in the future this would be removed. I have to keep reminding myself that it's not that old and there have to be some let downs along the way...so long as when it finally gets to somewhere ... those let downs are brought back online. forward thinking and long term planning.

what?
POSIX (Portable Operating System Interface for Unix) . my issue here is that hdfs is not a POSIX file system. this means that it doesn't behave as a normal fs, yet. i had hoped and in a way had forced myself to believe that updates would be supported and supported well. after a lot of digging, this wasn't the case. a lot of digging. sure, you can add a few configuration lines and it will update files ... but you get into lots of problems.

Conclusion?
In 6-12 months I think that this technology will be amazing...but as it currently stands, I think it may be as great as Windows 7 ... If you lower your expectations and expect some blue screens, I think you'll be fine. If you want it to be perfect out of the gate with wonderful documentation and features that are needed en masse ... give it some time.

I am.