Project

General

Profile

How to Hadoop » History » Revision 4

Revision 3 (Henning Blohm, 11.09.2012 22:22) → Revision 4/8 (Henning Blohm, 20.09.2012 12:29)

h1. How to use Z2 with Hadoop 

 *DRAFT* 
 
 One of Z2's most intriguing capabilities is to seamlessly integrate with Hadoop in the sense that Map-Reduce jobs can be now considered are just (really!!) ordinary application components. 

 This means that M/R Job implementations can re-use other application modules just like a Web app. Job implementations may be build on Spring, use Hibernate, use the very same data source definitions etc. No more special assembly and wiring - just because you want to run Map-Reduce with Hadoop.  

 Instead, when executing a distributed job by Hadoop, Z2 will be running embedded into the job's process and execute tasks within its normal component model. That is, your code now runs outside of its server context, but it does run within exactly the same logical environment. 

 !z2_hadoop.png! 

 All required extensions are provided by the [[Hadoop Add-on]]. 

 Note that this add on is version-specific with request to Hadoop and companion components. While the Z2 extension modules should be rather version independent, the Hadoop and HBase client access is not. Please consult the add on page to find out more about This howto explains what is in the add on needed and how you may be able to tweak it. 

 Hadoop it is not a completely trivial system. Before you try Z2 with Hadoop you should have a done. We assume some basic understanding of what Hadoop does.  

 On the other hand, the samples provided on this site may just be one of the simplest Hadoop. There is an instructive sample *z2-samples.hadoop-simple* that we will refer to thoughout and fastest ways of getting a that you should make sure to have up and running Hadoop setup in the first place. before you continue to more complex setups. 

 To learn more, there are two instructive samples: 

 * [[Sample-hadoop-basic]]: An adaptation of the classic word count example on Z2. This is the simplest to see something running and getting a feeling on how Z2 comes into the picture. !z2_hadoop.png! 
 * [[Sample-hbase-full-stack-TBD]]: A more complicated example featuring a richer application environment. This is based on HBase on Hadoop. This is closer to real application scenarios than the basic sample.