Install prepacked CDH4 » History » Version 3
Henning Blohm, 20.09.2012 14:44
| 1 | 1 | Henning Blohm | h2. Install CDH4 from a preconfigured repository |
|---|---|---|---|
| 2 | |||
| 3 | This site provides a pre-configured one-ckeck out user space installation of Cloudera's CDH4 Hadoop and HBase distributions. This page explains how to install it on your machine - which is really, really simple compared to normally suggested Hadoop installation procedures. |
||
| 4 | |||
| 5 | *Note #1:* This will only work on Linux or Mac OS |
||
| 6 | |||
| 7 | *Note #2:* The repository also contains an Eclipse project file and has Eclipse launchers for most functions required. |
||
| 8 | |||
| 9 | In short there are two steps: |
||
| 10 | |||
| 11 | # Clone the repository |
||
| 12 | # Adapt your local environment |
||
| 13 | 3 | Henning Blohm | # Format hdfs |
| 14 | 1 | Henning Blohm | |
| 15 | h3. Clone the repository |
||
| 16 | |||
| 17 | The pre-configured distribution is stored in the repository "z2-samples-cdh4-base":http://redmine.z2-environment.net/projects/z2-samples/repository/z2-samples-cdh4-base. We assume you install everything (including an Eclipse workspace - if you run the samples) in *install*. |
||
| 18 | |||
| 19 | 3 | Henning Blohm | <pre><code class="ruby"> |
| 20 | 1 | Henning Blohm | cd install |
| 21 | git clone -b http://git.z2-environment.net/z2-samples.cdh4-base |
||
| 22 | </code></pre> |
||
| 23 | 2 | Henning Blohm | |
| 24 | h3. Adapt your environment |
||
| 25 | |||
| 26 | Before you can run anything really there are two customizations needed: |
||
| 27 | |||
| 28 | h4. Set important environment variables |
||
| 29 | |||
| 30 | There is a shell script "env.sh":http://redmine.z2-environment.net/projects/z2-samples/repository/z2-samples-cdh4-base/revisions/master/entry/env.sh that you should open and change as described. At the time of this writing it is required that you define your JAVA_HOME (please do, even if set elsewhere already) and the NOSQL_HOME, which is the absolute path to the folder that has the *env.sh* file. This script is called from many places. |
||
| 31 | |||
| 32 | h4. Enable password-less SSH |
||
| 33 | |||
| 34 | Currently this is still required to have the start / stop scripts running. This requirement may be dropped in the future. |
||
| 35 | |||
| 36 | 1 | Henning Blohm | If you have not created a unique key for SSH or have no idea what that is, run |
| 37 | 2 | Henning Blohm | |
| 38 | 3 | Henning Blohm | <pre><code class="ruby"> |
| 39 | 2 | Henning Blohm | ssh-keygen |
| 40 | </code></pre> |
||
| 41 | 1 | Henning Blohm | |
| 42 | 2 | Henning Blohm | (just keep hitting enter). Next copy that key over to the machine you want to log on to without password, i.e. localhost in this case: |
| 43 | |||
| 44 | 3 | Henning Blohm | <pre><code class="ruby"> |
| 45 | 2 | Henning Blohm | ssh-copy-id <your user name>@localhost |
| 46 | </code></pre> |
||
| 47 | |||
| 48 | 1 | Henning Blohm | If this fails because your SSH works differently, or ssh will refuse to log on without password please "ask the internet". Sorry. |
| 49 | 2 | Henning Blohm | |
| 50 | All that matters is that in the end |
||
| 51 | |||
| 52 | 3 | Henning Blohm | <pre><code class="ruby"> |
| 53 | 1 | Henning Blohm | ssh <your user name>@localhost |
| 54 | </code></pre> |
||
| 55 | |||
| 56 | (substituting <your user name> with your actual user name of course) works without asking for a password. |
||
| 57 | 3 | Henning Blohm | |
| 58 | h3. Formatting HDFS |
||
| 59 | |||
| 60 | Finally, the last step before you can start up, is to prepare the local node to store data. This is done by running the *format_dfs.sh* script. Alternatively you can use the Eclipse launcher of the same name. |
||
| 61 | |||
| 62 | This should complete without any questions or errors. Otherwise please verify your setings above. |
||
| 63 | |||
| 64 | h3. Start and Stop |
||
| 65 | |||
| 66 | Depending on your sample requirements, you can start Hadoop (HDFS, Yarn, the History Server) or HBase (including all the Hadoop services) using the *start_hadoop.sh* script (or launcher) or the *start_hbase.sh* script (or launcher) respectively. Similarly you can stop everything with the stop scripts. |
||
| 67 | |||
| 68 | If you ran the start script and it returned, here is some URLs you should check to verify everything is looking good: |
||
| 69 | |||
| 70 | * Try to reach the Nameserver at |
