Last Updated: February 25, 2016

· vidyasagar

Hadoop CDH 4.4.0 Installation

#hadoop installation

#hadoop cdh4 installation

#cloudera hadoop installation

#hadoop 2.0.0 installation

Crete a Hadoop directory. Download all your components under this directory.
```
sudo mkdir /usr/local/hadoop
```
Change to all installations directory
```
cd /usr/local/hadoop
```

Download Hadoop tarball file

wget http://archive.cloudera.com/cdh4/cdh/4/hadoop-2.0.0-cdh4.4.0.tar.gz

Unpack the tarball file

sudo tar –zxvf hadoop-2.0.0-cdh4.4.0.tar.gz

Create Hadoop datastore directory

sudo mkdir hadoop-datastore
sudo mkdir hadoop-datastore/hadoop-hadoop  <hadoop-username>

Changing permissions to current user for all the folders

sudo chown –R hadoop.root *
sudo chown –R hadoop.root .
sudo chmod 755 *
sudo chmod 755 .

Adding hadoop binaries to /etc/environment

Current path: hadoop@localhost:/usr/local/hadoop/hadoop-2.0.0-cdh4.4.0$

sudo nano /etc/environment

Do changes in this file as shown below:

        PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/lib/jvm/java-6-openjdk-amd64/bin:/usr/local/hadoop/hadoop-2.0.0-cdh4.4.0/bin:/usr/local/hadoop/hadoop-2.0.0-cdh4.4.0/sbin"
JAVA_HOME="/usr/lib/jvm/java-6-openjdk-amd64"
HADOOP_HOME="/usr/local/hadoop/hadoop-2.0.0-cdh4.4.0"
HADOOP_CONF_DIR="/usr/local/hadoop/hadoop-2.0.0-cdh4.4.0/etc/hadoop"

 source /etc/environment
 echo $HADOOP_HOME

make sure this command is showing the below path

/usr/local/hadoop/hadoop-2.0.0-cdh4.4.0

Type hado and hit tab two files at prompt, hadoop keyword should be autofilled. (This ensures successful installation of hadoop)

Make sure hadoop installation directory has current user permissions to read and write

hadoop@localhost:/usr/local/hadoop/hadoop-2.0.0-cdh4.4.0$sudo chown –R hadoop.root *

hadoop@localhost:/usr/local/hadoop/hadoop-2.0.0-cdh4.4.0$sudo chown –R hadoop.root .

hadoop@localhost:/usr/local/hadoop/hadoop-2.0.0-cdh4.4.0$sudo chmod 755 .

hadoop@localhost:/usr/local/hadoop/hadoop-2.0.0-cdh4.4.0$sudo chmod 755 *
Configuring Hadoop

Current path: hadoop@localhost:/usr/local/hadoop/hadoop-2.0.0-cdh4.4.0/etc/hadoop$

sudo nano core-site.xml

<configuration>
<property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:8020</value>
  </property>
  <property>
     <name>hadoop.tmp.dir</name>
     <value>/usr/local/hadoop/hadoop-datastore/hadoop-${user.name}</value>
  </property>

  <!-- OOZIE proxy user setting -->
  <property>
    <name>hadoop.proxyuser.hadoop.hosts</name>
    <value>*</value>
  </property>
  <property>
    <name>hadoop.proxyuser.hadoop.groups</name>
    <value>*</value>
  </property>
</configuration>

sudo nano hadoop-env.sh

Add these two lines at the end of file

export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
export JAVA_HOME="/usr/lib/jvm/java-6-openjdk-amd64"

sudo nano hdfs-site.xml

Make sure you have the following contents in the file

<configuration>
<property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
  <property>
     <name>dfs.permissions</name>
     <value>false</value>
  </property>
  <!-- Immediately exit safemode as soon as one DataNode checks in.
       On a multi-node cluster, these configurations must be removed.  -->
  <property>
    <name>dfs.safemode.extension</name>
    <value>0</value>
  </property>
  <property>
     <name>dfs.safemode.min.datanodes</name>
     <value>1</value>
  </property>
  <property>
     <!-- specify this so that running 'hadoop namenode -format' formats the right dir -->
     <name>dfs.name.dir</name>
     <value>/usr/local/hadoop/hadoop-datastore/hadoop/dfs/name</value>
  </property>
</configuration>

Make sure in above value, which is underlined that has to be your username value.

    sudo nano mapred-site.xml

Make sure you have the similar contents in this file

    <configuration>
    <property>
          <name>mapreduce.framework.name</name>
          <value>yarn</value>
       </property>
    </configuration>

sudo nano yarn-site.xml

Make sure you have similar contents in this file

<configuration>
<property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce.shuffle</value>
   </property>
   <property>
      <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
      <value>org.apache.hadoop.mapred.ShuffleHandler</value>
   </property>
</configuration>

You’re done with Hadoop installation.
These are commands to start and stop Hadoop:

start-all.sh

This has to give you all 5 deamons (i.e.., NameNode, Secondary NameNode, DataNode, ResourceManager and NodeManager ) running

stop-all.sh

This command allows you to stop all 5 deamons that are running in your cluster

You can start Job History Server Using command

mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR

You can stop this historyserver using below command

mr-jobhistory-daemon.sh stop historyserver --config $HADOOP_CONF_DIR

#hadoop installation

#hadoop cdh4 installation

#cloudera hadoop installation

#hadoop 2.0.0 installation

Written by Vidyasagar Gudapati

Say Thanks

Respond

Related protips

Stop / remove all Docker containers

1.962M

Basic Vim commands - For getting started

1.246M

Don't use Array.forEach, use for() instead

1.237M

Have a fresh tip? Share with Coderwall community!

Post

Post a tip

Best #Hadoop installation Authors

1.757K

Related Tags

#hadoop cdh4 installation

#cloudera hadoop installation

#hadoop 2.0.0 installation

#native_company#

#native_desc#

Awesome Job

See All Jobs

Post a job for only $299

#native_title# #native_desc#

#native_cta#