Hadoop Install (On Premise)

User Rating: 5 / 5

Star ActiveStar ActiveStar ActiveStar ActiveStar Active
 

 Prerequisites:

  • VMware Workstation (in my case, version 11)
  • Centos 7 image ( in my case, CentOS-7-x86_64-DVD-1611)

Architecture

 

 Architecture Big Data

 

Explication

 

Objectif : Building a Hadoop home lab is one of the best ways to learn Big Data. A home lab allows you to gain hands-on experience without worrying about impacting production systems at work or use several cloud systems with cost impact.

In my opition the reasons why you might use a home lab is :

  • Hands-on Exprience
  • Certification Study and lab
  • Just do It :) 

Hadoop Home Lab components :

  • NameNode (Three NameNode - It keeps the directory tree of all files in the file system, and tracks where across the cluster the file data is kept. It does not store the data of these files itself - wiki.apache.org).
  • DataNode (Four DataNode - It stores data in the HDFS HadoopFileSystem. A functional filesystem has more than one DataNodem with data replicated across them).
  • Analytics (One R Server - We will use Microsoft R Server & RStudio Server).

 

Preparing the environment

 

3.1. Centos Installation :

NB : Don't forget to install VMware Tools on the guest machine

Centos Installation

3.2. IP/DNS Configuration:

 

Server Name IP
Master NNMaster01.bigdatabigimpact.lan 192.168.116.40
Standby NNStandby01.bigdatabigimpact.lan 192.168.116.41
Secondary NNSecondary01.bigdatabigimpact.lan 192.168.116.42
Slave01 DNSlave01.bigdatabigimpact.lan 192.168.116.50
Slave02 DNSlave02.bigdatabigimpact.lan 192.168.116.51
Slave03 DNSlave03.bigdatabigimpact.lan 192.168.116.52
Slave04 DNSlave04.bigdatabigimpact.lan 192.168.116.53
Analytics RServer.bigdatabigimpact.lan 192.168.116.45
Client Haddopuser01.bigdatabigimpact.lan 192.168.116.20


Add on all nodes (file /etc/hosts)

#Master
192.168.116.40
NNMaster01.bigdatabigimpact.lan NNMaster01 #Standby NameNode
192.168.116.41
NNStandby01.bigdatabigimpact.lan NNStandby01 #Secondary NameNode
192.168.116.42
NNSecondary01.bigdatabigimpact.lan NNSecondary01 #Slave01 DataNode
192.168.116.50
DNSlave01.bigdatabigimpact.lan DNSlave01 #Slave02 DataNode
192.168.116.51
DNSlave02.bigdatabigimpact.lan DNSlave02 #Slave03 DataNode
192.168.116.52
DNSlave03.bigdatabigimpact.lan DNSlave03 #Slave04 DataNode
192.168.116.53
DNSlave04.bigdatabigimpact.lan DNSlave04 #Analytics Server
192.168.116.45
RServer.bigdatabigimpact.lan RServer #Hadoop Client
192.168.116.20
Hadoopuser01.bigdatabigimpact.lan Hadoopuser01

If you want to change Hostname

3.3. SSH Configuration


Add on Master node

#DSA Key
ssh-keygen
-t dsa -f ~/.ssh/id_dsa leave the passphrase filed blank in order to automatically login via ssh.
#Add DSA Key to Trusted repo
cat
~/.ssh/id_dsa.pub >> ~/.ssh/authorized_key


Copy the DSA Key to other nodes

3.4. Java Installation


Add on All node

#Hadoop framework is written in Java!!
#Download jdk-8u92-linux-x64.rpm

rpm
-Uvh jdk-8u92-linux-x64.rpm
#Download hadoop-2.7.2.tar.gz
tar
xfz hadoop-2.7.2.tar.gz
cp -R
hadoop-2.7.2 /opt/hadoop


#Verify the Java version
[root@NNMaster01 default]# java -version 

[root@NNMaster01 default]# java -version
openjdk version "1.8.0_121"
OpenJDK Runtime Environment (build 1.8.0_121-b13)
OpenJDK 64-Bit Server VM (build 25.121-b13, mixed mode)
[root@NNMaster01 default]#ls /usr/java default jdk1.8.0_92 jre1.8.0_92 latest

We see that is not the right version 1.8.0_121 instead of 1.8.0_92, we use the #alternatives --config java command



 

3.5. Java & Hadoop Environment Variables


Add on All node

#cd ~ 
#vi .bash_profile
#Append the following lines at the end of the file:
### JAVA env variables
export JAVA_HOME=/usr/java/jdk1.8.0_92 export PATH=$PATH:$JAVA_HOME/bin export CLASSPATH=.:$JAVA_HOME/jre/lib:$JAVA_HOME/lib:$JAVA_HOME/lib/tools.jar
## HADOOP env variables export HADOOP_HOME=/opt/hadoop export HADOOP_COMMON_HOME=$HADOOP_HOME export HADOOP_HDFS_HOME=$HADOOP_HOME export HADOOP_MAPRED_HOME=$HADOOP_HOME export HADOOP_YARN_HOME=$HADOOP_HOME export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native" export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
[root@NNMaster01 ~]# source .bash_profile 
[root@NNMaster01 ~]# echo $HADOOP_HOME 
[root@NNMaster01 ~]# echo $JAVA_HOME 


About Helix

Ball tip biltong pork belly frankfurter shankle jerky leberkas pig kielbasa kay boudin alcatra short loin.

Jowl salami leberkas turkey pork brisket meatball turducken flank bilto porke belly ball tip. pork belly frankf urtane bilto

Latest News

21 August 2017