Hadoop Install (Azure)

User Rating: 0 / 5

Star InactiveStar InactiveStar InactiveStar InactiveStar Inactive
 

 Prerequisites:

  • Azure Account (in my case, MCT Azure account)
  • Centos 7 image ( in my case, CentOS-Based 7.3 from Rogue Wave Software)

Architecture

 

 Architecture Big Data

 

Explication

 

Objectif : Building a Hadoop home lab in Cloud is one of the best ways to learn Big Data. A home lab allows you to gain hands-on experience without worrying about impacting production systems at work with the ability to work anywhere.

In my opition the reasons why you might use a home lab is :

  • Hands-on Exprience
  • Certification Study and lab
  • Just do It :) 

Hadoop Home Lab components :

  • NameNode (Three NameNode - It keeps the directory tree of all files in the file system, and tracks where across the cluster the file data is kept. It does not store the data of these files itself - wiki.apache.org).
  • DataNode (Five DataNode - It stores data in the HDFS HadoopFileSystem. A functional filesystem has more than one DataNodem with data replicated across them).
  • Analytics (One R Server - We will use Microsoft R Server & RStudio Server).

 

Preparing the environment

 

3.1. Azure VM Creation with Centos 7.3 :

NB : Don't forget to choose the adapted VM size to optimize the cost impact

Centos Installation

Centos Installation

Centos Installation

Centos Installation

Centos Installation

Centos Installation

3.2. IP/DNS Configuration:

 

Server Name IP
Master VMNNMaster01.northeurope.cloudapp.azure.com 10.0.0.100
Standby VMNNStandby01.northeurope.cloudapp.azure.com 10.0.0.101
Secondary VMNNSecondary01.northeurope.cloudapp.azure.com 10.0.0.102
Worker01 VMWorker01.northeurope.cloudapp.azure.com 10.0.0.51
Worker02 VMWorker02.northeurope.cloudapp.azure.com 10.0.0.52
Worker03 VMWorker03.northeurope.cloudapp.azure.com 10.0.0.53
Worker04 VMWorker04.northeurope.cloudapp.azure.com 10.0.0.54
Worker05 VMWorker05.northeurope.cloudapp.azure.com 10.0.0.55
Analytics VMRServer.northeurope.cloudapp.azure.com 10.0.0.110
Client VMBigdatauser01.northeurope.cloudapp.azure.com 10.0.0.11


Add on all nodes (file /etc/hosts)

#Master
10.0.0.100
vmnnmaster01.northeurope.cloudapp.azure.com vmnnmaster01 #Standby NameNode
10.0.0.101
vmnnstandby01.northeurope.cloudapp.azure.com vmnnstandby01 #Secondary NameNode
10.0.0.102
vmnnsecondary01.northeurope.cloudapp.azure.com vmnnsecondary01 #Worker01 DataNode
10.0.0.51
vmworker01.northeurope.cloudapp.azure.com vmworker01 #Worker02 DataNode
10.0.0.52
vmworker02.northeurope.cloudapp.azure.com vmworker02 #Worker03 DataNode
10.0.0.53
vmworker03.northeurope.cloudapp.azure.com vmworker03 #Worker04 DataNode
10.0.0.54
vmworker04.northeurope.cloudapp.azure.com vmworker04 #Worker05 DataNode
10.0.0.55
vmworker05.northeurope.cloudapp.azure.com vmworker05 #Analytics Server
10.0.0.110
vmrserver.northeurope.cloudapp.azure.com vmrserver #Win Hadoop Client
10.0.0.11
vmbigdatauser01.northeurope.cloudapp.azure.com vmbigdatauser01

Check IP and DNS

hostnamectl set-hostname htlab01 --static

3.3. SSH Configuration


Add on Master node

#DSA Key
ssh-keygen
-t dsa -f ~/.ssh/id_dsa leave the passphrase filed blank in order to automatically login via ssh.
#Add DSA Key to Trusted repo
cat
~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys


Copy the DSA Key to other nodes

3.4. Java Installation


Add on All node

#Hadoop framework is written in Java!!
#Download jdk-8u92-linux-x64.rpm

mkdir
Downloads
cd
Downloads
wget
https://mirror.its.sfu.ca/mirror/CentOS-Third-Party/NSG/common/x86_64/jdk-8u92-linux-x64.rpm
rpm
-Uvh jdk-8u92-linux-x64.rpm
#Download hadoop-2.7.2.tar.gz
wget
https://archive.apache.org/dist/hadoop/core/hadoop-2.7.2/hadoop-2.7.2.tar.gz
tar
xfz hadoop-2.7.2.tar.gz
cp -R
hadoop-2.7.2 /opt/hadoop



#Verify the Java version
[root@vmnnmaster01 Downloads]# java -version 
java version "1.8.0_92"
Java(TM) SE Runtime Environment (build 1.8.0_92-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.92-b14, mixed mode)
[root@vmnnmaster01 Downloads]#ls /usr/java default jdk1.8.0_92 latest

If we see that is not the right version instead of 1.8.0_92, we use the #alternatives --config java command

 

3.5. Java & Hadoop Environment Variables


Add on All node

#cd ~ 
#vi .bash_profile
#Append the following lines at the end of the file:
### JAVA env variables
export JAVA_HOME=/usr/java/jdk1.8.0_92 export PATH=$PATH:$JAVA_HOME/bin export CLASSPATH=.:$JAVA_HOME/jre/lib:$JAVA_HOME/lib:$JAVA_HOME/lib/tools.jar
## HADOOP env variables export HADOOP_HOME=/opt/hadoop export HADOOP_COMMON_HOME=$HADOOP_HOME export HADOOP_HDFS_HOME=$HADOOP_HOME export HADOOP_MAPRED_HOME=$HADOOP_HOME export HADOOP_YARN_HOME=$HADOOP_HOME export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native" export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
[root@vmnnmaster01 ~]#source .bash_profile 
[root@vmnnmaster01 ~]#echo $HADOOP_HOME 
[root@vmnnmaster01 ~]#echo $JAVA_HOME 


About Helix

Ball tip biltong pork belly frankfurter shankle jerky leberkas pig kielbasa kay boudin alcatra short loin.

Jowl salami leberkas turkey pork brisket meatball turducken flank bilto porke belly ball tip. pork belly frankf urtane bilto

Latest News

21 August 2017