Tutorial Hadoop multi node installation

Tutorial Hadoop multi node installation

In this tutorial we will see how to install a Hadoop multi node nodes by doing the following steps:

step1: You need to ensure that hadoop is installed (single node mode) on all pcs (masters+ slaves). If not use this link to do it

step 2: Now let’s configure the pcs to work on multinode cluster, as an example we will use two slaves and one master:

Add all host names to /etc/hosts directory in all Machines (Master and Slave nodes):

# Add following hostname and their ip in host table
Suppose that your slaves address are : 192.168.0.151 and 192.168.0.152. And the master address is: 192.168.0.150

login with hduser, then install rsync and reboot the pc

Now let’s do the common configuration in all nodes (slaves + master):

1- ## edit core-site.xml:

 

## Paste these lines into <configuration> tag OR Just update it by replacing localhost with master

2- Update hdfs-site.xml
Update this file by updating repliction factor from 1 to 3.

Then paste/update these lines into <configuration> tag

3- Update yarn-site.xml
Update this file by updating the following three properties by updating hostname from localhost to HadoopMaster,

Then paste/update these lines into <configuration> tag

4- Update Mapred-site.xml
Update this file by updating and adding following properties,

Then paste/update these lines into <configuration> tag

5- Update masters file

Then add name of master nodes

6- Update slaves

Thenadd name of slave nodes

Applying Master node specific Hadoop configuration: (Only for master nodes)

1- Remove existing Hadoop_data folder (which was created while single node hadoop setup.)

2- Make same (/usr/local/hadoop_tmp/hdfs) directory and create NameNode directory (/usr/local/hadoop_tmp/hdfs/namenode)

3- Make hduser as owner of that directory.


Applying Slave node specific Hadoop configuration : (Only for slave nodes)

1- Remove existing Hadoop_data folder (which was created while single node hadoop setup)

2- Creates same (/usr/local/hadoop_tmp/) directory/folder and inside this folder again Create DataNode (/usr/local/hadoop_tmp/hdfs/namenode) directory folder

3- Make hduser as owner of that directory

SSH Configuration:

On the master node fire the following command for sharing public SSH key ~/.ssh/id_rsa.pub file (of HadoopMaster node) to authorized_keys file of hduser@HadoopSlave1 and also on hduser@HadoopSlave1 (in $HOME/.ssh/authorized_keys)

Now let’s format the namenode (Run on MasterNode) :

# Run this command from Masternode

Starting up Hadoop cluster daemons : (Run on MasterNode)
Start HDFS daemons:

Start Yarn daemons:

Instead both of these above command you can also use start-all.sh, but its now deprecated so its not recommended to be used for better Hadoop operations.

Track/Monitor/Verify Hadoop cluster : (Run on any Node)
Verify Hadoop daemons on Master :

you should see only

Verify Hadoop daemons on all slave nodes :

 

to view on web:

For ResourceManager – Http://HadoopMaster:8088

 

For NameNode – Http://HadoopMaster:50070

 

execute word count example:

in the masternode (always, clients should only talk to masternode ):

Author: Nizar Ellouze

4 Comments
  • Posted at 7:04 am, April 30, 2018

    Fine stuff. Thanks a lot!

  • Posted at 7:04 am, July 12, 2018

    This site was… how do you say it? Relevant!! Finally I have found something that helped me. Kudos!

  • AlinaAnatUncet
    Reply
    Posted at 10:55 pm, September 4, 2018

    Excellent!
    Respect the author!
    And I read the actual news here …[url=https://technology4you.website/category/smartphone-reviews/]Smartphone Reviews[/url]

  • Soumeya
    Reply
    Posted at 7:48 am, October 3, 2018

    Hello,
    Thank you for your tutorial very useful

    My question is about installing Hadoop on Windows 7, must i follow the same steps or is there any changes?
    How can i configure ssh on Windows 7?

    Best

Post a Comment

Comment
Name
Email
Website