PRACTICAL 4: Working with Hadoop Ecosystem

February 12, 2018

This is a short guide on how to install Hadoop single node cluster on a Windows computer. The process is straight forward. First, we need to download and install the following software:

· JAVA

Download the Java 1.8 from https://java.com/en/download/

Once installed confirm that you’re running the correct version from command line using ‘java -version’ command, output of which you can confirm in command line like this:

Figure 1- checking java

· HADOOP

Install a Hadoop distribution. To do so, Download the most recent release Hadoop 3.0.0-alpha2 (25 Jan, 2017) in a binary form, from the Apache Download Mirror at http://hadoop.apache.org/releases.html

Once the hadoop-3.0.0-alpha2.tar.gz (250 MB) downloaded, extract it by using WinRAR into C:\hadoop-3.0.0-alpha2 folder:

Figure 2- extracting hadoop in c:

· Setup Environmental Variables

Ø In Windows 10 open System Properties windows and click on Environment Variables button:

Figure 3-setting environment variable

Ø Create a new HADOOP_HOME variable and pointed the path to C:\hadoop-3.0.0-alpha2\bin folder on my PC:

Figure 4-adding variable name

Ø Add a Hadoop bin directory path to PATH variable. Clicked on PATH and pressed edit:

Figure 5

Ø add a ‘C:\hadoop-3.0.0-alpha2\bin’ path like this and pressed OK:

Figure 6

· Edit Hadoop Configuration

Ø Edit C:\hadoop-3.0.0-alpha2\etc\hadoop\core-site.xml file, just like this:

Figure 7-editing core-site.xml

Ø Next go to to C:\hadoop-3.0.0-alpha2\etc\hadoop folder and rename mapred-site.xml.template to mapred-site.xml.

Ø Edit the mapred-site.xml file adding the following XML Yarn configuration for Mapreduce:

Figure 8-editting mapred-site.xml

Ø The next step was to create a new ‘data’ folder in Hadoop’s home directory (C:\hadoop-3.0.0-alpha2\data).

Ø Once done, the next step is to add a data node and name node to Hadoop, by editing c:\hadoop-3.0.0-alpha2\etc\hadoop\hdfs-site.xml file.

Ø And add the following configuration to this XML file:

Figure 9-editing hdfs-site.xml

Ø The next step is to add site specific YARN configuration properties by editing yarn-site.xml at C:\hadoop-3.0.0-alpha2\etc\hadoop\yarn-site.xml, like this:

Figure 10-edit yarn-site.xml

Ø Go to the location: “Hadoop-2.6.0\etc\hadoop” and edit “hadoop-env.cmd” by writing set JAVA_HOME=C:\java\jdk1.8.0_91 .

Ø Check on cmd:

Figure 11.cmd

Ø Start Hadoop. Go to the location: “D:\hadoop-2.6.0\sbin.” Run the following files as administrator “start-dfs.cmd” and “start-yarn.cmd”

Figure 12-starting all windows

· Open Hadoop GUI

Once all above steps are completed, open browser and navigate to: http://localhost:8088/cluster

Search This Blog

Hello world

PRACTICAL 4: Working with Hadoop Ecosystem

Comments

Post a Comment

Popular posts from this blog

Tutorial 4

Login into Gmail Account Using Web Driver

Case study of Library Management System