PRACTICAL 4: Working with Hadoop Ecosystem
PRACTICAL 4: Working with Hadoop Ecosystem
This
is a short guide on how to install Hadoop single node cluster on a Windows
computer. The process is straight forward. First, we need to download and
install the following software:
·        
JAVA
Download the Java 1.8 from https://java.com/en/download/
Once installed confirm that you’re running the correct
version from command line using ‘java
-version’ command, output of which you can confirm in command line like
this:
Figure 1- checking java
·        
HADOOP
Install a Hadoop distribution. To do so, Download the
most recent release Hadoop 3.0.0-alpha2
(25 Jan, 2017) in a binary form, from the Apache Download Mirror at http://hadoop.apache.org/releases.html
Once the hadoop-3.0.0-alpha2.tar.gz (250 MB)
downloaded, extract it by using WinRAR into C:\hadoop-3.0.0-alpha2 folder:
Figure 2- extracting hadoop in c:
·        
 Setup
Environmental
Variables
Ø  In Windows 10 open System Properties windows and click
on Environment Variables button:
Figure 3-setting environment variable
Ø  Create a new HADOOP_HOME variable and pointed the path
to C:\hadoop-3.0.0-alpha2\bin folder on my PC:
Figure 4-adding variable name
Ø  Add a Hadoop bin directory path to PATH variable.
Clicked on PATH and pressed edit:
Figure 5
Ø  add a ‘C:\hadoop-3.0.0-alpha2\bin’ path like this
and pressed OK:
Figure 6
·       
Edit Hadoop
Configuration
Ø Edit
C:\hadoop-3.0.0-alpha2\etc\hadoop\core-site.xml
file, just like this:
Figure 7-editing core-site.xml
Ø  Next go to to C:\hadoop-3.0.0-alpha2\etc\hadoop folder
and rename mapred-site.xml.template
to mapred-site.xml.
Ø  Edit the mapred-site.xml file adding the following XML
Yarn configuration for Mapreduce:
Figure 8-editting mapred-site.xml
Ø  The next step was to create a new ‘data’ folder in Hadoop’s home directory (C:\hadoop-3.0.0-alpha2\data).
Ø  Once done, the next step is to add a data node and
name node to Hadoop, by editing c:\hadoop-3.0.0-alpha2\etc\hadoop\hdfs-site.xml
file.
Ø  And add the following configuration to this XML file:
Figure 9-editing hdfs-site.xml
Ø  The next step is to add site specific YARN
configuration properties by editing yarn-site.xml at C:\hadoop-3.0.0-alpha2\etc\hadoop\yarn-site.xml, like this:
Figure 10-edit yarn-site.xml
Ø  Go
to the location: “Hadoop-2.6.0\etc\hadoop”
and edit “hadoop-env.cmd” by writing  set
JAVA_HOME=C:\java\jdk1.8.0_91 .
Ø  Check
on cmd:
Figure 11.cmd
Ø  Start
Hadoop. Go to the location: “D:\hadoop-2.6.0\sbin.” Run the following files as   administrator
“start-dfs.cmd” and “start-yarn.cmd”
Figure 12-starting all windows
·       
Open Hadoop GUI
Once all above steps are completed,  open browser and navigate to: http://localhost:8088/cluster












 
 
Comments
Post a Comment