Ashish Patel: September 2016

Saturday, September 3, 2016

HIVE 1.2.0 INSTALLATION ON HADOOP 2.7.1 SINGLE NODE CLUSTER

PREREQUISITES

1) Ubuntu 12.4 or higher

2) Sun Java 6 or higher

3) The system must have hadoop installed (hadoop version 2.7.1)and configured.

Note :- Hive can run in Linux and Windows environment. Mac (OS X ) is a commonly used development environment.

INSTALLATION STEPS

Step1 :sudo tar -zxzf apache-hive-1.2.1-bin.tar.gz

Step2 :sudo mv apache-hive-1.2.0-bin /usr/local/hive

change this ~/.bashrc in hadoop user

login in hadoop user using this command

su hadoop-user(here your user name as per your system)

then it will ask for password write it the password and after do further process.

Step3 :sudo nano ~/.bashrc

Step4: add this in ~/.bashrc file and save

export HIVE_HOME=/usr/local/hive/apache-hive-1.2.0-bin/

export HIVE_CONF_DIR=$HIVE_HOME/conf

export HIVE_CLASS_PATH=$HIVE_CONF_DIR

export PATH=$HIVE_HOME/bin:$PATH

Step5 : source ~/.bashrc

Step6 :hive

Then Start first hadoop using start-all.sh command then start hive just typing hive command and you can see like this

hadoop-user@ubuntu:~$ hive

Logging initialized using configuration in jar:file:/usr/local/hive/apache-hive-1.2.1-bin/lib/hive-common-1.2.1.jar!/hive-log4j.properties
Java HotSpot(TM) Server VM warning: You have loaded library /usr/local/hadoop-2.7.1/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.

hive>

Step By Step Mapreduce Example run on hadoop 2.7.1 with video

Step 1: open terminal using ctrl+alt+T

Step 2: go to hadoop user using

hadoopashish@ubuntu:su hadoop-user

Step 3: Start all Hadoop using below command

hadoop-user@ubantu : start-all.sh

Step 4: Make new Data Directory on Desktop and create on text file in side Data folder which is on desktop

Desktop/data/text.txt

Step 5:After start hadoop just type below text

hadoop-user@ubantu : jps

6385 ResourceManager

6146 SecondaryNameNode

5637 NameNode

6696 NodeManager

5866 DataNode

12510 Jps

Step 6:after doing this You have to go on below path

hadoop-user@ubantu : cd /usr/local/hadoop/

or which path inside your Directory Mine is

hadoop-user@ubantu : cd /usr/local/hadoop-2.7.1/

Step 7 :after going in this folder you have to type

hadoop-user@ubantu : cd /usr/local/hadoop-2.7.1 $ bin/hdfs dfs -mkdir /user

after this press enter

hadoop-user@ubantu : cd /usr/local/hadoop-2.7.1 $ bin/hdfs dfs -mkdir /user/ashish/

after this press enter

Step 8 : next step to open web browser to veryfying this user will be created or not so just open web browser and type in navigation bar

localhost:50070/

then press enter.You are seeing hadoop window inside that you have to go utilities-->Browse the file System, In side that you can see Browse Directory list.just click on GO Button and You can see that below user directory then select that user directory u can see another directory which one we have created ahead,In my case i see ashish directory name I am seeing.

then verification is done.

Step 9: now we are give the input which is stored on desktop/data/text.txt

hadoop-user@ubuntu:/usr/local/hadoop-2.7.1$ hdfs dfs -put /home/hadoopashish/Desktop/data input

so here hadoop system is getting input directory file

Step 10: now this is main step to run map-reduce.

Here first you have to type hadoop which is internal command.

the jar which is java archive file extensionw which will be run our hadoop-mapreduce-example-2.7.1.jar

the wordcount which is program name wordcount.java which is inbuilt store inside hadoop-mapreduce-example-2.7.1.jar file

then our input file name input and output which is going to stored output of next after running this command.

hadoop-user@ubuntu:/usr/local/hadoop-2.7.1$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-example-2.7.1.jar wordcount input output

After putting this step just press enter and your mapreduce job is going to run and u can see that process in terminal window

Step 11: after completing this step u have to go once again in the browser and open localhost:50070 then go in utilities à browse the file system inside that click on GO button and go inside user the inside user goes in hadoop-user (My Directory where stored file)

You can see too file input and output.

You can see there r-0000 file and success file in output folder.open r-0000 file yes this is our output file.

Open it and u can see the output of this file

Friday, September 2, 2016

Hadoop 2.7.0 Single Node Cluster Setup on Ubuntu 15.04

$ sudo apt-get update

$ sudo apt-get install default-jdk

$ java -version

$ sudo apt-get install ssh

$ sudo apt-get install rsync

$ ssh-keygen -t dsa -P ' ' -f ~/.ssh/id_dsa

$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

$ wget -c http://apache.mirrors.lucidnetworks.net/hadoop/common/hadoop-2.7.0/hadoop-2.7.0.tar.gz

$ sudo tar -zxvf hadoop-2.7.0.tar.gz

$ sudo mv hadoop /usr/local/hadoop

$ update-alternatives --config java

$ sudo nano ~/.bashrc

#Hadoop Variables
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

$ source ~/.bashrc

$ cd /usr/local/hadoop/etc/hadoop

$ sudo nano hadoop-env.sh

#The java implementation to use.
export JAVA_HOME="/usr/lib/jvm/java-7-openjdk-amd64"

$ sudo nano core-site.xml

<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

$ sudo nano yarn-site.xml

<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
<property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value> org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>

$ sudo cp mapred.site.xml.template mapred-site.xml

$ sudo nano mapred-site.xml

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

$ sudo nano hdfs-site.xml

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/hadoop_data/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/hadoop_store/hdfs/datanode</value>
</property>
</configuration>

$ cd
$ sudo apt-get update

$ sudo apt-get install default-jdk

$ java -version

$ sudo apt-get install ssh

$ sudo apt-get install rsync

$ ssh-keygen -t dsa -P ' ' -f ~/.ssh/id_dsa

$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

$ wget -c http://apache.mirrors.lucidnetworks.net/hadoop/common/hadoop-2.7.0/hadoop-2.7.0.tar.gz

$ sudo tar -zxvf hadoop-2.7.0.tar.gz

$ sudo mv hadoop /usr/local/hadoop

$ update-alternatives --config java

$ sudo nano ~/.bashrc

#Hadoop Variables
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

$ source ~/.bashrc

$ cd /usr/local/hadoop/etc/hadoop

$ sudo nano hadoop-env.sh

#The java implementation to use.
export JAVA_HOME="/usr/lib/jvm/java-7-openjdk-amd64"

$ sudo nano core-site.xml

fs.defaultFS
hdfs://localhost:9000

$ sudo nano yarn-site.xml

yarn.nodemanager.aux-services
mapreduce_shuffle

yarn.nodemanager.aux-services.mapreduce.shuffle.class
org.apache.hadoop.mapred.ShuffleHandler

$ sudo cp mapred.site.xml.template mapred-site.xml

$ sudo nano mapred-site.xml

mapreduce.framework.name
yarn

$ sudo nano hdfs-site.xml

dfs.replication
1

dfs.namenode.name.dir
file:/usr/local/hadoop/hadoop_data/hdfs/namenode

dfs.datanode.data.dir
file:/usr/local/hadoop/hadoop_store/hdfs/datanode

$ cd

$ mkdir -p /usr/local/hadoop/hadoop_data/hdfs/namenode

$ mkdir -p /usr/local/hadoop/hadoop_data/hdfs/datanode

$ sudo chown chaal:chaal -R /usr/local/hadoop

$ hdfs namenode -format

$ start-all.sh

$ jps

http://192.168.56.10:8088/
http://192.168.56.10:50070/
$ mkdir -p /usr/local/hadoop/hadoop_data/hdfs/namenode

$ mkdir -p /usr/local/hadoop/hadoop_data/hdfs/datanode

$ sudo chown chaal:chaal -R /usr/local/hadoop

$ hdfs namenode -format

$ start-all.sh

$ jps

http://192.168.56.10:8088/
http://192.168.56.10:50070/