Saturday, September 3, 2016

HIVE 1.2.0 INSTALLATION ON HADOOP 2.7.1 SINGLE NODE CLUSTER



PREREQUISITES
1) Ubuntu 12.4 or higher
2) Sun Java 6 or higher
3) The system must have hadoop installed (hadoop version 2.7.1)and configured.
Note :- Hive can run in  Linux and Windows environment. Mac (OS X ) is a commonly used development environment. 
INSTALLATION STEPS
Step1 :sudo tar -zxzf apache-hive-1.2.1-bin.tar.gz
Step2 :sudo mv apache-hive-1.2.0-bin  /usr/local/hive
change this ~/.bashrc in hadoop user 
login in hadoop user using this command
su hadoop-user(here your user name as per your system)
then it will ask for password write it the password and after do further process. 
Step3 :sudo nano ~/.bashrc
Step4: add this in ~/.bashrc file and save
export HIVE_HOME=/usr/local/hive/apache-hive-1.2.0-bin/
export HIVE_CONF_DIR=$HIVE_HOME/conf
export HIVE_CLASS_PATH=$HIVE_CONF_DIR
export PATH=$HIVE_HOME/bin:$PATH
Step5 : source ~/.bashrc
Step6 :hive
Then Start first hadoop using start-all.sh command then start hive just typing hive command and you can see like this

hadoop-user@ubuntu:~$ hive

Logging initialized using configuration in jar:file:/usr/local/hive/apache-hive-1.2.1-bin/lib/hive-common-1.2.1.jar!/hive-log4j.properties
Java HotSpot(TM) Server VM warning: You have loaded library /usr/local/hadoop-2.7.1/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.


hive> 


Step By Step Mapreduce Example run on hadoop 2.7.1 with video




Step 1: open terminal using ctrl+alt+T

Step 2: go to hadoop user using
hadoopashish@ubuntu:su hadoop-user

Step 3: Start all Hadoop using below command
hadoop-user@ubantu : start-all.sh

Step 4: Make new Data Directory on Desktop and create on text file in side Data folder which is on desktop
Desktop/data/text.txt

Step 5:After start hadoop just type below text
hadoop-user@ubantu : jps
6385 ResourceManager
6146 SecondaryNameNode
5637 NameNode
6696 NodeManager
5866 DataNode
12510 Jps

Step 6:after doing this You have to go on below path
hadoop-user@ubantu : cd   /usr/local/hadoop/
or which path inside your Directory Mine is
hadoop-user@ubantu : cd   /usr/local/hadoop-2.7.1/

Step 7 :after going in this folder you have to type
hadoop-user@ubantu : cd /usr/local/hadoop-2.7.1 $ bin/hdfs dfs -mkdir /user
after this press enter
hadoop-user@ubantu : cd /usr/local/hadoop-2.7.1 $ bin/hdfs dfs -mkdir /user/ashish/
after this press enter

Step 8 : next step to open web browser to veryfying this user will be created or not so just open web browser and type in navigation bar
localhost:50070/
then press enter.You are seeing hadoop window inside that you have to go utilities-->Browse the file System, In side that you can see Browse Directory list.just click on GO Button and You can see that below user directory then select that user directory u can see another directory which one we have created ahead,In my case i see ashish directory name I am seeing.
then verification is done.

Step 9: now we are give the input which is stored on desktop/data/text.txt
hadoop-user@ubuntu:/usr/local/hadoop-2.7.1$ hdfs dfs -put /home/hadoopashish/Desktop/data input
so here hadoop system is getting input directory file

Step 10: now this is main step to run map-reduce.
Here first you have to type hadoop which is internal command.
the jar which is java archive file extensionw which will be run our hadoop-mapreduce-example-2.7.1.jar
the wordcount which is program name wordcount.java which is inbuilt store inside hadoop-mapreduce-example-2.7.1.jar file
then our input file name input and output which is going to stored output of next after running this command.
hadoop-user@ubuntu:/usr/local/hadoop-2.7.1$ hadoop  jar  share/hadoop/mapreduce/hadoop-mapreduce-example-2.7.1.jar  wordcount  input  output

After putting this step just press enter and your mapreduce job is going to run and u can see that process in terminal window

Step 11: after completing this step u have to go once again in the browser and open localhost:50070 then go in utilities à browse the file system inside that click on GO button and go inside user the inside user goes in hadoop-user (My Directory where stored file)
You can see too file input and output.
You can see there r-0000 file and success file in output folder.open r-0000 file yes this is our output file.
Open it and u can see the output of this file




Friday, September 2, 2016

Hadoop 2.7.0 Single Node Cluster Setup on Ubuntu 15.04




$ sudo apt-get update

$ sudo apt-get install default-jdk

$ java -version

$ sudo apt-get install ssh

$ sudo apt-get install rsync

$ ssh-keygen -t dsa -P ' ' -f ~/.ssh/id_dsa

$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

$ wget -c http://apache.mirrors.lucidnetworks.net/hadoop/common/hadoop-2.7.0/hadoop-2.7.0.tar.gz

$ sudo tar -zxvf hadoop-2.7.0.tar.gz

$ sudo mv hadoop /usr/local/hadoop

$ update-alternatives --config java

$ sudo nano ~/.bashrc

          #Hadoop Variables
          export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
          export HADOOP_HOME=/usr/local/hadoop
          export PATH=$PATH:$HADOOP_HOME/bin
          export PATH=$PATH:$HADOOP_HOME/sbin
          export HADOOP_MAPRED_HOME=$HADOOP_HOME
          export HADOOP_COMMON_HOME=$HADOOP_HOME
          export HADOOP_HDFS_HOME=$HADOOP_HOME
          export YARN_HOME=$HADOOP_HOME
          export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
          export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

$ source ~/.bashrc

$ cd /usr/local/hadoop/etc/hadoop

$ sudo nano hadoop-env.sh

          #The java implementation to use.
          export JAVA_HOME="/usr/lib/jvm/java-7-openjdk-amd64"

$ sudo nano core-site.xml

          <configuration>
                  <property>
                      <name>fs.defaultFS</name>
                      <value>hdfs://localhost:9000</value>
                  </property>
          </configuration>

$ sudo nano yarn-site.xml

          <configuration>
                  <property>
                      <name>yarn.nodemanager.aux-services</name>
                      <value>mapreduce_shuffle</value>
                  <property>
                  <property>
                      <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
                      <value> org.apache.hadoop.mapred.ShuffleHandler</value>
                  </property>
          </configuration>

$ sudo cp mapred.site.xml.template mapred-site.xml

$ sudo nano mapred-site.xml

          <configuration>
                  <property>
                      <name>mapreduce.framework.name</name>
                      <value>yarn</value>
                  </property>
          </configuration>

$ sudo nano hdfs-site.xml

          <configuration>
                  <property>
                      <name>dfs.replication</name>
                      <value>1</value>
                  </property>
                  <property>
                      <name>dfs.namenode.name.dir</name>
                      <value>file:/usr/local/hadoop/hadoop_data/hdfs/namenode</value>
                  </property>
                  <property>
                      <name>dfs.datanode.data.dir</name>
                      <value>file:/usr/local/hadoop/hadoop_store/hdfs/datanode</value>
                  </property>
          </configuration>

$ cd
$ sudo apt-get update

$ sudo apt-get install default-jdk

$ java -version

$ sudo apt-get install ssh

$ sudo apt-get install rsync

$ ssh-keygen -t dsa -P ' ' -f ~/.ssh/id_dsa

$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

$ wget -c http://apache.mirrors.lucidnetworks.net/hadoop/common/hadoop-2.7.0/hadoop-2.7.0.tar.gz

$ sudo tar -zxvf hadoop-2.7.0.tar.gz

$ sudo mv hadoop /usr/local/hadoop

$ update-alternatives --config java

$ sudo nano ~/.bashrc

#Hadoop Variables
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

$ source ~/.bashrc

$ cd /usr/local/hadoop/etc/hadoop

$ sudo nano hadoop-env.sh

#The java implementation to use.
export JAVA_HOME="/usr/lib/jvm/java-7-openjdk-amd64"

$ sudo nano core-site.xml



fs.defaultFS
hdfs://localhost:9000



$ sudo nano yarn-site.xml



yarn.nodemanager.aux-services
mapreduce_shuffle


yarn.nodemanager.aux-services.mapreduce.shuffle.class
org.apache.hadoop.mapred.ShuffleHandler



$ sudo cp mapred.site.xml.template mapred-site.xml

$ sudo nano mapred-site.xml



mapreduce.framework.name
yarn



$ sudo nano hdfs-site.xml



dfs.replication
1


dfs.namenode.name.dir
file:/usr/local/hadoop/hadoop_data/hdfs/namenode


dfs.datanode.data.dir
file:/usr/local/hadoop/hadoop_store/hdfs/datanode



$ cd

$ mkdir -p /usr/local/hadoop/hadoop_data/hdfs/namenode

$ mkdir -p /usr/local/hadoop/hadoop_data/hdfs/datanode

$ sudo chown chaal:chaal -R /usr/local/hadoop

$ hdfs namenode -format

$ start-all.sh

$ jps



http://192.168.56.10:8088/
http://192.168.56.10:50070/
$ mkdir -p /usr/local/hadoop/hadoop_data/hdfs/namenode

$ mkdir -p /usr/local/hadoop/hadoop_data/hdfs/datanode

$ sudo chown chaal:chaal -R /usr/local/hadoop

$ hdfs namenode -format

$ start-all.sh

$ jps



http://192.168.56.10:8088/
http://192.168.56.10:50070/