{"id":107,"date":"2019-03-01T15:41:58","date_gmt":"2019-03-01T14:41:58","guid":{"rendered":"http:\/\/deleforterie.com\/wordpress\/?p=107"},"modified":"2023-01-18T08:53:12","modified_gmt":"2023-01-18T07:53:12","slug":"installing-hadoop-yarn-and-hbase-on-your-linux-in-10-minutes","status":"publish","type":"post","link":"https:\/\/deleforterie.com\/wordpress\/index.php\/2019\/03\/01\/installing-hadoop-yarn-and-hbase-on-your-linux-in-10-minutes\/","title":{"rendered":"Installing Hadoop, Yarn and HBase on your Linux in 10 minutes"},"content":{"rendered":"\n<p>This is a howto for installing Hadoop (hdfs  and MapReduce), Yarn and HBase on your Linux box in 10 minutes (after the binary download).<\/p>\n\n\n\n<!--more-->\n\n\n\n<h2 class=\"wp-block-heading\">Prerequisites<\/h2>\n\n\n\n<p>Install java (openJDK) and find the java home, on Ubuntu look at \/usr\/lib\/jvm and choose java 1.8, not 1.11 as for the moment this will give you some troubles.<\/p>\n\n\n\n<p>Download the Hadoop binaries here <a href=\"https:\/\/hadoop.apache.org\/releases.html\">Hadoop<\/a> 3.1.2 is a good start, choose the binary file.<\/p>\n\n\n\n<p>Download the HBase binaries here <a href=\"http:\/\/hbase.apache.org\/downloads.html\">HBase<\/a> 2.1.3 is a good start, choose the binary file.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Installing Hadoop<\/h2>\n\n\n\n<p>We will follow the <a href=\"https:\/\/hadoop.apache.org\/docs\/r3.1.2\/hadoop-project-dist\/hadoop-common\/SingleCluster.html\">Hadoop Documentation<\/a> to start a Pseudo-Distributed&nbsp;mode cluster<\/p>\n\n\n\n<p>Untar the Hadoop archive in a directory, ex : \/home\/&lt;username&gt;\/hadoop\/hadoop-3.1.2<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Edit the Hdfs configuration files<\/h3>\n\n\n\n<h4 class=\"wp-block-heading\">\/home\/&lt;username&gt;\/hadoop\/hadoop-3.1.2\/etc\/hadoop\/core-site.xml<\/h4>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"xml\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">&lt;configuration>\n    &lt;property>\n        &lt;name>fs.defaultFS&lt;\/name>\n        &lt;value>hdfs:\/\/localhost:9000&lt;\/value>\n    &lt;\/property>\n&lt;\/configuration><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">\/home\/&lt;username&gt;\/hadoop\/hadoop-3.1.2\/etc\/hadoop\/hdfs-site.xml<\/h4>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"xml\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">&lt;configuration>\n    &lt;property>\n        &lt;name>dfs.replication&lt;\/name>\n        &lt;value>1&lt;\/value>\n    &lt;\/property>\n    &lt;property>\n        &lt;name>dfs.namenode.name.dir&lt;\/name>\n        &lt;value>\/home\/hadoop\/dfs&lt;\/value>\n    &lt;\/property>\n    &lt;property>\n        &lt;name>dfs.datanode.data.dir&lt;\/name>\n        &lt;value>\/home\/hadoop\/data&lt;\/value>\n    &lt;\/property>\n    &lt;property>\n        &lt;name>dfs.namenode.checkpoint.dir&lt;\/name>\n        &lt;value>\/home\/hadoop\/dfs\/namesecondary&lt;\/value>\n    &lt;\/property>    \n&lt;\/configuration>\n<\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">\/home\/&lt;username&gt;\/hadoop\/hadoop-3.1.2\/etc\/hadoop\/hadoop-env.sh<\/h4>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"shell\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">export JAVA_HOME=\/usr\/lib\/jvm\/java-1.8.0-openjdk-amd64<\/pre>\n\n\n\n<p>You should be able to do a ssh localhost without a passphrase, if this is not the case, just use ssh-copy-id localhost command to enable this.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Initialize hdfs<\/h3>\n\n\n\n<p>With the following commands we are going to create the local directory \/home\/hadoop to store the files,  format the namenode and start the hdfs processes. <\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"shell\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">[~]> sudo mkdir \/home\/hadoop\n[~]> sudo chown &lt;username>. \/home\/hadoop\n[~]> cd \/home\/&lt;username>\/hadoop\/hadoop-3.1.2\n[hadoop-3.1.2]> bin\/hdfs namenode -format\n[hadoop-3.1.2]> sbin\/start-dfs.sh\n[hadoop-3.1.2]> jps\n13579 Jps\n11757 DataNode\n12029 SecondaryNameNode<\/pre>\n\n\n\n<p>Now you have a HDFS system up and you could do some hdfs commands to create some directories and tests<\/p>\n\n\n\n<p>You can look at the WebUI to check health and configuration<\/p>\n\n\n\n<p>NameNode UI : <a href=\"http:\/\/localhost:9870\">http:\/\/localhost:9870<\/a><\/p>\n\n\n\n<p><\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">[hadoop-3.1.2]> bin\/hdfs dfs -mkdir \/user\n[hadoop-3.1.2]> bin\/hdfs dfs -mkdir \/user\/&lt;username>\n[hadoop-3.1.2]> bindfs dfs -ls \/\nFound 1 items\ndrwxr-xr-x   - &lt;username> supergroup          0 2019-03-01 14:39 \/user\n<\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Edit the Yarn configuration files<\/h3>\n\n\n\n<h4 class=\"wp-block-heading\">\/home\/&lt;username&gt;\/hadoop\/hadoop-3.1.2\/etc\/hadoop\/mapred-site.xml<\/h4>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"xml\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">&lt;configuration>\n    &lt;property>\n        &lt;name>mapreduce.framework.name&lt;\/name>\n        &lt;value>yarn&lt;\/value>\n    &lt;\/property>\n    &lt;property>\n        &lt;name>mapreduce.application.classpath&lt;\/name>\n        &lt;value>$HADOOP_MAPRED_HOME\/share\/hadoop\/mapreduce\/*:$HADOOP_MAPRED_HOME\/share\/hadoop\/mapreduce\/lib\/*&lt;\/value>\n    &lt;\/property>\n&lt;\/configuration><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">\/home\/&lt;username&gt;\/hadoop\/hadoop-3.1.2\/etc\/hadoop\/yarn-site.xml<\/h4>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"xml\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">&lt;configuration>\n    &lt;property>\n        &lt;name>yarn.nodemanager.aux-services&lt;\/name>\n        &lt;value>mapreduce_shuffle&lt;\/value>\n    &lt;\/property>\n    &lt;property>\n        &lt;name>yarn.nodemanager.env-whitelist&lt;\/name>        \n        &lt;value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME&lt;\/value>\n    &lt;\/property>\n    &lt;property>\n        &lt;name>yarn.nodemanager.local-dirs&lt;\/name>\n        &lt;value>\/home\/hadoop\/nm-local-dir\/&lt;\/value>\n    &lt;\/property>\n&lt;\/configuration><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Start Yarn and test the jobs<\/h3>\n\n\n\n<p>With the following commands we are going to start the Yarn Resource Manager that could run MapReduce jobs.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"shell\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">[hadoop-3.1.2]> sbin\/start-yarn.sh\n[hadoop-3.1.2]> jps\n22432 NodeManager\n22661 Jps\n18773 NameNode\n22038 ResourceManager\n19017 DataNode\n19307 SecondaryNameNode\n<\/pre>\n\n\n\n<p>Now that Yarn is running you can test the Yarn WebUI : <a href=\"http:\/\/localhost:8088\">http:\/\/localhost:8088<\/a><\/p>\n\n\n\n<p>Let&#8217;s go to run a MapReduce job, during the execution, you can follow the job with the WebUI<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"shell\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">[hadoop-3.1.2]> bin\/hdfs dfs -mkdir input\n[hadoop-3.1.2]> bin\/hdfs dfs -put etc\/hadoop\/*.xml input\n[hadoop-3.1.2]> bin\/hadoop jar share\/hadoop\/mapreduce\/hadoop-mapreduce-examples-3.1.2.jar grep input output 'dfs[a-z.]+'\n[hadoop-3.1.2]> bin\/hdfs dfs -get output output\n[hadoop-3.1.2]> cat output\/*\n[hadoop-3.1.2]> bin\/hdfs dfs -cat output\/*<\/pre>\n\n\n\n<p>From this point we have a running Hadoop system in Pseudo-Distributed mode<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Installing HBase<\/h2>\n\n\n\n<p>We will follow the <a href=\"http:\/\/hbase.apache.org\/book.html#quickstart_pseudo\">HBase Documentation<\/a> to start a Pseudo-Distributed&nbsp;mode HBase<\/p>\n\n\n\n<p>Untar the HBase archive in a directory, ex : \/home\/&lt;username&gt;\/hadoop\/hbase-2.1.3<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Edit the HBase configuration files<\/h3>\n\n\n\n<h4 class=\"wp-block-heading\">\/home\/&lt;username&gt;\/hadoop\/hbase-2.1.3\/conf\/hbase-site.xml<\/h4>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"xml\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">&lt;configuration>\n&lt;property>\n    &lt;name>hbase.cluster.distributed&lt;\/name>\n    &lt;value>true&lt;\/value>\n&lt;\/property>\n&lt;property>\n    &lt;name>hbase.rootdir&lt;\/name>\n    &lt;value>hdfs:\/\/localhost:9000\/hbase&lt;\/value>\n&lt;\/property>\n&lt;property>\n    &lt;name>hbase.zookeeper.property.dataDir&lt;\/name>\n    &lt;value>file:\/\/\/home\/hadoop\/zookeeper&lt;\/value>\n&lt;\/property>\n&lt;\/configuration><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">\/home\/&lt;username&gt;\/hadoop\/hbase-2.1.3\/conf\/hbase-env.sh<\/h4>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"shell\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">export JAVA_HOME=\/usr\/lib\/jvm\/java-1.8.0-openjdk-amd64<\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Start HBase<\/h3>\n\n\n\n<p>With the following commands we are going to start HBase. <\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"shell\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">[~]> cd \/home\/&lt;username>\/hadoop\/hbase-2.1.3\n[hbase-2.1.3]> bin\/start-hbase.sh\n[hbase-2.1.3]> jps\n22432 NodeManager\n18773 NameNode\n22038 ResourceManager\n25624 HRegionServer\n19017 DataNode\n25530 HMaster\n25451 HQuorumPeer\n19307 SecondaryNameNode\n25949 Jps<\/pre>\n\n\n\n<p>Now you have a HBase database running with your Hadoop in Pseudo-Ditributed mode and you can check your HBase with its WebUI : <a href=\"http:\/\/localhost:16010\">http:\/\/localhost:16010<\/a><\/p>\n\n\n\n<p>And view in Hdfs the HBase files<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"shell\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">[hbase-2.1.3]> cd \/home\/&lt;username>\/hadoop\/hadoop-3.1.2\n[hadoop-3.1.2]> bin\/hdfs dfs -ls \/hbase\nFound 13 items\ndrwxr-xr-x   - rico supergroup          0 2019-03-01 15:16 \/hbase\/.hbck\ndrwxr-xr-x   - rico supergroup          0 2019-03-01 15:17 \/hbase\/.tmp\ndrwxr-xr-x   - rico supergroup          0 2019-03-01 15:16 \/hbase\/MasterProcWALs\ndrwxr-xr-x   - rico supergroup          0 2019-03-01 15:16 \/hbase\/WALs\ndrwxr-xr-x   - rico supergroup          0 2019-03-01 15:16 \/hbase\/archive\ndrwxr-xr-x   - rico supergroup          0 2019-03-01 15:16 \/hbase\/corrupt\ndrwxr-xr-x   - rico supergroup          0 2019-03-01 15:17 \/hbase\/data\ndrwxr-xr-x   - rico supergroup          0 2019-03-01 15:17 \/hbase\/hbase\n-rw-r--r--   3 rico supergroup         42 2019-03-01 15:16 \/hbase\/hbase.id\n-rw-r--r--   3 rico supergroup          7 2019-03-01 15:16 \/hbase\/hbase.version\ndrwxr-xr-x   - rico supergroup          0 2019-03-01 15:16 \/hbase\/mobdir\ndrwxr-xr-x   - rico supergroup          0 2019-03-01 15:16 \/hbase\/oldWALs\ndrwx--x--x   - rico supergroup          0 2019-03-01 15:16 \/hbase\/staging\n<\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Testing HBase<\/h3>\n\n\n\n<p>With the following commands we are going to do the exercises from <a href=\"http:\/\/hbase.apache.org\/book.html#shell_exercises\">HBase Documentation<\/a>. <\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"sql\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">[~]> cd \/home\/&lt;username>\/hadoop\/hbase-2.1.3\n[hbase-2.1.3]> bin\/hbase shell\nHBase Shell\nUse \"help\" to get list of supported commands.\nUse \"exit\" to quit this interactive shell.\nFor Reference, please visit: http:\/\/hbase.apache.org\/2.0\/book.html#shell\nVersion 2.1.3, rda5ec9e4c06c537213883cca8f3cc9a7c19daf67, Mon Feb 11 15:45:33 CST 2019\nTook 0.0034 seconds                                                                                                                                                                                   \nhbase(main):001:0> \nhbase(main):002:0> create 'test', 'cf'\nCreated table test\nTook 5.3775 seconds                                                                                                                                                                                   \n=> Hbase::Table - test\nhbase(main):003:0> list 'test'\nTABLE                                                                                                                                                                                                 \ntest                                                                                                                                                                                                  \n1 row(s)\nTook 0.0218 seconds                                                                                                                                                                                   \n=> [\"test\"]\nhbase(main):004:0> describe 'test'\nTable test is ENABLED                                                                                                                                                                                 \ntest                                                                                                                                                                                                  \nCOLUMN FAMILIES DESCRIPTION                                                                                                                                                                           \n{NAME => 'cf', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL =\n> 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => \n'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}                                                                                                                           \n1 row(s)\nTook 0.2403 seconds                                                                                                                                                                                   \nhbase(main):005:0> put 'test', 'row1', 'cf:a', 'value1'\nTook 0.0952 seconds                                                                                                                                                                                   \nhbase(main):006:0> put 'test', 'row2', 'cf:b', 'value2'\nTook 0.0048 seconds                                                                                                                                                                                   \nhbase(main):007:0> put 'test', 'row3', 'cf:c', 'value3'\nTook 0.0048 seconds                                                                                                                                                                                   \nhbase(main):008:0> scan 'test'\nROW                                                COLUMN+CELL                                                                                                                                        \n row1                                              column=cf:a, timestamp=1551450920340, value=value1                                                                                                 \n row2                                              column=cf:b, timestamp=1551450930781, value=value2                                                                                                 \n row3                                              column=cf:c, timestamp=1551450938910, value=value3                                                                                                 \n3 row(s)\nTook 0.0247 seconds\n\n<\/pre>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"mce_37\">Stop all<\/h3>\n\n\n\n<p>With the following commands we are going to stop HBase, Yarn and Hdfs. <\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"shell\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">[~]> cd \/home\/&lt;username>\/hadoop\/hbase-2.1.3\n[hbase-2.1.3]> bin\/stop-hbase.sh\n[hbase-2.1.3]> cd \/home\/&lt;username>\/hadoop\/hadoop-3.1.2\n[hadoop-3.1.2]> sbin\/stop-yarn.sh\n[hadoop-3.1.2]> sbin\/stop-dfs.sh\n[hadoop-3.1.2]> jps\n25949 Jps<\/pre>\n\n\n\n<p>Hope this howto will help you to discover the Hadoop and HBase technologies.<\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>This is a howto for installing Hadoop (hdfs and MapReduce), Yarn and HBase on your Linux box in 10 minutes (after the binary download).<\/p>\n","protected":false},"author":2,"featured_media":170,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[4,14,21,16],"tags":[],"class_list":["post-107","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-bigdata","category-hadoop","category-hbase","category-yarn"],"_links":{"self":[{"href":"https:\/\/deleforterie.com\/wordpress\/index.php\/wp-json\/wp\/v2\/posts\/107","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/deleforterie.com\/wordpress\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/deleforterie.com\/wordpress\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/deleforterie.com\/wordpress\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/deleforterie.com\/wordpress\/index.php\/wp-json\/wp\/v2\/comments?post=107"}],"version-history":[{"count":20,"href":"https:\/\/deleforterie.com\/wordpress\/index.php\/wp-json\/wp\/v2\/posts\/107\/revisions"}],"predecessor-version":[{"id":166,"href":"https:\/\/deleforterie.com\/wordpress\/index.php\/wp-json\/wp\/v2\/posts\/107\/revisions\/166"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/deleforterie.com\/wordpress\/index.php\/wp-json\/wp\/v2\/media\/170"}],"wp:attachment":[{"href":"https:\/\/deleforterie.com\/wordpress\/index.php\/wp-json\/wp\/v2\/media?parent=107"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/deleforterie.com\/wordpress\/index.php\/wp-json\/wp\/v2\/categories?post=107"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/deleforterie.com\/wordpress\/index.php\/wp-json\/wp\/v2\/tags?post=107"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}