Title | Steps to Install Hadoop on Mac OSX |
---|---|
Author | Plamond Colaso |
Course | Network Routing |
Institution | University of Missouri-Kansas City |
Pages | 3 |
File Size | 90.3 KB |
File Type | |
Total Downloads | 104 |
Total Views | 180 |
software installation...
Steps to install and run Hadoop on a Macbook Pro (I tried the installation on MacOS Sierra, and it worked for me. So good luck!)
Make sure brew (https://brew.sh) is installed. $ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
Install Java and set it up. (On Ubutu, you would use apt-get to install different packages.) $ brew cask install java
Get the directory where Java is installed. $ /usr/libexec/java_home
Download Hadoop. $ brew install wget $ wget http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.8.1/hadoop2.8.1.tar.gz $ tar xvf hadoop-2.8.1.tar.gz
Set the environment variable $JAVA_HOME in etc/hadoop/hadoop-env.sh. $ export JAVA_HOME=/the/output/from/the/previous/command Update the .xml files in hadoop-2.8.1/etc/hadoop as described in Step 4 of [3]. Add the following to hdfs-site.xml to set the local directory that will be used by the Namenode in HDFS.
dfs.name.dir /some/directory/name/that/exists true
Add the following to hdfs-site.xml to set the local directory that will be used by the Datanode in HDFS.
dfs.data.dir /some/directory/name/that/exists/but/different/from/the/namenode
Then initialize the HDFS file system. bin/hdfs namenode -format
Now it is time to start HDFS and YARN! $ sbin/start-dfs.sh $ sbin/start-yarn.sh
Check if all processes have started using “jps.” You should see several processes related to Hadoop (DataNode, NameNode, ResourceManager, NodeManager, SecondaryNameNode, jps) if everything went well. $ jps
Sample output: 24849 Jps 24068 NodeManager 23689 NameNode 23771 DataNode 23870 SecondaryNameNode 23983 ResourceManager
If you don’t want to type your password each time when start/stop the services, try this: $ ssh-keygen -t rsa -P "" $ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
Let’s try some HDFS commands. $ $ $ $ $
bin/hdfs bin/hdfs bin/hdfs bin/hdfs bin/hdfs
dfs dfs dfs dfs dfs
–mkdir /yourname –ls –copyFromLocal LICENSE.txt /yourname/ –ls –cat /yourname/LICENSE.txt
You can add the Hadoop bin/ directory to your $PATH variable. To run the wordcount example, do the following: $ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.1.jar wordcount /yourname/LICENSE.txt /yourname/out $ hdfs dfs -ls /yourname/out/ $ hdfs dfs -cat /yourname/out/part-r-00000
Finally, you can shutdown Hadoop (gracefully). $ sbin/stop-dfs.sh $ sbin/stop-yarn.sh
You can set $HADOOP_HOME to point to where Hadoop is installed. Always be curious what is written in the log files in the logs/ directory. Use the below links to check the status of the Hadoop cluster. http://localhost:8088/cluster http://localhost:50070/dfshealth.html
(You could also install Hadoop using brew. It was too slow for me.)
References [1] http://www.lonecpluspluscoder.com/2017/04/27/installing-java-8-jdk-os-x-using-homebrew/ [2] https://brew.sh/[3] https://www.quickprogrammingtips.com/big-data/how-to-install-hadoop-onmac-os-x-el-capitan.html [4] http://stackoverflow.com/...