(Created page with "== Hadoop Pseudo Distributed Mode == Hadoop cluster can be emulated with "pseudo-distributed mode" * all Hadoop demons run, and applications feel like tHey are being execu...") |
(No difference)
|
Hadoop cluster can be emulated with "pseudo-distributed mode"
~/soft/hadoop-2.6.0/
HADOOP_CONF_DIR
to some directory with config, e.g. ~/conf/hadoop-local
You need to export the following env variables:
#!/bin/bash export HADOOP_HOME=~/soft/hadoop-2.6.0 export HADOOP_BIN=$HADOOP_HOME/bin export HADOOP_CONF_DIR=~/conf/hadoop-cluster export YARN_CONF_DIR=$HADOOP_CONF_DIR export PATH=$HADOOP_BIN:$HADOOP_HOME/sbin:$PATH
Also, if you don't have a java on your PATH, you need to create hadoop-env.sh
in HADOOP_CONF_DIR
and add (replace)
export JAVA_HOME=/home/user/soft/jdk1.8.0_60/ export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}
Hadoop in "Pseudo-distributed mode" should have properties similar to these:
cat core-site.xml <?xml version="1.0"?> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost/</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/agrigorev/tmp/hadoop/</value> </property> </configuration>
cat hdfs-site.xml <?xml version="1.0"?> <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
cat mapred-site.xml <?xml version="1.0"?> <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
cat yarn-site.xml <?xml version="1.0"?> <configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>localhost</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration>
hdfs namenode -format
hadoop.tmp.dir
is not specified, it'll use /tmp/hadoop-${user.name}
, which is cleaned after each reboot
ssh localhost
ssh-agent
are running
To start, use
start-dfs.sh start-yarn.sh mr-jobhistory-daemon.sh start historyserver
make sure namenode started:
telnet localhost 8020
If namenode doesn't start in local mode, do [1]:
rm -Rf tmp_dir
hadoop namenode -format
start-dfs.sh
Starting datanodes
hadoop-daemon.sh start datanode
hadoop fs -put somefile /home/username/ hadoop fs -ls /home/username/
Troubleshooting:
HADOOP_CONF_DIR
yarn application -list yarn application -kill application_1445857836386_0002