During my installing of hadoop on my laptop which was based on windows7, I encountered some problems which cost me couple of hours, so I plan to write it down.
The standard process was :
- install Cygwin(must enable openSSH feature when installing).
- download Hadoop package from official site, and extract it.(PS: in my laptop, I must extract it in the directory which cygwin located, or hadoop namenode would not be started. )
- set environment variables: JAVA_HOME, HADOOP_INSTALL, add %HADOOP_INSTALL%/bin to PATH.
- open %HADOOP_INSTALL%/conf/hadoop-env.sh, add "export JAVA_HOME=***", add "export HADOOP_INSTALL=**".
- then open Cygwin terminal, type "hadoop version", if no error found ,then it's ok. if error found , check your HADOOP_INSTALL, and PATH setting.(PS: be ware of the JAVA_HOME, it is recommended that this path should not include " "(space)).
- modity core-site.xml, hdfs-site.xml, mapred-site.xml using below content:
<?xml version="1.0"?>
<!– core-site.xml –>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost/</value>
</property>
</configuration>
<?xml version="1.0"?>
<!– hdfs-site.xml –>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
<?xml version="1.0"?>
<!– mapred-site.xml –>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:8021</value>
</property>
</configuration> - configuring SSH using command: ssh-host-config , select not create private privilege account(may not the same word, but seems like it.).
then generate a new SSH key (so we can login without password, the same as hadoop running on it) .
% ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
% cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
Test with : ssh localhost ( before we can login , we must start the cygwin ssh service firstly. ) - initialize hadoop namenode using : hadoop namenode -format. if you can find "tmp/hadoop-*/dfs/name" in the directory which hadoop located, then congratulations.
- run start-all.sh (it will start namenode, jobtracker, secondary namenode, datanode). using jps command to verify if these deamon process were started successfully. also can check it by http://localhost:50030/ for the jobtracker, http://localhost:50070/ for the namenode.
In my laptop, I must change the cygwin service configuration, I must change the service logon role. (using the account which was in administrator group)