Tuesday, May 17, 2011

Hadoop issue: jobtracker.info could only be replicated to 0 nodes, instead of 1

After wasting 36 hours of my life, finally found a way to bypass the problem.
Instead of using hadoop/bin/start-all.sh, run each of the commands separately:
* ./bin/hadoop-daemon.sh start namenode
* ./bin/hadoop-daemon.sh start jobtracker
* ./bin/hadoop-daemon.sh start tasktracker
* ./bin/hadoop-daemon.sh start datanode

There still may be some intermediate issue with dfs dir. Try deleting the dfs dir and cleanup everything of the form /tmp/hadoop*
Then do a hadoop namenode -format.

Note: make sure before doing all these things you have killed /stopped all the hadoop processes running on your machine.