Help
Dependencies
Hadoop itself
I got Hadoop from some mirrored place like this.
Java
Did a yum install java and seem to have received java-1.5.0-gcj. This could be a problem since the instructions ask for: "JavaTM 1.6.x, preferably from Sun, must be installed."
So over at the evil Oracle site they’re promising free blow jobs with:
jre-6u29-linux-x64-rpm.bin
which unzips to:
jre-6u29-linux-amd64.rpm
Then I think you need to do something like this:
export JAVA_HOME=/usr/java/jre1.6.0_29/
And installs to a mess that doesn’t quite work as far as I can tell.
SSH
SSH needs to be present and working. If it’s not already, you probably have no business doing serious things with your computing hardware.
Installing and Testing
Unzip and untar the hadoop package. Should unpack ok and you can test right from that directory. So in the Hadoop directory that you just unpacked, do this:
$ mkdir input $ cp conf/*.xml input $ bin/hadoop jar hadoop-examples-0.20.203.0.jar grep input output 'dfs[a-z.]+'
If the Java is happy it should run and process an example. Hopefully there are no errors or sad exit codes. A directory called output should appear and "results" can be checked with:
$ cat output/*