Help

Dependencies

Hadoop itself

I got Hadoop from some mirrored place like this.

Java

Did a yum install java and seem to have received java-1.5.0-gcj. This could be a problem since the instructions ask for: "JavaTM 1.6.x, preferably from Sun, must be installed."

So over at the evil Oracle site they’re promising free blow jobs with:

jre-6u29-linux-x64-rpm.bin

which unzips to:

jre-6u29-linux-amd64.rpm

Then I think you need to do something like this:

export JAVA_HOME=/usr/java/jre1.6.0_29/

And installs to a mess that doesn’t quite work as far as I can tell.

SSH

SSH needs to be present and working. If it’s not already, you probably have no business doing serious things with your computing hardware.

Installing and Testing

Unzip and untar the hadoop package. Should unpack ok and you can test right from that directory. So in the Hadoop directory that you just unpacked, do this:

$ mkdir input
$ cp conf/*.xml input
$ bin/hadoop jar hadoop-examples-0.20.203.0.jar grep input output 'dfs[a-z.]+'

If the Java is happy it should run and process an example. Hopefully there are no errors or sad exit codes. A directory called output should appear and "results" can be checked with:

$ cat output/*