- Download Hadoop from the Cloudera website. You would need the tar file. I downloaded the Hadoop 0.20.2 version.
- Untar and set up an environment variable HADOOP_HOME to point to that location.
- cd to the conf folder
update the hdfs-site.xml file with these changes (replace the items in bold to relevant folders and they need to have the right permissions):
- Update the core-site.xml with:
- Update the “masters” and “slaves” configuration files with “localhost”
- Make sure that ssh is turned on for your Mac.
- Format the DFS: hadoop namenode -format
start-all.sh=> this should start the following processes:
- Copy a few files from local to dfs and you are all set.
- Before you build Hadoop, you need to download the version that you require.
You can explore the SVN tree here:
Since I was interested in the branch: “branch-20-append”, that is what I used in the following command:
svn co http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-append/ hadoop-0.20-append
Once it is downloaded, you need to create a build.properties file (if you want control over the naming of the archives that would be generated as a consequence of this exercise):
Insert the following:
Save the file and then run:
This will generate the required file. You can search for these files with:
find ~/.m2 -name "hadoop-*.jar"