Running Cloudera Hadoop on Mac OSX (Quick and Easy way)

  • Download Hadoop from the Cloudera website. You would need the tar file. I downloaded the Hadoop 0.20.2 version.
  • Untar and set up an environment variable HADOOP_HOME to point to that location.
  • cd to the conf folder
    update the hdfs-site.xml file with these changes (replace the items in bold to relevant folders and they need to have the right permissions):
  • Update the core-site.xml with:
  • Update the “masters” and “slaves” configuration files with “localhost”
  • Make sure that ssh is turned on for your Mac.
  • Format the DFS: hadoop namenode -format
  • Run => this should start the following processes:
    $ jps
  • Copy a few files from local to dfs and you are all set.

Building Hadoop – download and build

    Before you build Hadoop, you need to download the version that you require.
    You can explore the SVN tree here:

    Since I was interested in the branch: “branch-20-append”, that is what I used in the following command:
    svn co hadoop-0.20-append

    Once it is downloaded, you need to create a file (if you want control over the naming of the archives that would be generated as a consequence of this exercise):
    cd hadoop-0.20-append/

    Insert the following:


    Save the file and then run:
    ant mvn-install

    This will generate the required file. You can search for these files with:
    find ~/.m2 -name "hadoop-*.jar"