Running Cloudera Hadoop on Mac OSX (Quick and Easy way)

  • Download Hadoop from the Cloudera website. You would need the tar file. I downloaded the Hadoop 0.20.2 version.
  • Untar and set up an environment variable HADOOP_HOME to point to that location.
  • cd to the conf folder
    update the hdfs-site.xml file with these changes (replace the items in bold to relevant folders and they need to have the right permissions):
    <property>
    <name>dfs.name.dir</name>
    <value>/Users/khanna/temp/hdfs</value>
    </property>
    <property>
    <name>dfs.data.dir</name>
    <value>/Users/khanna/temp/hdfsData</value>
    </property>
  • Update the core-site.xml with:
    <property>
    <name>hadoop.tmp.dir</name>
    <value>/Users/khanna/temp/hadoop</value>
    </property>
  • Update the “masters” and “slaves” configuration files with “localhost”
  • Make sure that ssh is turned on for your Mac.
  • Format the DFS: hadoop namenode -format
  • Run start-all.sh => this should start the following processes:
    $ jps
    TaskTracker
    DataNode
    NameNode
    SecondaryNameNode
    JobTracker
  • Copy a few files from local to dfs and you are all set.

Building Hadoop – download and build

    Before you build Hadoop, you need to download the version that you require.
    You can explore the SVN tree here:
    http://svn.apache.org/repos/asf/hadoop/common/

    Since I was interested in the branch: “branch-20-append”, that is what I used in the following command:
    svn co http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-append/ hadoop-0.20-append

    Once it is downloaded, you need to create a build.properties file (if you want control over the naming of the archives that would be generated as a consequence of this exercise):
    cd hadoop-0.20-append/
    vi build.properties

    Insert the following:

    version=0.20-append

    Save the file and then run:
    ant mvn-install

    This will generate the required file. You can search for these files with:
    find ~/.m2 -name "hadoop-*.jar"

    References:

    http://wiki.apache.org/hadoop/HowToContribute

    http://wiki.apache.org/hadoop/HowToSetupYourDevelopmentEnvironment

    http://www.michael-noll.com/blog/2011/04/14/building-an-hadoop-0-20-x-version-for-hbase-0-90-2/