Running Cloudera Hadoop on Mac OSX (Quick and Easy way)

  • Download Hadoop from the Cloudera website. You would need the tar file. I downloaded the Hadoop 0.20.2 version.
  • Untar and set up an environment variable HADOOP_HOME to point to that location.
  • cd to the conf folder
    update the hdfs-site.xml file with these changes (replace the items in bold to relevant folders and they need to have the right permissions):
    <property>
    <name>dfs.name.dir</name>
    <value>/Users/khanna/temp/hdfs</value>
    </property>
    <property>
    <name>dfs.data.dir</name>
    <value>/Users/khanna/temp/hdfsData</value>
    </property>
  • Update the core-site.xml with:
    <property>
    <name>hadoop.tmp.dir</name>
    <value>/Users/khanna/temp/hadoop</value>
    </property>
  • Update the “masters” and “slaves” configuration files with “localhost”
  • Make sure that ssh is turned on for your Mac.
  • Format the DFS: hadoop namenode -format
  • Run start-all.sh => this should start the following processes:
    $ jps
    TaskTracker
    DataNode
    NameNode
    SecondaryNameNode
    JobTracker
  • Copy a few files from local to dfs and you are all set.

Leave a Reply

Your email address will not be published. Required fields are marked *

     

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">