About Me

My Photo
I am a Software Architect and Developer based in Bangalore, India. I have experience in building (scalable) applications using Java, JSP, JSF, JBoss Drools, Spring Framework, Hibernate, Ajax, JavaScript, MySQL, NoSQL (HBase, Project Voldemort). I am also a fan of Ruby on Rails, and have done some experimental work with it.

Saturday, August 06, 2011

Configuring LZO Compression for CDH3 HBase

This post explains steps for "Configuring LZO Compression for CDH3 HBase" and it applies to CDH3 HBase 0.90.1 and above.   Main source of this post is http://wiki.apache.org/hadoop/UsingLzoCompression, I have added details of steps that are needed for CDH3 (step 3) as some instructions in the wiki page did not seem to apply applicable for CDH3 HBase.

I tried below steps on Ubuntu.  If you are using Linux, give Todd Lipcon's https://github.com/toddlipcon/hadoop-lzo-packager a try.

Also, adding LZO Compression may require additional configuration changes in your HBase setup.  Refer http://search-hadoop.com/m/WUnLM6ojHm1/Long+client+pauses+with+compression&subj=Long+client+pauses+with+compression
  1. Download and build http://www.oberhumer.com/opensource/lzo/. This steps are  based on http://www.linuxfromscratch.org/blfs/view/cvs/general/LZO.html.
    • wget http://www.oberhumer.com/opensource/lzo/download/lzo-2.05.tar.gz
    • gunzip lzo-2.05.tar.gz
    • tar -xvf lzo-2.05.tar
    • cd lzo-2.05/
    • ./configure --prefix=/usr --enable-shared
    • make
    • make check
      • All checks should pass.
    • make test 
      • All tests should pass.
    • sudo make install
  2. Download and build http://code.google.com/p/hadoop-gpl-compression/ 
    • Copy the Hadoop GPL Compression JAR to HBase lib
      • cp build/hadoop-gpl-compression-0.2.0-dev.jar /path/to/hbase/lib/
      • Copy the native files to /usr/local/lib/
        •  sudo cp build/native/Linux-amd64-64/lib/* /usr/local/lib/
    • Edit /path/to/hbase/conf/hbase-env.sh, and add /usr/local/lib to environment variable.  This is needed as hbase/lib does not seem to have "native" folder anymore (Refer:  http://cdh3u0.cloudera.com/cdh/3/hbase-0.90.1+15.18.releasenotes.html and https://issues.apache.org/jira/browse/HBASE-3533)
      • vi /path/to/hbase/conf/hbase-env.sh
      • Add following line at the end:
        • export HBASE_LIBRARY_PATH=/usr/local/lib/
    • Start the HBase.  You should see that native lzo library is successfully loaded in logs.  I was trying in stand alone mode, need to confirm whether below logs will be seen in master or region server logs.
      • 2011-08-06 14:54:08,776 INFO com.hadoop.compression.lzo.GPLNativeCodeLoader: Loaded native gpl library
      • 2011-08-06 14:54:08,782 INFO com.hadoop.compression.lzo.LzoCodec: Successfully loaded & initialized native-lzo library

    No comments: