This post explains steps for "Configuring LZO Compression for CDH3 HBase" and it applies to CDH3 HBase 0.90.1 and above. Main source of this post is http://wiki.apache.org/hadoop/UsingLzoCompression, I have added details of steps that are needed for CDH3 (step 3) as some instructions in the wiki page did not seem to apply applicable for CDH3 HBase.
I tried below steps on Ubuntu. If you are using Linux, give Todd Lipcon's https://github.com/toddlipcon/hadoop-lzo-packager a try.
Also, adding LZO Compression may require additional configuration changes in your HBase setup. Refer http://search-hadoop.com/m/WUnLM6ojHm1/Long+client+pauses+with+compression&subj=Long+client+pauses+with+compression
I tried below steps on Ubuntu. If you are using Linux, give Todd Lipcon's https://github.com/toddlipcon/hadoop-lzo-packager a try.
Also, adding LZO Compression may require additional configuration changes in your HBase setup. Refer http://search-hadoop.com/m/WUnLM6ojHm1/Long+client+pauses+with+compression&subj=Long+client+pauses+with+compression
- Download and build http://www.oberhumer.com/opensource/lzo/. This steps are based on http://www.linuxfromscratch.org/blfs/view/cvs/general/LZO.html.
- wget http://www.oberhumer.com/opensource/lzo/download/lzo-2.05.tar.gz
- gunzip lzo-2.05.tar.gz
- tar -xvf lzo-2.05.tar
- cd lzo-2.05/
- ./configure --prefix=/usr --enable-shared
- make
- make check
- All checks should pass.
- make test
- All tests should pass.
- sudo make install
- Download and build http://code.google.com/p/hadoop-gpl-compression/
- svn checkout http://svn.codespot.com/a/apache-extras.org/hadoop-gpl-compression/trunk/ hadoop-gpl-compression-read-only
- cd hadoop-gpl-compression-read-only/
- export CFLAGS="-m64"
- ant compile-native
- ant jar
- Copy the Hadoop GPL Compression JAR to HBase lib
- cp build/hadoop-gpl-compression-0.2.0-dev.jar /path/to/hbase/lib/
- Copy the native files to /usr/local/lib/
- sudo cp build/native/Linux-amd64-64/lib/* /usr/local/lib/
- Edit /path/to/hbase/conf/hbase-env.sh, and add /usr/local/lib to environment variable. This is needed as hbase/lib does not seem to have "native" folder anymore (Refer: http://cdh3u0.cloudera.com/cdh/3/hbase-0.90.1+15.18.releasenotes.html and https://issues.apache.org/jira/browse/HBASE-3533)
- vi /path/to/hbase/conf/hbase-env.sh
- Add following line at the end:
- export HBASE_LIBRARY_PATH=/usr/local/lib/
- Start the HBase. You should see that native lzo library is successfully loaded in logs. I was trying in stand alone mode, need to confirm whether below logs will be seen in master or region server logs.
- 2011-08-06 14:54:08,776 INFO com.hadoop.compression.lzo.GPLNativeCodeLoader: Loaded native gpl library
- 2011-08-06 14:54:08,782 INFO com.hadoop.compression.lzo.LzoCodec: Successfully loaded & initialized native-lzo library


No comments:
Post a Comment