Question

我正在使用以下代码进行压缩

     Configuration conf = new Configuration(); 
    conf.setBoolean("mapred.compress.map.output", true); 
conf.set("mapred.map.output.compression.codec","org.apache.hadoop.io.compress.SnappyCodec");

使用snappy算法。但是在用一些数据（70到100 mb）压缩输入文件时，它会压缩文件大小超过输入文件的数据，如果我尝试使用包含所有类型文件的输入目录，例如（.jpg，.mp3），.mp4等..）大小为100到150 mb，它显示错误：

log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Java HotSpot(TM) Server VM warning: INFO: os::commit_memory(0x930c0000, 105119744, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 105119744 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /home/hduser/workspace/TestProject/hs_err_pid16619.log

当我尝试使用snappy算法压缩和解压缩数据时，请在此建议我，如何使用sanppy算法压缩数据，空间更小。

我正在使用

Ubuntu 13.10,32位 Jdk 7 32位。与hadoop-2.2.0

如何在hadoop中使用Snappy压缩和解压缩

0 个答案: