以下是可以正常使用的代码段:
Configuration conf = new Configuration();
//PROBLEM PART!!!!!
//conf.setBoolean("mapred.compress.map.output", true);
//conf.set("mapred.output.compression.type", "BLOCK");
//conf.setClass("mapred.map.output.compression.codec", GzipCodec.class, CompressionCodec.class);
Job job = new Job(conf, "WordCount");
job.setJarByClass(WordCount.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(WordCountMap.class);
job.setReducerClass(Reduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
但如果我在上面的代码段中启用问题部分,则控制台的输出将停留在:
13/12/26 18:08:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/12/26 18:08:06 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/12/26 18:08:06 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
13/12/26 18:08:06 INFO input.FileInputFormat: Total input paths to process : 20
13/12/26 18:08:06 WARN snappy.LoadSnappy: Snappy native library not loaded
13/12/26 18:08:06 INFO mapred.JobClient: Running job: job_local1943436108_0001
13/12/26 18:08:06 INFO mapred.LocalJobRunner: Waiting for map tasks
13/12/26 18:08:06 INFO mapred.LocalJobRunner: Starting task: attempt_local1943436108_0001_m_000000_0
13/12/26 18:08:07 INFO util.ProcessTree: setsid exited with exit code 0
13/12/26 18:08:07 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@731d2572
13/12/26 18:08:07 INFO mapred.MapTask: Processing split: hdfs://localhost:9000/user/jude/input/capacity-scheduler.xml:0+7457
13/12/26 18:08:07 INFO mapred.MapTask: io.sort.mb = 100
13/12/26 18:08:07 INFO mapred.MapTask: data buffer = 79691776/99614720
13/12/26 18:08:07 INFO mapred.MapTask: record buffer = 262144/327680
13/12/26 18:08:07 INFO mapred.MapTask: Starting flush of map output
13/12/26 18:08:07 INFO compress.CodecPool: Got brand-new compressor
13/12/26 18:08:07 INFO mapred.MapTask: Starting flush of map output
13/12/26 18:08:07 INFO mapred.JobClient: map 0% reduce 0%
13/12/26 18:08:12 INFO mapred.LocalJobRunner:
13/12/26 18:08:13 INFO mapred.JobClient: map 5% reduce 0%
//no more
我只想压缩地图的输出,我的代码有什么问题吗?非常感谢!
答案 0 :(得分:1)
使用压缩需要Hadoop为您的平台使用本机库,但显然您没有它们(或者没有正确配置库的路径)。这是解释问题的信息:
[...] NativeCodeLoader: Unable to load native-hadoop library for your platform...
可能的解决方案:
最常见的问题是在64位架构上使用32位库。您可以下载pre-compiled 64 native libraries OR compile them yourself using mvn package -Pdist,**native**,docs
。
或者,您可能需要正确配置本机库的路径;请参阅以下有关如何执行此操作的其他问题:use -Djava.libray.path或LD_LIBRARY_PATH。