我没有粘贴下面的输入,输出,映射器和减速器类。以下是我的主要功能。我正在使用Hadoop 1.0.4来运行以下代码。它工作正常,直到我尝试压缩减速器的输出。我将编译错误与代码一起粘贴:
public static void main(String[] args) throws Exception
{
Configuration conf = new Configuration();
conf.set("xmlinput.start", "<page>");
conf.set("xmlinput.end", "</page>");
Job job = new Job(conf); //configure the job, submit it, control its execution, and query the state
job.setJarByClass(XmlParser11.class); //set jar by finding where the class came from
job.setOutputKeyClass(Text.class); //Set the key class for the job output data
job.setOutputValueClass(Text.class);
//job.setCompressMapOutput(true);
//job.setMapOutputCompressorClass(GzipCodec.class);
//job.setCompressOutput(job, true);
//job.setClass("mapred.output.compression.codec", GzipCodec.class,CompressionCodec.class);
job.setMapperClass(XmlParser11.Map.class);
job.setReducerClass(XmlParser11.Reduce.class);
job.setInputFormatClass(XmlInputFormat1.class); //Set the InputFormat for the job job.setOutputFormatClass(TextOutputFormat.class); //Set the OutputFormat for the job
FileOutputFormat.setCompressOutput(job,true);
FileOutputFormat.setOutputCompressorClass(job,GzipCodec.class);
FileInputFormat.addInputPath(job, new Path(args[0])); //the job for which the input path should be modified FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}
[ravisg@topsail-sn ~]$ javac -classpath /var/hadoop/hadoop-core-1.0.4.jar -d stopWords/ XmlParser11.java
XmlParser11.java:306: error: cannot find symbol
FileOutputFormat.setOutputCompressorClass(job,GzipCodec.class);
^
symbol: class GzipCodec
location: class XmlParser11
你能告诉我如何压缩我的减速机的输出,还是能指出我做错了什么?我尝试使用Stackoverflow上建议的不同压缩样式,但我总是遇到类似的错误。
答案 0 :(得分:1)
抱歉,我只需要使用
FileOutputFormat.setOutputCompressorClass(job, org.apache.hadoop.io.compress.GzipCodec.class
而不是
FileOutputFormat.setOutputCompressorClass(job, GzipCodec.class);
答案 1 :(得分:0)
编译代码时,需要将Hadoop发行版中的hadoop-common * jar添加到类路径中。有问题的jar包含GZipCodec类