我没有在这个程序中输出。当我运行这个mapreduce程序时,我没有得到任何结果。
输入文件:dict1.txt
apple,seo
apple,sev
dog,kukura
dog,kutta
cat,bilei
cat,billi
我想要的输出:
apple seo|sev
dog kukura|kutta
cat bilei|billi
Mapper类代码:
package com.accure.Dict;
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;
public class DictMapper extends MapReduceBase implements Mapper<Text,Text,Text,Text> {
private Text word = new Text();
public void map(Text key,Text value,OutputCollector<Text,Text> output,Reporter reporter) throws IOException{
StringTokenizer itr = new StringTokenizer(value.toString(),",");
while (itr.hasMoreTokens())
{
System.out.println(key);
word.set(itr.nextToken());
output.collect(key, word);
}
}
}
减速机代码:
package com.accure.Dict;
import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;
public class DictReducer extends MapReduceBase implements Reducer<Text, Text, Text, Text> {
private Text result = new Text();
public void reduce(Text key, Iterator<Text> values, OutputCollector<Text,Text> output,Reporter reporter) throws IOException {
String translations = "";
while(values.hasNext()){
translations += "|" + values.next().toString();
}
result.set(translations);
output.collect(key,result);
}
}
驱动程序代码:
package com.accure.driver;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.KeyValueTextInputFormat;
import org.apache.hadoop.mapred.TextOutputFormat;
import com.accure.Dict.DictMapper;
import com.accure.Dict.DictReducer;
public class DictDriver {
public static void main(String[] args) throws Exception{
// TODO Auto-generated method stub
JobConf conf=new JobConf();
conf.setJobName("wordcount_pradosh");
System.setProperty("HADOOP_USER_NAME","accure");
conf.set("fs.default.name","hdfs://host2.hadoop.career.com:54310/");
conf.set("hadoop.job.ugi","accuregrp");
conf.set("mapred.job.tracker","host2.hadoop.career.com:54311");
/*mapper and reduce class */
conf.setMapperClass(DictMapper.class);
conf.setReducerClass(DictReducer.class);
/*This particular jar file has your classes*/
conf.setJarByClass(DictMapper.class);
Path inputPath= new Path("/myCareer/pradosh/input");
Path outputPath=new Path("/myCareer/pradosh/output"+System.currentTimeMillis());
/*input and output directory path */
FileInputFormat.setInputPaths(conf,inputPath);
FileOutputFormat.setOutputPath(conf,outputPath);
conf.setMapOutputKeyClass(Text.class);
conf.setMapOutputValueClass(Text.class);
/*output key and value class*/
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(Text.class);
/*input and output format */
conf.setInputFormat(KeyValueTextInputFormat.class); /*Here the file is a text file*/
conf.setOutputFormat(TextOutputFormat.class);
JobClient.runJob(conf);
}
}
输出日志:
14/04/02 08:33:38 INFO mapred.JobClient: Running job: job_201404010637_0011
14/04/02 08:33:39 INFO mapred.JobClient: map 0% reduce 0%
14/04/02 08:33:58 INFO mapred.JobClient: map 50% reduce 0%
14/04/02 08:33:59 INFO mapred.JobClient: map 100% reduce 0%
14/04/02 08:34:21 INFO mapred.JobClient: map 100% reduce 16%
14/04/02 08:34:23 INFO mapred.JobClient: map 100% reduce 100%
14/04/02 08:34:25 INFO mapred.JobClient: Job complete: job_201404010637_0011
14/04/02 08:34:25 INFO mapred.JobClient: Counters: 29
14/04/02 08:34:25 INFO mapred.JobClient: Job Counters
14/04/02 08:34:25 INFO mapred.JobClient: Launched reduce tasks=1
14/04/02 08:34:25 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=33692
14/04/02 08:34:25 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
14/04/02 08:34:25 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
14/04/02 08:34:25 INFO mapred.JobClient: Launched map tasks=2
14/04/02 08:34:25 INFO mapred.JobClient: Data-local map tasks=2
14/04/02 08:34:25 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=25327
14/04/02 08:34:25 INFO mapred.JobClient: File Input Format Counters
14/04/02 08:34:25 INFO mapred.JobClient: Bytes Read=92
14/04/02 08:34:25 INFO mapred.JobClient: File Output Format Counters
14/04/02 08:34:25 INFO mapred.JobClient: Bytes Written=0
14/04/02 08:34:25 INFO mapred.JobClient: FileSystemCounters
14/04/02 08:34:25 INFO mapred.JobClient: FILE_BYTES_READ=6
14/04/02 08:34:25 INFO mapred.JobClient: HDFS_BYTES_READ=336
14/04/02 08:34:25 INFO mapred.JobClient: FILE_BYTES_WRITTEN=169311
14/04/02 08:34:25 INFO mapred.JobClient: Map-Reduce Framework
14/04/02 08:34:25 INFO mapred.JobClient: Map output materialized bytes=12
14/04/02 08:34:25 INFO mapred.JobClient: Map input records=6
14/04/02 08:34:25 INFO mapred.JobClient: Reduce shuffle bytes=12
14/04/02 08:34:25 INFO mapred.JobClient: Spilled Records=0
14/04/02 08:34:25 INFO mapred.JobClient: Map output bytes=0
14/04/02 08:34:25 INFO mapred.JobClient: Total committed heap usage (bytes)=246685696
14/04/02 08:34:25 INFO mapred.JobClient: CPU time spent (ms)=2650
14/04/02 08:34:25 INFO mapred.JobClient: Map input bytes=61
14/04/02 08:34:25 INFO mapred.JobClient: SPLIT_RAW_BYTES=244
14/04/02 08:34:25 INFO mapred.JobClient: Combine input records=0
14/04/02 08:34:25 INFO mapred.JobClient: Reduce input records=0
14/04/02 08:34:25 INFO mapred.JobClient: Reduce input groups=0
14/04/02 08:34:25 INFO mapred.JobClient: Combine output records=0
14/04/02 08:34:25 INFO mapred.JobClient: Physical memory (bytes) snapshot=392347648
14/04/02 08:34:25 INFO mapred.JobClient: Reduce output records=0
14/04/02 08:34:25 INFO mapred.JobClient: Virtual memory (bytes) snapshot=2173820928
14/04/02 08:34:25 INFO mapred.JobClient: Map output records=0
答案 0 :(得分:2)
读取输入时,您将输入格式设置为:KeyValueTextInputFormat
这需要字节分隔符b / w键和值。在你输入你的键和值用“,”分隔,因此整个文本作为键,值将为空。
这就是为什么它不会进入映射器的下面循环:
while (itr.hasMoreTokens())
{
System.out.println(key);
word.set(itr.nextToken());
output.collect(key, word);
}
您应该对您的密钥进行标记,然后将第一个拆分和密钥以及第二个拆分为值。
这在日志中得到证明:map输入记录:6但是Map输出记录= 0