mapreduce程序没有给我任何输出。有人可以看一下吗?

时间:2014-04-02 15:51:18

标签: hadoop mapreduce hdfs

我没有在这个程序中输出。当我运行这个mapreduce程序时,我没有得到任何结果。

输入文件:dict1.txt

apple,seo
apple,sev
dog,kukura
dog,kutta
cat,bilei
cat,billi

我想要的输出:

apple seo|sev
dog kukura|kutta
cat bilei|billi

Mapper类代码:

package com.accure.Dict;

import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;


public class DictMapper extends MapReduceBase implements Mapper<Text,Text,Text,Text> {

      private Text word = new Text();
public void map(Text key,Text value,OutputCollector<Text,Text> output,Reporter reporter) throws IOException{

    StringTokenizer itr = new StringTokenizer(value.toString(),",");
          while (itr.hasMoreTokens())
                {

              System.out.println(key);
                    word.set(itr.nextToken());
                    output.collect(key, word);
            }
}



}

减速机代码:

package com.accure.Dict;

import java.io.IOException;
import java.util.Iterator;


import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;

public class DictReducer extends MapReduceBase implements Reducer<Text, Text, Text, Text> {

                private Text result = new Text();

                public void reduce(Text key, Iterator<Text> values, OutputCollector<Text,Text> output,Reporter reporter) throws IOException {
                    String translations = "";

                while(values.hasNext()){
                        translations += "|" + values.next().toString();
                    }

                    result.set(translations);
                    output.collect(key,result);
            }



                }

驱动程序代码:

package com.accure.driver;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.KeyValueTextInputFormat;
import org.apache.hadoop.mapred.TextOutputFormat;

import com.accure.Dict.DictMapper;
import com.accure.Dict.DictReducer;
public class DictDriver {

    public static void main(String[] args) throws Exception{
        // TODO Auto-generated method stub


        JobConf conf=new JobConf();
        conf.setJobName("wordcount_pradosh");
        System.setProperty("HADOOP_USER_NAME","accure");

        conf.set("fs.default.name","hdfs://host2.hadoop.career.com:54310/");
        conf.set("hadoop.job.ugi","accuregrp");
        conf.set("mapred.job.tracker","host2.hadoop.career.com:54311");

        /*mapper and reduce class */
        conf.setMapperClass(DictMapper.class);
        conf.setReducerClass(DictReducer.class);

        /*This particular jar file has your classes*/
        conf.setJarByClass(DictMapper.class);

        Path inputPath= new Path("/myCareer/pradosh/input");
        Path outputPath=new Path("/myCareer/pradosh/output"+System.currentTimeMillis());


        /*input and output directory path */
        FileInputFormat.setInputPaths(conf,inputPath);
        FileOutputFormat.setOutputPath(conf,outputPath);

        conf.setMapOutputKeyClass(Text.class);
        conf.setMapOutputValueClass(Text.class);

        /*output key and value class*/
        conf.setOutputKeyClass(Text.class);
        conf.setOutputValueClass(Text.class);
        /*input and output format */

        conf.setInputFormat(KeyValueTextInputFormat.class); /*Here the file is a text file*/
        conf.setOutputFormat(TextOutputFormat.class);

        JobClient.runJob(conf);


    }

}

输出日志:

14/04/02 08:33:38 INFO mapred.JobClient: Running job: job_201404010637_0011
14/04/02 08:33:39 INFO mapred.JobClient:  map 0% reduce 0%
14/04/02 08:33:58 INFO mapred.JobClient:  map 50% reduce 0%
14/04/02 08:33:59 INFO mapred.JobClient:  map 100% reduce 0%
14/04/02 08:34:21 INFO mapred.JobClient:  map 100% reduce 16%
14/04/02 08:34:23 INFO mapred.JobClient:  map 100% reduce 100%
14/04/02 08:34:25 INFO mapred.JobClient: Job complete: job_201404010637_0011
14/04/02 08:34:25 INFO mapred.JobClient: Counters: 29
14/04/02 08:34:25 INFO mapred.JobClient:   Job Counters 
14/04/02 08:34:25 INFO mapred.JobClient:     Launched reduce tasks=1
14/04/02 08:34:25 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=33692
14/04/02 08:34:25 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
14/04/02 08:34:25 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
14/04/02 08:34:25 INFO mapred.JobClient:     Launched map tasks=2
14/04/02 08:34:25 INFO mapred.JobClient:     Data-local map tasks=2
14/04/02 08:34:25 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=25327
14/04/02 08:34:25 INFO mapred.JobClient:   File Input Format Counters 
14/04/02 08:34:25 INFO mapred.JobClient:     Bytes Read=92
14/04/02 08:34:25 INFO mapred.JobClient:   File Output Format Counters 
14/04/02 08:34:25 INFO mapred.JobClient:     Bytes Written=0
14/04/02 08:34:25 INFO mapred.JobClient:   FileSystemCounters
14/04/02 08:34:25 INFO mapred.JobClient:     FILE_BYTES_READ=6
14/04/02 08:34:25 INFO mapred.JobClient:     HDFS_BYTES_READ=336
14/04/02 08:34:25 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=169311
14/04/02 08:34:25 INFO mapred.JobClient:   Map-Reduce Framework
14/04/02 08:34:25 INFO mapred.JobClient:     Map output materialized bytes=12
14/04/02 08:34:25 INFO mapred.JobClient:     Map input records=6
14/04/02 08:34:25 INFO mapred.JobClient:     Reduce shuffle bytes=12
14/04/02 08:34:25 INFO mapred.JobClient:     Spilled Records=0
14/04/02 08:34:25 INFO mapred.JobClient:     Map output bytes=0
14/04/02 08:34:25 INFO mapred.JobClient:     Total committed heap usage (bytes)=246685696
14/04/02 08:34:25 INFO mapred.JobClient:     CPU time spent (ms)=2650
14/04/02 08:34:25 INFO mapred.JobClient:     Map input bytes=61
14/04/02 08:34:25 INFO mapred.JobClient:     SPLIT_RAW_BYTES=244
14/04/02 08:34:25 INFO mapred.JobClient:     Combine input records=0
14/04/02 08:34:25 INFO mapred.JobClient:     Reduce input records=0
14/04/02 08:34:25 INFO mapred.JobClient:     Reduce input groups=0
14/04/02 08:34:25 INFO mapred.JobClient:     Combine output records=0
14/04/02 08:34:25 INFO mapred.JobClient:     Physical memory (bytes) snapshot=392347648
14/04/02 08:34:25 INFO mapred.JobClient:     Reduce output records=0
14/04/02 08:34:25 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=2173820928
14/04/02 08:34:25 INFO mapred.JobClient:     Map output records=0

1 个答案:

答案 0 :(得分:2)

读取输入时,您将输入格式设置为:KeyValueTextInputFormat

这需要字节分隔符b / w键和值。在你输入你的键和值用“,”分隔,因此整个文本作为键,值将为空。

这就是为什么它不会进入映射器的下面循环:

while (itr.hasMoreTokens())
            {

          System.out.println(key);
                word.set(itr.nextToken());
                output.collect(key, word);
        }

您应该对您的密钥进行标记,然后将第一个拆分和密钥以及第二个拆分为值。

这在日志中得到证明:map输入记录:6但是Map输出记录= 0